CN112749139A - Log file processing method, electronic device and storage medium - Google Patents

Log file processing method, electronic device and storage medium Download PDF

Info

Publication number
CN112749139A
CN112749139A CN202011614132.5A CN202011614132A CN112749139A CN 112749139 A CN112749139 A CN 112749139A CN 202011614132 A CN202011614132 A CN 202011614132A CN 112749139 A CN112749139 A CN 112749139A
Authority
CN
China
Prior art keywords
coding
coding table
character
encoding
subset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011614132.5A
Other languages
Chinese (zh)
Other versions
CN112749139B (en
Inventor
邵传贤
周振江
王浩然
马兵
吴庆双
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
MIGU Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, MIGU Culture Technology Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN202011614132.5A priority Critical patent/CN112749139B/en
Publication of CN112749139A publication Critical patent/CN112749139A/en
Application granted granted Critical
Publication of CN112749139B publication Critical patent/CN112749139B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a log file processing method, electronic equipment and a storage medium, wherein the method comprises the following steps: determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table; respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to the coding type and the ID of the coding table corresponding to the coding field, and determining a source character segment corresponding to the coding field according to the coding number and the coding table; and generating a source file according to the source character segment corresponding to each encoding field. According to the log file processing method, the electronic device and the storage medium, the encoding table set is established and updated based on the character segments in the source file, the source file is encoded and compressed through the encoding table set, the storage space of the source file in the storage process is released, the cost of hardware equipment is reduced, and meanwhile, the encoded file is decoded through the encoding table set, so that rapid decoding is achieved.

Description

Log file processing method, electronic device and storage medium
Technical Field
The present invention relates to the field of information encoding and decoding technologies, and in particular, to a log file processing method, an electronic device, and a storage medium.
Background
In the operation process of the business system, the request and the system response of the user are recorded in a log file mode. The log files can be collected to a big data analysis platform for big data analysis, and the analyzed data can be uniformly transferred to a data storage system for storage. The log file is stored, on one hand, the deeper mining of the historical service can be carried out in the later period, on the other hand, when problems exist, the historical data can be analyzed, problem rules can be found, and the problem positioning and solving are convenient.
If the log file is stored in the source string format, a large amount of hardware storage space is needed. For this reason, the log file needs to be encoded and compressed to reduce the compression space.
Disclosure of Invention
Aiming at the problems in the prior art, the invention provides a log file processing method, electronic equipment and a storage medium.
The invention provides a log file processing method, which comprises the following steps:
determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table;
respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to a coding type and a coding table ID corresponding to the coding field, and determining a source character field corresponding to the coding field according to a coding number and the coding table corresponding to the coding field;
generating a source file according to the source character segment corresponding to each coding field;
wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on a source segment length.
According to the log file processing method provided by the invention, the lengths of the code numbers in the code table are sequentially increased, and the source character segments sequentially correspond to the code numbers according to the occurrence times from more to less.
According to the icon adjusting method provided by the invention, the number of the coding table subsets is the same as the numerical value of the preset intercepting length, and the intercepting length is the reference length for segmenting the source file in the source file coding process.
According to the log file processing method provided by the invention, the method further comprises the following steps:
acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to the intercepting length to obtain each subsection;
and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
According to the log file processing method provided by the invention, the encoding of the source character segments in each sub-segment is respectively carried out based on the existing encoding table set to obtain the encoding file, and the encoding table set is updated, and the method comprises the following steps:
judging whether an existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the subsections to obtain a first judgment result;
if the first judgment result is yes, judging whether the coding tables in the existing coding table subset can be matched with the subsections or not to obtain a second judgment result, and configuring coding numbers for the subsections according to the second judgment result;
if the first judgment result is negative, establishing a new coding table subset, configuring a corresponding coding type according to the maximum length of the subsegment, establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the subsegment.
According to the log file processing method provided by the invention, when the first judgment result is negative, the method further comprises the following steps:
determining that an existing coding table set has a coding table subset of a corresponding coding type according to each single character of the subsegment, determining that the coding table in the existing coding table subset cannot be matched with the single character, and configuring a coding number of the single character in the coding table;
and determining that the existing coding table subset does not have a corresponding coding type according to each single character of the subsegment, establishing a coding table in a new coding table subset, and configuring the coding number of the single character in the coding table.
According to the log file processing method provided by the invention, when the first judgment result is negative, the method further comprises the following steps:
acquiring any value from 1 to (L-1) of the character segments S (0, i) of the subsegments, wherein the S (0, i) represents the character segments formed by sequentially splicing the characters from 0 to the ith in the subsegments, and L is the maximum length of the subsegments;
determining that the character segment S (0, i) does not have a coding table subset of a corresponding coding type in an existing coding table set, establishing a new coding table subset, configuring the corresponding coding type according to the length of the character segment S (0, i), establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the character segment S (0, i);
and determining that the character segment S (0, i) has the corresponding coding table subset of the coding type in the existing coding table set, determining that the character segment S (0, i) cannot be matched in the coding table in which the existing coding table subset exists, matching a new coding number for the character segment S (0, i), and updating the coding table.
According to a log file processing method provided by the present invention, the configuring an encoding number for a sub-segment according to a second judgment result includes:
determining that the sub-segments can not be matched in the coding tables in the existing coding table subset, configuring a new coding number for the sub-segments, and updating the coding tables;
and if the sub-sections can be matched in the coding tables in the existing coding table subset, configuring the matched coding numbers for the sub-sections.
The invention also provides a log file processing method, which comprises the following steps:
acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to an interception length to obtain each sub-segment, wherein the interception length is a reference length for segmenting and dividing the source file;
and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the log file processing method.
The invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the log processing method as any one of the above.
According to the log file processing method, the electronic device and the storage medium, the encoding table set is established and updated based on the character segments in the source file, the source file is encoded and compressed through the encoding table set, the storage space of the source file in the storage process is released, the cost of hardware equipment is reduced, and meanwhile, the encoded file is decoded through the encoding table set, so that rapid decoding is achieved.
Drawings
In order to more clearly illustrate the technical solutions of the present invention or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic flow chart of a log file processing method provided by the present invention;
FIG. 2 is another schematic flow chart of a log file processing method provided by the present invention;
FIG. 3 is a schematic structural diagram of a log file processing apparatus provided in the present invention;
FIG. 4 is a schematic diagram of another structure of the log file processing apparatus provided by the present invention;
fig. 5 is a schematic structural diagram of an electronic device provided in the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The following describes a log file processing method, apparatus, electronic device and storage medium provided by the present invention with reference to fig. 1 to 5.
Fig. 1 shows a schematic flow chart of a log file processing method provided by the present invention, and referring to fig. 1, the method includes the following steps:
s11, determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table;
s12, respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to the coding type and the coding table ID corresponding to the coding field, and determining a source character section corresponding to the coding field according to the coding number and the coding table corresponding to the coding field;
and S13, generating a source file according to the source character segment corresponding to each encoding field.
Wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on the source segment length.
It should be noted that, in the process of operating the business system, the request and the system response of the user are recorded in the form of log files in the steps S11 to S13. The log files can be collected to a big data analysis platform for big data analysis, and the analyzed data can be uniformly transferred to a data storage system for storage. The log file is stored, on one hand, the deeper mining of the historical service can be carried out in the later period, on the other hand, when problems exist, the historical data can be analyzed, problem rules can be found, and the problem positioning and solving are convenient.
If the log file is stored in the source string format, a large amount of hardware storage space is needed. For this reason, the log file needs to be encoded and compressed to reduce the compression space.
Each log file is a character string, and each complete character string can generate different character segments due to the arrangement sequence of characters, so that the coded information is different due to the difference of the character segments. Therefore, a corresponding encoding table set needs to be established based on the characteristics of the log file, and encoding and decoding of the log file are completed through the established encoding table set.
The order of the characters changes the length of the character segments. For example, the length of the character segment ab is 2 bytes and the length of the character segment abcdf is 5 bytes.
In order to save the encoding time, the character segments with different lengths are required to be matched in the corresponding encoding table to obtain the encoding number. To this end, the encoding table set includes one or more encoding table subsets, and each encoding table stores therein one or more encoding tables of the same type. I.e. each subset of coding tables has a unique coding type, comprising one or more coding tables.
Since the character segments are distinguished based on different lengths (i.e., byte lengths), in the present invention, the encoding type of each encoding sub-table is determined based on the length of the character segment.
For example, in the present invention, a character segment includes one character or a plurality of characters.
The code table corresponding to the character segment of one character is called a single character code table, and the code type is called a single character.
The coding table corresponding to the character segments of the two characters is called a double-character coding table, and the coding type is called 'double characters'.
……
The coding table corresponding to the character segment of the i characters is called an i character coding table, and the coding type is called an 'i character'.
In the present invention, one or more coding tables are included in each of the subsets of coding tables. In this encoding sub-table, each encoding table has a unique encoding table ID.
For example, if the subset of coding tables includes 3 coding tables, the ID of each coding table is 1.1, 1.2, 1.3.
In the present invention, the encoding table is used to encode the log file. Therefore, the code table includes the corresponding relation between the code number and the source character segment. Each encoded log file is referred to as a source file, and the log file is composed of character segments, so the character segments of the source file are referred to as source character segments herein. The source character segments and the code numbers form a one-to-one correspondence relationship and are stored in a code table.
The code number is a unique number corresponding to the source character segment. The code number is determined by binary coding.
In the invention, a source file is coded by adopting a segmented coding mode, and therefore, the coded file comprises a plurality of coding fields, and each coding field comprises a coding type, a coding table ID and a coding number in a coding table.
In the decoding process, a file to be decoded is obtained, and the coding type, the coding table ID and the coding number in the coding table of each coding field are determined according to the file to be decoded.
And determining the coding table corresponding to each coding field in the coding table set according to the coding type and the coding table ID in each coding field, and obtaining the source character section corresponding to each coding field according to the coding number and the coding table.
And generating a source file according to the source character section corresponding to each encoding field.
For example, the file to be decoded includes encoding fields of { three characters, 3.1, 011}, { three characters, 3.2, 001}, respectively.
If there is a correspondence between 011 and abc in the encoding table with the encoding table ID of 3.1 in the three-character encoding table subset, then the source character segment with { three characters, 3.1, 011} encoding field is abc.
If there is a correspondence between 001 and jkh in the encoding table with an ID of 3.2 in the subset of three-character encoding tables, the source character segment with the encoding field of { three-character, 3.2, 001} is jkh.
At this time, the generated source file is abcjkh.
The log file processing method provided by the invention establishes and updates the coding table set based on the character segments in the source file, and codes and compresses the source file through the coding table set, so that the storage space of the source file in the storage process is released, the cost of hardware equipment is reduced, and meanwhile, the coding file is decoded through the coding table set, so that the rapid decoding is realized.
In the further explanation of the above method, the specific explanation is mainly for the code numbers in the code table, the lengths of the code numbers are sequentially increased, and the source character segments sequentially correspond to the code numbers according to the occurrence times from more to less.
In this regard, it should be noted that each code number has its own byte length. For example, the binary code 0, 1, 10, 11, 100, 101, 111, 1000 … … shows that the length of the code number is set in an increasing manner as a whole.
In the present invention, there are some character segments that occur more often. But the coding table is configured with a coding number corresponding to a longer byte length. Therefore, the encoding table needs to be updated and optimized, so that the source character segments are replaced with the encoding numbers according to the occurrence times from more to less.
For example, the source character segment abcdefgh appears in different source files more times, but the code number of the source character segment in the coding table is 10000, while the code number of another source character segment abcvgjhk in the coding table is 0, at this time, the code number of the source character segment abcdefgh is changed to 0, and the code number of another source character segment abcvgjhk is changed to 10000. The numbers of other source character segments with more times are sequentially adjusted to the front of the code numbers in the code table.
In the invention, the same character segment is compressed to obtain the same compression result, the occurrence frequency of the character segment can be counted through the compression result, and the coding table is optimized, thereby obtaining better character compression ratio.
In the further explanation of the above method, the number of the encoding table subsets is mainly explained specifically, the number of the encoding table subsets is the same as a preset intercepting length, and the intercepting length is a reference length for segmenting the source file in the source file encoding process.
In this regard, it should be noted that the method of the present invention is intended to encode and compress a source file, and for compression, a character segment of a certain byte length (i.e., an intercepted length) in a source file is actually compressed into an encoding with a shorter byte length. Namely: the maximum length of the code number in each code table is made shorter than the truncated length, so that the file compression can be realized.
Based on the above explanation of the encoding types, the number of the encoding table subsets is the same as the preset value of the truncation length. The interception length is a reference length for segmenting the source file in the source file encoding process.
For example, the configuration interception length is 10 bytes, and at this time, for the source file with 98 bytes to be segmented, 9 subsections with 10 bytes length and 1 subsections with 8 bytes length need to be segmented.
In the above encoding process with the truncation length of 10 bytes, since 10 bytes is the maximum length of the segment division, the number of the encoding table subsets can only be 10.
The further method of the invention can reasonably manage the number of the established subset of the coding table by reasonably setting the interception length.
In the further explanation of the above method, the establishment and updating process of the coding table is mainly explained as follows:
acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to the intercepting length to obtain each subsection;
and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
In this regard, it should be noted that, in the present invention, the encoding tables in the encoding table set can be dynamically updated during the encoding process of the log file.
The source file is encoded and compressed, in fact, character segments of a certain byte length (namely, the intercepted length) in the source file are compressed into codes with shorter byte length.
Therefore, after the source file to be coded is obtained, the source file to be coded is segmented and divided according to the set intercepting length to obtain each subsection.
And then, coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
In the present invention, an initial encoding table may be configured. The character segment corresponding to the code number in the code table is a common character or a character segment. That is, the common characters or character segments are encoded, and the corresponding relation between the code numbers and the character segments in the code table corresponding to different character segment lengths is established.
Then the initial coding table set is used for coding a certain number of source files to obtain a coding file and a more complete coding table set.
And then, the more complete coding table set is used for coding the subsequent source file to obtain a coding file, and meanwhile, the dynamic updating of the coding table set is also realized.
The further method of the invention encodes the source file by the existing encoding table set, and realizes the dynamic update of the encoding table set while obtaining the encoded file, so that the encoding table set is more perfect and has better adaptability.
In the further explanation of the above method, the source character segments in each sub-segment are encoded based on the existing encoding table set to obtain the encoded file, and the processing procedure of updating the encoding table set is explained as follows:
judging whether an existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the subsections to obtain a first judgment result;
if the first judgment result is yes, judging whether the coding tables in the existing coding table subset can be matched with the subsections or not to obtain a second judgment result, and configuring coding numbers for the subsections according to the second judgment result;
if the first judgment result is negative, establishing a new coding table subset, configuring a corresponding coding type according to the maximum length of the subsegment, establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the subsegment.
In this regard, it should be noted that, in the present invention, it should be noted that, when the configuration truncation length is 10 bytes, at this time, 9 subsections with a length of 10 bytes and 1 subsection with a length of 8 bytes are required to be divided for the source file with 98 bytes being segmented.
For 9 sub-segments of length 10 bytes, the maximum length of the sub-segment is 10.
For 1 sub-segment of length 8 bytes, the maximum length of the sub-segment is 8.
In the encoding process, whether the current sub-segment can be matched with the corresponding encoding number in the encoding table is firstly judged, and if the current sub-segment can be matched with the directly configurable encoding number, the encoding compression of the current sub-segment is completed.
When it is determined that there is no corresponding coding table subset of the coding type in the existing coding table set according to the maximum length of the sub-segment, it indicates that the character segment of the length has not been coded in the coding process, at this time, a new coding table subset is established for the length of the byte, a coding table is established in the coding table subset, a corresponding coding table ID is configured, a coding number is configured for the sub-segment, and then the sub-segment is coded and compressed by using the just configured coding number.
For example, for the sub-segment abcdefghlk, a subset of the coding table with the coding type of "10 characters" is established, then the coding table with the ID of 10.1 is configured, and the corresponding relationship between "0" and "abcdefghlk" is established in the coding table.
When determining that the existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the sub-segment, it cannot be guaranteed that a corresponding relationship between the sub-segment and the coding number exists in a certain coding table in the coding table subset. For this purpose, matching needs to be performed in the coding table.
Determining that the sub-segments can not be matched in the coding tables in the existing coding table subset, configuring a new coding number for the sub-segments, and updating the coding tables;
and if the sub-sections can be matched in the coding tables in the existing coding table subset, configuring the matched coding numbers for the sub-sections.
The further method of the invention is to establish a new coding table and configure the coding number when the existing coding table can not match the sub-segments, so as to directly match the subsequent sub-segments.
In the further description of the method, the following processing procedure when the first determination result is negative is mainly described in an additional way, specifically as follows:
since the maximum length of the code number in each code table is made shorter than the truncation length, file compression can be realized. Therefore, when the length of the code number in one code table is close to the intercepted length, a new code table needs to be reconstructed, and then a new sub-section configuration new code number which does not establish the corresponding relation between the sub-section and the code number is added into the code table.
In the present invention, the sub-segment of the length is not encoded in the encoding process, and at this time, it cannot be determined whether the character segments of different byte lengths exist in the sub-segment and have a corresponding relationship with the encoding number in the corresponding encoding table. Because when a source file is divided, the length of the last character segment that may be reserved is less than the maximum length of the subsegment. For example, the maximum length of a sub-segment is 10 and the length of the last character segment is 2.
At this time, continuously acquiring any value from 1 to (L-1) of the character segments S (0, i) of the subsections, wherein S (0, i) represents the character segment formed by the characters from 0 to the ith in the subsections, and L is the maximum length of the field;
determining that the character segment S (0, i) does not have a coding table subset of a corresponding coding type in an existing coding table set, establishing a new coding table subset, configuring the corresponding coding type according to the length of the character segment S (0, i), establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the character segment S (0, i); and determining that the existing coding table set has a coding table subset of the corresponding coding type according to the length of the character segment S (0, i), determining that the character segment S (0, i) cannot be matched in the coding table in which the existing coding table subset exists, matching a new coding number for the character segment S (0, i), and updating the coding table.
The processing procedure for the character segment S (0, i) of the sub-segment is the same as the above-mentioned processing procedure for the entire field, and is not described herein again.
In addition, when the first judgment result is negative, determining that an existing coding table set has a coding table subset of a corresponding coding type according to each single character of the subsegment, determining that the coding table in the existing coding table subset cannot be matched with the single character, and configuring the coding number of the single character in the coding table; and determining that the existing coding table subset does not have a corresponding coding type according to each single character of the subsegment, establishing a coding table in a new coding table subset, and configuring the coding number of the single character in the coding table. These processes are the same as the above-described processes and are not described in detail herein.
The above-described processing is explained below by specific examples:
if the source file to be encoded is a character string:&1&2…&10@1@2…@1012…%10…, the string is 101 characters in length. Each time the compressed character segment is 10 characters, the compression is divided into 11 times. I.e., 10 character segments of 10 characters and 1 character segment of 1 character.
Firstly, 1-10 characters are taken from source file to be coded as first character segment to be compressed "&1&2…&10”;
If there is a 10 character code table subset in the code table set "&1&2…&10"there is corresponding code table in 10 character code table subsets, and the corresponding code 01 in the code table, then output the code 01 directly;
if there is no 10 character code table subset or there is 10 character code table subset but the matching is not successful in the code table, then "&1&2…&10"get first character"&1", determine the character"&1"whether it is in single character coding table, if yes, continue the next step; if not, first "&1Putting the code into a single character code table, allocating code numbers for the code, and then carrying out the next step;
from "&1&2…&10"middle reading second character"&2", the processing of the single character is consistent with the processing in the previous step; after the single character processing is finished, the last character processing is carried out "&1"and the single character read this time"&2Splicing to obtain character string "&1&2". Judgment "&1&2"whether it is in the double character coding table, if yes, then proceed to next step; if not, will "&1&2Adding the code number into a double-character code table and distributing a corresponding code number for the double-character code table;
from "&1&2…&10"middle reading second character"&3", the processing of the single character is consistent with the processing in the previous step; after the character processing is finished, the character segment 'ab' processed last time and the read character of this time are compared "&3Splicing to obtain character string "&1&2&3". Judgment "&1&2&3Whether the code is in the three-character code table or not is judged, and if yes, the next step is carried out; if not, will "&1&2&3Adding into the three-character code table and distributing corresponding codes for the three-character code tableCode number;
from "&1&2…&10"read the ith character"&i", i takes 4-10, the processing of single character is consistent with the processing in the previous step; after the single character processing is completed, the last processed character segment "&1-&i-1"with the single character read this time"&iSplicing to obtain character segments "&1-&i". Judgment "&1-&iWhether the code is in the i character code table or not, if yes, the next step is carried out; if not, will "&1-&iAdding the code number into an i character code table and distributing a corresponding code number for the i character code table;
in "&1&2…&10After the processing is finished, the codes in the 10-character code table corresponding to the character segment can be obtained, and the processing of the first character segment is finished. The output format of the final encoding is: { ten characters, 10.1, 00}, the crosses are the code type, 10.1 the code table ID, 00 the code number.
Then intercepting the second character string' @ in turn1@2…@10”、“%12…%10"," … ", repeated for it with the above pair of character segments"&1&2…&10The processing process with the same principle is completed until the whole source file to be coded is processed, namely the compression is completed.
The further method of the invention realizes the purpose of dynamically updating the coding table by carrying out individual coding analysis on each character string of each subsection when each subsection in the source file to be coded is not successfully matched in the coding table set.
Fig. 2 shows a schematic flow chart of a log file processing method provided by the present invention, and referring to fig. 2, the method includes the following steps:
s21, obtaining a source file to be coded, and segmenting and dividing the source file to be coded according to the intercepting length to obtain each sub-segment, wherein the intercepting length is the reference length for segmenting and dividing the source file;
and S22, coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
In the further explanation of the above method, the source character segments in each sub-segment are encoded based on the existing encoding table set to obtain the encoded file, and the processing procedure of updating the encoding table set is explained as follows:
judging whether an existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the subsections to obtain a first judgment result;
if the first judgment result is yes, judging whether the coding tables in the existing coding table subset can be matched with the subsections or not to obtain a second judgment result, and configuring coding numbers for the subsections according to the second judgment result;
if the first judgment result is negative, establishing a new coding table subset, configuring a corresponding coding type according to the maximum length of the subsegment, establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the subsegment.
In the further explanation of the above method, the following explanation is mainly given when the first determination result is negative:
determining that an existing coding table set has a coding table subset of a corresponding coding type according to each single character of the subsegment, determining that the coding table in the existing coding table subset cannot be matched with the single character, and configuring a coding number of the single character in the coding table;
and determining that the existing coding table subset does not have a corresponding coding type according to each single character of the subsegment, establishing a coding table in a new coding table subset, and configuring the coding number of the single character in the coding table.
In the further explanation of the above method, the following explanation is mainly given when the first determination result is negative:
acquiring any value from 1 to (L-1) of the character segments S (0, i) of the subsegments, wherein the S (0, i) represents the character segments formed by sequentially splicing the characters from 0 to the ith in the subsegments, and L is the maximum length of the subsegments;
determining that the character segment S (0, i) does not have a coding table subset of a corresponding coding type in an existing coding table set, establishing a new coding table subset, configuring the corresponding coding type according to the length of the character segment S (0, i), establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the character segment S (0, i);
and determining that the character segment S (0, i) has the corresponding coding table subset of the coding type in the existing coding table set, determining that the character segment S (0, i) cannot be matched in the coding table in which the existing coding table subset exists, matching a new coding number for the character segment S (0, i), and updating the coding table.
In the further explanation of the above method, the process of configuring the code number for the sub-segment according to the second judgment result is mainly explained as follows:
determining that the sub-segments can not be matched in the coding tables in the existing coding table subset, configuring a new coding number for the sub-segments, and updating the coding tables;
and if the sub-sections can be matched in the coding tables in the existing coding table subset, configuring the matched coding numbers for the sub-sections.
The above-described encoding process is described in detail in the above-mentioned content, and is not described in detail herein.
The method of the invention encodes the source file through the existing encoding table set, and realizes the dynamic update of the encoding table set while obtaining the encoded file, so that the encoding table set is more perfect and has better adaptability.
The following describes the log file processing apparatus provided by the present invention, and the log file processing apparatus described below and the log file processing method described above may be referred to in correspondence with each other.
Fig. 3 shows a schematic structural diagram of a log file processing apparatus provided by the present invention, referring to fig. 3, the apparatus includes a parsing module 31, a decoding module 32, and a generating module 33, where:
the analysis module 31 is configured to determine each coding field in the file to be decoded, where each coding field includes a corresponding coding type, a coding table ID, and a coding number in a coding table;
a decoding module 32, configured to determine, for each coding field, a coding table corresponding to the coding field in a coding table set according to a coding type and a coding table ID corresponding to the coding field, and determine, according to a coding number and the coding table corresponding to the coding field, a source character segment corresponding to the coding field;
a generating module 33, configured to generate a source file according to the source character segments corresponding to each encoding field;
wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on a source segment length.
In the further explanation of the above device, the lengths of the code numbers in the code table are sequentially increased, and the source character segments sequentially correspond to the code numbers from more to less according to the occurrence times.
In a further description of the above apparatus, the number of the encoding table subsets is the same as a preset truncation length, where the truncation length is a reference length for segmenting the source file during the encoding process of the source file.
In a further description of the above apparatus, the apparatus further comprises an encoding module configured to:
acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to the intercepting length to obtain each subsection;
and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
In a further description of the above apparatus, the encoding module is specifically configured to, in a process of encoding the source character segment in each sub-segment based on an existing encoding table set to obtain an encoding file and updating the encoding table:
judging whether an existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the subsections to obtain a first judgment result;
if the first judgment result is yes, judging whether the coding tables in the existing coding table subset can be matched with the subsections or not to obtain a second judgment result, and configuring coding numbers for the subsections according to the second judgment result;
if the first judgment result is negative, establishing a new coding table subset, configuring a corresponding coding type according to the maximum length of the subsegment, establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the subsegment.
In a further description of the foregoing apparatus, when the first determination result is negative, the encoding module is further configured to:
determining that an existing coding table set has a coding table subset of a corresponding coding type according to each single character of the subsegment, determining that the coding table in the existing coding table subset cannot be matched with the single character, and configuring a coding number of the single character in the coding table;
and determining that the existing coding table subset does not have a corresponding coding type according to each single character of the subsegment, establishing a coding table in a new coding table subset, and configuring the coding number of the single character in the coding table.
In a further description of the foregoing apparatus, when the first determination result is negative, the encoding module is further configured to:
acquiring a character section S (0, i) of the sub-section, wherein i is 0- (L-1), the S (0, i) represents the character section formed by sequentially splicing the 0 th character section to the ith character section in the sub-section, and L is the maximum length of the sub-section;
determining that the character segment S (0, i) does not have a coding table subset of a corresponding coding type in an existing coding table set, establishing a new coding table subset, configuring the corresponding coding type according to the length of the character segment S (0, i), establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the character segment S (0, i);
and determining that the character segment S (0, i) has the corresponding coding table subset of the coding type in the existing coding table set, determining that the character segment S (0, i) cannot be matched in the coding table in which the existing coding table subset exists, matching a new coding number for the character segment S (0, i), and updating the coding table.
In a further description of the above apparatus, the encoding module, in the process of configuring the code number for the sub-segment according to the second judgment result, is specifically configured to:
determining that the sub-segments can not be matched in the coding tables in the existing coding table subset, configuring a new coding number for the sub-segments, and updating the coding tables;
and if the sub-sections can be matched in the coding tables in the existing coding table subset, configuring the matched coding numbers for the sub-sections.
Since the principle of the apparatus according to the embodiment of the present invention is the same as that of the method according to the above embodiment, further details are not described herein for further explanation.
It should be noted that, in the embodiment of the present invention, the relevant functional module may be implemented by a hardware processor (hardware processor).
The log file processing method provided by the invention establishes and updates the coding table set based on the character segments in the source file, and codes and compresses the source file through the coding table set, so that the storage space of the source file in the storage process is released, the cost of hardware equipment is reduced, and meanwhile, the coding file is decoded through the coding table set, so that the rapid decoding is realized.
Fig. 4 shows a schematic structural diagram of a log file processing apparatus provided by the present invention, referring to fig. 4, the apparatus includes a dividing module 41 and an encoding module 42, where:
the dividing module 41 is configured to obtain a source file to be encoded, and segment the source file to be encoded according to an interception length to obtain each sub-segment, where the interception length is a reference length for segmenting the source file;
and the encoding module 42 is configured to encode the source character segments in each sub-segment based on an existing encoding table set to obtain an encoding file, and update the encoding table set.
In a further description of the above apparatus, the encoding module is specifically configured to, in a processing procedure of respectively encoding the source character segments in each sub-segment based on an existing encoding table set to obtain an encoded file and updating the encoding table set:
judging whether an existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the subsections to obtain a first judgment result;
if the first judgment result is yes, judging whether the coding tables in the existing coding table subset can be matched with the subsections or not to obtain a second judgment result, and configuring coding numbers for the subsections according to the second judgment result;
if the first judgment result is negative, establishing a new coding table subset, configuring a corresponding coding type according to the maximum length of the subsegment, establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the subsegment.
In a further description of the foregoing apparatus, when the first determination result is negative, the encoding module is further configured to:
determining that an existing coding table set has a coding table subset of a corresponding coding type according to each single character of the subsegment, determining that the coding table in the existing coding table subset cannot be matched with the single character, and configuring a coding number of the single character in the coding table;
and determining that the existing coding table subset does not have a corresponding coding type according to each single character of the subsegment, establishing a coding table in a new coding table subset, and configuring the coding number of the single character in the coding table.
In a further description of the foregoing apparatus, when the first determination result is negative, the encoding module is further configured to:
acquiring any value from 1 to (L-1) of the character segments S (0, i) of the subsegments, wherein the S (0, i) represents the character segments formed by sequentially splicing the characters from 0 to the ith in the subsegments, and L is the maximum length of the subsegments;
determining that the character segment S (0, i) does not have a coding table subset of a corresponding coding type in an existing coding table set, establishing a new coding table subset, configuring the corresponding coding type according to the length of the character segment S (0, i), establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the character segment S (0, i);
and determining that the character segment S (0, i) has the corresponding coding table subset of the coding type in the existing coding table set, determining that the character segment S (0, i) cannot be matched in the coding table in which the existing coding table subset exists, matching a new coding number for the character segment S (0, i), and updating the coding table.
In a further description of the above apparatus, the encoding module, in the process of configuring the code number for the sub-segment according to the second judgment result, is specifically configured to:
determining that the sub-segments can not be matched in the coding tables in the existing coding table subset, configuring a new coding number for the sub-segments, and updating the coding tables;
and if the sub-sections can be matched in the coding tables in the existing coding table subset, configuring the matched coding numbers for the sub-sections.
Since the principle of the apparatus according to the embodiment of the present invention is the same as that of the method according to the above embodiment, further details are not described herein for further explanation.
It should be noted that, in the embodiment of the present invention, the relevant functional module may be implemented by a hardware processor (hardware processor).
The device of the invention encodes the source file through the existing encoding table set, and realizes the dynamic update of the encoding table set while obtaining the encoded file, so that the encoding table set is more perfect and has better adaptability.
Fig. 5 is a schematic physical structure diagram of an electronic device, which may include, as shown in fig. 5: a processor (processor)51, a communication Interface (communication Interface)52, a memory (memory)53 and a communication bus 54, wherein the processor 51, the communication Interface 52 and the memory 53 complete communication with each other through the communication bus 54. The processor 51 may call logic instructions in the memory 53 to perform a log file processing method comprising: determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table; respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to the coding type and the ID of the coding table corresponding to the coding field, and determining a source character segment corresponding to the coding field according to the coding number and the coding table corresponding to the coding field; and generating a source file according to the source character segment corresponding to each encoding field. Wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on the source segment length.
In addition, the logic instructions in the memory 53 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the log file processing method provided by the above methods, the method comprising: determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table; respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to the coding type and the ID of the coding table corresponding to the coding field, and determining a source character segment corresponding to the coding field according to the coding number and the coding table corresponding to the coding field; and generating a source file according to the source character segment corresponding to each encoding field. Wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on the source segment length.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the log file processing method provided above, the method including: determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table; respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to the coding type and the ID of the coding table corresponding to the coding field, and determining a source character segment corresponding to the coding field according to the coding number and the coding table corresponding to the coding field; and generating a source file according to the source character segment corresponding to each encoding field. Wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on the source segment length.
The present invention provides an electronic device, which may include: the system comprises a processor (processor), a communication Interface (communication Interface), a memory (memory) and a communication bus, wherein the processor, the communication Interface and the memory are communicated with each other through the communication bus. The processor may call logic instructions in the memory to perform a log file processing method, the method comprising: acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to an interception length to obtain each sub-segment, wherein the interception length is a reference length for segmenting and dividing the source file; and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
In addition, the logic instructions in the memory may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the log file processing method provided by the above methods, the method comprising: acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to an interception length to obtain each sub-segment, wherein the interception length is a reference length for segmenting and dividing the source file; and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
In yet another aspect, the present invention also provides a non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor is implemented to perform the log file processing method provided above, the method including: acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to an interception length to obtain each sub-segment, wherein the interception length is a reference length for segmenting and dividing the source file; and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (11)

1. A log file processing method is characterized by comprising the following steps:
determining each coding field in the file to be decoded, wherein each coding field comprises a corresponding coding type, a coding table ID and a coding number in a coding table;
respectively aiming at each coding field, determining a coding table corresponding to the coding field in a coding table set according to a coding type and a coding table ID corresponding to the coding field, and determining a source character field corresponding to the coding field according to a coding number and the coding table corresponding to the coding field;
generating a source file according to the source character segment corresponding to each coding field;
wherein the set of encoding tables includes one or more subsets of encoding tables; each coding table subset comprises at least one coding table, and the coding types of the coding tables are the same; each coding table in the same coding table subset corresponds to different coding table IDs respectively; the encoding type is determined based on a source segment length.
2. The log file processing method according to claim 1, wherein lengths of the code numbers in the code table are sequentially increased, and the source character segments sequentially correspond to the code numbers from more to less according to the occurrence number.
3. The log file processing method as claimed in claim 2, wherein a maximum value of the number of the subsets of the encoding tables is the same as a preset truncation length, and the truncation length is a reference length for segmenting the source file during the encoding process of the source file.
4. A log file processing method according to any of claims 1-3, wherein the method further comprises:
acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to the intercepting length to obtain each subsection;
and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
5. The log file processing method according to claim 4, wherein the encoding the source character segments in each sub-segment based on an existing encoding table set to obtain an encoded file, and updating the encoding table set includes:
judging whether an existing coding table set has a coding table subset of a corresponding coding type according to the maximum length of the subsections to obtain a first judgment result;
if the first judgment result is yes, judging whether the coding tables in the existing coding table subset can be matched with the subsections or not to obtain a second judgment result, and configuring coding numbers for the subsections according to the second judgment result;
if the first judgment result is negative, establishing a new coding table subset, configuring a corresponding coding type according to the maximum length of the subsegment, establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the subsegment.
6. The log file processing method according to claim 5, when the first determination result is no, further comprising:
determining that an existing coding table set has a coding table subset of a corresponding coding type according to each single character of the subsegment, determining that the coding table in the existing coding table subset cannot be matched with the single character, and configuring a coding number of the single character in the coding table;
and determining that the existing coding table subset does not have a corresponding coding type according to each single character of the subsegment, establishing a coding table in a new coding table subset, and configuring the coding number of the single character in the coding table.
7. The log file processing method according to claim 6, further comprising, when the first determination result is negative:
acquiring any value from 1 to (L-1) of the character segments S (0, i) of the subsegments, wherein the S (0, i) represents the character segments formed by sequentially splicing the characters from 0 to the ith in the subsegments, and L is the maximum length of the subsegments;
determining that the character segment S (0, i) does not have a coding table subset of a corresponding coding type in an existing coding table set, establishing a new coding table subset, configuring the corresponding coding type according to the length of the character segment S (0, i), establishing a new coding table in the new coding table subset, configuring a corresponding coding table ID, and configuring a coding number for the character segment S (0, i);
and determining that the character segment S (0, i) has the corresponding coding table subset of the coding type in the existing coding table set, determining that the character segment S (0, i) cannot be matched in the coding table in which the existing coding table subset exists, matching a new coding number for the character segment S (0, i), and updating the coding table.
8. The log file processing method as claimed in claim 5, wherein said configuring an encoding number for a subsegment according to the second determination result comprises:
determining that the sub-segments can not be matched in the coding tables in the existing coding table subset, configuring a new coding number for the sub-segments, and updating the coding tables;
and if the sub-sections can be matched in the coding tables in the existing coding table subset, configuring the matched coding numbers for the sub-sections.
9. A log file processing method is characterized by comprising the following steps:
acquiring a source file to be coded, and segmenting and dividing the source file to be coded according to an interception length to obtain each sub-segment, wherein the interception length is a reference length for segmenting and dividing the source file;
and coding the source character segments in each sub-segment based on the existing coding table set to obtain a coding file, and updating the coding table set.
10. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the log file processing method according to any one of claims 1 to 8 or implements the steps of the log file processing method according to claim 9 when executing the program.
11. A non-transitory computer-readable storage medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the steps of the log file processing method according to any one of claims 1 to 8, or implements the steps of the log file processing method according to claim 9.
CN202011614132.5A 2020-12-30 2020-12-30 Log file processing method, electronic equipment and storage medium Active CN112749139B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011614132.5A CN112749139B (en) 2020-12-30 2020-12-30 Log file processing method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011614132.5A CN112749139B (en) 2020-12-30 2020-12-30 Log file processing method, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112749139A true CN112749139A (en) 2021-05-04
CN112749139B CN112749139B (en) 2024-04-19

Family

ID=75649990

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011614132.5A Active CN112749139B (en) 2020-12-30 2020-12-30 Log file processing method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112749139B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006126839A1 (en) * 2005-05-27 2006-11-30 Lg Electronics Inc. Method and apparatus for decoding an audio signal
CN102567293A (en) * 2010-12-13 2012-07-11 汉王科技股份有限公司 Coded format detection method and coded format detection device for text files
US20140250230A1 (en) * 2012-10-11 2014-09-04 Verizon Patent And Licensing Inc. Media manifest file generation for adaptive streaming cost management
CN104813588A (en) * 2012-10-09 2015-07-29 阿尔卡特朗讯 Secure and lossless data compression
CN106648467A (en) * 2016-12-28 2017-05-10 税友软件集团股份有限公司 Log generation method and system
CN109324996A (en) * 2018-10-12 2019-02-12 平安科技(深圳)有限公司 Journal file processing method, device, computer equipment and storage medium
CN109901978A (en) * 2017-12-08 2019-06-18 航天信息股份有限公司 A kind of Hadoop log lossless compression method and system
CN111352907A (en) * 2020-03-30 2020-06-30 见知数据科技(上海)有限公司 Method and device for analyzing pipeline file, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2006126839A1 (en) * 2005-05-27 2006-11-30 Lg Electronics Inc. Method and apparatus for decoding an audio signal
CN102567293A (en) * 2010-12-13 2012-07-11 汉王科技股份有限公司 Coded format detection method and coded format detection device for text files
CN104813588A (en) * 2012-10-09 2015-07-29 阿尔卡特朗讯 Secure and lossless data compression
US20140250230A1 (en) * 2012-10-11 2014-09-04 Verizon Patent And Licensing Inc. Media manifest file generation for adaptive streaming cost management
CN106648467A (en) * 2016-12-28 2017-05-10 税友软件集团股份有限公司 Log generation method and system
CN109901978A (en) * 2017-12-08 2019-06-18 航天信息股份有限公司 A kind of Hadoop log lossless compression method and system
CN109324996A (en) * 2018-10-12 2019-02-12 平安科技(深圳)有限公司 Journal file processing method, device, computer equipment and storage medium
CN111352907A (en) * 2020-03-30 2020-06-30 见知数据科技(上海)有限公司 Method and device for analyzing pipeline file, computer equipment and storage medium

Also Published As

Publication number Publication date
CN112749139B (en) 2024-04-19

Similar Documents

Publication Publication Date Title
CN107395209B (en) Data compression method, data decompression method and equipment thereof
CN112165331A (en) Data compression method and device, data decompression method and device, storage medium and electronic equipment
US9282443B2 (en) Short message service (SMS) message segmentation
CN108322220A (en) Decoding method, device and coding/decoding apparatus
CN101984405A (en) Method of software version upgrade and terminal and system
CN104199951B (en) Web page processing method and device
CN104408100B (en) The compression method of structured web site daily record
US7605721B2 (en) Adaptive entropy coding compression output formats
US6304676B1 (en) Apparatus and method for successively refined competitive compression with redundant decompression
CN114666212B (en) Configuration data issuing method
CN106849956B (en) Compression method, decompression method, device and data processing system
US7518538B1 (en) Adaptive entropy coding compression with multi-level context escapes
CN104811209A (en) Compressed file data embedding method and device capable of resisting longest matching detection
CN116303297B (en) File compression processing method, device, equipment and medium
CN111209741A (en) Processing method and device of table data dictionary
CN112749139B (en) Log file processing method, electronic equipment and storage medium
CN116505954B (en) Huffman coding method, system, device and medium
CN110266834B (en) Area searching method and device based on internet protocol address
CN113987556B (en) Data processing method and device, electronic equipment and storage medium
CN114791904A (en) Persistent compression method and device for bloom filter
CN112100168A (en) Method and device for determining data association relationship
JP6005273B2 (en) Data stream encoding method, transmission method, transmission method, encoding device for encoding data stream, transmission device, and transmission device
CN111026748B (en) Data compression method, device and system for network access frequency management and control
CN114501011A (en) Image compression method, image decompression method and device
CN114092577A (en) Image data processing method, image data processing device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant