CN104050269A - Log compression method and device and log decompression method and device - Google Patents

Log compression method and device and log decompression method and device Download PDF

Info

Publication number
CN104050269A
CN104050269A CN201410283777.3A CN201410283777A CN104050269A CN 104050269 A CN104050269 A CN 104050269A CN 201410283777 A CN201410283777 A CN 201410283777A CN 104050269 A CN104050269 A CN 104050269A
Authority
CN
China
Prior art keywords
record
character string
field
order
character
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410283777.3A
Other languages
Chinese (zh)
Other versions
CN104050269B (en
Inventor
乔志刚
高亚明
顾庆荣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Supreme Being Joins Information Technology Share Co Ltd
Original Assignee
Shanghai Supreme Being Joins Information Technology Share Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Supreme Being Joins Information Technology Share Co Ltd filed Critical Shanghai Supreme Being Joins Information Technology Share Co Ltd
Priority to CN201410283777.3A priority Critical patent/CN104050269B/en
Publication of CN104050269A publication Critical patent/CN104050269A/en
Application granted granted Critical
Publication of CN104050269B publication Critical patent/CN104050269B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Quality & Reliability (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention provides a log compression method and device and a log decompression method and device. The log compression method includes the steps that records in logs are read; the records of the logs are stored according to fields, and bit order information of the records in the logs is added to the fields of the stored records; character strings of the fields of the records in the logs are compared with character strings in fields of selected reference records respectively, and the character strings of the fields of the records in the logs are merged so that the merged character strings can be acquired; compressed files are established, wherein the compressed files comprise header information of the compressed files; the acquired merged character strings are compressed and sequentially added to the established compressed files according to location sequences of the fields in the records. By the adoption of the scheme, the compression ratio of the logs can be effectively increased, and the log compression method and device and the log decompression method and device are simple and efficient.

Description

Daily record compression method and device, decompression method and device
Technical field
The present invention relates to data compression technique field, particularly relate to a kind of daily record compression method and device, decompression method and device.
Background technology
Internet produces large data, and along with the development of Internet technology, data also will, as the energy, material, become strategic resources.How utilizing data resource to deep-cut innovation, Improve Efficiency, is pursuing a goal of many IT enterprises.The large data that internet produces, mainly come from the depth analysis to internet access daily record.Therefore, the collection of internet access daily record and storage are become to key.Especially log store, because the shared storage space of not compressed original log is excessive, must compress original log, just can reach long-term preservation object.
In prior art, exist various file compression methods, can be applied to the compression of internet access log recording.But existing various file compression methods are due to for fully considering internet access daily record, thereby exist the problem that compressibility is low.
Summary of the invention
The problem that the embodiment of the present invention solves is how effectively to improve the compressibility of journal file.
For addressing the above problem, the embodiment of the present invention provides a kind of daily record compression method, and described method comprises:
Read the record in described daily record, described record comprises at least one field, and described field comprises the character string that at least one character forms;
The record of described daily record is deposited according to field, described in adding, be recorded in the position order information in described daily record in the field of stored record;
By by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging character string;
Create compressed file, described compressed file comprises the header of described compressed file, and described header comprises the line number information that the identification information for identifying described daily record compression method, described daily record record, the information that records included field number of described daily record;
Resulting merging character string is compressed, and the merging character string after compression is added in created compressed file in the sequence of positions of described record successively according to described field.
Alternatively, ordered arrangement between character string in the described field of the record of described daily record, described by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, character string in the described field of the record of described daily record is merged to processing, obtain merging character string, comprising:
Travel through the character string in the described field of record of described daily record;
In the described field of the record of first order, adding value is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of described first order;
Character string in the described field of the record of the character string in the described field of the record of non-first order in described daily record and described first order is compared, obtain and record the number of repeat character (RPT) between the two;
Repeat character (RPT) between the described field of the record of the described field of the record of described non-first order and described first order is removed, leave non-repeat character (RPT), the new character string of described field that obtains the record of described non-first order, the new character string of the described field of the record of described non-first order comprises the information of the position order information of record of described non-first order and the number of the repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order;
Using new character string in the described field of record of described first order of obtaining as beginning, after character string new in the described field of the record of generated non-first order being added successively to character string new in the described field of record of described first order, and in the described field of the record of described first order, in the described field of the record of new character string and non-first order, between the new character string between new character string and in the described field of the record of described non-first order, be respectively arranged with and separate sign, obtain merging character string.
Alternatively, between character string in the described field of the record of described daily record during lack of alignment, character string in the described field of the record of described daily record is carried out to ordered arrangement, and carry out described by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the operation of character string.
Alternatively, described by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtains merging character string, comprising:
Travel through the character string in the described field of record of described daily record, obtain the character string of field described in the record of described daily record;
Adopt preset characters string to replace character string in the described field of the record of described daily record, obtain new character string, the character quantity of described preset characters string is less than the character quantity of the character string in the described field of record of described daily record;
Resulting new character string is merged, obtain merging character string, in described merging character string, between resulting new character string, be provided with and separate sign.
A daily record decompression method, is characterized in that, comprising:
Obtain and the compressed file that decompresses in through the merging character string of field described in the record of the described daily record of overcompression;
By merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, is recorded in the position order information in described daily record described in the character string of field described in the record of the described daily record after described recovery comprises;
The character string having after recovering in the field of identical position order information is spliced in order, obtain described record;
The record that splicing is obtained sorts according to the position order information of described record, the daily record after being decompressed.
Alternatively, by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, comprising:
Obtain the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in described daily record;
Obtain adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the described field of record of rear non-first order of sequence and the described field of the record of rear first order of sequence, the number information of repeat character (RPT) between the two, and the position order information in described daily record that is recorded in of non-first order after described sequence;
According to the number information of repeat character (RPT) between the described field of the record of non-first order after the described field of the record of rear first order of described sequence and described sequence, before non-repetitive character described in obtained adjacent two character strings of separating between sign, add the repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence after described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, the character string of the described field of the described record of described non-first order after being restored.
Alternatively, by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, comprising:
Obtain the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in described daily record;
Obtain adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the rear described field of record of non-first order of sequence and the described field of the record of rear first order of described sequence, the number information of repeat character (RPT) between the two, and the position order information in described daily record that is recorded in of non-first order after described sequence;
According to the number information of repeat character (RPT) between the described field of the record of non-first order after the described field of the record of rear first order of described sequence and described sequence, before non-repetitive character described in obtained adjacent two character strings of separating between sign, add the repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence after described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, the character string of the described field of the described record of described non-first order after being restored.
Alternatively, by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, comprising:
Obtain the first character string and adjacent two character strings of separating between sign of separating before identifying in described merging character string;
Adopt corresponding repeat character string to replace the first preset characters string of separating in the character strings that identify between previous character string and adjacent two separation signs in obtained described merging character string, the character string of the described field of the record of the described daily record after being restored, the character quantity of described preset characters string is less than the character quantity of described repeat character string.
The embodiment of the present invention also provides a kind of daily record compression set, comprising:
Reading unit, is suitable for reading the record in described daily record, and described record comprises at least one field, and described field comprises the character string that at least one character forms;
Storage unit, the record that is suitable for described daily record that described reading unit is read is deposited according to field, described in adding, is recorded in the position order information in described daily record in the field of stored record;
Merge cells, be suitable for by by the character string in the described field of the record of the described daily record of storing in described storage unit respectively with the same field of selected reference recording in character string compare, the character string of field described in record in the described daily record of storing in described storage unit is merged to processing, obtain merging character string;
Creating unit, be suitable for creating compressed file, described compressed file comprises the header of described compressed file, and described header comprises the line number information that the identification information for identifying described daily record compression method, described daily record record, the information that records included field number of described daily record;
Compression unit, is suitable for that merge cells is merged to the merging character string obtain and compresses, and the merging character string after compression is added in the compressed file that described creating unit creates in the sequence of positions of described record successively according to described field.
Alternatively, described merge cells comprises:
The first traversal subelement, is suitable for traveling through the character string in the described field of record of described daily record, ordered arrangement between the character string in the described field of the record of described daily record;
First adds subelement, and being suitable for adding value in the described field of the record of first order is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of described first order;
First compares subelement, is suitable for the character string in the described field of the record of the character string in the described field of the record of non-first order in described daily record and described first order to compare, and obtains and record the number of repeat character (RPT) between the two;
First generates subelement, be suitable for the character repeating between the described first described field of record of described non-first order that relatively subelement obtains and the described field of the record of described first order to remove, leave non-repeat character (RPT), obtain the new character string in the described field of record of described non-first order, new character string in the described field of the record of described non-first order comprises the information of the position order information of record of described non-first order and the number of the repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order,
First merges subelement, being suitable for usining described first adds subelement and obtains character string new in the described field of record of described first order as beginning, new character string in the described field of the record of non-first order that described the first generation subelement is generated, add to successively described first add character string new in the described field of record of described first order that subelement obtains after, and between the new character string in the described field of the record of described first order in the described field of the record of new character string and described non-first order, and between the new character string in the described field of the record of described non-first order, be respectively arranged with and separate sign, obtain merging character string.
Alternatively, described merge cells comprises:
Sequence subelement, is suitable for by the character string ordered arrangement in the described field of the record of described daily record lack of alignment between the character string in the described field of the record of described daily record;
The second traversal subelement, is suitable for the character string in the described field of record of the described daily record after traversal sequence;
Second adds subelement, and being suitable for adding value in the described field of the record of rear first order of sequence is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of rear first order of described sequence;
Second compares subelement, is suitable for the character string in the described field of the record of the character string in the described field of the record of non-first order after sequence and rear first order of described sequence to compare, and obtains and record the number of repeat character (RPT) between the two;
First generates subelement, be suitable for the repeat character (RPT) between the character string of the described field of the record of the character string in the described field of the record of non-first order after sequence and rear first order of sequence to remove, leave non-repeat character (RPT), generate new character string in the described field of the record of non-first order after described sequence, after described sequence, in the described field of the record of non-first order, new character string comprises the information of the repeat character (RPT) number between the described field of record of non-first order and the described field of the record of rear first order of described sequence after the recorded bit order information of non-first order after described sequence and described sequence,
Second merges subelement, be suitable for generating after new character string in the described field of the record of non-first order after the sequence that subelement generates adds the new character string of described field of record of rear first order of described sequence successively to described second, and new character string in the described field of record of non-first order after the new character string of the described field of the record of rear first order of described sequence and sequence, and between the new character string in the described field of the record of described non-first order, be respectively arranged with separation sign, obtain merging character string.
Alternatively, described merge cells comprises:
The 3rd traversal subelement, is suitable for traveling through the character string in the described field of record of described daily record, obtains the repeat character string in the character string of the described field recording in described daily record;
First replaces subelement, be suitable for the character string in the described field of the record of described daily record to adopt preset characters string to replace, obtain new character string, the character quantity of the character string in the described field of the record of described daily record is greater than the character quantity of described preset characters string;
The 3rd merges subelement, and the new character string that is suitable for described the first replacement subelement to obtain merges, and obtains merging character string, in described merging character string, is provided with and separates sign between resulting new character string, obtains merging character string.
The embodiment of the present invention provides a kind of daily record decompressing device, comprising:
Decompression unit, be suitable for obtaining and the described compressed file that decompresses in the record of the described daily record of overcompression the merging character string of field, described merging character string;
Recovery unit, be suitable for comparing by described decompression unit is decompressed the merging character string obtaining and the benchmark character string of choosing, described merging character string is carried out to Recovery processing, character string in the described field of the record of the described daily record after being restored, is recorded in the position order information in described daily record described in the character string in the described field of the record of the described daily record after described recovery comprises;
Concatenation unit, having of being suitable for that described recovery unit obtains recover after identical bits order information the character string of field splice in order, obtain the record of described daily record;
Sequencing unit, after the record that is suitable for daily record that the splicing of described concatenation unit is obtained sorts according to the position order information of the record of described daily record, the daily record after being decompressed.
Alternatively, described recovery unit comprises:
First obtains subelement, be suitable for obtaining the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises first order, value be zero repeat character (RPT) number information and described in be recorded in the position order information in daily record;
Second obtains subelement, be suitable for obtaining adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the described field of record of non-first order and the described field of the record of described first order, the number information of repeat character (RPT) between the described field of the record of the field of the record of described non-first order and described first order, and the position order information in described daily record that is recorded in of described non-first order;
First recovers subelement, being suitable for obtaining subelement according to described first obtains in described merging character string first and separates the number information that character string and the described the 3rd before sign is obtained repeat character (RPT) between the character string of described field of record of first order that subelement obtains and the described field of the record of described first order, before non-repetitive character described in described two character strings of separating between sign, add the repeat character (RPT) between the described field of record of character string in the described field of record of described first order and described first order, and the number information of simultaneously deleting repeat character (RPT) between the field of record of described non-first order and the described field of the record of described first order, the character string of the described field of the described record of non-first order after being restored.
Alternatively, described recovery unit comprises:
The 3rd obtains subelement, be suitable for obtaining the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in daily record;
The 4th obtains subelement, be suitable for obtaining adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the rear described field of record of non-first order of sequence and the described field of the record of rear first order of described sequence, the number information of repeat character (RPT) between the described field of the record of the field of the record of non-first order and rear first order of described sequence after described sequence, and the position order information in described daily record that is recorded in of non-first order after described sequence;
Second recovers subelement, be suitable for obtaining in the described merging character string that subelement obtains first according to the described the 3rd and separate the number information that character string and the described the 4th before sign is obtained repeat character (RPT) between the described field of record of rear first order of sequence that subelement obtains and the described field of the record of rear first order of described sequence, before non-repetitive character described in described two character strings of separating between sign, add the repeat character (RPT) between the described field of record of rear first order of described sequence and the described field of the record of rear first order of described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, character string after described sequence after being restored in the described field of the record of non-first order.
Alternatively, described recovery unit comprises:
The 5th obtains subelement, is suitable for obtaining the first character string and adjacent two character strings of separating between sign of separating before identifying in described merging character string;
Second replaces subelement, be suitable for obtaining in the described merging character string that subelement obtains the first character string of separating in the described field of the record that character string before sign and adjacent two preset characters strings of separating between sign adopt described daily record by the described the 5th and replace, the character string of the described field of the record of the described daily record after being restored
Compared with prior art, technical scheme of the present invention has advantages of following:
Above-mentioned technical scheme, by the character string of each field of log recording being carried out to corresponding merging, process, can significantly dwindle the size of each byte character string, then, to add in compressed file through merging the character string of each field of processing again, can effectively improve the compressibility of daily record, simply efficient.
Accompanying drawing explanation
Fig. 1 is the process flow diagram of a kind of daily record compression method in the embodiment of the present invention;
Fig. 2 be in the embodiment of the present invention a kind of by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the process flow diagram of character string;
Fig. 3 be in the embodiment of the present invention a kind of by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the process flow diagram of character string;
Fig. 4 be in the embodiment of the present invention a kind of by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the process flow diagram of character string;
Fig. 5 is the process flow diagram of a kind of daily record decompression method in the embodiment of the present invention;
Fig. 6 is that a kind of in the embodiment of the present invention carries out Recovery processing by described merging character string, the process flow diagram of the character string of field described in the record of the described daily record after being restored;
Fig. 7 is that another in the embodiment of the present invention carried out Recovery processing by described merging character string, the process flow diagram of the character string of field described in the record of the described daily record after being restored;
Fig. 8 is that another in the embodiment of the present invention carried out Recovery processing by described merging character string, the process flow diagram of the character string of field described in the record of the described daily record after being restored;
Fig. 9 is the structural representation of a kind of daily record compression set in the embodiment of the present invention;
Figure 10 is the structural representation of a kind of merge cells in the embodiment of the present invention;
Figure 11 is the structural representation of another merge cells in the embodiment of the present invention;
Figure 12 is the structural representation of another merge cells in the embodiment of the present invention;
Figure 13 is the structural representation of a kind of daily record decompressing device in the embodiment of the present invention;
Figure 14 is the structural representation of a kind of recovery unit in the embodiment of the present invention;
Figure 15 is the structural representation of another recovery unit in the embodiment of the present invention;
Figure 16 is the structural representation of another recovery unit in the embodiment of the present invention.
Embodiment
Record in internet access daily record is by being formed without several user's Visitor Logs, and every record is comprised of relatively-stationary field, and the same field of different records is longer, and between same field, exists the character of more repetition.For example, internet interconnection protocol (Internet Protocol, abbreviation IP) address, URL(uniform resource locator) (Uniform Resource Locator, be called for short URL), cookie (refer to website in order to distinguish user identity, carry out session tracking and be stored in the data in subscriber's local terminal) etc.
File compression method of the prior art, owing to reckoning without the These characteristics of internet daily record, therefore, exists the problem that daily record compressibility is low.
For solving the above-mentioned problems in the prior art, the technical scheme that the embodiment of the present invention adopts is by processing the character string storing together according to field through merging, the character quantity that merges character string can be dwindled, thereby the compressibility of internet daily record can be effectively improved.
For above-mentioned purpose of the present invention, feature and advantage can more be become apparent, below in conjunction with accompanying drawing, specific embodiments of the invention are described in detail.
Fig. 1 shows the process flow diagram of a kind of daily record compression method in the embodiment of the present invention.Daily record compression method as shown in Figure 1, comprising:
Step S11: read the record in described daily record.
In concrete enforcement, every record in daily record can comprise more than one field, can comprise again the character string that more than one character forms in each field.
In concrete enforcement, described daily record can comprise internet access daily record.In internet access daily record, recorded the browse record of user to related web site, by the record in daily record is analyzed, can show that user accesses the custom of related web site, whether website is search engine favor etc.
Step S12: the record of described daily record is deposited according to field, be recorded in the position order information in described daily record described in adding in the field of stored record.
In concrete enforcement, every record in daily record has relatively-stationary field.Owing to conventionally thering is regularity between the character string in the same field of every record.The character string etc. for example, between the character string of the same field of the record of daily record with repetition.Therefore, the record of daily record can be deposited according to field, be that is to say the identical field of every record in daily record is stored together, so that the character string in the described field of the record of daily record is merged to processing.
In concrete enforcement, for when decompressing, the described field of the log recording combining is recovered, when the record of daily record is deposited according to field, described in can adding, be recorded in the position order information in described daily record in the field of stored record.By position order information, each field of same record can be stitched together again.
Step S13: by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging character string.
In concrete enforcement, can, according to the feature of the character string of each field, character string in the same field of every record and the reference recording of choosing be compared, and merge processing according to comparative result, to reduce the character quantity of the record of daily record.In the situation that the minimizing of the character quantity of the record in daily record, then daily record is compressed, the size of the daily record of compression can be effectively reduced, thereby the compressibility of daily record can be effectively improved, save the storage space of daily record.
Step S14: create compressed file, described compressed file comprises the header of described compressed file.
In concrete enforcement, described header can comprise the information of included field number of the record of the line number information recording in the identification information of described daily record, described daily record, described daily record.Wherein, the identification information of described daily record is for identifying for described daily record compression method.
Step S15: resulting merging character string is compressed, and the merging character string after compression is added in created compressed file in the sequence of positions of described record successively according to described field.
In concrete enforcement, can, according to the feature of character string in the different field of log recording, to the character string in the described field of the record of daily record, merge processing.
Fig. 2 show in the embodiment of the present invention a kind of by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the process flow diagram of character string, below by concrete steps, be elaborated:
Step S21: travel through the character string in the described field of record of described daily record.
In concrete enforcement, due to ordered arrangement between the character string in the described field of the record of described daily record, by traveling through the character string in the described field of record of described daily record, can by the character string in the described field of the record of described daily record respectively with the described field of the reference recording of choosing in character string compare, and process accordingly.
Step S22: adding value in the described field of the record of first order is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of described first order.
In concrete enforcement, the character string ordered arrangement in the same field of every record of described daily record, for example, arranges according to time sequencing.Therefore, can choose character string in the described field of record of first order as benchmark character string, the character string in the described field of the record of other non-first order and the benchmark character string of choosing are compared.
In concrete enforcement, because the character string in the described field of the record of non-first order compares the described field of the record with first order, in order to keep the consistance of the structure of the character string of separating with separation sign after merging, can be in the described field of the record of first order interpolation value be the information of the number of zero repeat character (RPT), thereby can obtain new character string in the described field of first order.
Step S23: the character string in the described field of the record of the character string in the described field of the record of non-first order in described daily record and described first order is compared, obtain and record the number of repeat character (RPT) between the two.
In concrete enforcement, by character string in the described field of the record of the described field of the record of all non-first order in daily record and first order being compared one by one, can draw the repeat character (RPT) that has and the number information of described repeat character (RPT) between the two.
Step S24: the repeat character (RPT) between the described field of the record of the described field of the record of described non-first order and described first order is removed, leave non-repeat character (RPT), the new character string of described field that obtains the record of described non-first order, the new character string of the described field of the record of described non-first order comprises the information of the position order information of record of described non-first order and the number of the repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order.
In concrete enforcement, by character string in the described field of the record of the described field of the record of all non-first order in daily record and first order is compared one by one according to character, the repeat character string having between the described field of the record of the described field of the record of non-first order and first order is deleted, leave non-repetitive character, and the number information of adding the repeat character (RPT) having between the two before non-repetitive character string, thereby obtain new character string in the described field of non-first order record.Due to the character repeating in the described field no longer retaining in character string new in the described field at non-first order record with first order record, therefore, can reduce the character quantity of character string in the described field of record of non-first order.
Step S25: after character string new in the described field of the record of generated non-first order being added successively to character string new in the described field of record of described first order, and in the described field of the record of described first order, in the described field of the record of new character string and non-first order, between the new character string between new character string and in the described field of the record of described non-first order, be respectively arranged with and separate sign, obtain merging character string.
In concrete enforcement, separating sign can arrange according to the actual needs.For example, can use “ t " as separating sign, with the new character string being combined in character string, play the effect of separating and identifying.
The merging method of the character string in the described field of the record of the daily record shown in Fig. 2 can be for being associated with the character string in the described field of record of the daily record that order arranges.For example, timestamp field in the record of daily record, character string sequence in field is as " 2014-04-2813:52:23 ", " 2014-04-2813:53:31 ", " 2014-04-2814:00:09 ", " 2014-04-2814:03:06 ", can be merged into " 01002014-04-2813:52:23 t02141:31 t031114:00:09 t04114:03:06 ".
As can be seen here, the length after the character string of the timestamp field before merging merges is compared with the length that the merging character string of the timestamp field obtaining is processed in the merging for through described in Fig. 2, and character quantity obviously reduces.
In concrete enforcement, when tool is sequential between the character string of the identical field of every record of daily record, can adopt the mode shown in Fig. 2, the character string of the same field of every record in daily record is merged to processing.
Fig. 3 show another kind in the embodiment of the present invention by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the process flow diagram of character string, below by concrete steps, be elaborated:
Step S31: the character string in the described field of the record of described daily record is carried out to ordered arrangement, lack of alignment between the character string in the described field of the record of described daily record.
Step S32: travel through the character string in the described field of record of described daily record.
Step S33: adding value in the described field of the record of rear first order of sequence is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of rear first order of described sequence.
Step S34: the character string in the described field of the record of the character string in the described field of the record of non-first order after sequence in described daily record and rear first order of described sequence is compared, obtain and record the number of repeat character (RPT) between the two.
Step S35: the repeat character (RPT) between the described field of the record of the described field of the record of non-first order after described sequence and rear first order of described sequence is removed, leave non-repeat character (RPT), obtain the new character string of the described field of the record of non-first order after described sequence, after described sequence, the new character string of the described field of the record of non-first order comprises the information of the number of the repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence after the position order information of the record of non-first order after described sequence and described sequence.
Step S36: after character string new in the described field of the record of non-first order after generated sequence being added successively to character string new in the described field of record of rear first order of described sequence, and in the described field of the record of rear first order of described sequence, after new character string and sequence, in the described field of the record of non-first order, between the new character string in the described field of the record of non-first order, be respectively arranged with and separate sign between new character string and after described sequence, obtain merging character string.
Different from Fig. 2, lack of alignment between character string in the described field of the handled log recording of method in employing Fig. 3.Therefore, can first perform step S31: the character string in the described field of the record of described daily record is carried out to ordered arrangement.Then, using the character string that sorts in the field of record of daily record of rear first order as benchmark character string again, by the character string in the field of the record of the daily record of non-first order after sequence one by one with the field of the record of the daily record of rear first order of sequence in character string compare and merge.
In concrete enforcement, the method shown in Fig. 3 can be more for the character string quantity of log recording, and the field that part repeats merges processing.For example, the agreement interconnecting between network (Internet Protocol is called for short IP) field, URL(uniform resource locator) (Uniform Resource Locator is called for short URL) field etc.
Fig. 4 show another kind in the embodiment of the present invention by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the process flow diagram of character string, below by concrete steps, be elaborated:
Step S41: travel through the character string in the described field of record of described daily record, obtain the character string of field described in the record of described daily record.
In concrete enforcement, the character string in the described field of the record of described daily record is identical.
Step S42: adopt preset characters string to replace the character string in the described field of the record of described daily record, obtain new character string.
In concrete enforcement, the character quantity of described preset characters string can be far smaller than the character quantity of the character string in the described field of record of described daily record.Therefore, can effectively dwindle the quantity that merges the character in character string.
Step S43: resulting new character string is merged, obtain merging character string, in described merging character string, be provided with between resulting new character string and separate sign.
The merging disposal route of the character string in the described field of the log recording shown in Fig. 4, the field that is applicable to the log recording of processing has: the character string in the described field of every record in described daily record is identical, and the more feature of the character quantity in character string.For example, the agency of picture access log recording (agent) field can occur as " Mozilla/4.0 (compatible in a large number; MSIE 7.0; Windows NT 6.2; WOW64; Trident/6.0; .NET4.0E; .NET4.0C; InfoPath.2) ".Due to, the type of the described field of the record of this kind of daily record is less, and the character string in field is identical, therefore, can set up character string in the field of record of daily record and the corresponding relation between preset characters string, the described field of the record of described daily record is merged while processing, the character string in the described field in every record in daily record can be done as a wholely, adopt respectively preset characters string to replace.Because the character quantity of preset characters string can be far smaller than the character string quantity in the field of record of corresponding with it described daily record, therefore, can reduce to a great extent the character quantity merging in character string, and then can improve compressibility.
Fig. 5 shows the process flow diagram of a kind of daily record decompression method in the embodiment of the present invention.Daily record decompression method as shown in Figure 5, can comprise:
Step S51: obtain and the compressed file that decompresses in through the merging character string of field described in the record of the described daily record of overcompression.
In concrete enforcement, by the described compressed file that decompresses, can obtain the merging character string of each field included in the record of daily record.
Step S52: by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, is recorded in the position order information in described daily record described in the character string of field described in the record of the described daily record after described recovery comprises.
In concrete enforcement, because the character string merging in the described field that character string is log recording compares and obtains with the benchmark character string of choosing, thereby can be by the merging character string obtaining and benchmark character string are compared, to recover the character string in the field of record of daily record.
Step S53: the character string having after recovering in the field of identical position order information is spliced in order, obtain described record.
In concrete enforcement, described in can including, the character string of the field of the log recording obtaining after recovery is recorded in the position order information in daily record, by the character string with the field of identical bits order information is stitched together in the order recording according to each field, can obtain described record.
Step S54: the record that splicing is obtained sorts according to the position order information of described record, the daily record after being decompressed.
In concrete enforcement, the position order information of described record is for being recorded in the position of daily record described in indicating, by the record that splicing is obtained, according to the position order information of described record, sort, just can the position of recovery record in daily record, the daily record after being decompressed.
Fig. 6 shows a kind of in the embodiment of the present invention described merging character string is carried out to Recovery processing, and the process flow diagram of the character string of field described in the record of the described daily record after being restored below illustrates by concrete step:
Step S61: obtain the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises first order, value be zero repeat character (RPT) number information and described in be recorded in the position order information in described daily record.
Step S62: obtain adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the described field of record of non-first order and the described field of the record of rear first order of sequence, the number information of repeat character (RPT) between the two, and the position order information in described daily record that is recorded in of described non-first order.
Step S63: according to the number information of repeat character (RPT) between the described field of the record of the described field of the record of described first order and described non-first order, before non-repetitive character described in obtained adjacent two character strings of separating between sign, add the repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order, and the number information of simultaneously deleting repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order, the character string of the described field of the described record of described non-first order after being restored.
In concrete enforcement, the restoration methods of the character string in the field of the record of the daily record described in Fig. 6 is the inverse process of merging method of character string of field of the record of the daily record shown in Fig. 2, and the merging method that please refer to the character string in the field of the aforementioned record for the daily record shown in Fig. 2 is understood the restoration methods of the character string in the field of the record of the daily record shown in Fig. 6.
Fig. 7 show in the embodiment of the present invention another described merging character string is carried out to Recovery processing, the process flow diagram of the character string of field described in the record of the described daily record after being restored, below illustrates by concrete step:
Step S71: obtain the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in described daily record.
Step S72: obtain adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the rear described field of record of non-first order of sequence and the described field of the record of rear first order of described sequence, the number information of repeat character (RPT) between the two, and the position order information in described daily record that is recorded in of non-first order after described sequence.
Step S73: according to the number information of repeat character (RPT) between the described field of the record of non-first order after the described field of the record of rear first order of described sequence and described sequence, before non-repetitive character described in obtained adjacent two character strings of separating between sign, add the repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence after described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, the character string of the described field of the described record of described non-first order after being restored.
In concrete enforcement, the restoration methods of the character string in the field of the record of the daily record described in Fig. 7 is the inverse process of merging method of character string of field of the record of the daily record shown in Fig. 3, and the merging method that please refer to the character string in the field of the aforementioned record for the daily record shown in Fig. 3 is understood the restoration methods of the character string in the field of the record of the daily record shown in Fig. 7.
Fig. 8 show in the embodiment of the present invention another described merging character string is carried out to Recovery processing, the process flow diagram of the character string of field described in the record of the described daily record after being restored, below illustrates by concrete step:
Step S81: obtain the first character string and adjacent two character strings of separating between sign of separating before identifying in described merging character string.
Step S82: the character string in the described field of the record of the corresponding described daily record of preset characters string employing in the character string before the first separation sign in obtained described merging character string and adjacent two character strings of separating between sign is replaced, the character string of the described field of the record of the described daily record after being restored, the character quantity of described preset characters string is less than the character quantity of the character string in the described field of record of described daily record.
In concrete enforcement, the restoration methods of the character string in the field of the record of the daily record described in Fig. 8 is the inverse process of merging method of character string of field of the record of the daily record shown in Fig. 4, and the merging method that please refer to the character string in the field of the aforementioned record for the daily record shown in Fig. 2 is understood the restoration methods of the character string in the field of the record of the daily record shown in Fig. 4.
Fig. 9 shows the structural representation of a kind of daily record compression set in the embodiment of the present invention.Daily record compression set 90 as shown in Figure 9, can comprise reading unit 91, storage unit 92, merge cells 93, creating unit 94 and compression unit 95, reading unit 91, storage unit 92, merge cells 93 connect successively, and compression unit 95 is connected with creating unit 94 with merge cells 93 respectively.Wherein:
Reading unit 91, is suitable for reading the record in described daily record, and described record comprises at least one field, and described field comprises the character string that at least one character forms.
Storage unit 92, the record that is suitable for described daily record that described reading unit 91 is read is deposited according to field, described in adding, is recorded in the position order information in described daily record in the field of stored record.
Merge cells 93, be suitable for by by the character string in the described field of record of the described daily record of storage in described storage unit 92 respectively with the same field of selected reference recording in character string compare, the character string of field described in record in the described daily record of storing in described storage unit is merged to processing, obtain merging character string.
Creating unit 94, be suitable for creating compressed file, described compressed file comprises the header of described compressed file, and described header comprises the line number information that the identification information for identifying described daily record compression method, described daily record record, the information that records included field number of described daily record.
Compression unit 95, is suitable for that merge cells 93 is merged to the merging character string obtaining and compresses, and the merging character string after compression is added in the compressed file that described creating unit 94 creates in the sequence of positions of described record successively according to described field.
Figure 10 shows the structural representation of a kind of merge cells in the embodiment of the present invention.Merge cells 100 as shown in figure 10, can comprise that the first traversal subelement 101, first adds subelement 102, first and compares subelement 103, the first generation subelement 104, the first merging subelement 105, the first traversal subelement 101 compares subelement 103 with the first interpolation subelement 102 and first respectively and is connected, first compares subelement 103, the first generation subelement 104 and the first merging subelement 105 is connected successively, and the first interpolation subelement 102 is also connected with the first merging subelement 105.Wherein:
The first traversal subelement 101, is suitable for traveling through the character string in the described field of record of described daily record.In concrete enforcement, ordered arrangement between the character string in the described field of the record of described daily record.
First adds subelement 102, and being suitable for adding value in the described field of the record of first order is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of described first order.
First compares subelement 103, is suitable for the character string in the described field of the record of the character string in the described field of the record of non-first order in described daily record and described first order to compare, and obtains and record the number of repeat character (RPT) between the two.
First generates subelement 104, be suitable for the character repeating between the described first described field of record of non-first order that relatively subelement 103 obtains and the described field of the record of described first order to remove, leave non-repeat character (RPT), obtain the new character string in the described field of record of described non-first order.In concrete enforcement, the new character string in the described field of the record of described non-first order comprises the position order information of record of described non-first order and the number information of the repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order.
First merges subelement 105, be suitable for the new character string generating in the described field of record of non-first order that subelement 104 generates described first, add to successively described first add character string new in the described field of record of described first order that subelement 102 obtains after, and between the new character string in the described field of the record of described first order in the described field of the record of new character string and described non-first order, and between the new character string in the described field of the record of described non-first order, be respectively arranged with and separate sign, obtain merging character string.
Figure 11 shows the structural representation of another merge cells in the embodiment of the present invention.Merge cells 110 as shown in figure 11, can comprise that sequence subelement 111, the second traversal subelement 112, the second interpolation subelement 113, second compare subelement 114, the second generation subelement 115 and second merges subelement 116, the second traversal subelement 112 adds subelement 113 with sequence subelement 111, second respectively, the second traversal subelement 114 is connected, the second generation subelement 115 compares subelement 114 with second respectively and the second merging subelement 116 is connected, and second adds subelement 113 is also connected with the second merging subelement 116.Wherein:
Sequence subelement 111, is suitable for by the character string ordered arrangement in the described field of the record of described daily record lack of alignment between the character string in the described field of the record of described daily record.
The second traversal subelement 112, is suitable for the character string in the described field of record of the described daily record of traversal after 111 sequences of described sequence subelement.
Second adds subelement 113, being suitable for adding value in the described field of the record of first order after 111 sequences of described sequence subelement is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of rear first order of described sequence.
Second compares subelement 114, is suitable for the character string in the described field of the record of the character string in the described field of the record of non-first order after described sequence and rear first order of described sequence to compare, and obtains and record the number of repeat character (RPT) between the two.
Second generates subelement 115, be suitable for the repeat character (RPT) between the described field of the record of non-first order after the described second described sequence that relatively subelement 114 obtains and the described field of the record of rear first order of sequence to remove, leave non-repeat character (RPT), generate new character string in the described field of the record of non-first order after described sequence, after described sequence, in the described field of the record of non-first order, new character string comprises the information of the repeat character (RPT) number between the recorded bit order information of non-first order and the described field of the record of rear first order of described sequence after the second described sequence that relatively subelement 114 obtains.
Second merges subelement 116, be suitable for generating after new character string in the described field of the record of non-first order after the sequence that subelement 115 generates adds the new character string of described field of record of rear first order of described sequence successively to described second, and new character string in the described field of record of non-first order after the new character string of the described field of the record of rear first order of described sequence and sequence, and between the new character string in the described field of the record of described non-first order, be respectively arranged with separation sign, obtain merging character string.
Figure 12 shows the structural representation of another merge cells in the embodiment of the present invention.Merge cells 120 as shown in figure 12, can comprise that the 3rd traversal subelement 121, the first replacement subelement 122 and the 3rd that connect successively merge subelement 123.Wherein:
The 3rd traversal subelement 121, is suitable for traveling through the character string in the described field of record of described daily record, obtains the repeat character string in the character string of the described field recording in described daily record.
First replaces subelement 122, be suitable for the character string in the described field of the record of described daily record to adopt preset characters string to replace, obtain new character string, the character quantity of the character string in the described field of the record of described daily record is greater than the character quantity of described preset characters string.
The 3rd merges subelement 123, and the new character string that is suitable for described the first replacement subelement 122 to obtain merges, and obtains merging character string, in described merging character string, is provided with and separates sign between resulting new character string, obtains merging character string.
Figure 13 shows the structural representation of a kind of daily record decompressing device in the embodiment of the present invention.Daily record decompressing device 130 as shown in figure 13, can comprise the decompression unit 131, recovery unit 132, concatenation unit 133 and the sequencing unit 134 that connect successively.Wherein:
Decompression unit 131, be suitable for obtaining and the described compressed file that decompresses in the record of the described daily record of overcompression the merging character string of field.
Recovery unit 132, be suitable for comparing by described decompression unit 131 is decompressed the merging character string obtaining and the benchmark character string of choosing, described merging character string is carried out to Recovery processing, character string in the described field of the record of the described daily record after being restored, is recorded in the position order information in described daily record described in the character string in the described field of the record of the described daily record after described recovery comprises.
Concatenation unit 133, the character string with the field of recovering rear identical bits order information that is suitable for described recovery unit 132 to obtain is spliced in order, obtains the record of described daily record.
Sequencing unit 134, after the record that is suitable for daily record that 133 splicings of described concatenation unit are obtained sorts according to the position order information of the record of described daily record, the daily record after being decompressed.
Figure 14 shows the structural representation of a kind of recovery unit in the embodiment of the present invention.Recovery unit 140 as shown in figure 14, can comprise that first obtains subelement 141, second and obtain subelement 142 and first and recover subelement 143, the first and recover subelements 143 and obtain subelement 141 and second and obtain subelement 142 and be connected with first respectively.Wherein:
First obtains subelement 141, be suitable for obtaining the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises first order, value be zero repeat character (RPT) number information and described in be recorded in the position order information in daily record.
Second obtains subelement 142, be suitable for obtaining adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the described field of record of non-first order and the described field of the record of described first order, the number information of repeat character (RPT) between the described field of the record of the field of the record of described non-first order and described first order, and the position order information in described daily record that is recorded in of described non-first order.
First recovers subelement 143, be suitable for according to described first obtain in the described merging character string that subelement 141 obtains first separate number information that character string and second before sign obtains repeat character (RPT) between the character string of described field of record of first order that subelement 142 obtains and the character string of the described field of the record of described first order with, before non-repetitive character described in described two character strings of separating between sign, add the repeat character (RPT) between the described field of record of character string in the described field of record of described first order and described first order, and the number information of simultaneously deleting repeat character (RPT) between the field of record of described non-first order and the described field of the record of described first order, the character string of the described field of the described record of non-first order after being restored.
Figure 15 shows the structural representation of another recovery unit in the embodiment of the present invention.Recovery unit 150 as shown in figure 15, can comprise that the 3rd obtains subelement 151, the 4th and obtain subelement 152 and second and recover subelement 153, the second and recover subelements 153 and obtain subelement 151 and the 4th and obtain subelement 152 and be connected with the 3rd respectively.Wherein:
The 3rd obtains subelement 151, be suitable for obtaining the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in daily record;
The 4th obtains subelement 152, be suitable for obtaining adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the rear described field of record of non-first order of sequence and the described field of the record of rear first order of described sequence, the number information of repeat character (RPT) between the described field of the record of the field of the record of non-first order and rear first order of described sequence after described sequence, and the position order information in described daily record that is recorded in of non-first order after described sequence;
Second recovers subelement 153, be suitable for obtaining in the described merging character string that subelement 151 obtains first according to the 3rd and separate the number information that character string and the 4th before sign is obtained repeat character (RPT) between the described field of record of rear first order of sequence that subelement 152 obtains and the described field of the record of rear first order of described sequence, before non-repetitive character described in described two character strings of separating between sign, add the repeat character (RPT) between the described field of record of rear first order of described sequence and the described field of the record of rear first order of described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, character string after described sequence after being restored in the described field of the record of non-first order.
Figure 16 shows the structural representation of another recovery unit in the embodiment of the present invention.Recovery unit 160 as shown in figure 16, can comprise successively the 5th obtaining subelement 161, second and replace subelement 162 of connecting.Wherein:
The 5th obtains subelement 161, is suitable for obtaining the first character string and adjacent two character strings of separating between sign of separating before identifying in described merging character string.
Second replaces subelement 162, be suitable for obtaining in the described merging character string that subelement 161 obtains the first character string of separating in the described field of the record that character string before sign and the corresponding preset characters string in adjacent two character strings of separating between sign adopt described daily record by the described the 5th and replace, the character string of the described field of the record of the described daily record after being restored.In concrete enforcement, the character string in the described field of the record of described daily record and described preset characters string have relation one to one.
One of ordinary skill in the art will appreciate that all or part of step in the whole bag of tricks of above-described embodiment is to come the hardware that instruction is relevant to complete by program, this program can be stored in computer-readable recording medium, and storage medium can comprise: ROM, RAM, disk or CD etc.
Above the method and system of the embodiment of the present invention have been done to detailed introduction, the present invention is not limited to this.Any those skilled in the art, without departing from the spirit and scope of the present invention, all can make various changes or modifications, so protection scope of the present invention should be as the criterion with claim limited range.

Claims (16)

1. a daily record compression method, is characterized in that, comprising:
Read the record in described daily record, described record comprises at least one field, and described field comprises the character string that at least one character forms;
The record of described daily record is deposited according to field, described in adding, be recorded in the position order information in described daily record in the field of stored record;
By by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging character string;
Create compressed file, described compressed file comprises the header of described compressed file, and described header comprises the line number information that the identification information for identifying described daily record compression method, described daily record record, the information that records included field number of described daily record;
Resulting merging character string is compressed, and the merging character string after compression is added in created compressed file in the sequence of positions of described record successively according to described field.
2. daily record compression method according to claim 1, it is characterized in that, ordered arrangement between character string in the described field of the record of described daily record, described by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, character string in the described field of the record of described daily record is merged to processing, obtain merging character string, comprising:
Travel through the character string in the described field of record of described daily record;
In the described field of the record of first order, adding value is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of described first order;
Character string in the described field of the record of the character string in the described field of the record of non-first order in described daily record and described first order is compared, obtain and record the number of repeat character (RPT) between the two;
Repeat character (RPT) between the described field of the record of the described field of the record of described non-first order and described first order is removed, leave non-repeat character (RPT), the new character string of described field that obtains the record of described non-first order, the new character string of the described field of the record of described non-first order comprises the information of the position order information of record of described non-first order and the number of the repeat character (RPT) between the described field of record of described non-first order and the described field of the record of described first order;
Using new character string in the described field of record of described first order of obtaining as beginning, after character string new in the described field of the record of generated non-first order being added successively to character string new in the described field of record of described first order, and in the described field of the record of described first order, in the described field of the record of new character string and non-first order, between the new character string between new character string and in the described field of the record of described non-first order, be respectively arranged with and separate sign, obtain merging character string.
3. daily record compression method according to claim 2, it is characterized in that, between character string in the described field of the record of described daily record during lack of alignment, character string in the described field of the record of described daily record is carried out to ordered arrangement, and carry out described by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging the operation of character string.
4. daily record compression method according to claim 2, it is characterized in that, described by by the character string of the described field recording in described daily record respectively with the described field of selected reference recording in character string compare, the character string of the described field recording in described daily record is merged to processing, obtain merging character string, comprising:
Travel through the character string in the described field of record of described daily record, obtain the character string in the described field of record of described daily record, described in the record of described daily record, the character string of field is identical;
Adopt preset characters string to replace character string in the described field of the record of described daily record, obtain new character string, the character quantity of described preset characters string is less than the character quantity of the character string in the described field of record of described daily record;
Resulting new character string is merged, obtain merging character string, in described merging character string, between resulting new character string, be provided with and separate sign.
5. a daily record decompression method, is characterized in that, comprising:
Obtain and the compressed file that decompresses in through the merging character string of field described in the record of the described daily record of overcompression;
By merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, is recorded in the position order information in described daily record described in the character string of field described in the record of the described daily record after described recovery comprises;
The character string having after recovering in the field of identical position order information is spliced in order, obtain described record;
The record that splicing is obtained sorts according to the position order information of described record, the daily record after being decompressed.
6. daily record decompression method according to claim 5, it is characterized in that, by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, comprising:
Obtain the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in described daily record;
Obtain adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the described field of record of rear non-first order of sequence and the described field of the record of rear first order of sequence, the number information of repeat character (RPT) between the two, and the position order information in described daily record that is recorded in of non-first order after described sequence;
According to the number information of repeat character (RPT) between the described field of the record of non-first order after the described field of the record of rear first order of described sequence and described sequence, before non-repetitive character described in obtained adjacent two character strings of separating between sign, add the repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence after described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, the character string of the described field of the described record of described non-first order after being restored.
7. daily record decompression method according to claim 5, it is characterized in that, by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, comprising:
Obtain the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in described daily record;
Obtain adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the rear described field of record of non-first order of sequence and the described field of the record of rear first order of described sequence, the number information of repeat character (RPT) between the two, and the position order information in described daily record that is recorded in of non-first order after described sequence;
According to the number information of repeat character (RPT) between the described field of the record of non-first order after the described field of the record of rear first order of described sequence and described sequence, before non-repetitive character described in obtained adjacent two character strings of separating between sign, add the repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence after described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, the character string of the described field of the described record of described non-first order after being restored.
8. daily record decompression method according to claim 5, it is characterized in that, by merging character string and benchmark character string are compared, described merging character string is carried out to Recovery processing, the character string of field described in the record of the described daily record after being restored, comprising:
Obtain the first character string and adjacent two character strings of separating between sign of separating before identifying in described merging character string;
Character string in the described field of the record of the corresponding described daily record of preset characters string employing in the character string before the first separation sign in obtained described merging character string and adjacent two character strings of separating between sign is replaced, the character string of the described field of the record of the described daily record after being restored, the character quantity of described preset characters string is less than the character quantity of the character string in the described field of record of described daily record.
9. a daily record compression set, is characterized in that, comprising:
Reading unit, is suitable for reading the record in described daily record, and described record comprises at least one field, and described field comprises the character string that at least one character forms;
Storage unit, the record that is suitable for described daily record that described reading unit is read is deposited according to field, described in adding, is recorded in the position order information in described daily record in the field of stored record;
Merge cells, be suitable for by by the character string in the described field of the record of the described daily record of storing in described storage unit respectively with the same field of selected reference recording in character string compare, the character string of field described in record in the described daily record of storing in described storage unit is merged to processing, obtain merging character string;
Creating unit, be suitable for creating compressed file, described compressed file comprises the header of described compressed file, and described header comprises the line number information that the identification information for identifying described daily record compression method, described daily record record, the information that records included field number of described daily record;
Compression unit, is suitable for that merge cells is merged to the merging character string obtain and compresses, and the merging character string after compression is added in the compressed file that described creating unit creates in the sequence of positions of described record successively according to described field.
10. daily record compression set according to claim 9, is characterized in that, described merge cells comprises:
The first traversal subelement, is suitable for traveling through the character string in the described field of record of described daily record, ordered arrangement between the character string in the described field of the record of described daily record;
First adds subelement, and being suitable for adding value in the described field of the record of first order is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of described first order;
First compares subelement, is suitable for the character string in the described field of the record of the character string in the described field of the record of non-first order in described daily record and described first order to compare, and obtains and record the number of repeat character (RPT) between the two;
First generates subelement, be suitable for the character repeating between the described first described field of record of described non-first order that relatively subelement obtains and the described field of the record of described first order to remove, leave non-repeat character (RPT), obtain the new character string in the described field of record of described non-first order, the new character string in the described field of the record of described non-first order comprises the information of the number of the repeat character (RPT) between the position order information of record of described non-first order and the described field of the record of described first order;
First merges subelement, being suitable for usining described first adds subelement and obtains character string new in the described field of record of described first order as beginning, new character string in the described field of the record of non-first order that described the first generation subelement is generated, add to successively described first add character string new in the described field of record of described first order that subelement obtains after, and between the new character string in the described field of the record of described first order in the described field of the record of new character string and described non-first order, and between the new character string in the described field of the record of described non-first order, be respectively arranged with and separate sign, obtain merging character string.
11. daily record compression sets according to claim 9, is characterized in that, described merge cells comprises:
Sequence subelement, is suitable for by the character string ordered arrangement in the described field of the record of described daily record lack of alignment between the character string in the described field of the record of described daily record;
The second traversal subelement, is suitable for the character string in the described field of record of the described daily record after traversal sequence;
Second adds subelement, and being suitable for adding value in the described field of the record of rear first order of sequence is the information of zero repeat character (RPT) number, obtains new character string in the described field of record of rear first order of described sequence;
Second compares subelement, is suitable for the character string in the described field of the record of the character string in the described field of the record of non-first order after described sequence and rear first order of described sequence to compare, and obtains and record the number of repeat character (RPT) between the two;
Second generates subelement, be suitable for the repeat character (RPT) between the described field of the record of the described field of the record of non-first order after described sequence and rear first order of described sequence to remove, leave non-repeat character (RPT), generate new character string in the described field of the record of non-first order after described sequence, after described sequence, in the described field of the record of non-first order, new character string comprises the information of the repeat character (RPT) number between the recorded bit order information of non-first order and the described field of the record of rear first order of described sequence after described sequence;
Second merges subelement, be suitable for generating after new character string in the described field of the record of non-first order after the sequence that subelement generates adds the new character string of described field of record of rear first order of described sequence successively to described second, and new character string in the described field of record of non-first order after the new character string of the described field of the record of rear first order of described sequence and sequence, and between the new character string in the described field of the record of described non-first order, be respectively arranged with separation sign, obtain merging character string.
12. daily record compression sets according to claim 9, is characterized in that, described merge cells comprises:
The 3rd traversal subelement, is suitable for traveling through the character string in the described field of record of described daily record, obtains the character string in the described field recording in described daily record;
First replaces subelement, be suitable for the character string in the described field of the record of described daily record to adopt preset characters string to replace, obtain new character string, the character quantity of the character string in the described field of the record of described daily record is greater than the character quantity of described preset characters string;
The 3rd merges subelement, and the new character string that is suitable for described the first replacement subelement to obtain merges, and obtains merging character string, in described merging character string, is provided with and separates sign between resulting new character string, obtains merging character string.
13. 1 kinds of daily record decompressing devices, is characterized in that, comprising:
Decompression unit, be suitable for obtaining and the described compressed file that decompresses in the record of the described daily record of overcompression the merging character string of field;
Recovery unit, be suitable for comparing by described decompression unit is decompressed the merging character string obtaining and the benchmark character string of choosing, described merging character string is carried out to Recovery processing, character string in the described field of the record of the described daily record after being restored, is recorded in the position order information in described daily record described in the character string in the described field of the record of the described daily record after described recovery comprises;
Concatenation unit, having of being suitable for that described recovery unit obtains recover after the character string of field of identical bits order information splice in order, obtain the record of described daily record;
Sequencing unit, after the record that is suitable for daily record that the splicing of described concatenation unit is obtained sorts according to the position order information of the record of described daily record, the daily record after being decompressed.
14. daily record decompressing devices according to claim 13, is characterized in that, described recovery unit comprises:
First obtains subelement, be suitable for obtaining the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises first order, value be zero repeat character (RPT) number information and described in be recorded in the position order information in daily record;
Second obtains subelement, be suitable for obtaining adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the described field of record of non-first order and the described field of the record of described first order, the number information of repeat character (RPT) between the described field of the record of the field of the record of described non-first order and described first order, and the position order information in described daily record that is recorded in of described non-first order;
First recovers subelement, be suitable for obtaining in the described merging character string that subelement obtains first according to first and separate the number information that character string and second before sign is obtained repeat character (RPT) between the character string of described field of record of first order that subelement obtains and the described field of the record of described first order, before non-repetitive character described in described two character strings of separating between sign, add the repeat character (RPT) between the described field of record of character string in the described field of record of described first order and described first order, and the number information of simultaneously deleting repeat character (RPT) between the field of record of described non-first order and the described field of the record of described first order, the character string of the described field of the described record of non-first order after being restored.
15. daily record decompressing devices according to claim 13, is characterized in that, described recovery unit comprises:
The 3rd obtains subelement, be suitable for obtaining the first character string of separating before identifying in described merging character string, described first separates the character string in the described field of record that character string before sign comprises rear first order that sort, value be zero repeat character (RPT) number information with described in be recorded in the position order information in daily record;
The 4th obtains subelement, be suitable for obtaining adjacent two character strings of separating between sign in described merging character string, described adjacent two character strings of separating between sign comprise non-repetitive character between the rear described field of record of non-first order of sequence and the described field of the record of rear first order of described sequence, the number information of repeat character (RPT) between the described field of the record of the field of the record of non-first order and rear first order of described sequence after described sequence, and the position order information in described daily record that is recorded in of non-first order after described sequence;
Second recovers subelement, be suitable for obtaining in the described merging character string that subelement obtains first according to the described the 3rd and separate the number information that character string and the described the 4th before sign is obtained repeat character (RPT) between the described field of record of rear first order of sequence that subelement obtains and the described field of the record of rear first order of described sequence, before non-repetitive character described in described two character strings of separating between sign, add the repeat character (RPT) between the described field of record of rear first order of described sequence and the described field of the record of rear first order of described sequence, and delete after described sequence the number information of repeat character (RPT) between the described field of record of non-first order and the described field of the record of rear first order of described sequence simultaneously, character string after described sequence after being restored in the described field of the record of non-first order.
16. daily record decompressing devices according to claim 13, is characterized in that, described recovery unit comprises:
The 5th obtains subelement, is suitable for obtaining the first character string and adjacent two character strings of separating between sign of separating before identifying in described merging character string;
Second replaces subelement, be suitable for obtaining in the described merging character string that subelement obtains the first character string of separating in the described field of the record that character string before sign and the corresponding preset characters string in adjacent two character strings of separating between sign adopt described daily record by the described the 5th and replace, the character string of the described field of the record of the described daily record after being restored.
CN201410283777.3A 2014-06-23 2014-06-23 log compression method and device, decompression method and device Active CN104050269B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410283777.3A CN104050269B (en) 2014-06-23 2014-06-23 log compression method and device, decompression method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410283777.3A CN104050269B (en) 2014-06-23 2014-06-23 log compression method and device, decompression method and device

Publications (2)

Publication Number Publication Date
CN104050269A true CN104050269A (en) 2014-09-17
CN104050269B CN104050269B (en) 2017-06-16

Family

ID=51503101

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410283777.3A Active CN104050269B (en) 2014-06-23 2014-06-23 log compression method and device, decompression method and device

Country Status (1)

Country Link
CN (1) CN104050269B (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104410424A (en) * 2014-11-26 2015-03-11 西安电子科技大学 Quick lossless compression method of memory data of embedded device
CN105468748A (en) * 2015-11-26 2016-04-06 航天恒星科技有限公司 Distributed storage location data method and system
CN105516307A (en) * 2015-12-09 2016-04-20 浪潮电子信息产业股份有限公司 Method for optimizing log storage of cloud storage system based on compression
CN105654259A (en) * 2015-12-25 2016-06-08 中国民航信息网络股份有限公司 Mass agent freight rate search compression method
CN106021417A (en) * 2016-05-12 2016-10-12 京信通信***(广州)有限公司 Log compression method and device
CN106844565A (en) * 2016-12-30 2017-06-13 上海帝联信息科技股份有限公司 Charactor comparison method and device between data row
CN107025233A (en) * 2016-01-29 2017-08-08 苏宁云商集团股份有限公司 A kind of processing method and processing device of data characteristics
CN107391583A (en) * 2017-06-23 2017-11-24 微梦创科网络科技(中国)有限公司 Website logins log information is converted to the method and system of vectorization data
CN107688624A (en) * 2017-08-18 2018-02-13 杭州迪普科技股份有限公司 A kind of daily record index structuring method and device
CN108256017A (en) * 2018-01-08 2018-07-06 武汉斗鱼网络科技有限公司 A kind of method, apparatus and computer equipment for data storage
CN108306771A (en) * 2018-02-09 2018-07-20 腾讯科技(深圳)有限公司 Log reporting method, apparatus and system
CN108933781A (en) * 2018-06-19 2018-12-04 上海点融信息科技有限责任公司 Method, apparatus and computer readable storage medium for processing character string
CN109617708A (en) * 2018-10-31 2019-04-12 浙江口碑网络技术有限公司 A kind of compression method burying a log, equipment and system
CN110543458A (en) * 2019-09-13 2019-12-06 北京上下文***软件有限公司 compression algorithm for mobile network internet log data
CN110851409A (en) * 2019-11-06 2020-02-28 南京星环智能科技有限公司 Log compression and decompression method, device and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130031064A1 (en) * 2009-12-22 2013-01-31 At&T Intellectual Property I, L.P. Compressing Massive Relational Data
CN103379136A (en) * 2012-04-17 2013-10-30 ***通信集团公司 Compression method and decompression method of log acquisition data, compression apparatus and decompression apparatus of log acquisition data

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130031064A1 (en) * 2009-12-22 2013-01-31 At&T Intellectual Property I, L.P. Compressing Massive Relational Data
CN103379136A (en) * 2012-04-17 2013-10-30 ***通信集团公司 Compression method and decompression method of log acquisition data, compression apparatus and decompression apparatus of log acquisition data

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SEBASTIAN DEOROWICZ等: "Sub-Atomic Field Processing for Improved Web Log Compression", 《PROCEEDINGS OF IEEE INTERNATIONAL CONFERENCE ON MODERN PROBLEMS OF RADIO ENGINEERING, TELECOMMUNICATIONS AND COMPUTER SCIENCE》 *
甄成 等: "一种基于日志结构的自动压缩/解压缩文件***的实现方案", 《计算机工程与应用》 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104410424B (en) * 2014-11-26 2017-06-16 西安电子科技大学 The fast and lossless compression method of embedded device internal storage data
CN104410424A (en) * 2014-11-26 2015-03-11 西安电子科技大学 Quick lossless compression method of memory data of embedded device
CN105468748A (en) * 2015-11-26 2016-04-06 航天恒星科技有限公司 Distributed storage location data method and system
CN105468748B (en) * 2015-11-26 2020-05-26 航天恒星科技有限公司 Distributed storage position data method and system
CN105516307A (en) * 2015-12-09 2016-04-20 浪潮电子信息产业股份有限公司 Method for optimizing log storage of cloud storage system based on compression
CN105654259A (en) * 2015-12-25 2016-06-08 中国民航信息网络股份有限公司 Mass agent freight rate search compression method
CN105654259B (en) * 2015-12-25 2021-07-06 中国民航信息网络股份有限公司 Large-batch agent freight rate search compression method
CN107025233B (en) * 2016-01-29 2020-04-28 苏宁云计算有限公司 Data feature processing method and device
CN107025233A (en) * 2016-01-29 2017-08-08 苏宁云商集团股份有限公司 A kind of processing method and processing device of data characteristics
CN106021417A (en) * 2016-05-12 2016-10-12 京信通信***(广州)有限公司 Log compression method and device
CN106844565B (en) * 2016-12-30 2020-07-07 上海帝联信息科技股份有限公司 Character comparison method and device between data lines
CN106844565A (en) * 2016-12-30 2017-06-13 上海帝联信息科技股份有限公司 Charactor comparison method and device between data row
CN107391583A (en) * 2017-06-23 2017-11-24 微梦创科网络科技(中国)有限公司 Website logins log information is converted to the method and system of vectorization data
CN107391583B (en) * 2017-06-23 2020-07-28 微梦创科网络科技(中国)有限公司 Method and system for converting website login log information into vectorized data
CN107688624B (en) * 2017-08-18 2020-12-29 杭州迪普科技股份有限公司 Log index construction method and device
CN107688624A (en) * 2017-08-18 2018-02-13 杭州迪普科技股份有限公司 A kind of daily record index structuring method and device
CN108256017A (en) * 2018-01-08 2018-07-06 武汉斗鱼网络科技有限公司 A kind of method, apparatus and computer equipment for data storage
CN108256017B (en) * 2018-01-08 2020-12-15 武汉斗鱼网络科技有限公司 Method and device for data storage and computer equipment
CN108306771B (en) * 2018-02-09 2021-06-18 腾讯科技(深圳)有限公司 Log reporting method, device and system
CN108306771A (en) * 2018-02-09 2018-07-20 腾讯科技(深圳)有限公司 Log reporting method, apparatus and system
CN108933781A (en) * 2018-06-19 2018-12-04 上海点融信息科技有限责任公司 Method, apparatus and computer readable storage medium for processing character string
CN108933781B (en) * 2018-06-19 2021-07-02 上海点融信息科技有限责任公司 Method, apparatus and computer-readable storage medium for processing character string
CN109617708B (en) * 2018-10-31 2020-07-31 浙江口碑网络技术有限公司 Compression method, device and system for embedded point log
CN109617708A (en) * 2018-10-31 2019-04-12 浙江口碑网络技术有限公司 A kind of compression method burying a log, equipment and system
CN110543458A (en) * 2019-09-13 2019-12-06 北京上下文***软件有限公司 compression algorithm for mobile network internet log data
CN110851409A (en) * 2019-11-06 2020-02-28 南京星环智能科技有限公司 Log compression and decompression method, device and storage medium

Also Published As

Publication number Publication date
CN104050269B (en) 2017-06-16

Similar Documents

Publication Publication Date Title
CN104050269A (en) Log compression method and device and log decompression method and device
CN102937926B (en) Method and device for recovering deleted sqlite files on mobile terminal
US7965841B2 (en) Method and apparatus for compressing and decompressing data, and computer product
CN106844607B (en) SQLite data recovery method suitable for non-integer main key and idle block combination
CN102929946A (en) Data synchronization method, device and system
CN102902762A (en) Method, device and system for deleting repeating data
US11989161B2 (en) Generating readable, compressed event trace logs from raw event trace logs
CN105447146A (en) Massive data collecting and exchanging system and method
CN105068885A (en) JPG fragmented file recovery and reconstruction method
CN105447168A (en) Method for restoring and recombining fragmented files in MP4 format
CN103646048A (en) Method and device for achieving multimedia pictures
CN104021217A (en) System and method for extracting fragment file and deleted file of mobile phone
CN112732191A (en) Method, system, device and medium for merging tree merging data based on log structure
Hadi Reviewing and evaluating existing file carving techniques for JPEG files
CN109617708A (en) A kind of compression method burying a log, equipment and system
KR101218087B1 (en) Method for Extracting InputFormat for Binary Format Data in Hadoop MapReduce and Binary Data Analysis Using the Same
CN105282206A (en) Method, device and system for processing website resource files
CN105704215A (en) File sharing system and corresponding file sending and receiving method and device
CN116910820A (en) Data report processing method, device, computer equipment and storage medium
KR101200773B1 (en) Method for Extracting InputFormat for Handling Network Packet Data on Hadoop MapReduce
CN106874147A (en) A kind of recovery simultaneously parses the method that Windows operating system pre-reads file
CN113467997A (en) Data recovery method and device, mobile device and storage medium
CN103902567A (en) Data processing method, device and system
CN104618644B (en) A kind of view data writes the method and terminal of file
CN102053881A (en) Zip file carving recovery method based on contents

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant