CN108233942A - A kind of method, apparatus and computer equipment for data storage - Google Patents

A kind of method, apparatus and computer equipment for data storage Download PDF

Info

Publication number
CN108233942A
CN108233942A CN201810014959.9A CN201810014959A CN108233942A CN 108233942 A CN108233942 A CN 108233942A CN 201810014959 A CN201810014959 A CN 201810014959A CN 108233942 A CN108233942 A CN 108233942A
Authority
CN
China
Prior art keywords
coding
compression
data
character string
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810014959.9A
Other languages
Chinese (zh)
Other versions
CN108233942B (en
Inventor
胡耀文
张文明
陈少杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Douyu Network Technology Co Ltd
Original Assignee
Wuhan Douyu Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Douyu Network Technology Co Ltd filed Critical Wuhan Douyu Network Technology Co Ltd
Priority to CN201810014959.9A priority Critical patent/CN108233942B/en
Publication of CN108233942A publication Critical patent/CN108233942A/en
Application granted granted Critical
Publication of CN108233942B publication Critical patent/CN108233942B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

An embodiment of the present invention provides a kind of method, apparatus and computer equipment for data storage, apply in platform is broadcast live, the method includes:More parts of performance sampled datas are obtained, the performance sampled data includes:A variety of method names and corresponding operation data;String encoding is carried out using each method name in every part of performance data as different basic elements, generates multiple unshared coding schedules;Based on multiple unshared coding schedules, the method title after each coding is searched respectively;The unshared coding schedule is subjected to first compression with the method title after each coding respectively, obtains each side's legitimate name after first compression;Each the method title after first compression and each corresponding operation data of the method title are serialized respectively, generate character string sequence;The character string sequence is subjected to second-compressed, compression result is obtained, the compression result is stored into database.

Description

A kind of method, apparatus and computer equipment for data storage
Technical field
The invention belongs to network operation technical field more particularly to a kind of method, apparatus and calculating for data storage Machine equipment.
Background technology
After xhprof is used to carry out performance sampling, need by sampled data storage in database with it is to be retrieved and point Analysis.
Performance sampled data is a very big array, and common processing method is laggard first with conventional compact compression algorithm Row storage, this method cause compression ratio low main drawback is that content to be compressed is encoded by single character.
Invention content
In view of the problems of the existing technology, an embodiment of the present invention provides a kind of method, apparatus for data storage And computer equipment, when being compressed in the prior art to performance sampled data for solution, compression ratio is low to cause what is occupied to deposit Store up the technical issues of space is larger.
The present invention provides a kind of method for data storage, applies in platform is broadcast live, the method includes:
More parts of performance sampled datas are obtained, the performance sampled data includes:A variety of method names and corresponding operation number According to;
String encoding is carried out using each method name in every part of performance data as different basic elements, it is raw Into multiple unshared coding schedules;
Based on multiple unshared coding schedules, the method title after each coding is searched respectively;
The unshared coding schedule is subjected to first compression with the method title after each coding respectively, is obtained primary Compressed each side's legitimate name;
Respectively by each the method title after first compression and each corresponding operation data of the method title into Row serializing, generates character string sequence;
The character string sequence is subjected to second-compressed, compression result is obtained, the compression result is stored to database In.
Described to carry out string encoding using each method name as different basic elements in said program, generation is compiled Code table, including:
The quantity of each method name in the more parts of performance sampled datas is counted respectively;
A unique character string is distributed for each method name in every part of performance sampled data;
By each method name described in every part of performance sampled data kind, the quantity of each method name and corresponding Character string is stored respectively into corresponding mapping table, and the mapping table is unshared coding schedule, and the corresponding character string is pre- If.
In said program, in the character string sequence, separator is provided between different character strings.
In said program, the operation data includes:The run time of each the method title, number of run, operation Memory and central processing unit (CPU, Central Processing Unit) utilization rate of occupancy.
The present invention also provides a kind of device for data storage, described device includes:
Acquiring unit, for obtaining more parts of performance sampled datas, the performance sampled data includes:A variety of method names and Corresponding operation data;
Coding unit, for being carried out each method name in every part of performance data as different basic elements String encoding generates multiple unshared coding schedules;
Searching unit for being based on multiple unshared coding schedules, searches the method name after each coding respectively Claim;
First compression unit, for the unshared coding schedule to be carried out respectively with the method title after each coding First compression obtains each side's legitimate name after first compression;
Generation unit, for respectively corresponding to each the method title after first compression and each the method title Operation data serialized, generate character string sequence;
Second compression unit for the character string sequence to be carried out second-compressed, obtains compression result;
Storage unit, for storing the compression result into database.
In said program, the coding unit is specifically used for:
The quantity of each method name in the more parts of performance sampled datas is counted respectively;
A unique character string is distributed for each method name in every part of performance sampled data;
By each method name described in every part of performance sampled data kind, the quantity of each method name and corresponding Character string is stored respectively into corresponding mapping table, and the mapping table is unshared coding schedule, and the corresponding character string is pre- If.
In said program, in the character string sequence, separator is provided between different character strings.
In said program, the operation data includes:The run time of each the method title, number of run, operation The memory of occupancy and central processor CPU utilization rate.
The present invention also provides a kind of computer readable storage mediums, are stored thereon with computer program, which is handled Device is able to carry out when performing such as any of the above-described method.
The present invention also provides it is a kind of for data storage computer equipment, including:
At least one processor;And
At least one processor being connect with the processor communication, wherein,
The memory is stored with the program instruction that can be performed by the processor, and the processor calls described program to refer to Order is able to carry out such as any of the above-described method.
An embodiment of the present invention provides a kind of method, apparatus and computer equipment for data storage, apply and are being broadcast live In platform, the method includes:More parts of performance sampled datas are obtained, the performance sampled data includes:A variety of method names and Corresponding operation data;Character string is carried out using each method name in every part of performance data as different basic elements Coding generates multiple unshared coding schedules;Based on multiple unshared coding schedules, the side after each coding is searched respectively Legitimate name;The unshared coding schedule is subjected to first compression with the method title after each coding respectively, is obtained primary Compressed each side's legitimate name;It is respectively that each the method title after first compression and each the method title is corresponding Operation data is serialized, and generates character string sequence;The character string sequence is subjected to second-compressed, obtains compression result, The compression result is stored into database;In this way, when carrying out data storage, first the unshared coding schedule is distinguished First compression is carried out with the method title after each coding, obtains each side's legitimate name after first compression, then character string Sequence carries out second-compressed, obtains compression result, and such data improve compression ratio by two second compressions;And to method name When title is encoded, encoded using each method name as basic element, be by list not like the prior art What a character was encoded, data to be stored are further reduced in this way, so as to reduce the memory space of occupancy.
Description of the drawings
Fig. 1 is the method flow schematic diagram for data storage that the embodiment of the present invention one provides;
Fig. 2 is the apparatus structure schematic diagram for data storage that the embodiment of the present invention one provides;
Fig. 3 is the computer equipment overall structure diagram for data storage that the embodiment of the present invention three provides.
Specific embodiment
During in order to be used to solve in the prior art to compress performance sampled data, the low storage for causing to occupy of compression ratio The technical issues of space is larger, the present invention provides a kind of method, apparatus and computer equipment for data storage, apply It is broadcast live in platform, the method includes:More parts of performance sampled datas are obtained, the performance sampled data includes:A variety of method names Title and corresponding operation data;Word is carried out using each method name in every part of performance data as different basic elements String encoding is accorded with, generates multiple unshared coding schedules;Based on multiple unshared coding schedules, the institute after each coding is searched respectively State method name;The unshared coding schedule is subjected to first compression with the method title after each coding respectively, is obtained Each side's legitimate name after first compression;Respectively by each the method title and each the method title pair after first compression The operation data answered is serialized, and generates character string sequence;The character string sequence is subjected to second-compressed, obtains compression knot Fruit stores the compression result into database.
Technical scheme of the present invention is described in further detail below by drawings and the specific embodiments.
Embodiment one
The present embodiment provides a kind of methods for data storage, apply in platform is broadcast live, as shown in Figure 1, the side Method includes:
S110, obtains more parts of performance sampled datas, and the performance sampled data includes:A variety of method names and corresponding fortune Row data;
It needs to get performance sampled data in this step, the performance sampled data includes more parts, every part of performance Sampled data all includes a variety of method names and corresponding operation data.Than if desired for getting user's name, then corresponding Method name is exactly username or userID;The run time of the operation data including each method name, number of run, Run the memory occupied and CPU usage etc..
Each method name in every part of performance data is carried out character Series Code by S111 Code generates multiple unshared coding schedules;
Here it is possible to it is carried out for each method name in every part of performance sampled data respectively as different basic elements String encoding generates multiple unshared coding schedules.
Specifically, the quantity of each method name in the more parts of performance sampled datas is counted respectively;It is adopted for every part of performance Each method name in sample data distributes a unique character string;Based on Huffman encoding, by every part of performance sampled data Each method name, the quantity of each method name and corresponding character string are stored respectively to corresponding mapping described in kind In table, the mapping table is unshared coding schedule, and the corresponding character string is preset.
For example, it is as shown in table 1 for the mapping table of certain part of performance sampled data generation, for other performance hits According to mapping table realize after the same method.Wherein, the character string can be determined from 26 upper and lower case letters.
Table 1
Method name Occurrence number Character code
A 243 f
B 173 d
C 99 e
It should be noted that the performance sampled data of different numbers include identical method name when, identical method Title is required for identical character string.
Certainly, compression ratio is if desired further improved, wherein arbitrary a performance sampled data generation one can also be utilized A shared coding schedule, specific generating mode is identical with above-mentioned unshared coding schedule generating mode, and details are not described herein.
S112 based on multiple unshared coding schedules, searches the method title after each coding respectively;
It, can be with when the method title after requiring to look up in every part of performance sampled data each coding in this step It is searched based on corresponding unshared coding schedule.For example table 1 is the coding schedule of first part of performance sampled data, when needs first In part performance sampled data during the corresponding string encodings of method name A, corresponding string encoding can be found based on table 1 For f.
The unshared coding schedule is carried out first compression with the method title after each coding respectively, obtained by S113 Take each side's legitimate name after first compression;
Get each coding after the method title after, by the unshared coding schedule respectively with each coding after The method title carries out first compression, obtains each side's legitimate name after first compression.
Still by taking first part of performance sampled data as an example, when getting, method name A is corresponding to be encoded to f, method name B It is corresponding to be encoded to d, then first part of performance sampled data includes:Method A, method A, method B, method A;So primary pressure Each method after contracting is entitled:Method A->F, method B->d;f,f,d,f.
S114, respectively by each the method title after first compression and the corresponding operation number of each the method title According to being serialized, character string sequence is generated;
In this step, when needing to store more parts of performance sampled datas of same application program, respectively by first compression Each the method title and each corresponding operation data of the method title afterwards is serialized, and generates character string sequence Row;In the character string sequence, separator is added between different character strings, to facilitate accurate retrieval.Here, it is described right The character string answered is preset.
For example the method name that certain part of performance sampled data includes is A and B;In first part of performance sampled data, method A Corresponding operation data is 1, and the corresponding operation datas of method B are B;So character string sequence is:
Method A->F, method B->d;f,d;1,3;
Similarly, if it is desired to when being stored to 1000 parts of performance sampled datas, distinguish according to above-mentioned same method Generation.Here, each operation data has corresponding offset or extraction mark;The offset can be that the time is inclined It moves, serial number deviates or address offset.
The character string sequence is carried out second-compressed, obtains compression result, the compression result is stored to number by S115 According in library.
After getting character string sequence, the character string sequence is subjected to second-compressed, compression result is obtained, by the pressure Sheepshank fruit is stored into database.
Here, it is differed because method name generally comprises several to dozens of characters, if directly being carried out with single character During compression, the space occupied is bigger, but using method name as basic element carry out string encoding after, generally also just include 1 A to arrive several characters, the space occupied is with regard to smaller in this way in compression.
By taking the sampled data of certain request of direct broadcasting room as an example, occur 476 method names in total, occur 1788 times altogether, put down Each method name occurs 3.8 times.The weighted average length of method name is 28.6 characters.Go out occurrence discounting for method name Number, then the average length of this 457 method names is 31.4 characters.In the character string after serializing, the character of method name Occupy the 58.0% of overall content.
After the above-mentioned progress Huffman encoding to method name, a method name can be with average 8.0 bit come table Show.In contrast, if stored with character style, each character is required for 1 byte (1 byte=8bit) in method name. After also meaning that coding, the occupied space of method name only have before 1/28.6th.
In view of needing extra storage coding schedule, i.e., each method must will occur once, therefore such as in the form of character One method name of fruit is repeated twice in sampled data, we can be saved by this coding mode in about 50% character Hold (part for only referring to method name);If method name is in triplicate, 64% character content can be saved;By average weight Again 3.8 calculating of number can save nearly 70% character space.Overall content is accounted in view of the character content of method name 58%, it is exactly 40.6% to be multiplied by 70%.I.e. theoretically, can be before conventional compression method by this programme, it will be to be compressed Content reduces more than 40% size.
It should be noted that remaining 42% operation data is according to conventional compress mode compression, storage.
When needing to retrieve performance sampled data, can be looked into coding schedule (shared coding schedule or unshared coding schedule) The corresponding string encoding of goal approach title is found, then finds the corresponding total operation data of the string encoding, so Object run data are extracted from total operation data according to the offset address of object run data or extraction mark afterwards.
Embodiment two
Corresponding to embodiment one, the present embodiment also provides a kind of device for data storage, as shown in Fig. 2, the dress Put including:Acquiring unit 21, coding unit 22, searching unit 23, the first compression unit 24, the compression of generation unit 25, second are single Member 26 and storage unit 27;Wherein,
Acquiring unit 21 is for obtaining performance sampled data, and the performance sampled data includes more parts, every part of performance Sampled data all includes a variety of method names and corresponding operation data.Than if desired for getting user's name, then corresponding Method name is exactly username or userID;The run time of the operation data including each method name, number of run, Run the memory occupied and CPU usage etc..
Here, coding unit 22 can be each method name in every part of performance sampled data respectively as different bases This element carries out string encoding, generates multiple unshared coding schedules.
Specifically, coding unit 22 counts the quantity of each method name in the more parts of performance sampled datas respectively;For Each method name in every part of performance sampled data distributes a unique character string;Based on Huffman encoding by every part of performance Each method name, the quantity of each method name and corresponding character string are stored respectively to right described in sampled data kind In the mapping table answered, the mapping table is unshared coding schedule, and the corresponding character string is preset.
For example, for the mapping table of certain part of performance sampled data generation, it is as shown in the table, for other performance hits According to mapping table realize after the same method.Wherein, the character string can be determined from 26 upper and lower case letters.
Table 1
Method name Occurrence number Character code
A 243 f
B 173 d
C 99 e
It should be noted that the performance sampled data of different numbers include identical method name when, identical method Title is required for identical character string.
Certainly, compression ratio is if desired further improved, wherein arbitrary a performance sampled data generation one can also be utilized A shared coding schedule, specific generating mode is identical with above-mentioned unshared coding schedule generating mode, and details are not described herein.
After coding schedule generates, the method title after requiring to look up each in every part of performance sampled data and encoding When, searching unit 23 can be searched based on corresponding unshared coding schedule.For example table 1 is first part of performance sampled data Coding schedule when needing in first part of performance sampled data the corresponding string encodings of method name A, can be searched based on table 1 It is f to corresponding string encoding.
After getting the method title after each coding, the first compression unit 24 is used for the unshared coding schedule First compression is carried out with the method title after each coding respectively, obtains each side's legitimate name after first compression.
Still by taking first part of performance sampled data as an example, when getting, method name A is corresponding to be encoded to f, method name B It is corresponding to be encoded to d, then first part of performance sampled data includes:Method A, method A, method B, method A;So primary pressure Each method after contracting is entitled:Method A->F, method B->d;f,f,d,f.
When needing to store more parts of performance sampled datas of same application program, generation unit 25 is used for respectively will be primary Each compressed the method title and each corresponding operation data of the method title are serialized, and generate character string Sequence;In the character string sequence, separator is added between different character strings, to facilitate accurate retrieval.Here, it is described Corresponding character string is preset.
For example the method name that certain part of performance sampled data includes is A and B;In first part of performance sampled data, method A Corresponding operation data is 1, and the corresponding operation datas of method B are B;So character string sequence is:
Method A->F, method B->d;f,d;1,3;
Similarly, if it is desired to when being stored to 1000 parts of performance sampled datas, distinguish according to above-mentioned same method Generation.Here, each operation data has corresponding offset or extraction mark;The offset can be that the time is inclined It moves, serial number deviates or address offset.
The character string sequence is carried out second-compressed by the second last compression unit 26, obtains compression result, storage unit 27 store the compression result into database.
Here, it is differed because method name generally comprises several to dozens of characters, if directly being carried out with single character During compression, the space occupied is bigger, but using method name as basic element carry out string encoding after, generally also just include 1 A to arrive several characters, the space occupied is with regard to smaller in this way in compression.
By taking the sampled data of certain request of direct broadcasting room as an example, occur 476 method names in total, occur 1788 times altogether, put down Each method name occurs 3.8 times.The weighted average length of method name is 28.6 characters.Go out occurrence discounting for method name Number, then the average length of this 457 method names is 31.4 characters.In the character string after serializing, the character of method name Occupy the 58.0% of overall content.
After the above-mentioned progress Huffman encoding to method name, a method name can be with average 8.0 bit come table Show.In contrast, if stored with character style, each character is required for 1 byte (1 byte=8bit) in method name. After also meaning that coding, the occupied space of method name only have before 1/28.6th.
In view of needing extra storage coding schedule, i.e., each method must will occur once, therefore such as in the form of character One method name of fruit is repeated twice in sampled data, we can be saved by this coding mode in about 50% character Hold (part for only referring to method name);If method name is in triplicate, 64% character content can be saved;By average weight Again 3.8 calculating of number can save nearly 70% character space.Overall content is accounted in view of the character content of method name 58%, it is exactly 40.6% to be multiplied by 70%.I.e. theoretically, can be before conventional compression method by this programme, it will be to be compressed Content reduces more than 40% size.
It should be noted that remaining 42% operation data is compressed according to conventional compress mode.
When needing to retrieve performance sampled data, can be looked into coding schedule (shared coding schedule or unshared coding schedule) The corresponding string encoding of goal approach title is found, then finds the corresponding total operation data of the string encoding, so Object run data are extracted from total operation data according to the offset address of object run data or extraction mark afterwards.
Embodiment three
The present embodiment also provides a kind of computer equipment for data storage, as shown in figure 3, the computer equipment packet It includes:Radio frequency (Radio Frequency, RF) circuit 310, memory 320, input unit 330, display unit 340, voicefrequency circuit 350th, the components such as WiFi module 360, processor 370 and power supply 380.It will be understood by those skilled in the art that it is shown in Fig. 3 Computer equipment structure do not form restriction to computer equipment, can include than illustrate more or fewer components or Person combines certain components or different components arrangement.
Each component parts of computer equipment is specifically introduced with reference to Fig. 3:
RF circuits 310 can be used for sending and receiving for signal, particularly, after the downlink information of base station is received, to processing Device 370 is handled.In general, RF circuits 310 include but not limited at least one amplifier, transceiver, coupler, low noise amplification Device (Low Noise Amplifier, LNA), duplexer etc..
Memory 320 can be used for storage software program and module, and processor 370 is stored in memory 320 by operation Software program and module, so as to perform the various function application of computer equipment and data processing.Memory 320 can be led To include storing program area and storage data field, wherein, storing program area can storage program area, needed at least one function Application program etc.;Storage data field can be stored uses created data etc. according to computer equipment.In addition, memory 320 Can include high-speed random access memory, can also include nonvolatile memory, a for example, at least disk memory, Flush memory device or other volatile solid-state parts.
Input unit 330 can be used for receiving the number inputted or character information and generation and the user of computer equipment Setting and function control it is related key signals input.Specifically, input unit 330 may include keyboard 331 and other inputs Equipment 332.Keyboard 331 collects the input operation of user on it, and drives corresponding connection according to preset formula Device.Keyboard 331 gives processor 370 again after collecting output information.In addition to keyboard 331, input unit 330 can also include Other input equipments 332.Specifically, other input equipments 332 can include but is not limited to touch panel, function key (such as sound Measure control button, switch key etc.), it is trace ball, mouse, one or more in operating lever etc..
Display unit 340 can be used for display by information input by user or be supplied to the information and computer equipment of user Various menus.Display unit 340 may include display panel 341, optionally, liquid crystal display (Liquid may be used Crystal Display, LCD), the forms such as Organic Light Emitting Diode (Organic Light-Emitting Diode, OLED) Display panel 341 is configured.Further, keyboard 331 can cover display panel 341, when keyboard 331 detect it is on it or attached After near touch operation, processor 370 is sent to determine the type of touch event, is followed by subsequent processing device 370 according to incoming event Type corresponding visual output is provided on display panel 341.Although keyboard 331 and display panel 341 are conducts in figure 3 Two independent components realize the input of computer equipment and input function, but in some embodiments it is possible to by keyboard 331 is integrated with display panel 341 and that realizes computer equipment output and input function.
Voicefrequency circuit 350, loud speaker 351, microphone 352 can provide the audio interface between user and computer equipment. The transformed electric signal of the audio data received can be transferred to loud speaker 351, is converted by loud speaker 351 by voicefrequency circuit 350 It is exported for voice signal;
WiFi belongs to short range wireless transmission technology, and computer equipment can help user to receive and dispatch by WiFi module 360 Email, browsing webpage and access streaming video etc., it has provided wireless broadband internet to the user and has accessed.Although Fig. 3 Show WiFi module 360, but it is understood that, and must be configured into for computer equipment is not belonging to, it completely can root It is omitted in the range for the essence for not changing invention according to needs.
Processor 370 is the control centre of computer equipment, utilizes various interfaces and the entire computer equipment of connection Various pieces, by run or perform the software program being stored in memory 320 and/or module and call be stored in Data in memory 320 perform the various functions of computer equipment and processing data, whole so as to be carried out to computer equipment Monitoring.Optionally, processor 370 may include one or more processing units;Preferably, processor 370 can be integrated using processing Device, wherein, the main processing operation system of application processor, user interface and application program etc..
Computer equipment further includes the power supply 380 (such as power supply adaptor) powered to all parts, it is preferred that power supply can With logically contiguous by power-supply management system and processor 370.
The advantageous effect that the method, apparatus provided in an embodiment of the present invention for being used to store data and computer equipment can be brought At least:
An embodiment of the present invention provides a kind of method, apparatus and computer equipment for data storage, apply and are being broadcast live In platform, the method includes:An embodiment of the present invention provides a kind of method, apparatus for data storage and computer to set It is standby, it applies in platform is broadcast live, the method includes:More parts of performance sampled datas are obtained, the performance sampled data includes:It is more Kind method name and corresponding operation data;Using each method name in every part of performance data as different members substantially Element carries out string encoding, generates multiple unshared coding schedules;Based on multiple unshared coding schedules, each volume is searched respectively The method title after code;The unshared coding schedule is once pressed respectively with the method title after each coding Contracting obtains each side's legitimate name after first compression;Respectively by each the method title after first compression and each described side The corresponding operation data of legitimate name is serialized, and generates character string sequence;The character string sequence is subjected to second-compressed, is obtained Compression result is taken, the compression result is stored into database;In this way, when carrying out data storage, it first will be described non-common It enjoys coding schedule and carries out first compression with the method title after each coding respectively, obtain each side's religious name after first compression Claim, then character string sequence carries out second-compressed, obtains compression result, and such data improve compression ratio by two second compressions; And when being encoded to method name, encoded using each method name as basic element, not with it is existing Technology is the same, and single character is encoded, and further reduces data to be stored in this way, so as to reduce occupancy Memory space;Also, it is that will carry out string encoding using each method name as different basic elements when being encoded; And character string is readable data, even if after data are stored, data are still readable;Therefore user is carrying out information retrieval When, it can be according to corresponding string search to corresponding method name, then get the performance sampled data of party's legitimate name; In addition, since this case is encoded method name as a basic element, when being compressed to method name, due to The data of compression are reduced, so as to also improve compression ratio.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein. Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair Bright preferred forms.
In the specification provided in this place, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention Example can be put into practice without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this description.
Similarly, it should be understood that in order to simplify the disclosure and help to understand one or more of each inventive aspect, Above in the description of exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor Shield the present invention claims the more features of feature than being expressly recited in each claim.More precisely, as following Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore, Thus the claims for following specific embodiment are expressly incorporated in the specific embodiment, wherein each claim is in itself Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to carry out adaptively the module in the equipment in embodiment Change and they are arranged in one or more equipment different from the embodiment.It can be the module or list in embodiment Member or component be combined into a module or unit or component and can be divided into addition multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it may be used any Combination is disclosed to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so to appoint Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification is (including adjoint power Profit requirement, abstract and attached drawing) disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments in this include institute in other embodiments Including certain features rather than other feature, but the combination of the feature of different embodiment means in the scope of the present invention Within and form different embodiments.For example, in the following claims, embodiment claimed it is arbitrary it One mode can use in any combination.
The all parts embodiment of the present invention can be with hardware realization or to be run on one or more processor Software module realize or realized with combination thereof.It will be understood by those of skill in the art that it can use in practice Microprocessor or digital signal processor (DSP, Digital Signal Processing) are implemented to realize according to the present invention The gateway of example, proxy server, some or all components in system some or all functions.It is of the invention acceptable real It is now for performing some or all equipment of method as described herein or program of device (for example, computer journey Sequence and computer program product).It is such realize the present invention program can be stored on computer readable storage medium or There can be the form of one or more signal.Such signal can be downloaded from internet website and obtain or carry It provides on body signal or is provided in the form of any other.It should be noted that above-described embodiment the present invention will be described and It does not limit the invention, and those skilled in the art can set without departing from the scope of the appended claims Count out alternative embodiment.In the claims, any reference mark between bracket should not be configured to claim Limitation.Word "comprising" does not exclude the presence of element or step not listed in the claims.Word before element "a" or "an" does not exclude the presence of multiple such elements.The present invention can be by means of including the hardware of several different elements And it is realized by means of properly programmed computer.If in the unit claim for listing equipment for drying, in these devices Several can be embodied by same hardware branch.The use of word first, second, and third does not indicate that Any sequence.These words can be construed to title.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention, it is all All any modification, equivalent and improvement made within the spirit and principles in the present invention etc. should be included in the protection of the present invention Within the scope of.

Claims (10)

  1. A kind of 1. method for data storage, which is characterized in that it applies in platform is broadcast live, the method includes:
    More parts of performance sampled datas are obtained, the performance sampled data includes:A variety of method names and corresponding operation data;
    String encoding is carried out using each method name in every part of performance data as different basic elements, generation is more A unshared coding schedule;
    Based on multiple unshared coding schedules, the method title after each coding is searched respectively;
    The unshared coding schedule is subjected to first compression with the method title after each coding respectively, obtains first compression Each side's legitimate name afterwards;
    Each the method title after first compression and the corresponding operation data of each the method title are subjected to sequence respectively Rowization generate character string sequence;
    The character string sequence is subjected to second-compressed, compression result is obtained, the compression result is stored into database.
  2. 2. the method as described in claim 1, which is characterized in that it is described using each method name as different basic elements into Line character string encoding generates coding schedule, including:
    The quantity of each method name in the more parts of performance sampled datas is counted respectively;
    A unique character string is distributed for each method name in every part of performance sampled data;
    By each method name, the quantity of each method name and corresponding character described in every part of performance sampled data kind String is stored respectively into corresponding mapping table, and the mapping table is unshared coding schedule, and the corresponding character string is preset.
  3. 3. the method as described in claim 1, which is characterized in that in the character string sequence, set between different character strings It is equipped with separator.
  4. 4. the method as described in claim 1, which is characterized in that the operation data includes:The fortune of each the method title The memory and central processor CPU utilization rate that row time, number of run, operation occupy.
  5. 5. a kind of device for data storage, which is characterized in that described device includes:
    Acquiring unit, for obtaining more parts of performance sampled datas, the performance sampled data includes:A variety of method names and correspondence Operation data;
    Coding unit, for using each method name in every part of performance data as different basic elements into line character String encoding generates multiple unshared coding schedules;
    Searching unit for being based on multiple unshared coding schedules, searches the method title after each coding respectively;
    First compression unit, it is primary for the unshared coding schedule to be carried out respectively with the method title after each coding Compression obtains each side's legitimate name after first compression;
    Generation unit, for respectively by each the method title after first compression and each corresponding fortune of the method title Row data are serialized, and generate character string sequence;
    Second compression unit for the character string sequence to be carried out second-compressed, obtains compression result;
    Storage unit, for storing the compression result into database.
  6. 6. device as claimed in claim 5, which is characterized in that the coding unit is specifically used for:
    The quantity of each method name in the more parts of performance sampled datas is counted respectively;
    A unique character string is distributed for each method name in every part of performance sampled data;
    By each method name, the quantity of each method name and corresponding character described in every part of performance sampled data kind String is stored respectively into corresponding mapping table, and the mapping table is unshared coding schedule, and the corresponding character string is preset.
  7. 7. device as claimed in claim 5, which is characterized in that in the character string sequence, set between different character strings It is equipped with separator.
  8. 8. device as claimed in claim 5, which is characterized in that the operation data includes:The fortune of each the method title The memory and central processor CPU utilization rate that row time, number of run, operation occupy.
  9. 9. a kind of computer readable storage medium, is stored thereon with computer program, which is characterized in that the program is held by processor The method as described in Claims 1-4 is any is able to carry out during row.
  10. 10. a kind of computer equipment for data storage, which is characterized in that including:
    At least one processor;And
    At least one processor being connect with the processor communication, wherein,
    The memory is stored with the program instruction that can be performed by the processor, and the processor calls described program instruction energy Enough methods performed as described in Claims 1-4 is any.
CN201810014959.9A 2018-01-08 2018-01-08 Method and device for data storage and computer equipment Active CN108233942B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810014959.9A CN108233942B (en) 2018-01-08 2018-01-08 Method and device for data storage and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810014959.9A CN108233942B (en) 2018-01-08 2018-01-08 Method and device for data storage and computer equipment

Publications (2)

Publication Number Publication Date
CN108233942A true CN108233942A (en) 2018-06-29
CN108233942B CN108233942B (en) 2022-02-22

Family

ID=62645456

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810014959.9A Active CN108233942B (en) 2018-01-08 2018-01-08 Method and device for data storage and computer equipment

Country Status (1)

Country Link
CN (1) CN108233942B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4955066A (en) * 1989-10-13 1990-09-04 Microsoft Corporation Compressing and decompressing text files
WO1995019662A1 (en) * 1994-01-13 1995-07-20 Telco Systems, Inc. Data compression apparatus and method
CN102122960A (en) * 2011-01-18 2011-07-13 西安理工大学 Multi-character combination lossless data compression method for binary data
US20130019029A1 (en) * 2011-07-13 2013-01-17 International Business Machines Corporation Lossless compression of a predictive data stream having mixed data types
CN103138764A (en) * 2011-11-22 2013-06-05 上海麦杰科技股份有限公司 Method and system for lossless compression of real-time data
JP2016134808A (en) * 2015-01-20 2016-07-25 富士通株式会社 Data compression program, data decompression program, data compression device, and data decompression device
CN106503003A (en) * 2015-09-06 2017-03-15 阿里巴巴集团控股有限公司 A kind of compression of expandable mark language XML document, decompressing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4955066A (en) * 1989-10-13 1990-09-04 Microsoft Corporation Compressing and decompressing text files
WO1995019662A1 (en) * 1994-01-13 1995-07-20 Telco Systems, Inc. Data compression apparatus and method
CN102122960A (en) * 2011-01-18 2011-07-13 西安理工大学 Multi-character combination lossless data compression method for binary data
US20130019029A1 (en) * 2011-07-13 2013-01-17 International Business Machines Corporation Lossless compression of a predictive data stream having mixed data types
CN103138764A (en) * 2011-11-22 2013-06-05 上海麦杰科技股份有限公司 Method and system for lossless compression of real-time data
JP2016134808A (en) * 2015-01-20 2016-07-25 富士通株式会社 Data compression program, data decompression program, data compression device, and data decompression device
CN106503003A (en) * 2015-09-06 2017-03-15 阿里巴巴集团控股有限公司 A kind of compression of expandable mark language XML document, decompressing method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
薛向阳: "基于哈夫曼编码的文本文件压缩分析与研究", 《科学技术与工程》 *

Also Published As

Publication number Publication date
CN108233942B (en) 2022-02-22

Similar Documents

Publication Publication Date Title
CN107770596A (en) A kind of special efficacy synchronous method, device and mobile terminal
CN107480123A (en) A kind of recognition methods, device and the computer equipment of rubbish barrage
CN104159136B (en) The acquisition methods of interactive information, terminal, server and system
CN107562835A (en) File search method, device, mobile terminal and computer-readable recording medium
CN107609916A (en) A kind of method, apparatus and computer equipment of advertisement information
CN104572889A (en) Method, device and system for recommending search terms
CN108255954A (en) Using search method, device, storage medium and terminal
CN105373293A (en) Data acquisition method and apparatus
CN107391587A (en) Link is shared and access method, server, terminal and computer-readable storage medium
CN110210605A (en) Hardware operator matching process and Related product
CN107577712A (en) The method, apparatus and computer equipment of a kind of loading page
CN108256017A (en) A kind of method, apparatus and computer equipment for data storage
CN105512150A (en) Method and device for information search
CN104699501A (en) Method and device for running application program
CN109525647A (en) Message pushes badge value control method, electronic device and storage medium
CN104052679A (en) Load balancing method and device for network flow
CN108549681A (en) Data processing method and device, electronic equipment, computer readable storage medium
CN105047185A (en) Method, device and system for obtaining audio frequency of accompaniment
CN108233942A (en) A kind of method, apparatus and computer equipment for data storage
CN107948753A (en) One kind recommends method, apparatus and computer equipment
CN108133033A (en) A kind of method, apparatus and computer equipment for data storage
CN105095286A (en) Page recommendation method and device
CN106407585B (en) The method and apparatus for tuning or optimizing in radio frequency simulation
CN108038193A (en) Game application searching method and device, electronic equipment, computer-readable recording medium
CN112579250A (en) Middleware management method and device and repair engine system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant