CN103761316B - A kind of data compression storage method and device based on sparse matrix - Google Patents

A kind of data compression storage method and device based on sparse matrix Download PDF

Info

Publication number
CN103761316B
CN103761316B CN201410037979.XA CN201410037979A CN103761316B CN 103761316 B CN103761316 B CN 103761316B CN 201410037979 A CN201410037979 A CN 201410037979A CN 103761316 B CN103761316 B CN 103761316B
Authority
CN
China
Prior art keywords
data
attribute
storage file
bit
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410037979.XA
Other languages
Chinese (zh)
Other versions
CN103761316A (en
Inventor
刘道新
胡航海
张健
徐秀敏
张启伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Beijing Guodiantong Network Technology Co Ltd
Beijing China Power Information Technology Co Ltd
Original Assignee
State Grid Corp of China SGCC
State Grid Zhejiang Electric Power Co Ltd
Beijing China Power Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Corp of China SGCC, State Grid Zhejiang Electric Power Co Ltd, Beijing China Power Information Technology Co Ltd filed Critical State Grid Corp of China SGCC
Priority to CN201410037979.XA priority Critical patent/CN103761316B/en
Publication of CN103761316A publication Critical patent/CN103761316A/en
Priority to CA2871435A priority patent/CA2871435C/en
Application granted granted Critical
Publication of CN103761316B publication Critical patent/CN103761316B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

This application discloses a kind of data compression storage method and device based on sparse matrix, this method stores the attribute data of the first data with Value Data respectively, that is stored in the first storage file first data attribute data and the attribute data corresponding to attribute-bit, using attribute-bit corresponding to attribute data as the dimension data for determining first data, then dimension data and the single Value Data are only stored in second storage file.Prior art is using five storage domains storage sparse matrixes, and for real data value storage in data value field, and because the data value of each node has more identical data, repetition stores the identical data, wastes memory space.The technical scheme that the application provides improves the sharing degree of data, saves memory space, and enhance the dynamic scalability of data.

Description

A kind of data compression storage method and device based on sparse matrix
Technical field
The application is related to computer data memory technology, especially a kind of data compression storage method based on sparse matrix And device.
Background technology
In recent years, enterprise etc. is information-based ripe day by day, and explosive growth is presented in business datum caused by miscellaneous service application Ground trend, but found after being combed to the mass data, the data structure form of the mass data meets sparse matrix Characteristic, the sparse matrix refer to matrix most data elements be 0.When preserving the sparse matrix, if preserving a large amount of 0 element, then can cause the waste of memory space, therefore, it is necessary to a kind of technology preserves sparse matrix data to compress.
At present, to compress, to store one of the mode of the sparse matrix mass data be cross chain tabular form.This kind of side In method, the non-zero element in the sparse matrix is stored using node, needs to establish five storage domains in the node, respectively It is stored with row value, train value, data value, line pointer and column pointer.Wherein, the row value or train value represent the node described sparse Line position in matrix is put or column position, and the line pointer or column pointer represent next non-zero element in one's own profession or this row.In addition, Also need to establish often row and the gauge outfit node of each column, the row and the first of the row are pointed in gauge outfit of being expert at node and the storage of list head node The pointer of individual node.
In above-mentioned storage mode, each node includes five storage domains, and real data value is contained in data value field, And because the data value of each node has more identical data, the storage mode causes the waste of memory space.
The content of the invention
In view of this, this application provides a kind of data compression storage method and device based on sparse matrix, for solving The each node of sparse matrix compression storing data mode certainly of the prior art includes five storage domains, real data value bag It is contained in data value field, and because the data value of each node has more identical data, the storage mode causes storage The technical problem of the waste in space.The technical scheme that the application provides is as follows:
A kind of data compression storage method based on sparse matrix, including:
Receive the data acquisition system for including a plurality of first data;Wherein, every first data include multiple category Property data and a Value Data, the multiple attribute data each correspond to different attributes, and every first data have phase Same multiple attributes;
According to each attribute, the first storage file is generated;Wherein, include in first storage file described each Attribute-bit corresponding to the attribute data and each attribute data that individual attribute includes;
Determine each self-corresponding attribute-bit of each attribute data in every first data, and by each category Property identifier combination generates the dimension data of the data of this first;
Dimension data and Value Data corresponding to every first data are defined as a data tuple to be stored;
According to each data tuple to be stored, the second storage file is generated.
The above method, it is preferred that it is described according to each attribute, the first storage file is generated, including:
It is determined that attribute data corresponding with each attribute;
According to the type of each attribute, it is determined that property element corresponding with the attribute;
Judge whether the property element is identical with the attribute data;If so, then generate corresponding to the attribute data Attribute-bit;The attribute data and its corresponding attribute-bit are stored in the first storage file of generation;
If it is not, then generate component identification corresponding to the property element;In the first storage file of generation described in storage Property element and its corresponding component identification.
The above method, it is preferred that first storage file and second storage file are tables of data.
The above method, it is preferred that according to each data tuple to be stored, after generating the second storage file, go back Including:
Receive the second data for including multiple attribute datas and a Value Data;Wherein, second packet contains Attribute corresponding to attribute data is identical with the attribute of first data;
Obtain attribute-bit corresponding to each attribute data, and by described this article of each attribute-bit combination producing The dimension data of two data;
Dimension data and Value Data corresponding to second data are defined as a data tuple to be stored;
The data tuple to be stored is added into second storage file.
The above method, it is preferred that after methods described, in addition to:
According to attribute data corresponding to each attribute in first storage file, it is determined that classified inquiry rule;
According to the classified inquiry rule, target corresponding with the rule searching is searched in second storage file Dimension data;
Show Value Data corresponding to attribute data corresponding with the target dimension data and the target dimension data.
Present invention also provides a kind of compression storing data device based on sparse matrix, including:
Data acquisition system receiving unit, the data acquisition system of a plurality of first data is included for receiving;Wherein, described every One data include multiple attribute datas and a Value Data, and the multiple attribute data each corresponds to different attributes, institute Stating every first data has the multiple attributes of identical;
First storage file generation unit, for according to each attribute, generating the first storage file;Wherein, it is described Include attribute mark corresponding to the attribute data and each attribute data that each attribute includes in first storage file Know;
Dimension data generation unit, for determining each self-corresponding category of each attribute data in every first data Property mark, and by the dimension data of the first data of this of each attribute-bit combination producing;
Data tuple determining unit to be stored, for dimension data and Value Data corresponding to every first data is true It is set to a data tuple to be stored;
Second storage file generation unit, for according to each data tuple to be stored, generating the second storage file.
Said apparatus, it is preferred that the first storage file generation unit includes:
Attribute data determination subelement, for determining attribute data corresponding with each attribute;
Property element determination subelement, for the type according to each attribute, it is determined that attribute corresponding with the attribute Element;
Judgment sub-unit, for judging whether the property element is identical with the attribute data;If so, the first knot of triggering Fruit unit;If it is not, the second result subelement of triggering;
First result subelement, for generating attribute-bit corresponding to the attribute data;In the first storage text of generation The attribute data and its corresponding attribute-bit are stored in part;
Second result subelement, for generating component identification corresponding to the property element;In the first storage text of generation The property element and its corresponding component identification are stored in part.
Said apparatus, it is preferred that the first storage file and described second of the first storage file generation unit generation Second storage file of storage file generation unit generation is tables of data.
Said apparatus, it is preferred that also include:
Second data receipt unit, the second data of multiple attribute datas and a Value Data are included for receiving;Its In, attribute corresponding to the attribute data that second packet contains is identical with the attribute of first data;
Second data dimension data generating unit, for obtaining attribute-bit corresponding to each attribute data, and will The dimension data of the second data of this of each attribute-bit combination producing;
Second data tuple determining unit to be stored, for dimension data and Value Data corresponding to second data is true It is set to a data tuple to be stored;
Second data adding device, for the data tuple to be stored to be added into second storage file.
Said apparatus, it is preferred that also include:
Rule determination unit, for according to attribute data corresponding to each attribute in first storage file, it is determined that Classified inquiry rule;
Data searching unit, for according to it is described classified inquiry rule, in second storage file search with it is described Target dimension data corresponding to rule searching;
Data display unit, for showing attribute data corresponding with the target dimension data and the target dimension number According to corresponding Value Data.
From above technical scheme, compared with prior art, this application provides a kind of data based on sparse matrix Compression and storage method and device, methods described is by the way that the attribute data of first data is stored respectively with Value Data, i.e. institute State attribute-bit corresponding to the attribute data that first data are stored in the first storage file and the attribute data, described The combination of attribute-bit corresponding to every attribute data of first data and first data are stored in two storage files Value Data.In the prior art using five storage domains storage sparse matrix data, real data value storage in data value field, And because the data value of each node has more identical data, repetition stores the identical data, wastes memory space.This Application individually stores the attribute data that first data have in itself, using attribute-bit corresponding to attribute data as determination The dimension data of first data, improves the sharing degree of data, has been effectively saved memory space.
Brief description of the drawings
In order to illustrate more clearly of the technical scheme in the embodiment of the present application, make required in being described below to embodiment Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the present application, for For those of ordinary skill in the art, without having to pay creative labor, it can also be obtained according to these accompanying drawings His accompanying drawing.
Fig. 1 is a kind of flow of data compression storage method one embodiment based on sparse matrix that the application provides Figure;
Fig. 2 is a kind of part stream for another embodiment of data compression storage method based on sparse matrix that the application provides Cheng Tu;
Fig. 3 is an exemplary plot of the present embodiment that the application provides;
Fig. 4 is a kind of part stream for the another embodiment of data compression storage method based on sparse matrix that the application provides Cheng Tu;
Fig. 5 is a kind of part stream for the another embodiment of data compression storage method based on sparse matrix that the application provides Cheng Tu;
Fig. 6 is that a kind of structure of compression storing data device one embodiment based on sparse matrix that the application provides is shown It is intended to;
Fig. 7 is a kind of part knot for another embodiment of compression storing data device based on sparse matrix that the application provides Structure schematic diagram.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present application, the technical scheme in the embodiment of the present application is carried out clear, complete Site preparation describes, it is clear that described embodiment is only some embodiments of the present application, rather than whole embodiments.It is based on Embodiment in the application, those of ordinary skill in the art are obtained every other under the premise of creative work is not made Embodiment, belong to the scope of the application protection.
Referring to Fig. 1, it illustrates a kind of data compression storage method based on sparse matrix one that the application provides The flow chart of embodiment, the present embodiment can include:
Step 101:Receive the data acquisition system for including a plurality of first data;Wherein, every first data include There are multiple attribute datas and a Value Data, the multiple attribute data each corresponds to different attributes, every first number According to the multiple attributes of identical.
Wherein, the data acquisition system for including a plurality of first data may be considered sparse matrix, the data acquisition system In multiple first data for including be multiple element datas in the sparse matrix.It should be noted that sparse matrix refers to Multiple data elements are 0 matrix.The Value Data of first data included in the data acquisition system is not 0.
In addition, first data can with but not limit be business datum caused by industrial circle, and first data Include multiple attribute datas and a Value Data.For example, first data are the units such as incorporated business within certain time Power consumption data, be specially:A units in January, 2013 power consumption is 1000 degree, and 2 months power consumptions of B units 2013 year are 2000 Degree, 2 months power consumptions of C units 2013 year are 3000 degree.
Three the first data are included in above-mentioned example, wherein, A units, in January, 2013, power consumption are described first article the Three attribute datas of one data;B units, 2 months 2013, three attribute numbers that power consumption is the data of Article 2 first According to;C units, 2 months 2013, three attribute datas that power consumption is the data of Article 3 first.And described 1000 degree are institute State the Value Data of first the first data;The described 2000 degree Value Datas for the data of Article 2 first;Described 3000 degree are The Value Data of the data of Article 3 first.
In addition, three attributes of each data of bar first correspond to different attributes respectively in above-mentioned example.Specifically, the A Unit, B units, attribute is unit attribute corresponding to C units;Corresponding in the January, 2013,2 months 2013,2 months 2013 Attribute is time attribute;Attribute corresponding to the power consumption is Criterion Attribute.
It should be noted that every first data have the multiple attributes of identical.For example, in above-mentioned example The attribute all same that described first, Article 2 and the data of Article 3 first have, i.e., all have unit attribute, time attribute and Criterion Attribute.
Step 102:According to each attribute, the first storage file is generated;Wherein, included in first storage file There is attribute-bit corresponding to the attribute data and each attribute data that each attribute includes.
, wherein it is desired to attribute-bit corresponding to each bar attribute data is generated, and it is right with it to establish the attribute data Corresponding relation between the attribute-bit answered.For example, the attribute that three the first data in the example of step 101 have is respectively Unit attribute, time attribute and Criterion Attribute, specifically, the attribute data that first first packet contains is respectively A mono- Position, in January, 2013, power consumption, it is A for attribute-bit corresponding to A units generation, for the attribute mark of in January, 2013 generation Know for T1, the attribute-bit for being power consumption generation is Elec.Then by A units and its corresponding attribute-bit A, in January, 2013 and Its corresponding attribute-bit T1, power consumption and its corresponding attribute-bit Elec are stored in first storage file.
It should be noted that storage identical attribute data, identical attribute are not repeated in first storage file Data are indicated using same attribute-bit.For example, the attribute data of the data of Article 2 first is mono- including B in above-mentioned example Position, 2 months 2013, power consumption, the attribute data of the data of Article 3 first include C units, 2 months 2013, power consumption.Two article One data identical attribute data is to use same attribute-bit Elec using same attribute-bit T2, power consumption in 2 months 2013.
Step 103:Determine each self-corresponding attribute-bit of each attribute data in every first data, and by institute State the dimension data of each attribute-bit combination producing the first data of this.
Every first data are parsed, determine each attribute data that every first packet contains, according to step Corresponding relation between the attribute data and attribute-bit that are preserved in 102 in first storage file, determine each category Property attribute-bit corresponding to data, and then, by the dimension data of the first data of this of each attribute-bit combination producing.Example Such as, three attribute datas that first the first packet in the example of step 101 contains are A units, in January, 2013, power consumption, According to the corresponding relation generated in step 102, the dimension data for generating first first data is A, T1, Elec.
It should be noted that the dimension data of every first data can be multiple institutes corresponding to the data of this first State the arbitrary arrangement combination of attribute-bit.For example, the syntagmatic of the attribute-bit of first first data can be, or A, it is any one in Elec, T1, or T1, Elec, A, or T1, A, Elec, or Elec, A, T1, or several combinations of Elec, T1, A Kind.
Step 104:Dimension data and Value Data corresponding to every first data are defined as a data to be stored Tuple.
Wherein, every first packet contains multiple attribute datas and a Value Data, the multiple attribute data It can represent that further, every first data can by dimension data as corresponding to the data of this first that step 103 generates Represented with the combination of both the dimension data corresponding to the data of this first and its Value Data, and using both combinations as this Data tuple to be stored corresponding to first data.For example, the dimension data of first the first data can be with the example of step 101 For A, T1, Elec, Value Data 1000, then the data of this first can be expressed as A, T1, Elec and 1000, and be defined as Data tuple to be stored.
Step 105:According to each data tuple to be stored, the second storage file is generated.
It should be noted that what is stored in second storage file is the data pair of each bar first in the data acquisition system The data tuple to be stored answered.
In the prior art, using five storage domain storage sparse matrix data, real data value(Attribute data and value number According to)It is stored in data value field, and because the data value of each node has a more identical data, the identical data are main Refer to multiple identical attribute datas, repetition stores the identical data, so as to cause the waste of memory space.
Technical scheme more than, present embodiments provides a kind of compression storing data side based on sparse matrix Method, this method are deposited by the way that the attribute data of first data is stored respectively with Value Data in that is, described first storage file Store up first data attribute data and the attribute data corresponding to attribute-bit, and identical attribute data is using same Attribute-bit, so as to effectively remove the data of repeated and redundant, the sharing degree of data is improved, and then save memory space. Only include two storage domains in second storage file, corresponding to the every attribute data for storing first data respectively Attribute-bit combines and the Value Data of first data, and sparse matrixes are stored compared to five storage domains are used in the prior art Data, further save memory space.
It should be noted that the data compression storage method based on sparse matrix that the application provides is applied to but not limited Sparse matrix is only applied to, it is equally applicable to the storage of non-sparse matrix data.
It should be noted that the attribute data that each packet of bar first contains in above-described embodiment can have one or more Attribute item.For example, attribute data includes two attribute items in January, 2013, i.e., belong to Year attribute item, January within described 2013 Belong to month attribute item, then it is corresponding, need to embody each attribute item in attribute-bit corresponding to the attribute data Corresponding attribute item mark.For example, attribute-bit corresponding to the attribute data in January, 2013 is T1, then it is considered that the T Corresponding with Year attribute item, described 1 is corresponding with the month attribute item.
It should be noted that first storage file can be tables of data, the tables of data includes multiple data sheets Member, each data cell include two parts, and a portion is used for the attribute data for storing the data of each bar first, another Part is used to store attribute-bit corresponding to the attribute data.For example, a data cell includes two parts, a portion Store the A units, another part storage attribute-bit A;Another data cell includes two parts, a portion storage institute State B units, another part storage attribute-bit B;Another data cell includes two parts, and it is mono- that a portion stores the C Position, another part storage attribute-bit C.
Second storage file can also be tables of data, and the tables of data includes multiple data cells, each data sheet Member is divided into two parts, and a portion is used for the dimension data for storing the data of each bar first, described in another part storage Value Data corresponding to each data of bar first.For example, the data storage of Article 3 first in step 101 is in a data cell, The data cell includes two parts, a portion storage A, T1, Elec, another part storage 1000.
Above-described embodiment can be used for compression to store sparse matrix, and most data elements are 0 in sparse matrix, to realize institute State sparse matrix reduction to show, see Fig. 2, a kind of data compression based on sparse matrix provided it illustrates the application is deposited The partial process view of another embodiment of method for storing, the step 102 of above-described embodiment can be accomplished by the following way:
Step 201:It is determined that attribute data corresponding with each attribute.
Wherein, the data of each bar first included in the data acquisition system that above-mentioned steps 101 receive have the multiple attributes of identical, And the data of each bar first include attribute data corresponding with each attribute.For example, the data of each bar first have Time attribute, the attribute data corresponding with time attribute that each packet of bar first contains be respectively in January, 2013,2 months 2013, In March, 2013, in April, 2013, in May, 2013, in June, 2013, in July, 2013, in August, 2013, in September, 2013,2013 October, in November, 2013.
Step 202:According to the type of each attribute, it is determined that property element corresponding with the attribute.
Wherein, the type of the attribute includes the first kind and Second Type.Wherein, the attribute of the first kind includes Property element to be fixed, for example, property element corresponding to time attribute is the December in the January to 2013 of 2013, i.e., In January, 2013,2 in 2013 months ... ... the December of 2013.
The property element that the attribute of the Second Type includes is revocable, in the data acquisition system received with step 101 The attribute data that each packet of bar first contains is relevant.Specifically, each different attribute data is as category corresponding with the attribute Property element.For example, the attribute of the Second Type includes unit attribute, the data acquisition system that step 101 receives includes and unit Different attribute data corresponding to attribute is A units, B units, C units, then property element corresponding to the unit attribute is that A is mono- Position, B units, C units.
Step 203:Judge whether the property element is identical with the attribute data;If so, perform step 204;If it is not, Perform step 205.
Wherein, the property element is to determine every property element in step 202, and the attribute data is step The every attribute data determined in 201.It is described identical whether identical and whether content is identical including number.
For example, property element corresponding to the time attribute determined in step 202 was January of 2013 to 2013 December, but determine that attribute data corresponding to the time attribute lacks in December, 2013 in step 201, then judged result is no; Property element corresponding to the unit attribute determined in step 202 is identical with the attribute data determined in step 201, that is, is A units, B units, C units, then judged result is yes.
Step 204:Generate attribute-bit corresponding to the attribute data;In the first storage file of generation described in storage Attribute data and its corresponding attribute-bit.
Step 205:Generate component identification corresponding to the property element;In the first storage file of generation described in storage Property element and its corresponding component identification.
For example, the component identification of the time element generation of generation is respectively T1, T2, T3 ... T12.Step 203 When judged result is no, illustrate some matrix datas in the sparse matrix that the data acquisition system that above-described embodiment 101 receives represents Element is 0, for example, it is that 0, C is mono- that the power consumption in A units in December, 2013, which is the power consumption in 0, B units December of 2013, Position in December, 2013 power consumption be 0.
Technical scheme more than, storage of the present embodiment to neutral element in sparse matrix is by storing sparse square What the property element mark of battle array was realized.For example, referring to Fig. 3, it illustrates the present embodiment a exemplary plot.In the figure Each data cell includes two parts, wherein, 301 be dimension data part, and 302 be Value Data part.Wherein, if 302 Part be null value, and it is 0 to illustrate matrix data in the sparse matrix, in the second storage file and need not store 302 parts For the data cell of null value, property element and its corresponding component identification only need to be stored in first storage file.
Specifically, X1, X2, X3, Y1, Y2, Y3 and Y4 are stored with the first storage file corresponding with the figure.Certainly, Also need to store attribute data or property element corresponding to each mark.When needing to show null value element corresponding to the X1Y3 When, only it need to inquire about attribute data or property element corresponding to the X1Y3.For example, property element corresponding to the X1 is In December, 2013, attribute data corresponding to Y3 are C units.Data corresponding to the C units in December, 2013 can then be shown It is worth for 0.
When needing to insert data in the data acquisition system, referring to Fig. 4, it illustrates a kind of base that the application provides In the partial process view of the another embodiment of the data compression storage method of sparse matrix, after the step 105 of above-described embodiment, It can also include:
Step 401:Receive the second data for including multiple attribute datas and a Value Data;Wherein, second number According to comprising attribute data corresponding to attribute it is identical with the attribute of first data.
Wherein, first data are first data included in the data acquisition system that step 101 receives.For example, institute Attribute corresponding to stating the second data includes unit attribute, time attribute and Criterion Attribute, corresponding with first data single Bit attribute, time attribute and Criterion Attribute are identical.
Step 402:Attribute-bit corresponding to each attribute data is obtained, and each attribute-bit combination is given birth to Into the dimension data of the data of this second.
Second data are parsed, obtain every attribute data that second packet contains, and determine every category Property attribute-bit corresponding to data.For example, second data are 3000 degree of D units in April, 2013 power consumption.Second number According to comprising attribute data be D units, in April, 2013 and power consumption, comprising Value Data be 3000.Obtain every attribute Attribute-bit corresponding to data, i.e. respectively D, T4 and Elec.By the attribute-bit combination producing and the data pair of this second The dimension data answered, for example, D, T4, Elec.
Step 403:Dimension data and Value Data corresponding to second data are defined as a data tuple to be stored.
Step 404:The data tuple to be stored is added into second storage file.
Wherein, position of the addition in second storage file can be optional position, you can be inserted into The above or below of any data storage tuple in the second storage file is stated, can also be and be directly appended to second storage The end of file.
In the prior art, using orthogonal list mode storage matrix data, when needing to insert some data tuple, head is needed First find the line identifier of the data tuple and arrange position corresponding to mark, and the data tuple is inserted into the position, insert Compare fixed single in the position entered.And this implementation can be realized data tuple addition storage arriving second storage file Optional position.
When needing to search some matrix datas in each item data of the sparse matrix of storage, referring to Fig. 5, its A kind of partial process view for the another embodiment of data compression storage method based on sparse matrix that the application provides is shown, on After the step 105 for stating embodiment, it can also include:
Step 501:According to attribute data corresponding to each attribute in first storage file, it is determined that classified inquiry rule Then.
Wherein, the attribute data included in first storage file corresponds to different attributes, for example, time attribute, list Bit attribute and Criterion Attribute., can be according to described when needing to be inquired about according to a certain attribute in first storage file Attribute data corresponding to attribute determines classified inquiry rule.For example, attribute data corresponding to the time attribute includes 2013 1 The moon, 2 months 2013, in March, 2013, in April, 2013, in May, 2013 etc..The first quarter in 2013 i.e. in January, 2013 can be inquired about To the power consumption of the constituent parts in March.
Step 502:According to the classified inquiry rule, searched and the rule searching pair in second storage file The target dimension data answered.
Specifically, according to the classified inquiry rule, the attribute-bit that the classifying rules includes, traversal described second are determined The dimension data of every data storage tuple in storage file, it is determined that the dimension data for including the attribute-bit is target Dimension data.For example, attribute-bit corresponding to the classified inquiry rule is T1, T2, T3, the target dimension data determined are A, T1, Elec, B, T2, Elec, B, T3, Elec.
Step 503:Show corresponding to attribute data corresponding with the target dimension data and the target dimension data Value Data.
Attribute data corresponding to the target dimension data is determined, and determines value number corresponding to the target dimension data According to, and display properties data and the Value Data.For example, according to examples cited in step 502, display content is:A units 2013 Year January, power consumption was 1000 degree, 2 month of B units 2013 power consumption be 2000 degree, B units in March, 2013 power consumption is 2000 degree.
Corresponding with above method embodiment, present invention also provides compression storing data device.Referring to Fig. 6, it shows A kind of structural representation of compression storing data device one embodiment based on sparse matrix of the application offer, this reality have been provided Applying example can include:Data acquisition system receiving unit 601, the first storage file generation unit 602, dimension data generation unit 603, The storage file generation unit 605 of data tuple determining unit 604 and second to be stored.Wherein:
The data acquisition system receiving unit 601, the data acquisition system of a plurality of first data is included for receiving;Wherein, institute State that every first data include multiple attribute datas and a Value Data, the multiple attribute data each correspond to different Attribute, every first data have the multiple attributes of identical;
The first storage file generation unit 602, for according to each attribute, generating the first storage file;Its In, include in first storage file corresponding to the attribute data and each attribute data that each attribute includes Attribute-bit;
The dimension data generation unit 603, for determining each attribute data in every first data each Corresponding attribute-bit, and by the dimension data of the first data of this of each attribute-bit combination producing;
The data tuple determining unit 604 to be stored, for by dimension data corresponding to every first data and Value Data is defined as a data tuple to be stored;
The second storage file generation unit 605, for being deposited according to each data tuple to be stored, generation second Store up file.
Optionally, referring to Fig. 7, it illustrates a kind of compression storing data dress based on sparse matrix that the application provides Put the part-structure schematic diagram of another embodiment.In said apparatus embodiment, the first storage file generation unit 602 can be with It is accomplished by the following way:Attribute data determination subelement 701, property element determination subelement 702, judgment sub-unit 703, One result subelement 704 and the second result subelement 705.
The attribute data determination subelement 701, for determining attribute data corresponding with each attribute;
The property element determination subelement 702, for the type according to each attribute, it is determined that corresponding with the attribute Property element;
The judgment sub-unit 703, for judging whether the property element is identical with the attribute data;If so, touch Send out the first result subelement 704;If it is not, the second result subelement 705 of triggering;
The first result subelement 704, for generating attribute-bit corresponding to the attribute data;The first of generation The attribute data and its corresponding attribute-bit are stored in storage file;
The second result subelement 705, for generating component identification corresponding to the property element;The first of generation The property element and its corresponding component identification are stored in storage file.
The explanation of present apparatus embodiment refers to above method embodiment, will not be described here.
It should be noted that the first storage file and described second that the first storage file generation unit 602 generates The second storage file that storage file generation unit 605 generates is tables of data.
Optionally, on the basis of said apparatus embodiment, the present embodiment can also include:
Second data receipt unit, the second data of multiple attribute datas and a Value Data are included for receiving;Its In, attribute corresponding to the attribute data that second packet contains is identical with the attribute of first data;
Second data dimension data generating unit, for obtaining attribute-bit corresponding to each attribute data, and will The dimension data of the second data of this of each attribute-bit combination producing;
Second data tuple determining unit to be stored, for dimension data and Value Data corresponding to second data is true It is set to a data tuple to be stored;
Second data adding device, for the data tuple to be stored to be added into second storage file.
The explanation of present apparatus embodiment refers to above method embodiment, will not be described here.
Optionally, on the basis of said apparatus embodiment, the present embodiment can also include:
Rule determination unit, for according to attribute data corresponding to each attribute in first storage file, it is determined that Classified inquiry rule;
Data searching unit, for according to it is described classified inquiry rule, in second storage file search with it is described Target dimension data corresponding to rule searching;
Data display unit, for showing attribute data corresponding with the target dimension data and the target dimension number According to corresponding Value Data.
The explanation of present apparatus embodiment refers to above method embodiment, will not be described here.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight Point explanation is all difference with other embodiment, between each embodiment identical similar part mutually referring to.
A kind of data compression storage method and device based on sparse matrix provided by the present invention have been carried out in detail above It is thin to introduce, the foregoing description of the disclosed embodiments, professional and technical personnel in the field is realized or using the present invention.It is right A variety of modifications of these embodiments will be apparent for those skilled in the art, and as defined herein one As principle can realize in other embodiments without departing from the spirit or scope of the present invention.Therefore, the present invention will It will not be intended to be limited to the embodiments shown herein, and be to fit to consistent with principles disclosed herein and features of novelty Most wide scope.

Claims (8)

  1. A kind of 1. data compression storage method based on sparse matrix, it is characterised in that including:
    Receive the data acquisition system for including a plurality of first data;Wherein, every first data include multiple attribute numbers According to and a Value Data, the multiple attribute data each correspond to different attributes, every first data have identical Multiple attributes;
    According to each attribute, the first storage file is generated;Wherein, each category is included in first storage file Attribute-bit corresponding to the attribute data and each attribute data that property includes;
    Determine each self-corresponding attribute-bit of each attribute data in every first data, and by each attribute mark Know the dimension data of combination producing the first data of this;Wherein, each attribute-bit represents a dimension;
    Dimension data and Value Data corresponding to every first data are defined as a data tuple to be stored;
    According to each data tuple to be stored, the second storage file is generated;
    Wherein, it is described according to each attribute, the first storage file is generated, including:
    It is determined that attribute data corresponding with each attribute;
    According to the type of each attribute, it is determined that property element corresponding with the attribute;
    Judge whether the property element is identical with the attribute data;If so, then generate attribute corresponding to the attribute data Mark;The attribute data and its corresponding attribute-bit are stored in the first storage file of generation;
    If it is not, then generate component identification corresponding to the property element;The attribute is stored in the first storage file of generation Element and its corresponding component identification.
  2. 2. according to the method for claim 1, it is characterised in that first storage file and second storage file are Tables of data.
  3. 3. according to the method for claim 1, it is characterised in that according to each data tuple to be stored, generation the After two storage files, in addition to:
    Receive the second data for including multiple attribute datas and a Value Data;Wherein, the attribute that second packet contains Attribute corresponding to data is identical with the attribute of first data;
    Attribute-bit corresponding to each attribute data is obtained, and this of each attribute-bit combination producing second is counted According to dimension data;
    Dimension data and Value Data corresponding to second data are defined as a data tuple to be stored;
    The data tuple to be stored is added into second storage file.
  4. 4. according to the method described in claims 1 to 3 any one, it is characterised in that after methods described, in addition to:
    According to attribute data corresponding to each attribute in first storage file, it is determined that classified inquiry rule;
    According to the classified inquiry rule, target dimension corresponding with the rule searching is searched in second storage file Data;
    Show Value Data corresponding to attribute data corresponding with the target dimension data and the target dimension data.
  5. A kind of 5. compression storing data device based on sparse matrix, it is characterised in that including:
    Data acquisition system receiving unit, the data acquisition system of a plurality of first data is included for receiving;Wherein, every first number According to multiple attribute datas and a Value Data is included, the multiple attribute data each corresponds to different attributes, described every The data of bar first have the multiple attributes of identical;
    First storage file generation unit, for according to each attribute, generating the first storage file;Wherein, described first Include attribute-bit corresponding to the attribute data and each attribute data that each attribute includes in storage file;
    Dimension data generation unit, for determining each self-corresponding attribute mark of each attribute data in every first data Know, and by the dimension data of the first data of this of each attribute-bit combination producing;Wherein, each attribute-bit represents one Individual dimension;
    Data tuple determining unit to be stored, for dimension data and Value Data corresponding to every first data to be defined as One data tuple to be stored;
    Second storage file generation unit, for according to each data tuple to be stored, generating the second storage file;
    The first storage file generation unit includes:
    Attribute data determination subelement, for determining attribute data corresponding with each attribute;
    Property element determination subelement, for the type according to each attribute, it is determined that property element corresponding with the attribute;
    Judgment sub-unit, for judging whether the property element is identical with the attribute data;If so, triggering first is born fruit Unit;If it is not, the second result subelement of triggering;
    First result subelement, for generating attribute-bit corresponding to the attribute data;In the first storage file of generation Store the attribute data and its corresponding attribute-bit;
    Second result subelement, for generating component identification corresponding to the property element;In the first storage file of generation Store the property element and its corresponding component identification.
  6. 6. device according to claim 5, it is characterised in that the first of the first storage file generation unit generation is deposited The second storage file for storing up file and the second storage file generation unit generation is tables of data.
  7. 7. device according to claim 5, it is characterised in that also include:
    Second data receipt unit, the second data of multiple attribute datas and a Value Data are included for receiving;Wherein, institute It is identical with the attribute of first data to state attribute corresponding to the attribute data that the second packet contains;
    Second data dimension data generating unit, for obtaining attribute-bit corresponding to each attribute data, and by described in The dimension data of each attribute-bit combination producing the second data of this;
    Second data tuple determining unit to be stored, for dimension data and Value Data corresponding to second data to be defined as One data tuple to be stored;
    Second data adding device, for the data tuple to be stored to be added into second storage file.
  8. 8. according to the device described in claim 5 to 7 any one, it is characterised in that also include:
    Rule determination unit, for according to attribute data corresponding to each attribute in first storage file, it is determined that classification Rule searching;
    Data searching unit, for according to the classified inquiry rule, being searched and the inquiry in second storage file Target dimension data corresponding to rule;
    Data display unit, for showing attribute data corresponding with the target dimension data and the target dimension data pair The Value Data answered.
CN201410037979.XA 2014-01-26 2014-01-26 A kind of data compression storage method and device based on sparse matrix Active CN103761316B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201410037979.XA CN103761316B (en) 2014-01-26 2014-01-26 A kind of data compression storage method and device based on sparse matrix
CA2871435A CA2871435C (en) 2014-01-26 2014-11-18 Method and device for compressing and storing data based on sparse matrix

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410037979.XA CN103761316B (en) 2014-01-26 2014-01-26 A kind of data compression storage method and device based on sparse matrix

Publications (2)

Publication Number Publication Date
CN103761316A CN103761316A (en) 2014-04-30
CN103761316B true CN103761316B (en) 2018-02-06

Family

ID=50528553

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410037979.XA Active CN103761316B (en) 2014-01-26 2014-01-26 A kind of data compression storage method and device based on sparse matrix

Country Status (2)

Country Link
CN (1) CN103761316B (en)
CA (1) CA2871435C (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156407B (en) * 2014-07-29 2017-08-25 华为技术有限公司 Storage method, device and the storage device of index data
CN104574159B (en) * 2015-01-30 2018-01-23 华为技术有限公司 Data storage, querying method and device
US10644721B2 (en) 2018-06-11 2020-05-05 Tenstorrent Inc. Processing core data compression and storage system
CN109710611B (en) * 2018-12-25 2019-09-17 北京三快在线科技有限公司 The method of storage table data, the method, apparatus of lookup table data and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1317116A (en) * 1998-07-08 2001-10-10 必需技术公司 Value-instance-connectivity computer-implemented database
CN101311930A (en) * 2007-05-21 2008-11-26 Sap股份公司 Block compression of tables with repeated values
CN101311931A (en) * 2007-05-21 2008-11-26 Sap股份公司 Compression of tables based on occurrence of values
CN102402617A (en) * 2011-12-23 2012-04-04 天津神舟通用数据技术有限公司 Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1317116A (en) * 1998-07-08 2001-10-10 必需技术公司 Value-instance-connectivity computer-implemented database
CN101311930A (en) * 2007-05-21 2008-11-26 Sap股份公司 Block compression of tables with repeated values
CN101311931A (en) * 2007-05-21 2008-11-26 Sap股份公司 Compression of tables based on occurrence of values
CN102402617A (en) * 2011-12-23 2012-04-04 天津神舟通用数据技术有限公司 Easily compressed database index storage system using fragments and sparse bitmap, and corresponding construction, scheduling and query processing methods

Also Published As

Publication number Publication date
CA2871435C (en) 2017-02-07
CA2871435A1 (en) 2015-07-26
CN103761316A (en) 2014-04-30

Similar Documents

Publication Publication Date Title
CN103761316B (en) A kind of data compression storage method and device based on sparse matrix
CN105027115B (en) Inquiry to document and index
CN105630847B (en) Date storage method, data query method, apparatus and system
CN103902701B (en) A kind of data-storage system and storage method
CN106649708A (en) Method and device for storing data
CN105512229B (en) A kind of storage, querying method and the device of the regional information of IP address
Duvvuru et al. Undercovering research trends: Network analysis of keywords in scholarly articles
CN105843933B (en) The index establishing method of distributed memory columnar database
EP1734457A3 (en) Browsing method and apparatus using metadata
EP1455283A3 (en) Information relevance display method program storage medium and apparatus
JP2001043237A (en) Data file and data retrieving method
Li et al. Steiner tree packing number and tree connectivity
CN105978868A (en) Method and apparatus for searching IP address authority
CN104765868A (en) Page display method for data query
Abreu et al. Adjacency matrices of polarity graphs and of other C 4-free graphs of large size
WO2007112205A3 (en) Methods and systems for partitioning data in parallel processing systems
CN104809249A (en) Processing method and system of data structure
US20130282760A1 (en) Apparatus and Method for Random Database Sampling with Repeatable Results
CN106547902A (en) A kind of method that business platform and inquiry business are realized
Sahin et al. Squares of congruence subgroups of the extended modular group
CN108268252A (en) The method and apparatus for adding component property
JP5528644B2 (en) Method, computer system, and computer program (product) for creating and modifying data packets for storage reduction (memory-saving packet modification)
CN103984722B (en) File system with node and document handling method
Ayatollah Zadeh Shirazi et al. On distality of a transformation semigroup with one point compactification of a discrete space as phase space
CN102708420B (en) Method for optimizing multi-strategy-multi-rule detection scheme

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160322

Address after: 100192 Beijing city Haidian District Qinghe small Camp Road No. 15

Applicant after: BEIJING CHINA POWER INFORMATION TECHNOLOGY Co.,Ltd.

Applicant after: State Grid Corporation of China

Applicant after: STATE GRID ZHEJIANG ELECTRIC POWER Co.

Address before: 100192 Beijing city Haidian District Qinghe small Camp Road No. 15

Applicant before: BEIJING CHINA POWER INFORMATION TECHNOLOGY Co.,Ltd.

Applicant before: State Grid Corporation of China

GR01 Patent grant
GR01 Patent grant
CP01 Change in the name or title of a patent holder

Address after: 100192 Beijing city Haidian District Qinghe small Camp Road No. 15

Co-patentee after: STATE GRID CORPORATION OF CHINA

Patentee after: BEIJING CHINA POWER INFORMATION TECHNOLOGY Co.,Ltd.

Co-patentee after: STATE GRID ZHEJIANG ELECTRIC POWER Co.,Ltd.

Address before: 100192 Beijing city Haidian District Qinghe small Camp Road No. 15

Co-patentee before: State Grid Corporation of China

Patentee before: BEIJING CHINA POWER INFORMATION TECHNOLOGY Co.,Ltd.

Co-patentee before: STATE GRID ZHEJIANG ELECTRIC POWER Co.

CP01 Change in the name or title of a patent holder
TR01 Transfer of patent right

Effective date of registration: 20190726

Address after: 100085 Building 32-3-4108-4109, Pioneer Road, Haidian District, Beijing

Co-patentee after: STATE GRID CORPORATION OF CHINA

Patentee after: BEIJING GUODIANTONG NETWORK TECHNOLOGY Co.,Ltd.

Co-patentee after: STATE GRID ZHEJIANG ELECTRIC POWER Co.,Ltd.

Address before: 100192 Beijing city Haidian District Qinghe small Camp Road No. 15

Co-patentee before: STATE GRID CORPORATION OF CHINA

Patentee before: BEIJING CHINA POWER INFORMATION TECHNOLOGY Co.,Ltd.

Co-patentee before: STATE GRID ZHEJIANG ELECTRIC POWER Co.,Ltd.

TR01 Transfer of patent right