CN106844703B - A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine - Google Patents

A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine Download PDF

Info

Publication number
CN106844703B
CN106844703B CN201710064131.XA CN201710064131A CN106844703B CN 106844703 B CN106844703 B CN 106844703B CN 201710064131 A CN201710064131 A CN 201710064131A CN 106844703 B CN106844703 B CN 106844703B
Authority
CN
China
Prior art keywords
dimensional
data
memory
vector
indexing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710064131.XA
Other languages
Chinese (zh)
Other versions
CN106844703A (en
Inventor
张延松
王珊
杜小勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN201710064131.XA priority Critical patent/CN106844703B/en
Publication of CN106844703A publication Critical patent/CN106844703A/en
Application granted granted Critical
Publication of CN106844703B publication Critical patent/CN106844703B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/235Update request formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine, steps: building internal storage data model of storehouse inventory;Construct internal storage data warehouse all-in-one machine distributed storage model;High performance computing service device company-data more new strategy: when high performance computing service device cluster memory off-capacity, data at most is eliminated using round-robin queue's more new strategy, are updated to newest data;Realize the processing of internal storage data warehouse all-in-one machine OLAP query.The present invention can improve the utilization rate of database all-in-one machine asymmetry storage and computing resource, improve memory OLAP overall performance, the different disposal stage of multiple queries can further be flowed to water parallel processing on database one machine platform, improve system OLAP query throughput performance.The present invention is suitable for the memory OLAP application scenarios towards internal storage data warehouse all-in-one machine, and the memory OLAP performance that can adapt under database all-in-one machine asymmetry hardware structure accelerates demand.

Description

A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine
Technical field
The present invention relates to a kind of data warehouse implementation methods, especially with regard to a kind of memory number of data base-oriented all-in-one machine According to warehouse query processing implementation method.
Background technique
Database all-in-one machine is a kind of storage of data base-oriented big data and High Performance Data Query processing application characteristic and designs Soft and hardware integrated design solution.In terms of hardware design, database all-in-one machine is usually the clothes as unit of cabinet Business device aggregated structure, passes through built-in high speed network and server cluster provides the big data storage and processing energy of scalability Power.Different server cluster Expansion abilities is provided in cabinet, and is realized as unit of cabinet extending transversely;Database All-in-one machine generallys use small-scale high performance computing service device cluster and deposits for complex query processing service and extensive low side It stores up server cluster and is used for big data storage service, be a kind of asymmetric server cluster framework;Database all-in-one machine is usual Using its special hardware-accelerated storage access and query processing performance, if Oracle Exadata database all-in-one machine is using big Capacity PCI-e flash cache data in magnetic disk improves data access performance, IBM Netezza use site programmable gate array FPGA calculates the simple operations such as the biggish decompression of cost, projection, filtering as dedicated database accelerator card, for handling, And multi-core CPU then handles and more complicated the operation such as polymerize, connects, summarizing.In software aspects, Database Systems are needed towards number According to the special hardware structure optimization software design of library all-in-one machine, such as optimize distributed data storage strategy, optimization collects towards asymmetry The query process tactic of group optimizes towards novel flash memory device and novel acceleration card apparatus (such as FPGA, GPU, Intel MIC Phi etc.) Query Optimization Technique.
Data warehouse is the most important application field of database all-in-one machine, with the hair of novel storage and processing device technology Exhibition, internal storage data warehouse are increasingly becoming emerging real-time OLAP analysis processing platform, data base-oriented all-in-one machine framework it is interior Deposit data warehouse can preferably meet big data real-time OLAP application demand.Current internal storage data REPOSITORY TECHNOLOGY is mainly directed towards The hardware structure of isomorphism server cluster, towards asymmetric server cluster and novel storage, calculate in terms of it is excellent It is also immature to change technical research.Therefore, how pointedly towards internal storage data warehouse all-in-one machine framework the characteristics of and it is novel Storage and calculate equipment the characteristics of and systematically design memory OLAP Query Processing Technique frame become current urgent need to resolve skill Art problem: its critical issue is how to adapt to the hardware structure feature of internal storage data warehouse all-in-one machine, gives full play to internal storage data The hardware performance advantage of warehouse all-in-one machine, improves the overall performance of memory OLAP.
Summary of the invention
In view of the above-mentioned problems, the object of the present invention is to provide a kind of inquiries of the internal storage data warehouse of data base-oriented all-in-one machine Implementation method is handled, this method adapts to the acceleration of the memory OLAP performance under internal storage data warehouse all-in-one machine asymmetry hardware structure Demand gives full play to the hardware performance advantage of internal storage data warehouse all-in-one machine, improves the overall performance of memory OLAP.
To achieve the above object, the present invention takes following technical scheme: a kind of internal storage data of data base-oriented all-in-one machine Warehouse query processing implementation method, it is characterised in that the following steps are included: 1) constructing internal storage data model of storehouse inventory;2) it constructs Internal storage data warehouse all-in-one machine distributed storage model;3) high performance computing service device company-data more new strategy: work as high-performance When calculation server cluster memory off-capacity, data at most are eliminated using round-robin queue's more new strategy, are updated to newest Data;4) processing of internal storage data warehouse all-in-one machine OLAP query is realized.
In the step 1), multi-dimensional relation OLAP model of the internal storage data model of storehouse inventory using fusion, multi-dimensional relation OLAP model construction process is as follows: 1.1) logic data model: the cube structure of data warehouse is divided into dimension, more Three kinds of data structures of dimension index and measurement;1.2) Physical data model: dimension is stored as dimension table and dimensional vector, and dimension table is deposited using row Storage or column storage database engine, dimensional vector indicate dimension with structure of arrays, and array index is mapped as latitude coordinates;Multi-dimensional indexing Using column storage model;Measurement is stored as true table, is stored using column;1.3) multidimensional OLAP interrogation model includes that dimension maps, is more Dimension index calculates and polymerization calculates three processing stages.
In the step 1.3), the specific process is as follows: 1.3.1) dimension mapping: OLAP query is mapped to related dimension table, Dimensional vector is generated, the non-null value in dimensional vector identifies the corresponding multidimensional data subset of current OLAP query on each relevant dimension Component value;1.3.2) multi-dimensional indexing calculates: multi-dimensional indexing is mapped to corresponding dimensional vector realization to the multidimensional mistake of metric data Filter, and vector index is created, mark meets the multi-dimensional indexing item of current OLAP query, and the non-null value in vector index represents OLAP The multi-dimensional address for the aggregated data cube that inquiry packets attribute is constructed;It is obtained by multidimensional filtering and meets OLAP query condition The metrology data sets of data create vector index for metric data;1.3.3) polymerization calculates: metric data is based on vector index Packet aggregation is completed to calculate.
In the step 2), internal storage data warehouse all-in-one machine distributed storage model uses following two distributed storage plan Slightly: 2.1) dimension table, multi-dimensional indexing are centrally stored, true table distributed storage strategy;2.2) dimension table is centrally stored, multi-dimensional indexing, thing Real table distributed storage strategy.
In the step 2.1), specific storage strategy is as follows: 2.1.1) lesser dimension table is centrally stored in high-performance calculation Server cluster;When computing cluster configuration is higher, the multi-dimensional indexing in internal storage data warehouse is centrally stored in high-performance calculation clothes Business device clustered node;2.1.2) huge true table data are distributed using horizontal fragmentation mode is stored in storage service clustered node On;2.1.3) vector index that Multi-dimension calculation generates is transferred to corresponding storage server clustered node, completes polymerization and calculates.
In the step 2.2), specific storage strategy are as follows: stored when high performance computing service device cluster memory capacity is opposite Service cluster memory size it is smaller and can not stored memory data warehouse whole multi-dimensional indexing data when, using dimension table concentration deposit It is stored in high performance computing service device cluster, multi-dimensional indexing and true table use horizontal fragmentation mode to be stored in high-performance meter with being distributed It calculates in server cluster and storage server cluster.
In the step 4), specific memory OLAP inquiry processing method is as follows: 4.1) OLAP query is in high-performance calculation Server cluster executes, and OLAP query order is decomposed into the dimensional vector on related dimension table and generates order, filtering dimension table record, projection Packet attributes and dictionary encoding is carried out to packet attributes out, is encoded using dictionary table and as dimension table record corresponding dimensional vector unit Value, the dimension table for being unsatisfactory for filter condition record corresponding dimensional vector unit and are set to null value, creation OLAP query it is relevant it is each tie up to Amount;4.2) centrally stored using multi-dimensional indexing, when true table distributed storage strategy, multi-dimensional indexing is pressed true table physical partitioning and is carried out Logic fragment;4.3) when using multi-dimensional indexing, true table distributed storage strategy, each server node saves complete multidimensional rope Draw and downloads dimensional vector with factual data fragment, each server node to local node from high performance computing service device cluster, complete The OLAP of localization is calculated;4.4) when server node is configured with many-core coprocessor accelerator card, accelerated using coprocessor Card accelerates multi-dimensional indexing calculation method;4.5) in storage server node side, when memory size is less than data fragmentation, use is excellent Change strategy one and completes multi-dimensional indexing calculating.
In the step 4.2), OLAP query comprising the following three steps: 4.2.1) multi-dimensional indexing is raw according to OLAP query At dimensional vector carry out multidimensional filtering and calculate, generate corresponding vector index, the null value unit in vector index is for filtering thing Real table record, non-null value represent the block encoding that true table is recorded in OLAP query;When multi-dimensional indexing is in OLAP query correlation When the position value of dimensional vector mapping is non-empty, by the corresponding packet data cube multidimensional coordinate of related dimensional vector mapping value One-dimensional coordinate is converted to be stored in the corresponding unit of vector index;4.2.2) vector index of creation is sent by logic fragment Onto the corresponding node of storage server cluster, measure column is filtered by vector index, and carry out polymerization calculating;4.2.3 it) stores Polymerization result in server cluster node is transmitted back to high performance computing service device cluster and carries out global polymerization result merger operation, Global polymerization result is obtained, and the multidimensional coordinate of the corresponding packet data cube of polymerization result is mapped to each dimensional vector and is grouped Dictionary table is converted to packet attributes, exports OLAP query processing result.
In the step 4.4), the specific steps are as follows: 4.4.1) according to coprocessor accelerator card memory size to multidimensional rope Draw and divided with vector index, is distributed by the principle for maximizing coprocessor accelerator card memory usage and coprocessor is suitble to add The maximum fragment of fast card memory size, and copy to coprocessor accelerator card memory;4.4.2) when query execution, dimensional vector is answered Coprocessor accelerator card memory is made, the multi-dimensional indexing mapped based on dimensional vector is completed by coprocessor accelerator card and is calculated, it is raw At vector index, and memory is copied back into, updates corresponding vector index fragment;4.4.3) memory multi-dimensional indexing fragment be based on dimension to Amount is completed multi-dimensional indexing by CPU and is calculated, and generates corresponding vector index fragment;4.4.4) at CPU and coprocessor accelerator card Different multi-dimensional indexing data fragmentations is managed, the calculating on two multi-dimensional indexing fragments executes parallel.
In the step 4.5), optimisation strategy one is as follows: 4.5.1) when node memory can store multi-dimensional indexing and part When measure column, multi-dimensional indexing full memory is stored, and factual data is frequent in memory storage by lru algorithm to arrange as storage cell The measure column of access, the measure column infrequently accessed are stored in flash memory;4.5.2) when node memory cannot store whole multidimensional ropes When drawing column, multi-dimensional indexing is stored in node server memory or flash memory to arrange for unit;Multi-dimensional indexing is calculated for unit by LRU with arranging The multi-dimensional indexing column that method selection frequently uses are stored in memory;4.5.3 the multi-dimensional indexing) when multi-dimensional indexing calculates, in memory Column first carry out dimensional vector map operation, and vector index records some numerical results of memory multi-dimensional indexing column, and with vector rope Draw non-null value position as the multi-dimensional indexing column in index accesses flash memory, completes remaining multi-dimensional indexing calculating task.
The invention adopts the above technical scheme, which has the following advantages: 1, the present invention passes through building data base-oriented The novel meter such as all-in-one machine high performance computing service device cluster and storage server cluster, many-core coprocessor accelerator card and flash memory It calculates, the memory OLAP data model of storage hardware, Data Warehouse Conceptual data set is divided into dimension, multi-dimensional indexing and measurement three Class data respectively correspond high performance computing service device cluster and many-core coprocessor accelerator card memory and computing resource, storage clothes Memory, flash memory and the computing resource of business device cluster realize data storage and calculate feature and database all-in-one machine hardware characteristics phase It adapts to;Memory OLAP query processing is reduced to dimension mapping calculation, multi-dimensional indexing calculates and polymerize calculating, database is most complicated Attended operation be converted to the calculating of the multi-dimensional indexing based on simple vector data structure, keep Data Structure and Algorithm design more suitable The programming feature for closing many-core coprocessor accelerator card accelerates OLAP core capabilities by novel computing hardware;OLAP is looked into Inquiry task is on database all-in-one machine high performance computing service device cluster, many-core coprocessor accelerator card and storage server cluster Configuration is optimized, the utilization rate of database all-in-one machine asymmetry storage and computing resource is improved, improves memory OLAP globality Energy;It is the stream treatment task between different computing platforms that OLAP query, which handles Task-decomposing, can further be looked into multiple The different disposal stage of inquiry flows water parallel processing on database one machine platform, improves system OLAP query throughput performance.2, The present invention is mentioned for the hardware configuration of the asymmetric server cluster framework of database all-in-one machine and flash memory, coprocessor accelerator card The memory OLAP Query Optimization Technique appeared to hardware feature maximizes internal storage data warehouse all-in-one machine by hardware to memory The optimization function of OLAP performance.3, under memory database warehouse all-in-one machine asymmetry hardware structure, in storage model, this hair Lesser dimension and multi-dimensional indexing data are centrally stored in high performance computing service device cluster by bright use, by biggish measurement number According to the data distribution strategy for being stored in storage server cluster, make the data characteristics and database all-in-one machine high-performance of data warehouse The memory capacity feature of calculation server cluster and storage cluster is adapted.4, on computation model, the present invention, which uses, passes through crowd Core coprocessor rapid memory OLAP query processing technique, utilizes many-core coprocessor accelerator card (such as FPGA, GPU, Intel MIC Phi etc.) the characteristics of computation capability is powerful, price is low, low energy consumption rapid memory OLAP multi-dimensional indexing calculation processing Stage improves whole OLAP query process performance.
In conclusion the present invention is suitable for the memory OLAP application scenarios towards internal storage data warehouse all-in-one machine, Neng Goushi The memory OLAP performance under database all-in-one machine asymmetry hardware structure is answered to accelerate demand.
Detailed description of the invention
Fig. 1 is database all-in-one machine hardware structure schematic diagram;
Fig. 2 is logic data model, Physical data model and multidimensional OLAP computation model schematic diagram used in the present invention;
Fig. 3 is that dimension table of the present invention, multi-dimensional indexing are centrally stored, true table distributed storage strategy schematic diagram;
Fig. 4 is that dimension table of the present invention is centrally stored, multi-dimensional indexing, true table distributed storage strategy;
Fig. 5 is high performance computing service device company-data more new strategy schematic diagram of the present invention;
Fig. 6 is the present invention towards the multi-dimensional indexing of CPU and many-core coprocessor framework calculating schematic diagram;
Fig. 7 is that the present invention is based on the OLAP query processing schematics of database all-in-one machine cluster;
Fig. 8 is that the present invention mostly inquiry flowing water executes method schematic diagram parallel;
Fig. 9 is OLAP query treatment process schematic diagram of the embodiment of the present invention.
Specific embodiment
The present invention is described in detail below with reference to the accompanying drawings and embodiments.
The present invention provides a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine, party's normal plane To novel storage and the computing hardware such as database all-in-one machine asymmetry hardware structure and flash memory, many-core coprocessor accelerator card It optimizes, is allowed to be adapted with memory OLAP query processing feature, high-performance internal storage data warehouse OLAP query is provided Processing capacity, specific step are poly- as follows:
1) internal storage data model of storehouse inventory is constructed:
As shown in Figure 1, database all-in-one machine generallys use dissymmetrical structure on hardware structure, usually by high-performance calculation Server cluster and storage service cluster are constituted: high performance computing service device cluster hardware configuration is higher, is such as configured with large capacity Memory or muti-piece high-performance many-core coprocessor accelerator card;Storage service cluster hardware configuration is usually relatively low, memory size It is relatively small, a small amount of coprocessor accelerator card of possible configuration.According to hardware configuration feature, High-Performance Computing Cluster is mainly responsible for memory The main Multi-dimension calculation task of data warehouse, and storage cluster is then suitble to the processing lower data processing task of computation complexity.
For the hardware structure feature of database all-in-one machine, as shown in Fig. 2, internal storage data model of storehouse inventory of the present invention is adopted With the multi-dimensional relation OLAP model of fusion, multi-dimensional relation OLAP model construction process is as follows:
1.1) logic data model
The cube structure of data warehouse is divided into three kinds of dimension, multi-dimensional indexing and measurement data structures.Dimension The solid axes of corresponding internal storage data warehouse multi-dimensional data cube, for constructing Data Warehouse Conceptual data cube mould Type;Multi-dimensional indexing corresponds to space coordinate of the factual data in multi-dimensional data cube, for mapping metric data in multidimensional number According to the hyperspace position in cube;Measurement then corresponds to each attribute of factual data.
1.2) Physical data model
In Physical data model, dimension is stored as dimension table and dimensional vector, and dimension table can be using row storage or column storage number According to library engine, each dimension table records the unique coordinate values being mapped as in dimension, and dimensional vector indicates dimension, array with structure of arrays Subscript is mapped as latitude coordinates;Multi-dimensional indexing uses column storage model, and multidimensional coordinate storage is independent multi-dimensional indexing column, mark Multidimensional coordinate component of the factual data in multi-dimensional data cube space, vector index are the arrays isometric with measure column, are used In the corresponding factual data of retrieval multi-dimensional indexing;Measurement be stored as true table, using column memory technology improve data compression ratio and Analyze process performance.
1.3) multidimensional OLAP interrogation model
OLAP query is the multidimensional operation towards multi-dimensional data cube structure.OLAP based on multi-dimensional relation OLAP model Query processing includes three processing stages:
1.3.1) dimension mapping: being mapped to related dimension table for OLAP query, generates dimensional vector, the non-null value mark in dimensional vector Current component value of the corresponding multidimensional data subset of OLAP query on each relevant dimension;
1.3.2) multi-dimensional indexing calculates: multi-dimensional indexing is mapped to corresponding dimensional vector (the corresponding related dimension of multi-dimensional indexing value Vector array index value) it realizes and the multidimensional of metric data is filtered, and vector index is created, mark meets current OLAP query Multi-dimensional indexing item, the non-null value in vector index represent the multidimensional for the aggregated data cube that OLAP query packet attributes are constructed Address;The metrology data sets for meeting OLAP query condition data are obtained by multidimensional filtering, create vector rope for metric data Draw;
1.3.3) polymerization calculates: metric data is based on vector index and completes packet aggregation calculating.
2) internal storage data warehouse all-in-one machine distributed storage model is constructed:
In data warehouse, dimension table is usually smaller and increasess slowly, and true table is huge and increases comparatively fast, but true table data For read-only additional mode (i.e. insert-only mode).Under database all-in-one machine hardware structure of the invention, according to hardware Configuring condition, using following two distributed storage strategy:
2.1) dimension table, multi-dimensional indexing are centrally stored, true table distributed storage strategy:
2.1.1) as shown in figure 3, lesser dimension table is centrally stored in high performance computing service device cluster;When computing cluster is matched Set it is higher, such as configured with large capacity memory, configuration muti-piece many-core coprocessor accelerate card apparatus when, internal storage data warehouse it is more Dimension index is centrally stored in high performance computing service device clustered node, utilizes powerful computational of high performance computing service device cluster It can complete the Multi-dimension calculation task of memory OLAP inquiry;
2.1.2) huge true table data are distributed using horizontal fragmentation mode and are stored on storage service clustered node.
2.1.3) vector index that Multi-dimension calculation generates is transferred to corresponding storage server clustered node, completes polymerization meter It calculates.
Wherein, when multi-dimensional indexing is more than High-Performance Computing Cluster node storage capacity, by the physical store of multi-dimensional indexing data Earliest multi-dimensional indexing data degradation is cold data by sequence, the storage server being distributed to where corresponding true table data fragmentation Clustered node shifts storage server clustered node under calculating the part multi-dimensional indexing.
2.2) dimension table is centrally stored, multi-dimensional indexing, true table distributed storage strategy:
As shown in figure 4, when high performance computing service device cluster memory capacity is smaller with respect to storage service cluster memory capacity And can not stored memory data warehouse whole multi-dimensional indexing data when, high performance computing service device is centrally stored in using dimension table Cluster, multi-dimensional indexing and true table are stored in high performance computing service device cluster and storage clothes using horizontal fragmentation mode with being distributed It is engaged in device cluster.
3) high performance computing service device company-data more new strategy: when high performance computing service device cluster memory off-capacity When, data (as shown in Figure 5) at most are eliminated using round-robin queue's more new strategy, are updated to newest data.It is specific as follows:
Using column storage, column are stored as unit of row group for multi-dimensional indexing and factual data, and the size of row group column is flash memory I/O Row group size (such as 1M, 2M, 4M ... row) is arranged according to column data access performance in the integral multiple of data block size.According to storage plan Slightly (high-performance server cluster only stores multi-dimensional indexing or storage multi-dimensional indexing and factual data), column data width and server The open ended maximum row group number n of free memory calculation of capacity memory, the data newly increased are stored in data column as unit of row group In.When row group number is more than threshold value, such as the 90% of maximum row group number, then the corresponding column data of initial row group is asynchronously synchronized to sudden strain of a muscle In depositing, after the storage of whole row groups is full, using initial row group as the storage unit of new insertion data.Entire row group is used as one and follows Ring queue, the row group of rear of queue is for being inserted into new record, and the row group of queue head is for eliminating legacy data to flash memory.It is eliminated in flash memory Data storage server clustered node is copied to by asynchronous mode, after synchronously completing delete high performance computing service device collection Data fragmentation in group node flash memory.
In storage strategy as shown in Figure 3, the centrally stored multi-dimensional indexing data of high performance computing service device cluster are superseded Multi-dimensional indexing row group data according to factual data storage server cluster Distribution Strategy from high performance computing service device cluster Node flash sync keeps multi-dimensional indexing row group data and corresponding true table row group to corresponding storage server node memory Data are stored in identical node, shift storage server node under part multi-dimensional indexing is calculated.Storage plan shown in Fig. 4 In slightly, high-performance server node stores multidimensional data and factual data.Internal storage data replacement policy is as shown in figure 5, superseded Row group is made of multi-dimensional indexing and factual data, and row group quantity reaches certain threshold value (such as 32,64 ..., the quantity of row group in flash memory Determine the granularity replicated to storage server data) when, using several row groups in flash memory as a data fragmentation, by storage clothes The data distribution strategy of business device cluster is assigned to storage server clustered node, completes legacy data from high performance computing service device collection Transfer of the group to storage server cluster.
4) processing of internal storage data warehouse all-in-one machine OLAP query is realized:
The high performance computing service device cluster and storage server cluster of database all-in-one machine are in storage capacity and processing energy The asymmetry of the asymmetry of power, server node inner treater and many-core coprocessor accelerator card processing capacity and interior It is a kind of for depositing and requiring the OLAP query processing of internal storage data warehouse all-in-one machine with asymmetry of the flash memory in memory capacity and performance The distributed computing mechanism of loose coupling, different calculation stages can distribute to different storages according to hardware configuration and calculate money Source.In conjunction with internal storage data warehouse all-in-one machine different hardware configuration and data distribution strategy, specific memory OLAP query processing Method is as follows:
4.1) OLAP query is executed in high performance computing service device cluster, and OLAP query order is decomposed on related dimension table Dimensional vector generates order, and filtering dimension table record is projected out packet attributes and carries out dictionary encoding to packet attributes, compiled with dictionary table Code records corresponding dimensional vector cell value as dimension table, and the dimension table for being unsatisfactory for filter condition records corresponding dimensional vector unit and is set to Null value, the relevant each dimensional vector of creation OLAP query.
The block encoding of each dimensional vector constitutes a packet data cube, and the grouping value in dimensional vector is represented in the dimension The dimension coordinate components of upper packet data cube.
4.2) centrally stored using multi-dimensional indexing, when true table distributed storage strategy, multi-dimensional indexing presses true table physics point Piece carries out logic fragment.OLAP query comprising the following three steps:
4.2.1) multi-dimensional indexing carries out multidimensional filtering calculating according to the dimensional vector that OLAP query generates, and generates corresponding vector It indexes, the null value unit in vector index represents true table and be recorded in OLAP query for filtering true table record, non-null value Block encoding.When multi-dimensional indexing is when the position value that OLAP query correlation dimensional vector maps is non-empty, by correlation tie up to The corresponding packet data cube multidimensional coordinate of amount mapping value is converted to one-dimensional coordinate and is stored in the corresponding unit of vector index;
4.2.2 it) sends the vector index of creation on the corresponding node of storage server cluster by logic fragment, such as schemes Shown in 2, measure column is filtered by vector index, and carry out polymerization calculating;
4.2.3) polymerization result on storage server clustered node is transmitted back to high performance computing service device cluster and carries out entirely Office's polymerization result merger operation obtains global polymerization result, and the multidimensional of the corresponding packet data cube of polymerization result is sat Mark is mapped to each dimensional vector grouping dictionary table, is converted to packet attributes, exports OLAP query processing result.
4.3) when using multi-dimensional indexing, true table distributed storage strategy, each server node saves complete multidimensional rope Draw and downloads dimensional vector with factual data fragment, each server node to local node from high performance computing service device cluster, complete The OLAP of localization is calculated.
In local node, multi-dimensional indexing calculates, generates vector index, polymerization calculating can form assembly line, improves OLAP The locally aggregated result of query processing performance, generation returns to high performance computing service device clustered node, by high-performance server collection Group node completes the merger and output query result task of global polymerization result.
4.4) when server node is configured with many-core coprocessor accelerator card, multidimensional is accelerated using coprocessor accelerator card Index calculation method, the specific steps are as follows:
4.4.1) multi-dimensional indexing and vector index are divided according to coprocessor accelerator card memory size, by maximization The principle of coprocessor accelerator card memory usage distributes the maximum fragment for being suitble to coprocessor accelerator card memory size, and replicates To coprocessor accelerator card memory;
4.4.2) when query execution, dimensional vector is copied into coprocessor accelerator card memory, passes through coprocessor accelerator card It completes the multi-dimensional indexing mapped based on dimensional vector to calculate, generates vector index, and copy back into memory, update corresponding vector index Fragment;
4.4.3) memory multi-dimensional indexing fragment be based on dimensional vector by CPU complete multi-dimensional indexing calculate, and generate accordingly to Amount index fragment;
4.4.4) CPU handles different multi-dimensional indexing data fragmentations, two multi-dimensional indexing fragments from coprocessor accelerator card On calculating can execute parallel.
4.5) in storage server node side, when memory size is less than data fragmentation, using following optimisation strategy multidimensional Index calculates:
4.5.1) when node memory can store multi-dimensional indexing and part measure column, the storage of multi-dimensional indexing full memory, Factual data is to arrange as storage cell, by the measure column that LRU (nearest least referenced) algorithm is frequently accessed in memory storage, not frequently The measure column of numerous access is stored in flash memory;
4.5.2) when node memory cannot store whole multi-dimensional indexing column, multi-dimensional indexing is stored in node to arrange for unit Server memory or flash memory.Multi-dimensional indexing with arrange for unit by lru algorithm selection frequently use multi-dimensional indexing column be stored in It deposits;
4.5.3) when multi-dimensional indexing calculates, the multi-dimensional indexing column in memory first carry out dimensional vector map operation, vector rope Draw some numerical results of record memory multi-dimensional indexing column, and using in vector index non-null value position as index accesses flash memory In multi-dimensional indexing column, complete remaining multi-dimensional indexing calculating task.
In conclusion memory database all-in-one machine OLAP query processing technique of the present invention draws OLAP query task It is divided into dimension mapping calculation, multi-dimensional indexing calculates and polymerize three flowing water of calculating and execute the stages, as shown in fig. 7, at OLAP query Dimension mapping calculation, the multi-dimensional indexing of reason calculate and polymerization calculates three calculation stages and is respectively distributed to high performance computing service device collection Group CPU, high performance computing service device cluster coprocessor and when storage server clustered node, the calculated result in each stage with Vector mode passes to next hardware platform and continues to execute.As shown in figure 8, the different execution stages of multiple OLAP queries can be with Flowing water is parallel, improves the utilization rate of each computing resource in database all-in-one machine asymmetry hardware platform, improves system queries and handles up Performance.The ideal conditions of flowing water parallel computation is that the calculating time of three phases is close, calculating time in each stage by data volume, The Multiple factors such as computation complexity, processor memory size, processor quantity, processor performance determine, need to match by optimization Setting hardware keeps the calculating time of three phases relatively uniform, improves the computational efficiency of database all-in-one machine hardware platform.
The present invention is described further below with reference to embodiment.
As shown in figure 9, in the present embodiment, entire OLAP query treatment process is divided into three processing stages.Memory number According to the high-performance server cluster of library all-in-one machine as host node, receive OLAP query.
In dimension table processing stage, the CPU of high-performance server cluster is by selection, projection, the grouping in sql command on dimension table Operation is applied to corresponding dimension table, is projected out packet attributes, then carries out dictionary table compression to packet attributes, for not repetition values point It with unique serial number, then updates dimension table grouping and is projected as grouping projection vector, replace original packet with dictionary table coding Attribute value.As being projected out packet attributes c_nation by WHERE clause c_region=' AMERICA ' on customer table, In attribute value ' Canada ' and the dictionary encoding of ' Brazil ' be respectively 0 and 1, generating with dimension table there is position mapping one by one to close The dimensional vector of system.Similarly, dimensional vector is generated on supplier table, the dictionary table coding of three members of packet attributes is respectively 0,1, 2.Two dimension tables are corresponding to generate two dimensional vectors.
In multi-dimensional indexing calculation stages, multi-dimensional indexing maps directly to the corresponding deviation post of dimensional vector, reads corresponding Grouping value, when any multi-dimensional indexing mapping position is null value, current fact table record is unsatisfactory for the output condition of inquiry, corresponding Vector index position be set as null value;It, will be corresponding when the dimensional vector position of two multi-dimensional indexing values mapping is not empty Block encoding is stored as Multidimensional numerical subscript, if first recording indexes column l_CK, l_SK value of multi-dimensional indexing is 2 and 0, reflects respectively Be mapped to dimensional vector value be 1 and 0 position, Multidimensional numerical A [1] [0] subscript is converted into one-dimension array subscript 3, be stored in Measure first position of index.When the sufficient many-core coprocessor accelerator card of configuration, multi-dimensional indexing calculating adds in coprocessor It is executed on speed card.Dimensional vector copies to coprocessor accelerator card memory, with the multidimensional rope for being stored in coprocessor accelerator card memory Draw the common multi-dimensional indexing that executes of column to calculate, generates vector index, and copy back into memory.When coprocessor accelerator card memory can not It, can be concurrently on the multi-dimensional indexing column fragment of memory and coprocessor accelerator card memory when executing whole multi-dimensional indexings calculating Execute multi-dimensional indexing calculating task.The vector index of generation is calculated for the polymerization on measure column, and vector index is pressed and measurement number It is vector fragment according to the corresponding model split of fragment, is transferred to the corresponding node of storage server cluster.
Corresponding measure column note is accessed in polymerization calculation stages, sequential scan vector index and by the non-empty position of vector index It records position and carries out Aggregation computation.If scan vector indexes first unit, reading value 3 accesses measure column l_revenue first Metric 946 is mapped in the corresponding unit A [1] [0] (or A [3]) of Multidimensional numerical Agg and carries out accumulation calculating by unit.
After completing all Aggregation computations, Multidimensional numerical Agg is obtained.The Multidimensional numerical of each storage server node is in height Performance calculation server clustered node carries out aggregation result merger, and its each array location subscript is mapped to dimension table dictionary table pair Actual packet attributes value is read in the position answered, and generates query result record.As A [1] [0] is respectively corresponded in customer table Nation value is that nation value is Japan in Brazil and supplier table, and Multidimensional numerical subscript is reduced to grouping and is belonged to Property value, and with the cluster set group in array location be combined into output record.
OLAP query execute the highest multi-dimensional indexing calculation stages of time accounting, algorithm using fixed length dimension table vector, Multi-dimensional indexing column and vector index, attended operation are reduced to position mapping of the multi-dimensional indexing on dimensional vector, are accessed based on array Algorithm design can better adapt to the hardware characteristics of many-core coprocessor accelerator card large-scale integrated simple cores, preferably Play its computation capability.Multi-dimensional indexing counts and is designed to independent calculated under deposit data warehouse all-in-one machine framework Journey can use novel many-core coprocessor accelerator card and further increase multi-dimensional indexing calculated performance, the vector index energy of generation Enough polymerization calculated performances improved on storage server node on metric data more significantly, simplify on storage server node Computation complexity improves polymerization computational efficiency.
In conclusion database all-in-one machine is a kind of asymmetric hardware structure, high-end calculation server cluster and low side are deposited It stores up server cluster to service respectively for high-performance complicated calculations and the access of high extension storage, novel flash memory and many-core coprocessor Accelerator card hardware technology further improves the storage and calculated performance of database all-in-one machine.For internal storage data warehouse applications Speech improves memory real-time OLAP query processing performance needs according to the storage and calculated performance feature of different hardware, targetedly Ground optimizes distributed data storage and distribution calculating task, utilizes advanced hardware-accelerated OLAP query process performance.The present invention towards Database all-in-one machine asymmetry hardware structure and devise multi-dimensional relation OLAP model, data warehouse is divided into lesser dimension Degree, medium sized multi-dimensional indexing and biggish metric data three parts are handled with high performance computing service device cluster, many-core association The storage capacity of device accelerator card memory and storage server cluster matches, and optimizes distributed data storage strategy;Meanwhile it will OLAP query treatment process is decomposed into dimension mapping calculation, multi-dimensional indexing calculates and polymerization calculates three phases, at OLAP query The main cost that calculates of reason focuses on multi-dimensional indexing calculation stages, and hardware-accelerated more by novel many-core coprocessor accelerator card Dimension index calculating process, promotes memory OLAP query processing performance by advanced hardware.
The various embodiments described above are merely to illustrate the present invention, data structure, data type, application site and the realization of each component Technology may be changed, based on the technical solution of the present invention, all principles according to the present invention to individual part into Capable improvement and equivalents, should not exclude except protection scope of the present invention.

Claims (5)

1. a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine, it is characterised in that including following step It is rapid:
1) internal storage data model of storehouse inventory is constructed;
Multi-dimensional relation OLAP model of the internal storage data model of storehouse inventory using fusion, multi-dimensional relation OLAP model construction mistake Journey is as follows:
1.1) the cube structure of data warehouse logic data model: is divided into three kinds of dimension, multi-dimensional indexing and measurement Data structure;
1.2) Physical data model: dimension is stored as dimension table and dimensional vector, and dimension table is stored using row or column storage database engine, Dimensional vector indicates dimension with structure of arrays, and array index is mapped as latitude coordinates;Multi-dimensional indexing uses column storage model;Measurement is deposited Storage is true table, is stored using column;
1.3) multidimensional OLAP interrogation model includes dimension mapping, multi-dimensional indexing calculates and polymerization calculates three processing stages;
2) internal storage data warehouse all-in-one machine distributed storage model is constructed;
Internal storage data warehouse all-in-one machine distributed storage model uses following two distributed storage strategy:
2.1) dimension table, multi-dimensional indexing are centrally stored, true table distributed storage strategy:
2.1.1) dimension table is centrally stored in high performance computing service device cluster;When computing cluster is configured with large capacity memory, configuration When muti-piece many-core coprocessor accelerates card apparatus, the multi-dimensional indexing in internal storage data warehouse is centrally stored in high performance computing service device Clustered node;
2.1.2) true table data are distributed using horizontal fragmentation mode is stored on storage service clustered node;
2.1.3) vector index that Multi-dimension calculation generates is transferred to corresponding storage server clustered node, completes polymerization and calculates;
2.2) dimension table is centrally stored, multi-dimensional indexing, true table distributed storage strategy:
When high performance computing service device cluster memory capacity is smaller with respect to storage service cluster memory capacity and can not stored memory When the multi-dimensional indexing data of data warehouse whole, high performance computing service device cluster, multi-dimensional indexing are centrally stored in using dimension table It is stored in high performance computing service device cluster and storage server cluster with being distributed with true table using horizontal fragmentation mode;
3) high performance computing service device company-data more new strategy: when high performance computing service device cluster memory off-capacity, Data at most are eliminated using round-robin queue's more new strategy, are updated to newest data;
4) processing of internal storage data warehouse all-in-one machine OLAP query is realized, comprising the following steps:
4.1) OLAP query high performance computing service device cluster execute, OLAP query order be decomposed into the dimension on related dimension table to Amount generates order, and filtering dimension table record is projected out packet attributes and carries out dictionary encoding to packet attributes, encoded and made with dictionary table Corresponding dimensional vector cell value is recorded for dimension table, the dimension table for being unsatisfactory for filter condition records corresponding dimensional vector unit and is set to sky Value, the relevant each dimensional vector of creation OLAP query;
4.2) centrally stored using multi-dimensional indexing, when true table distributed storage strategy, multi-dimensional indexing press true table physical partitioning into Row logic fragment;
4.3) when using multi-dimensional indexing, true table distributed storage strategy, each server node save complete multi-dimensional indexing and Factual data fragment, each server node are downloaded dimensional vector to local node from high performance computing service device cluster, are completed local The OLAP of change is calculated;
4.4) when server node is configured with many-core coprocessor accelerator card, multi-dimensional indexing is accelerated using coprocessor accelerator card Calculation method;
4.5) in storage server node side, when memory size is less than data fragmentation, multidimensional rope is completed using optimisation strategy one Draw calculating.
2. a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine as described in claim 1, It is characterized in that: in the step 1.3), the specific process is as follows:
1.3.1) dimension mapping: being mapped to related dimension table for OLAP query, generates dimensional vector, and the non-null value mark in dimensional vector is current Component value of the corresponding multidimensional data subset of OLAP query on each relevant dimension;
1.3.2) multi-dimensional indexing calculates: multi-dimensional indexing is mapped to corresponding dimensional vector realization, the multidimensional of metric data is filtered, And vector index is created, mark meets the multi-dimensional indexing item of current OLAP query, and the non-null value in vector index represents OLAP and looks into Ask the multi-dimensional address for the aggregated data cube that packet attributes are constructed;It is obtained by multidimensional filtering and meets OLAP query conditional number According to metrology data sets, for metric data create vector index;
1.3.3) polymerization calculates: metric data is based on vector index and completes packet aggregation calculating.
3. a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine as described in claim 1, It is characterized in that: in the step 4.2), OLAP query comprising the following three steps:
4.2.1) multi-dimensional indexing carries out multidimensional filtering calculating according to the dimensional vector that OLAP query generates, and generates corresponding vector rope Draw, the null value unit in vector index represents true table and be recorded in OLAP query for filtering true table record, non-null value Block encoding;When multi-dimensional indexing is when the position value that OLAP query correlation dimensional vector maps is non-empty, by related dimensional vector The corresponding packet data cube multidimensional coordinate of mapping value is converted to one-dimensional coordinate and is stored in the corresponding unit of vector index;
4.2.2 it) sends the vector index of creation on the corresponding node of storage server cluster by logic fragment, passes through vector Index filtering measure column, and carry out polymerization calculating;
4.2.3) polymerization result on storage server clustered node is transmitted back to high performance computing service device cluster and carries out global gather Result merger operation is closed, obtains global polymerization result, and the multidimensional coordinate of the corresponding packet data cube of polymerization result is reflected It is mapped to each dimensional vector grouping dictionary table, packet attributes is converted to, exports OLAP query processing result.
4. a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine as described in claim 1, It is characterized in that: in the step 4.4), the specific steps are as follows:
4.4.1) multi-dimensional indexing and vector index are divided according to coprocessor accelerator card memory size, at maximization association The principle for managing device accelerator card memory usage distributes the maximum fragment for being suitble to coprocessor accelerator card memory size, and copies to association Processor accelerator card memory;
4.4.2) when query execution, dimensional vector is copied into coprocessor accelerator card memory, is completed by coprocessor accelerator card Multi-dimensional indexing based on dimensional vector mapping calculates, and generates vector index, and copy back into memory, updates corresponding vector index point Piece;
4.4.3) memory multi-dimensional indexing fragment is based on dimensional vector and completes multi-dimensional indexing calculating by CPU, and generates corresponding vector rope Draw fragment;
4.4.4) CPU and coprocessor accelerator card handle different multi-dimensional indexing data fragmentations, on two multi-dimensional indexing fragments Calculate parallel execute.
5. a kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine as described in claim 1, Be characterized in that: in the step 4.5), optimisation strategy one is as follows:
4.5.1) when node memory can store multi-dimensional indexing and part measure column, the storage of multi-dimensional indexing full memory is true Data are to arrange as storage cell, and by the measure column that lru algorithm is frequently accessed in memory storage, the measure column infrequently accessed is stored In flash memory;
4.5.2) when node memory cannot store whole multi-dimensional indexing column, multi-dimensional indexing is stored in node serve to arrange for unit Device memory or flash memory;Multi-dimensional indexing is stored in memory to arrange the multi-dimensional indexing column frequently used for unit by lru algorithm selection;
4.5.3) when multi-dimensional indexing calculates, the multi-dimensional indexing column in memory first carry out dimensional vector map operation, vector index note Record some numerical results of memory multi-dimensional indexing column, and using in vector index non-null value position as in index accesses flash memory Multi-dimensional indexing column, complete remaining multi-dimensional indexing calculating task.
CN201710064131.XA 2017-02-04 2017-02-04 A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine Active CN106844703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710064131.XA CN106844703B (en) 2017-02-04 2017-02-04 A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710064131.XA CN106844703B (en) 2017-02-04 2017-02-04 A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine

Publications (2)

Publication Number Publication Date
CN106844703A CN106844703A (en) 2017-06-13
CN106844703B true CN106844703B (en) 2019-08-02

Family

ID=59122907

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710064131.XA Active CN106844703B (en) 2017-02-04 2017-02-04 A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine

Country Status (1)

Country Link
CN (1) CN106844703B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107609746B (en) * 2017-08-18 2021-03-19 云南电网有限责任公司物资部 Intelligent bidding method based on data OLAP analysis and matched retrieval
CN110555080B (en) * 2018-03-30 2023-02-14 华为技术有限公司 Online analysis processing method, device and system
CN108874858A (en) * 2018-04-13 2018-11-23 南京中物联科技有限公司 The system and method for building and inquiry data cube under a kind of cloud environment
CN109151000A (en) * 2018-08-01 2019-01-04 长沙拓扑陆川新材料科技有限公司 A kind of system and method for cloud platform parallel communications
CN109189810B (en) * 2018-08-28 2021-07-02 拉扎斯网络科技(上海)有限公司 Query method, query device, electronic equipment and computer-readable storage medium
CN109977175B (en) 2019-03-20 2021-06-01 跬云(上海)信息科技有限公司 Data configuration query method and device
CN111782734B (en) * 2019-04-04 2024-04-12 华为技术服务有限公司 Data compression and decompression method and device
CN110442627A (en) * 2019-07-05 2019-11-12 威讯柏睿数据科技(北京)有限公司 Data transmission method and system between a kind of memory database system and data warehouse
CN110413642B (en) * 2019-08-02 2022-05-27 北京快立方科技有限公司 Application-unaware fragmentation database parsing and optimizing method
CN111400346A (en) * 2020-03-13 2020-07-10 苏州浪潮智能科技有限公司 Method, equipment, device and medium for improving execution efficiency of database all-in-one machine
CN112364264B (en) * 2020-11-27 2023-10-27 支付宝(杭州)信息技术有限公司 Risk prevention and control method, device and equipment
CN113032427B (en) * 2021-04-12 2023-12-08 中国人民大学 Vectorization query processing method for CPU and GPU platform
CN113535745B (en) * 2021-08-09 2022-01-18 威讯柏睿数据科技(北京)有限公司 Hierarchical database operation acceleration system and method
CN113742320B (en) * 2021-11-05 2022-03-01 亿景智联(北京)科技有限公司 Management method and device of OLAP data warehouse
CN115829615A (en) * 2023-01-05 2023-03-21 瓴创(北京)科技有限公司 User grouping method, system and storage medium based on multiple databases
CN116303791A (en) * 2023-03-22 2023-06-23 合肥申威睿思信息科技有限公司 Data synchronization method and device based on acceleration system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN105868388A (en) * 2016-04-14 2016-08-17 中国人民大学 Method for memory on-line analytical processing (OLAP) query optimization based on field programmable gate array (FPGA)
CN106354434A (en) * 2016-08-31 2017-01-25 中国人民大学 Log data storing method and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103309958A (en) * 2013-05-28 2013-09-18 中国人民大学 OLAP star connection query optimizing method under CPU and GPU mixing framework
CN105868388A (en) * 2016-04-14 2016-08-17 中国人民大学 Method for memory on-line analytical processing (OLAP) query optimization based on field programmable gate array (FPGA)
CN106354434A (en) * 2016-08-31 2017-01-25 中国人民大学 Log data storing method and system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
不对称内存计算平台OLAP查询处理技术研究;张延松 等;《华东师范大学学报(自然科学版)》;20160929(第5期);第89-91页

Also Published As

Publication number Publication date
CN106844703A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN106844703B (en) A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine
US10691646B2 (en) Split elimination in mapreduce systems
CN106372114B (en) A kind of on-line analysing processing system and method based on big data
CN108292315B (en) Storing and retrieving data in a data cube
CN107133342A (en) A kind of IndexR real-time data analysis storehouse
CN103246749B (en) The matrix database system and its querying method that Based on Distributed calculates
Wang et al. Supporting a light-weight data management layer over hdf5
Lu et al. Scalagist: Scalable generalized search trees for mapreduce systems [innovative systems paper]
US8229916B2 (en) Method for massively parallel multi-core text indexing
CN110059067A (en) A kind of water conservancy space vector big data memory management method
Liang et al. Express supervision system based on NodeJS and MongoDB
CN103942342A (en) Memory database OLTP and OLAP concurrency query optimization method
US10977280B2 (en) Systems and methods for memory optimization interest-driven business intelligence systems
Dehne et al. The cgmCUBE project: Optimizing parallel data cube generation for ROLAP
Giannakouris et al. MuSQLE: Distributed SQL query execution over multiple engine environments
CN106095951B (en) Data space multi-dimensional indexing method based on load balancing and inquiry log
Chattopadhyay et al. Procella: Unifying serving and analytical data at YouTube
CN113032427B (en) Vectorization query processing method for CPU and GPU platform
CN104376109A (en) Multi-dimension data distribution method based on data distribution base
Zhao et al. A practice of TPC-DS multidimensional implementation on NoSQL database systems
CN112651618A (en) Construction method of audit dimension model for online audit of metering data
Theeten et al. Chive: Bandwidth optimized continuous querying in distributed clouds
CN103365923A (en) Method and device for assessing partition schemes of database
Ho et al. Data partition optimization for column-family NoSQL databases
CN103870342B (en) Task core value calculating method based on node attribute function in cloud computing environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant