CN109726225A - A kind of storage of distributed stream data and querying method based on Storm - Google Patents
A kind of storage of distributed stream data and querying method based on Storm Download PDFInfo
- Publication number
- CN109726225A CN109726225A CN201910026601.2A CN201910026601A CN109726225A CN 109726225 A CN109726225 A CN 109726225A CN 201910026601 A CN201910026601 A CN 201910026601A CN 109726225 A CN109726225 A CN 109726225A
- Authority
- CN
- China
- Prior art keywords
- data
- subquery
- query
- server
- storage
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The present invention provide it is a kind of based on Storm distributed stream data storage and querying method, the present invention is based on Storm data stream type Computational frames, CEPHFS is as under data bottom storage system, pass through the signature analysis to distributive type data, real-time subregion and index construct are carried out to data, by the good data block compression deposit CEPHFS of subregion.According to the attribute of the key of data block and two dimensions of temporal when search operation, it is corresponding subquery by query decomposition, and the file that may contain required data is only read by bloomFilter method, qualified data are selected by predicate, aggregate operation is carried out after submitting subquery results to merge, and returns to user.Computing resource is made full use of to improve the efficiency of data storage and inquiry.The present invention have the characteristics that application scenarios extensively, low time delay, load balancing, and can be realized high speed storing.
Description
Technical field
The present invention relates to technical field of data processing, especially a kind of distributed stream data based on Storm are stored and are looked into
Inquiry method.
Background technique
With the fast development of network technology, the high speed of real-time streaming data caused by social networks and Location Service Platform etc.
Increase, occurs carrying out magnanimity flow data the requirement of processing response in real time in more and more fields, so that the high speed of data
Insertion and real-time searching become a very important data-handling capacity, user can obtain in real time desired historical data and
New data.For providing the platform such as Baidu map of location service, Amap etc. is per second all instantaneously to produce the position of magnanimity
Information and trail change data, in order to meet the needs of users and improve company's benefit, plateform system is required to support
Real-time insertion storage on million grades of flow datas is inquired with low delay, such as client needs to obtain 5km range near current time
The GPS information of interior all vehicles, or specify driving trace of certain vehicle within past 1 hour.
Common key-value memory technology open source, which is realized, updates leaf node band as HBase is reduced using LSM-Tree
The time overhead come, but new data and the historical data needs being inserted into every time are updated, in the inquiry time delay of time range
It is excessively high;The Druid of common time series databases technology such as Alibaba's open source only supports inverted index, looks into key range
It is more inefficient in inquiry.In order to solve this problem, must design one can be carried out high speed storing and reality for magnanimity flow data
When the distributed data base technique inquired, all support efficient inquiry in key scope and time range, this requires data of newly arriving
It can be separated with historical data, avoid the traversal of unrelated range data as far as possible in inquiry, improve search efficiency, guarantee simultaneously
The load balancing of system difference node, carrys out the utilization rate of maximum resource.
Summary of the invention
In view of the deficiencies of the prior art, the present invention provide it is a kind of based on Storm distributed stream data storage and issuer
Method, present invention analysis flow data can reach the stability feature with data distribution according to close sequence under true environment,
With problem efficient in key scope and time range is unable to satisfy in present database technology, provide a kind of magnanimity flow data
Under efficient index and time-domain range real-time query processing method.The present invention is directed to by for upcoming flow data into
Line range divides, and is respectively stored into distributed file system after different machines nodal parallel index, and when inquiry carries out inquiry point
Solution, executes subquery parallel, filters, and after the operation such as polymerization, amalgamation result is returned.
The technical solution of the present invention is as follows: a kind of storage of distributed stream data and querying method, the present invention based on Storm are logical
The B+Tree index for establishing several isolation ranges in real time when receiving distributed stream data is crossed, distribution is arrived in storage after reaching threshold value
File system, and query decomposition is carried out in inquiry, the subquery under parallel processing different range keeps load balancing, completes
Merge afterwards and returns to real-time storage as a result, realizing the flow data insertion of high-throughput and inquiring, specifically includes the following steps:
S1), receive source data and be distributed to downstream units building index structure;
S2), by index structure boil down to data block and distributed file storage system CEPHFS is written;
It S3), is several independent subqueries by query decomposition based on querying condition and data block information;
S4), the son for being distributed to the independent query processing unit in downstream by accessing distributed file storage system CEPHFS
Inquiry;
S5), receive the subquery results returned and merging returns to user.
Further, step S1) in, the received each source data of flow data storage system is data element ancestral, is defined as d=
{dk,dt,dr, wherein dkIt is the major key of first ancestral, dtIt is time attribute, drIt is other attribute values of first ancestral, K and T define one
The two-dimensional space D=(K, T) of major key and time-domain;Major key range is fixed, and time range is continuously increased, and the section major key K is expressed as K
(k-, k+), the section time-domain T are expressed as T (t-, t+), establish unique rectangle r≤K, T >={ (k, t) ∈ R according to two sections
|k∈K,t∈T}。
Further, by rectangle r≤K, the data tuple write-in in T >={ (k, t) ∈ R | k ∈ K, t ∈ T } range is unique right
In the template B+Tree answered, key reaches the template B+Tree of threshold value chunkSize size in memory as indexing with chunk shape
Formula storage is to distributed file system, and chunk is made of key array and array of data, the key value of key storage order of array, packet
Include the offset of a direction array of data.
Further, it is based on two-dimensional space D=(K, T), the querying condition of flow data storage system can be defined as one
Triple q={ Kq,Tq,fq, Kq,TqThe condition range of choice on major key and time-domain, query range cutting be a r≤
K,T≥{(k,t)∈R|k∈Kq,t∈Tq},fq: t- > { true, false } is the customized condition filter function of user, is used to
Judge whether the selection for meeting user.
Further, the blocks of files difference based on the storage of different subquery server S ubquery Server nodes is gentle
The template B+Tree leaf node deposited is different, realizes the algorithm of query decomposition scheduling, calculates subquery server S ubquery
Server carries out inquiry distribution to each untreated subquery priority query, until untreated subquery collection is combined into sky, and
The leaf segment point data inquired recently is written and is cached, realizes the caching locality of inquiry distribution, data block locality and load are equal
Weighing apparatus;Specific algorithm process is as follows:
To S (qi) andIt shuffles, if S (qi) preceding, then the two is spliced into new arrayWherein, subscript
It is small to represent priority height, it willElement include priority be separately added into each subquery server S ubquery Server
Subquery priority query in, all qiAfter all having handled, to the priority of sub- query service device Subquery Server
Queue successively takes out highest priority and untreated qiIt is allocated, until all qiIt is assigned, wherein S (qi) generation
Table has qiThe subquery server S ubquery Server array of range data,Represent remaining subquery server
The array of Subquery Server, qi∈ q represents the subquery after one query is decomposed.
Further, step S2) in, index structure is tree index structure, and tree index structure size is being more than specified
Threshold value after, the data element ancestral in leaf node is compressed by Snappy algorithm, is written in the form of data block point
It is permanently stored in cloth document storage system CEPHFS, and by first ancestral's major key of data block, the relevant metadata of time-domain range
Meta data manager metadata keeper is recorded;It can be become in a certain range according to flow data key major key domain
Change, and time-domain can ever-increasing characteristic, the non-leaf nodes part of tree index structure is carried out to be left template, with side
Just index templates are directly used in building next time, the division of the progress node as building B+ tree is avoided, when causing very big
Between expense.
Further, step S3) in, query decomposition is looked into for several independent sons based on querying condition and data block information
It askes, specifically includes the following steps:
S301), major key and time-domain in the querying condition that query scheduling device query dispatcher is provided according to user
Range, the data block metadata information read in meta data manager (metadata keeper) compares, by query region
It is divided into a series of two-dimensional index regions;
S302), the equivalent Rule of judgment provided based on user, is filtered out by Bloom filter bloomFilter method
Certain subquery region for not containing target data member ancestral;
S303), the independent subquery server in downstream will likely be only distributed to containing the subquery of target data member ancestral
Subquery Server。
Further, step S4) in, it is distributed to that downstream is independent to be looked by accessing distributed file storage system CEPHFS
The subquery of processing unit is ask, specifically includes the following steps:
S401), subquery server S ubquery Server read parallel in distributed file storage system CephFs with
The corresponding data block of subquery, the template part of index structure, obtains leaf node for all leaf nodes in first read block
Opposite offset and it is packed compressed after offset, be calculated may include target key range a series of leaf nodes
offset;
S402), the leaf node part based on index structure in offset read block file, passes through Snappy algorithm solution
Obtained leaf node packet data block byte is pressed, is deserialized as leaf node, and do the filtering in time range and equivalence condition;
S403), aggregate operation is carried out to filtered volume of data member ancestral, inquiry is sent to after serializing and is adjusted
Spend device query dispatcher.
The invention has the benefit that
1, application scenarios of the present invention are extensive, distributed stream data handling utility such as communication common carrier monitoring analysis network flow,
Position networked platforms vehicle flowrate trail change, electric business platform festivals or holidays conclusion of the business index etc. in real time realize that the data of mass data are real
When transmission process.
2, the present invention can be realized high speed storing, and the present invention will newly arrive data and history using efficient data division mode
Interval data is opened, and using data area stability feature, by reserving index, template constructs B+Tree index, avoids tree node point
Split the consumption of bring plenty of time.
3, the present invention has the characteristics that low time delay, and after carrying out range cutting to querying condition, only accessing metamessage may be accorded with
The file of query context, parallel processing filtering, the key operations such as polymerization are closed, and realize caching locality and file locality, are mentioned
High search efficiency.
4, load balancing of the present invention is allocated different sections to the subquery of decomposition by the query scheduling algorithm of design
Point, makes full use of system resource.
Detailed description of the invention
Fig. 1 is flow diagram of the invention;
Fig. 2 is structure chart of the distributed stream datum number storage of the present invention according to block;
Fig. 3 is internal structure chart of the distributed stream datum number storage of the present invention according to block leaf node;
Fig. 4 is that distributed stream data query of the present invention decomposes scheduling graph.
Specific embodiment
Specific embodiments of the present invention will be further explained with reference to the accompanying drawing:
As shown in Figure 1, a kind of storage of distributed stream data and querying method based on Storm, the present invention is by receiving
The B+Tree index for establishing several isolation ranges when distributed stream data in real time, distributed field system is arrived in storage after reaching threshold value
System, and query decomposition is carried out in inquiry, the subquery under parallel processing different range keeps load balancing, merges after the completion
Real-time storage is returned as a result, realizing the flow data insertion and inquiry of high-throughput, specifically includes the following steps:
S1), receive source data and be distributed to downstream units building index structure;
Wherein, the received each source data of flow data storage system is known as data element ancestral, and is defined as d={ dk,dt,dr,
Wherein, dkIt is the major key of first ancestral, dtIt is time attribute, drIt is other attribute values of first ancestral, K and T define a major key and time
The two-dimensional space D=(K, T) in domain;Major key range is fixed, and time range is continuously increased, and the section major key K is expressed as K (k-, k+), when
Between the domain section T be expressed as T (t-, t+), establish unique rectangle according to two sections:
R≤K, T >=(k, t) ∈ R | k ∈ K, t ∈ T };
By rectangle r≤K, unique corresponding template B is written in the data tuple in T >={ (k, t) ∈ R | k ∈ K, t ∈ T } range
In+Tree, key is as index, when reaching the template B+Tree of threshold value chunkSize size in memory with data block data
The storage of chunk form is to distributed file system, and chunk is made of key array and array of data, key storage order of array
Key value is directed toward the offset of array of data including one.
Based on two-dimensional space D=(K, T), the querying condition of flow data storage system can be defined as a triple q=
{Kq,Tq,fq, Kq,TqIt is the condition range of choice on major key and time-domain, query range cutting is a r≤K, T >=(k,
t)∈R|k∈Kq,t∈Tq},fq: t- > { true, false } is the customized condition filter function of user, with to determine whether
Meet the selection of user.
The template B+ of blocks of files difference and caching based on the storage of different subquery server S ubquery Server nodes
Tree leaf node is different, realizes the algorithm of query decomposition scheduling, calculates subquery server S ubquery Server to each
Untreated subquery priority query carries out inquiry distribution, until untreated subquery collection is combined into sky, and will inquire recently
Leaf segment point data write-in caching, realizes the caching locality of inquiry distribution, data block locality and load balancing;Specific algorithm mistake
Journey is as follows:
To S (qi) andIt shuffles, if S (qi) preceding, then the two is spliced into new arrayWherein, subscript
It is small to represent priority height, it willElement include priority be separately added into each subquery server S ubquery Server
Subquery priority query in, all qiAfter all having handled, to the priority of sub- query service device Subquery Server
Queue successively takes out highest priority and untreated qiIt is allocated, until all qiIt is assigned, wherein S (qi) generation
Table has qiSubquery server (Subquery Server) array of range data,Represent remaining subquery server
The array of Subquery Server, qi∈ q represents the subquery after one query is decomposed.
S2), by index structure boil down to data block and distributed file storage system CEPHFS is written;Wherein,
Index structure is tree index structure, and tree index structure size passes through Snappy after more than specified threshold value
Algorithm compresses the data element ancestral in leaf node, and distributed file storage system is written in the form of data block
It is permanently stored in CEPHFS, and by first ancestral's major key of data block, the relevant metadata record of time-domain range to metadata management
Device metadata keeper;It can be changed in a certain range according to flow data key major key domain, and time-domain can be continuous
The characteristic of growth carries out the non-leaf nodes part of tree index structure to be left template, straight in building next time to facilitate
The division for avoiding carrying out node as building B+ tree using index templates is connect, very big time overhead is caused.
S3), it is several independent subqueries by query decomposition based on querying condition and data block information, specifically includes following
Step:
S301), major key and time-domain in the querying condition that query scheduling device query dispatcher is provided according to user
Range, the data block metadata information read in meta data manager metadata keeper compare, and query region is drawn
It is divided into a series of two-dimensional index regions;
S302), the equivalent Rule of judgment provided based on user is filtered by Bloom filter (bloomFilter) method
Fall certain subquery region for not containing target data member ancestral;
S303), the independent subquery server in downstream will likely be only distributed to containing the subquery of target data member ancestral
Subquery Server。
S4), the son for being distributed to the independent query processing unit in downstream by accessing distributed file storage system CEPHFS
Inquiry, specifically includes the following steps:
S401), subquery server S ubquery Server read parallel in distributed file storage system CephFs with
The corresponding data block of subquery, the template part of index structure, obtains leaf node for all leaf nodes in first read block
Opposite offset and it is packed compressed after offset, be calculated may include target key range a series of leaf nodes
offset;
S402), the leaf node part based on index structure in offset read block file, passes through Snappy algorithm solution
Obtained leaf node packet data block byte is pressed, is deserialized as leaf node, and do the filtering in time range and equivalence condition;
S403), aggregate operation is carried out to filtered volume of data member ancestral, inquiry is sent to after serializing and is adjusted
Spend device query dispatcher.
S5), receive the subquery results returned and merging returns to user.
As shown in Fig. 2, the chunk file internals of flow data write-in distributed file system.Chunk contains B+
Tree template part and leaf node two parts.Template in figure represents B+Tree template part, and leaf node represents leaf
Node section, compress chunk represent leaf node it is packed compressed after data block.
B+Tree template part includes root node and the internal node part of B+Tree, each nodes records key value, child
Relative displacement of the column leaf node in all leaf nodes, a column leaf is also recorded in child node etc., maximum layer internal node
In the offset of chunk after node is packed compressed.
Leaf node includes key array part and array of data part, and all nodes are carried out continuous by sequence from left to right
Storage.When storage file, template part is written chunk as a whole, and leaf node is written in the form after packed compressed
Chunk, every group of leaf node number N are set as 20, improve the problem of compression factor carrys out processing space storage.
As shown in figure 3, the leaf node partial data in flow data storage organization chunk is laid out.Data layout is by two parts
Composition, one is key array, and one is array of data.Index array in figure represents key array, data array generation
Table array of data.The key value of Key storage order of array, which includes the offsets that one is directed toward array of data, when search
By finding the Key value and offset of eligible range in Key array, then corresponding data element is taken into array of data
Ancestral.
As shown in figure 4, the algorithm of processing query decomposition scheduling can be expressed as a figure.Pending Set generation in figure
All also unassigned subqueries of table, S (qi) the optimum allocation Subquery Server of each subquery is represented,
The Subquery priority array of each subquery is represented, preferred server queue PreferedServer Arrays represents son
Priority query of the query service device (Subquery Server) to all untreated subqueries.Pending Set is not empty
When, each Subquery Server is stored in the data area in local data area and caching according to file system, right
Subquery in Set carries out priority ranking will be preferred to all Subquery Server according to ID sequence after the completion of sequence
Untreated subquery is allocated in server queue PreferedServer Arrays, until Pending Set is all
Until subquery is all handled.
The above embodiments and description only illustrate the principle of the present invention and most preferred embodiment, is not departing from this
Under the premise of spirit and range, various changes and improvements may be made to the invention, these changes and improvements both fall within requirement and protect
In the scope of the invention of shield.
Claims (8)
1. a kind of storage of distributed stream data and querying method based on Storm, it is characterised in that: by receiving distributed stream
The B+Tree index for establishing several isolation ranges when data in real time, storage is to distributed file system after reaching threshold value, and is looking into
Query decomposition is carried out when inquiry, the subquery under parallel processing different range keeps load balancing, merges to return after the completion and deposit in real time
Flow data insertion and inquiry as a result, realization high-throughput are stored up, specifically includes the following steps:
S1), receive source data and be distributed to downstream units building index structure;
S2), by index structure boil down to data block and distributed file storage system CEPHFS is written;
It S3), is several independent subqueries by query decomposition based on querying condition and data block information;
S4), the subquery for being distributed to the independent query processing unit in downstream by accessing distributed file storage system CEPHFS;
S5), receive the subquery results returned and merging returns to user.
2. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists
In: in step S1), the received each source data of flow data storage system is data element ancestral, is defined as d={ dk,dt,dr,
In, dkIt is the major key of first ancestral, dtIt is time attribute, drIt is other attribute values of first ancestral, K and T define a major key and time-domain
Two-dimensional space D=(K, T);Major key range is fixed, and time range is continuously increased, and the section major key K is expressed as K (k-, k+), time
The domain section T is expressed as T (t-, t+), establishes unique rectangle r≤K according to two sections, and T >=(k, t) ∈ R | k ∈ K, t ∈ T }.
3. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 2 exists
In: by rectangle r≤K, unique corresponding template B+Tree is written in the data tuple in T >={ (k, t) ∈ R | k ∈ K, t ∈ T } range
In, as indexing, the template B+Tree that threshold value chunkSize size is reached in memory is stored in the form of chunk to distribution key
File system, chunk are made of key array and array of data, the key value of key storage order of array, including a direction data
The offset of array.
4. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 3 exists
In: it is based on two-dimensional space D=(K, T), the querying condition of flow data storage system can be defined as a triple q={ Kq,Tq,
fq, Kq,TqIt is the condition range of choice on major key and time-domain, query range cutting is a r≤K, T >=(k, t) ∈ R |
k∈Kq,t∈Tq},fq: t- > { true, false } is the customized condition filter function of user, and use is used to determine whether meeting
The selection at family.
5. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 4 exists
In: the template B+ of blocks of files difference and caching based on the storage of different subquery servers (Subquery Server) node
Tree leaf node is different, realizes the algorithm of query decomposition scheduling, calculates subquery server S ubquery Server to each
Untreated subquery priority query carries out inquiry distribution, until the inquiry leaf that untreated subquery collection is combined into sky and will look into recently
Node data write-in caching, realizes the caching locality of inquiry distribution, data block locality and load balancing;Specific algorithm process
It is as follows:
To S (qi) andIt shuffles, if S (qi) preceding, then the two is spliced into new arrayWherein, subscript small generation
Table priority is high, willElement include that priority is separately added into the son of each subquery server S ubquery Server
In Query priority queue, all qiAfter all having handled, to the priority query of sub- query service device Subquery Server
Successively take out highest priority and untreated qiIt is allocated, until all qiIt is assigned, wherein S (qi) represent and deposit
There is qiThe subquery server S ubquery Server array of range data,Represent remaining subquery server
The array of Subquery Server, qi∈ q represents the subquery after one query is decomposed.
6. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists
In: in step S2), index structure is tree index structure, and tree index structure size passes through after more than specified threshold value
Snappy algorithm compresses the data element ancestral in leaf node, and distributed document storage system is written in the form of data block
It is permanently stored in system CEPHFS, and by first ancestral's major key of data block, the relevant metadata record of time-domain range to metadata pipe
Manage device metadata keeper.
7. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists
In: by query decomposition be several independent subqueries based on querying condition and data block information in step S3), specifically include with
Lower step:
S301), major key and time-domain range in the querying condition that query scheduling device query dispatcher is provided according to user,
The data block metadata information read in meta data manager metadata keeper compares, and query region is divided into
A series of two-dimensional index regions;
S302), the equivalent Rule of judgment provided based on user, is filtered out centainly by Bloom filter bloomFilter method
Subquery region without containing target data member ancestral;
S303), the independent subquery server in downstream will likely be only distributed to containing the subquery of target data member ancestral
Subquery Server。
8. a kind of storage of distributed stream data and querying method, feature based on Storm according to claim 1 exists
In: in step S4), the son of the independent query processing unit in downstream is distributed to by access distributed file storage system CEPHFS
Inquiry, specifically includes the following steps:
S401), subquery server S ubquery Server is read parallel in distributed file storage system CephFs looks into son
Corresponding data block is ask, the template part of index structure in first read block obtains leaf node for the phase of all leaf nodes
To offset and it is packed compressed after offset, be calculated may include target key range a series of leaf nodes
offset;
S402), the leaf node part based on index structure in offset read block file, is decompressed by Snappy algorithm
The leaf node packet data block byte arrived, is deserialized as leaf node, and do the filtering in time range and equivalence condition;
S403), aggregate operation is carried out to filtered volume of data member ancestral, query scheduling device is sent to after serializing
(query dispatcher)。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910026601.2A CN109726225B (en) | 2019-01-11 | 2019-01-11 | Storm-based distributed stream data storage and query method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910026601.2A CN109726225B (en) | 2019-01-11 | 2019-01-11 | Storm-based distributed stream data storage and query method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109726225A true CN109726225A (en) | 2019-05-07 |
CN109726225B CN109726225B (en) | 2023-08-01 |
Family
ID=66299136
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910026601.2A Active CN109726225B (en) | 2019-01-11 | 2019-01-11 | Storm-based distributed stream data storage and query method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109726225B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110515990A (en) * | 2019-07-23 | 2019-11-29 | 华信永道(北京)科技股份有限公司 | Data query methods of exhibiting and inquiry display systems |
CN111241099A (en) * | 2020-01-09 | 2020-06-05 | 佛山科学技术学院 | Industrial big data storage method and device |
CN111310230A (en) * | 2020-02-10 | 2020-06-19 | 腾讯云计算(北京)有限责任公司 | Spatial data processing method, device, equipment and medium |
WO2020248150A1 (en) * | 2019-06-12 | 2020-12-17 | Alibaba Group Holding Limited | Method and system for answering multi-dimensional analytical queries under local differential privacy |
CN115563103A (en) * | 2022-09-15 | 2023-01-03 | 河南星环众志信息科技有限公司 | Multi-dimensional aggregation method, system, electronic device and storage medium |
CN116244313A (en) * | 2023-05-08 | 2023-06-09 | 北京四维纵横数据技术有限公司 | JSON data storage and access method, device, computer equipment and medium |
CN117076466A (en) * | 2023-10-18 | 2023-11-17 | 河北因朵科技有限公司 | Rapid data indexing method for large archive database |
CN117689451A (en) * | 2024-01-31 | 2024-03-12 | 浙江大学 | Flink-based stream vector search method, device and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103678520A (en) * | 2013-11-29 | 2014-03-26 | 中国科学院计算技术研究所 | Multi-dimensional interval query method and system based on cloud computing |
US20140172867A1 (en) * | 2012-12-17 | 2014-06-19 | General Electric Company | Method for storage, querying, and analysis of time series data |
CN105589951A (en) * | 2015-12-18 | 2016-05-18 | 中国科学院计算机网络信息中心 | Distributed type storage method and parallel query method for mass remote-sensing image metadata |
CN107357659A (en) * | 2017-07-04 | 2017-11-17 | 东北大学 | Towards the group technology and querying method of Storm successive ranges inquiry GSLB |
-
2019
- 2019-01-11 CN CN201910026601.2A patent/CN109726225B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140172867A1 (en) * | 2012-12-17 | 2014-06-19 | General Electric Company | Method for storage, querying, and analysis of time series data |
CN103678520A (en) * | 2013-11-29 | 2014-03-26 | 中国科学院计算技术研究所 | Multi-dimensional interval query method and system based on cloud computing |
CN105589951A (en) * | 2015-12-18 | 2016-05-18 | 中国科学院计算机网络信息中心 | Distributed type storage method and parallel query method for mass remote-sensing image metadata |
CN107357659A (en) * | 2017-07-04 | 2017-11-17 | 东北大学 | Towards the group technology and querying method of Storm successive ranges inquiry GSLB |
Non-Patent Citations (1)
Title |
---|
朱东升等: "基于Hadoop平台的地铁NCC数据中心方案研究", 《计算机测量与控制》 * |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020248150A1 (en) * | 2019-06-12 | 2020-12-17 | Alibaba Group Holding Limited | Method and system for answering multi-dimensional analytical queries under local differential privacy |
CN110515990A (en) * | 2019-07-23 | 2019-11-29 | 华信永道(北京)科技股份有限公司 | Data query methods of exhibiting and inquiry display systems |
CN111241099A (en) * | 2020-01-09 | 2020-06-05 | 佛山科学技术学院 | Industrial big data storage method and device |
CN111310230A (en) * | 2020-02-10 | 2020-06-19 | 腾讯云计算(北京)有限责任公司 | Spatial data processing method, device, equipment and medium |
CN111310230B (en) * | 2020-02-10 | 2023-04-14 | 腾讯云计算(北京)有限责任公司 | Spatial data processing method, device, equipment and medium |
CN115563103A (en) * | 2022-09-15 | 2023-01-03 | 河南星环众志信息科技有限公司 | Multi-dimensional aggregation method, system, electronic device and storage medium |
CN115563103B (en) * | 2022-09-15 | 2023-12-08 | 河南星环众志信息科技有限公司 | Multi-dimensional aggregation method, system, electronic equipment and storage medium |
CN116244313A (en) * | 2023-05-08 | 2023-06-09 | 北京四维纵横数据技术有限公司 | JSON data storage and access method, device, computer equipment and medium |
CN117076466A (en) * | 2023-10-18 | 2023-11-17 | 河北因朵科技有限公司 | Rapid data indexing method for large archive database |
CN117076466B (en) * | 2023-10-18 | 2023-12-29 | 河北因朵科技有限公司 | Rapid data indexing method for large archive database |
CN117689451A (en) * | 2024-01-31 | 2024-03-12 | 浙江大学 | Flink-based stream vector search method, device and system |
CN117689451B (en) * | 2024-01-31 | 2024-04-26 | 浙江大学 | Flink-based stream vector search method, device and system |
Also Published As
Publication number | Publication date |
---|---|
CN109726225B (en) | 2023-08-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109726225A (en) | A kind of storage of distributed stream data and querying method based on Storm | |
US6438562B1 (en) | Parallel index maintenance | |
CN106528773B (en) | Map computing system and method based on Spark platform supporting spatial data management | |
CN110162528A (en) | Magnanimity big data search method and system | |
Zhang et al. | Trajspark: A scalable and efficient in-memory management system for big trajectory data | |
CN110287391A (en) | Multi-level trajectory data storage method, storage medium and terminal based on Hadoop | |
CN103793493B (en) | A kind of method and system for handling car-mounted terminal mass data | |
US20150006509A1 (en) | Incremental maintenance of range-partitioned statistics for query optimization | |
CN108804602A (en) | A kind of distributed spatial data storage computational methods based on SPARK | |
CN102054000A (en) | Data querying method, device and system | |
WO2009082116A1 (en) | System and method for analysis of information | |
CN106528787A (en) | Mass data multi-dimensional analysis-based query method and device | |
CN109241159A (en) | A kind of subregion querying method, system and the terminal device of data cube | |
EP3767486B1 (en) | Multi-record index structure for key-value stores | |
CN108920552A (en) | A kind of distributed index method towards multi-source high amount of traffic | |
WO2021017269A1 (en) | Data migration method and apparatus, computer device, and storage medium | |
CN107193898A (en) | The inquiry sharing method and system of log data stream based on stepped multiplexing | |
CN108733781B (en) | Cluster temporal data indexing method based on memory calculation | |
CN110059149A (en) | Electronic map spatial key Querying Distributed directory system and method | |
CN107704475A (en) | Multilayer distributed unstructured data storage method, querying method and device | |
Zhang et al. | Aggregate keyword nearest neighbor queries on road networks | |
CN110471925A (en) | Realize the method and system that index data is synchronous in search system | |
CN109726219A (en) | The method and terminal device of data query | |
CN110275885A (en) | Multi-level track data storage device based on Hadoop | |
Jiang et al. | MOIST: A scalable and parallel moving object indexer with school tracking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |