CN109213793A - A kind of stream data processing method and system - Google Patents
A kind of stream data processing method and system Download PDFInfo
- Publication number
- CN109213793A CN109213793A CN201810889376.0A CN201810889376A CN109213793A CN 109213793 A CN109213793 A CN 109213793A CN 201810889376 A CN201810889376 A CN 201810889376A CN 109213793 A CN109213793 A CN 109213793A
- Authority
- CN
- China
- Prior art keywords
- data
- summary feature
- record
- feature data
- data set
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of stream data processing method and system, by the summary feature data that stream datas a large amount of in e-commerce are extracted to stream data, establish a plurality of processing thread, economic cooperation summary feature data and at multiple data sets, and data are pre-processed in advance, reduce data dimension, the data similarity value between reasonable computation reference data and other data, so that it is determined that whether each data in data set have the preferable degree of association, finally determine whether to retain the data.It can make when in face of stream data amount, larger and high concurrent is accessed, system can timely respond to request, filter false data, and query time is reduced, the available optimization of transmission performance.
Description
Technical field
The present invention relates to computer data processing technology field, in particular to a kind of stream data processing method and system.
Background technique
E-commerce is a booming business model, thus brings new opportunity to sme development.In
During small enterprise and e-commerce cooperative development, informationization is essential intermediate link.However current middle-size and small-size enterprise
Industry informatization is slow, and related medium-sized and small enterprises' warehouse logistics information study on construction is less, and the system that oneself realizes at present is set
Meter has basic function, but lacks preferable detailed design and user experience.For e-commerce company, what inside fell behind
The level of IT application is likely to become a major reason for restricting its efficiency of service.The design of E-business applications must be with data
Centered on storage and management, centered on database technology, a height is realized in terms of logical concept and soft and hardware technology two
Performance and data-centered network system provides an effective data storage management system for user.
But the concurrent control mechanism of user terminal/server framework is generally used in the prior art, it is asked by client reception
It asks, the data that server customer in response end is sent, carries out parallel data processing, but and high concurrent larger in face of stream data amount is visited
When asking, system can not timely respond to request, and client-side management is cumbersome, and query time increases, and transmission performance is difficult to ensure,
In some data do not carry out screening and filtering or be not optimized processing, the data being stored in database table often have data lack
Mistake, information redundancy and error in data and other issues.It would therefore be highly desirable to propose the method for stream data processing.
Summary of the invention
The embodiment of the invention provides a kind of stream data processing method and system, and stream data is optimized processing,
Request can not be timely responded to, query time increase, pass by occurring error in data, system when to solving the processing of existing stream data
The problems such as defeated performance is difficult to ensure.
To solve the above-mentioned problems, the invention discloses following technical solutions:
In a first aspect, providing a kind of stream data processing method, comprising:
The window that a length is S is established, is extracted from the current window of a plurality of stream data using processor CPU element
Summary feature data;
Multiple thread parallel processing units are established using processor GPU unit, in the multiple thread parallel processing unit
A thread parallel processing unit correspond to a plurality of stream data in a stream data;
The summary feature data are merged to form multiple summary feature data sets, wherein each summary feature data
First concentrated is recorded as the reference data of the summary feature data set;
Data in the multiple summary feature data set are pre-processed, the dimension of the data is reduced, are deleted superfluous
Remaining or little relevance attribute;
Execution character String matching operation is traversed one by one to the data of the summary feature data set, by the summary feature number
It is compared according to first record and subsequent record of collection;
The data similarity value for calculating other data in the reference data and the summary feature data set, by what is obtained
Data similarity value Q is compared with preset reference data similarity value, obtains comparison result;
Determine whether other described data retain according to the comparison result, the data of reservation are depositing for the current window
File data.
Second aspect provides a kind of stream data processing system, comprising:
Abstraction module establishes the window that a length is S, extracts summary feature from the current window of a plurality of stream data
Data;
Multiple threads module, establishes multiple thread parallel processing units, in the multiple thread parallel processing unit
One thread parallel processing unit corresponds to a stream data in a plurality of stream data;
Merging module merges the summary feature data to form multiple summary feature data sets, wherein each described general
First for wanting characteristic to concentrate is recorded as the reference data of the summary feature data set;
Preprocessing module pre-processes the data in the multiple summary feature data set, reduces the data
Dimension deletes redundancy or the little attribute of relevance;
Comparison module traverses execution character String matching operation to the data of the summary feature data set one by one, will be described
First record of summary feature data set is compared with subsequent record;
Computing module calculates the data similarity of other data in the reference data and the summary feature data set
Value, obtained data similarity value Q is compared with preset reference data similarity value, obtains comparison result;
As a result confirmation module determines whether other described data retain according to the comparison result, and the data of reservation are institute
State the archive data of current window.
The invention discloses a kind of electronic commerce data processing method and system, by by streaming numbers a large amount of in e-commerce
According to the summary feature data for extracting stream data, a plurality of processing thread is established, economic cooperation summary feature data and at multiple data sets,
And data are pre-processed in advance, reduce data dimension, the data phase between reasonable computation reference data and other data
Like angle value, so that it is determined that whether each data in data set have the preferable degree of association, finally determine whether to retain the data.
This method makes when in face of stream data amount, larger and high concurrent is accessed, and system can timely respond to request, filter false number
According to query time is reduced, the available optimization of transmission performance.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention
Some embodiments for those of ordinary skill in the art without creative efforts, can also basis
These attached drawings obtain other attached drawings.
Fig. 1 is the flow diagram of stream data processing method in one embodiment of the invention.
Fig. 2 is the structural schematic diagram of stream data processing system in another embodiment of the present invention.
Specific embodiment
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention
In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is
A part of the embodiment of the present invention, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art
Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Referring to Fig. 1, one embodiment of the invention proposes a kind of flow chart of stream data processing method, firstly, establishing
The window that one length is S, extracts summary feature number using processor CPU element from the current window of a plurality of stream data
According to.So-called summary feature data refer to the data for best embodying the stream data attribute in energy this stream data, pass through word frequency
Or other algorithms can analyze and obtain the data.
Further, multiple thread parallel processing units are established using processor GPU unit, multiple thread parallel processing are single
A thread parallel processing unit in member corresponds to a stream data in a plurality of stream data.The thread of GPU is light weight
Zero-overhead may be implemented in grade thread, the switching between thread, and the advantage of this thread switching is to switch to ready state thread, can
To hide the delay of thread with the calculating in thread, and bring hiding delay better if thread;And CPU is real
The method of existing multithreading is the coarseness multithreading using software itself, he the characteristics of when being that thread switching generally requires hundreds of
In the clock period, this consumption is very big.In CPU, there is the standard of multicore, there can be 2-8 calculating core, but hardware
The raising of energy is limited, so the quantity that be continuously increased calculating core is not easy very much.In comparison, the stream multiprocessing in GPU
Device usually has 1-30, if used at full capacity, Floating-point Computation processing capacity is very advantageous, so, mainstream GPU performance is
10 times of cpu performance are even higher.GPU and CPU are compared as can be seen that in the bandwidth of memory and two sides of ability of operation
Face, GPU are higher by several times or more than CPU of the same period in terms of the two.In addition, the characteristics of according to stream data, in processing stream
It transfers to GPU to go to handle the parallel section in the parallel algorithm of design or algorithm when formula data, utilizes its high memory bandwidth
With multithread processor, to execute Large-scale parallel computing, so that streaming data processing accelerates, this is very reasonable.
Further, summary feature data are merged to form multiple summary feature data sets, wherein each summary feature number
The reference data of summary feature data set is recorded as according to first of concentration.Since stream data amount is magnanimity, at data
The decomposition of reason task can start with from data itself, and original data set is resolved into multiple small data sets.Assuming that data volume
There is N item record, the processing time loss for each record is t, then the data processing task time-consuming for having executed this M item record is
This M/n data set is performed simultaneously data processing if this M data to be resolved into M/n small data sets by M*t,
In the case of the influence for not considering memory and CPU, it is believed that processing time time-consuming is M/n*t.
Further, summary feature data are merged to form multiple summary feature data set specific steps are as follows: extract summary
First record in characteristic, and first record is considered as new summary feature data set, and preserve;Analysis is general
The Article 2 in characteristic is wanted to record, by comparing Article 2 record and oneself current category through existing summary feature data set
Property, upon a match, Article 2 record is assigned in matched summary feature data set;If this record with it is current
Oneself mismatches through existing all summary feature data sets, then records one new summary feature data of creation for this
Collection, and match attribute is created for it;The step of front two is constantly repeated, was calculated until every record is all scanned, final
Multiple summary feature data sets are recorded to Article 2.
Further, the data in multiple summary feature data sets are pre-processed, reduces the dimension of data, deleted superfluous
Remaining or little relevance attribute;For small data sets multiple after having decomposed, the dimension for reducing data is carried out, in this way
The time complexity of algorithm will be greatly reduced, error is reduced.
Further, execution character String matching operation is traversed one by one to the data of summary feature data set, by summary feature
First record of data set is compared with subsequent record;Data sliding window model is a processing window on data set
Mouthful, and can slide.When handling data, window is that first record from data set constantly slides backward.
Further, the data similarity value for calculating other data in reference data and summary feature data set, will obtain
Data similarity value Q be compared with preset reference data similarity value, obtain comparison result;
Finally, determining whether other data retain according to comparison result, the data of reservation are the archive data of current window.
If the data similarity Q of the data is greater than or equal to reference data similarity value, indicate the data in the data intensive data
The degree of association is higher, is not wrong data;On the contrary, being indicated if the data similarity Q of the data is less than reference data similarity value
Data data correlation degree in the data set is lower, which is wrong data.
Wherein, the calculation formula of data similarity value Q are as follows:
D is the total length of the data window of summary feature data set, qiFor field i
Similarity, p be two comparison character strings identical characters number, NmaxFor the maximum value for taking two comparison string lengths, miFor
The weight that field i is accounted for.
The present invention is by establishing the summary feature data of stream datas a large amount of in e-commerce extraction stream data a plurality of
Thread is handled, economic cooperation summary feature data are simultaneously pre-processed at multiple data sets, and to data in advance, reduce data dimension
Degree, the data similarity value between reasonable computation reference data and other data, so that it is determined that each data in data set are
It is no that there is the preferable degree of association, finally determine whether to retain the data.This method makes larger and high in face of stream data amount
When concurrently accessing, system can timely respond to request, filter false data, and query time is reduced, and transmission performance is available excellent
Change.
Fig. 2 is the structural schematic diagram of stream data processing system in another embodiment of the present invention, proposes a kind of streaming
Data processing system, comprising: abstraction module 201, multiple threads module 202, merging module 203, preprocessing module 204, ratio
Compared with module 205, computing module 206 and result confirmation module 207.Wherein:
Abstraction module 201 establishes the window that a length is S, utilizes processor CPU element working as from a plurality of stream data
Summary feature data are extracted in front window.So-called summary feature data, the streaming can be best embodied in this stream data by referring to
The data of data attribute can analyze by word frequency or other algorithms and obtain the data.
Multiple threads module 202 establishes multiple thread parallel processing units, multiple threads using processor GPU unit
A thread parallel processing unit in parallel processing element corresponds to a stream data in a plurality of stream data.GPU's
Thread is lightweight thread, and zero-overhead may be implemented in the switching between thread, and the advantage of this thread switching is to switch to ready
State thread can hide the delay of thread with the calculating in thread, and bring hiding delay if thread more
Better;And the method that CPU realizes multithreading is coarseness multithreading using software itself, he the characteristics of be that thread switching is general
Hundreds of clock cycle are needed, this consumption is very big.In CPU, there is the standard of multicore, there can be 2-8 calculating
Core, but the raising of hardware performance is limited, so the quantity that be continuously increased calculating core is not easy very much.In comparison, GPU
In stream multiprocessor usually have 1-30, if used at full capacity, Floating-point Computation processing capacity is very advantageous, so, it is main
Stream GPU performance is that 10 times of cpu performance are even higher.GPU and CPU are compared as can be seen that memory bandwidth and operation
Two aspects of ability, GPU are higher by several times or more than CPU of the same period in terms of the two.In addition, the spy according to stream data
Parallel section in the parallel algorithm of design or algorithm is transferred to GPU to go to handle by point when handling stream data, using it
High memory bandwidth and multithread processor, to execute Large-scale parallel computing, so that streaming data processing accelerates, this is to close very much
Reason.
Merging module 203 merges summary feature data to form multiple summary feature data sets, wherein each summary feature
First in data set is recorded as the reference data of summary feature data set.Since stream data amount is magnanimity, to data
The decomposition of processing task can start with from data itself, and original data set is resolved into multiple small data sets.Assuming that data
Amount has N item record, and the processing time loss for each record is t, then the data processing task for having executed this M item record is time-consuming
This M/n data set is performed simultaneously data processing if this M data to be resolved into M/n small data sets for M*t,
When not considering the influence of memory and CPU, it is believed that processing time time-consuming is M/n*t.
Further, summary feature data are merged to form multiple summary feature data set specific steps are as follows: extract summary
First record in characteristic, and first record is considered as new summary feature data set, and preserve;Analysis is general
The Article 2 in characteristic is wanted to record, by comparing Article 2 record and oneself current category through existing summary feature data set
Property, upon a match, Article 2 record is assigned in matched summary feature data set;If this record with it is current
Oneself mismatches through existing all summary feature data sets, then records one new summary feature data of creation for this
Collection, and match attribute is created for it;The step of front two is constantly repeated, was calculated until every record is all scanned, final
Multiple summary feature data sets are recorded to Article 2.
Preprocessing module 204 pre-processes the data in multiple summary feature data sets, reduces the dimension of data,
Delete redundancy or the little attribute of relevance;For small data sets multiple after having decomposed, the dimension for reducing data is carried out
Degree will greatly reduce the time complexity of algorithm in this way, reduce error.
Comparison module 205 traverses execution character String matching operation to the data of summary feature data set one by one, by summary spy
First record of sign data set is compared with subsequent record;Data sliding window model is a processing on data set
Window, and can slide.When handling data, window is that first record from data set constantly slides backward.
Computing module 206 calculates the data similarity value of other data in reference data and summary feature data set, will
To data similarity value Q be compared with preset reference data similarity value, obtain comparison result;
As a result confirmation module 207 determine whether other data retain according to comparison result, and the data of reservation are current window
Archive data.If the data similarity Q of the data is greater than or equal to reference data similarity value, indicate the data in the number
It is higher according to the intensive data degree of association, it is not wrong data;On the contrary, if the data similarity Q of the data is less than reference data phase
Like angle value, indicate that data data correlation degree in the data set is lower, which is wrong data.
Wherein, the calculation formula of data similarity value Q are as follows:
D is the total length of the data window of summary feature data set, qiFor field i
Similarity, p be two comparison character strings identical characters number, NmaxFor the maximum value for taking two comparison string lengths, miFor
The weight that field i is accounted for.
Above system is by establishing the summary feature data of stream datas a large amount of in e-commerce extraction stream data more
Item handles thread, and economic cooperation summary feature data are simultaneously pre-processed at multiple data sets, and to data in advance, reduces data dimension
Degree, the data similarity value between reasonable computation reference data and other data, so that it is determined that each data in data set are
It is no that there is the preferable degree of association, finally determine whether to retain the data.This method makes larger and high in face of stream data amount
When concurrently accessing, system can timely respond to request, filter false data, and query time is reduced, and transmission performance is available excellent
Change.
It should be noted that, in this document, such as first and second etc relational terms are used merely to an entity
Or operation is distinguished with another entity or operation, is existed without necessarily requiring or implying between these entities or operation
Any actual relationship or order.Moreover, the terms "include", "comprise" or its any other variant be intended to it is non-
It is exclusive to include, so that the process, method, article or equipment for including a series of elements not only includes those elements,
It but also including other elements that are not explicitly listed, or further include solid by this process, method, article or equipment
Some elements.In the absence of more restrictions, the element limited by sentence " including one ", is not arranged
Except there is also other identical factors in the process, method, article or equipment for including element.
Those of ordinary skill in the art will appreciate that: realize that all or part of the steps of above method embodiment can pass through
The relevant hardware of program instruction is completed, and program above-mentioned can store in computer-readable storage medium, the program
When being executed, step including the steps of the foregoing method embodiments is executed;And storage medium above-mentioned includes: ROM, RAM, magnetic disk or light
In the various media that can store program code such as disk.
Finally, it should be noted that the foregoing is merely a prefered embodiment of the invention, it is merely to illustrate technical side of the invention
Case is not intended to limit the scope of the present invention.It is any modification for being made all within the spirits and principles of the present invention, equivalent
Replacement, improvement etc., are included within the scope of protection of the present invention.
Claims (8)
1. a kind of stream data processing method, which is characterized in that the described method includes:
The window that a length is S is established, extracts summary from the current window of a plurality of stream data using processor CPU element
Characteristic;
Multiple thread parallel processing units are established using processor GPU unit, one in the multiple thread parallel processing unit
A thread parallel processing unit corresponds to a stream data in a plurality of stream data;
The summary feature data are merged to form multiple summary feature data sets, wherein in each summary feature data set
First reference data for being recorded as the summary feature data set;
Data in the multiple summary feature data set are pre-processed, reduce the dimension of the data, delete redundancy or
The little attribute of person's relevance;
Execution character String matching operation is traversed one by one to the data of the summary feature data set, by the summary feature data set
First record with it is subsequent record be compared;
Calculate the data similarity value of other data in the reference data and the summary feature data set, the data that will be obtained
Similarity value Q is compared with preset reference data similarity value, obtains comparison result;
Determine whether other described data retain according to the comparison result, the data of reservation are the archive number of the current window
According to.
2. the method according to claim 1, wherein wherein whether determining other described data according to comparison result
Retain specifically:, will if the data similarity value of other data is greater than or equal to the reference data similarity value
Other described data are added to record set, finally save into new data table;If obtained data similarity value Q is less than described
Reference data similarity value deletes other described data from the summary feature data.
3. the method according to claim 1, wherein wherein the summary feature data are merged to be formed it is multiple general
Want characteristic data set specifically: extract first record in the summary feature data, and described first is recorded
It is considered as new summary feature data set, and preserves;The Article 2 record in the summary feature data is analyzed, by comparing
The Article 2 record and oneself the current attribute through existing summary feature data set upon a match record the Article 2
It is assigned in matched summary feature data set;If this record with it is current oneself through existing all summary features
Data set all mismatches, then records one new summary feature data set of creation for this, and create match attribute for it;Constantly
The step of front two is repeated, was calculated until every record is all scanned, and was finally obtained Article 2 and record multiple summary feature numbers
According to collection.
4. the method according to claim 1, wherein wherein, the calculation formula of the data similarity value Q are as follows:
D is the total length of the data window of the summary feature data set, qiFor field i
Similarity, p be two comparison character strings identical characters number, NmaxFor the maximum value for taking two comparison string lengths, miFor
The weight that field i is accounted for.
5. a kind of stream data processing system, which is characterized in that the system comprises:
Abstraction module establishes the window that a length is S, summary feature number is extracted from the current window of a plurality of stream data
According to;
Multiple threads module, establishes multiple thread parallel processing units, and one in the multiple thread parallel processing unit
Thread parallel processing unit corresponds to a stream data in a plurality of stream data;
Merging module merges the summary feature data to form multiple summary feature data sets, wherein each summary is special
First in sign data set is recorded as the reference data of the summary feature data set;
Preprocessing module pre-processes the data in the multiple summary feature data set, reduces the dimension of the data,
Delete redundancy or the little attribute of relevance;
Comparison module traverses execution character String matching operation to the data of the summary feature data set, by the summary one by one
First record of characteristic data set is compared with subsequent record;
Computing module calculates the data similarity value of other data in the reference data and the summary feature data set, will
Obtained data similarity value Q is compared with preset reference data similarity value, obtains comparison result;
As a result confirmation module determines whether other described data retain according to the comparison result, and the data of reservation are described work as
The archive data of front window.
6. system according to claim 5, which is characterized in that wherein whether determine other described data according to comparison result
Retain specifically:, will if the data similarity value of other data is greater than or equal to the reference data similarity value
Other described data are added to record set, finally save into new data table;If obtained data similarity value Q is less than described
Reference data similarity value deletes other described data from the summary feature data.
7. system according to claim 5, which is characterized in that wherein the summary feature data are merged to be formed it is multiple general
Want characteristic data set specifically: extract first record in the summary feature data, and described first is recorded
It is considered as new summary feature data set, and preserves;The Article 2 record in the summary feature data is analyzed, by comparing
The Article 2 record and oneself the current attribute through existing summary feature data set upon a match record the Article 2
It is assigned in matched summary feature data set;If this record with it is current oneself through existing all summary features
Data set all mismatches, then records one new summary feature data set of creation for this, and create match attribute for it;Constantly
The step of front two is repeated, was calculated until every record is all scanned, and was finally obtained Article 2 and record multiple summary feature numbers
According to collection.
8. system according to claim 5, which is characterized in that wherein, the calculation formula of the data similarity value Q are as follows:
D is the total length of the data window of the summary feature data set, qiFor field i
Similarity, p be two comparison character strings identical characters number, NmaxFor the maximum value for taking two comparison string lengths, miFor
The weight that field i is accounted for.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810889376.0A CN109213793A (en) | 2018-08-07 | 2018-08-07 | A kind of stream data processing method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810889376.0A CN109213793A (en) | 2018-08-07 | 2018-08-07 | A kind of stream data processing method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109213793A true CN109213793A (en) | 2019-01-15 |
Family
ID=64988067
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810889376.0A Pending CN109213793A (en) | 2018-08-07 | 2018-08-07 | A kind of stream data processing method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109213793A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111158898A (en) * | 2019-11-25 | 2020-05-15 | 国网浙江省电力有限公司建设分公司 | BIM data processing method and device aiming at power transmission and transformation project site arrangement standardization |
CN111680065A (en) * | 2020-05-25 | 2020-09-18 | 泰康保险集团股份有限公司 | Processing system, equipment and method for lag data in streaming computation |
CN112650895A (en) * | 2021-01-26 | 2021-04-13 | 南京超辰信息科技有限公司 | Surveying and mapping operation data acquisition and processing system and method thereof |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101933015A (en) * | 2007-12-13 | 2010-12-29 | 图形软件科技公司 | The system and method that is used for editing cartographic data |
CN103136243A (en) * | 2011-11-29 | 2013-06-05 | 中国电信股份有限公司 | File system duplicate removal method and device based on cloud storage |
CN103279332A (en) * | 2013-06-09 | 2013-09-04 | 浪潮电子信息产业股份有限公司 | Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm |
CN103279542A (en) * | 2013-06-05 | 2013-09-04 | 中国电子科技集团公司第十五研究所 | Data importing processing method and data processing device |
US20130335432A1 (en) * | 2011-11-07 | 2013-12-19 | Square Enix Holdings Co., Ltd. | Rendering server, central server, encoding apparatus, control method, encoding method, and recording medium |
CN104317751A (en) * | 2014-11-18 | 2015-01-28 | 浪潮电子信息产业股份有限公司 | Data stream processing system on GPU (Graphic Processing Unit) and data stream processing method thereof |
CN104391679A (en) * | 2014-11-18 | 2015-03-04 | 浪潮电子信息产业股份有限公司 | GPU (graphics processing unit) processing method for high-dimensional data stream in irregular stream |
CN107273412A (en) * | 2017-05-04 | 2017-10-20 | 北京拓尔思信息技术股份有限公司 | A kind of clustering method of text data, device and system |
-
2018
- 2018-08-07 CN CN201810889376.0A patent/CN109213793A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101933015A (en) * | 2007-12-13 | 2010-12-29 | 图形软件科技公司 | The system and method that is used for editing cartographic data |
US20130335432A1 (en) * | 2011-11-07 | 2013-12-19 | Square Enix Holdings Co., Ltd. | Rendering server, central server, encoding apparatus, control method, encoding method, and recording medium |
CN103136243A (en) * | 2011-11-29 | 2013-06-05 | 中国电信股份有限公司 | File system duplicate removal method and device based on cloud storage |
CN103279542A (en) * | 2013-06-05 | 2013-09-04 | 中国电子科技集团公司第十五研究所 | Data importing processing method and data processing device |
CN103279332A (en) * | 2013-06-09 | 2013-09-04 | 浪潮电子信息产业股份有限公司 | Data flow parallel processing method based on GPU-CUDA platform and genetic algorithm |
CN104317751A (en) * | 2014-11-18 | 2015-01-28 | 浪潮电子信息产业股份有限公司 | Data stream processing system on GPU (Graphic Processing Unit) and data stream processing method thereof |
CN104391679A (en) * | 2014-11-18 | 2015-03-04 | 浪潮电子信息产业股份有限公司 | GPU (graphics processing unit) processing method for high-dimensional data stream in irregular stream |
CN107273412A (en) * | 2017-05-04 | 2017-10-20 | 北京拓尔思信息技术股份有限公司 | A kind of clustering method of text data, device and system |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111158898A (en) * | 2019-11-25 | 2020-05-15 | 国网浙江省电力有限公司建设分公司 | BIM data processing method and device aiming at power transmission and transformation project site arrangement standardization |
CN111158898B (en) * | 2019-11-25 | 2022-07-15 | 国网浙江省电力有限公司建设分公司 | BIM data processing method and device aiming at power transmission and transformation project site arrangement standardization |
CN111680065A (en) * | 2020-05-25 | 2020-09-18 | 泰康保险集团股份有限公司 | Processing system, equipment and method for lag data in streaming computation |
CN111680065B (en) * | 2020-05-25 | 2023-11-10 | 泰康保险集团股份有限公司 | Processing system, equipment and method for hysteresis data in stream type calculation |
CN112650895A (en) * | 2021-01-26 | 2021-04-13 | 南京超辰信息科技有限公司 | Surveying and mapping operation data acquisition and processing system and method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2953959C (en) | Feature processing recipes for machine learning | |
Wang et al. | Application of improved time series Apriori algorithm by frequent itemsets in association rule data mining based on temporal constraint | |
Becker et al. | A comparative survey of business process similarity measures | |
Lim et al. | Business intelligence and analytics: Research directions | |
Parameswaran et al. | Answering queries using humans, algorithms and databases | |
US9965531B2 (en) | Data storage extract, transform and load operations for entity and time-based record generation | |
US6763354B2 (en) | Mining emergent weighted association rules utilizing backlinking reinforcement analysis | |
CN113424173B (en) | Materialized graph view for active graph analysis | |
CN104573130B (en) | The entity resolution method and device calculated based on colony | |
CN109684330A (en) | User's portrait base construction method, device, computer equipment and storage medium | |
Kolchinsky et al. | Lazy evaluation methods for detecting complex events | |
CN109213793A (en) | A kind of stream data processing method and system | |
CN109033281B (en) | Intelligent pushing system of knowledge resource library | |
CN113157947A (en) | Knowledge graph construction method, tool, device and server | |
CN111310032A (en) | Resource recommendation method and device, computer equipment and readable storage medium | |
CN103995828B (en) | A kind of cloud storage daily record data analysis method | |
US10628421B2 (en) | Managing a single database management system | |
CN109165119A (en) | A kind of electronic commerce data processing method and system | |
CN112749325A (en) | Training method and device for search ranking model, electronic equipment and computer medium | |
CN110062112A (en) | Data processing method, device, equipment and computer readable storage medium | |
Mathai et al. | An efficient approach for item set mining using both utility and frequency based methods | |
CN111159213A (en) | Data query method, device, system and storage medium | |
CN112052365A (en) | Cross-border scene portrait construction method and device | |
Tüker | Application development for improving website usability by web mining methods | |
Wu et al. | A frequent itemset mining algorithm based on composite granular computing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190115 |
|
RJ01 | Rejection of invention patent application after publication |