CN106504169A - A kind of waterlogging data handling system and its processing method based on stream process - Google Patents
A kind of waterlogging data handling system and its processing method based on stream process Download PDFInfo
- Publication number
- CN106504169A CN106504169A CN201611026709.4A CN201611026709A CN106504169A CN 106504169 A CN106504169 A CN 106504169A CN 201611026709 A CN201611026709 A CN 201611026709A CN 106504169 A CN106504169 A CN 106504169A
- Authority
- CN
- China
- Prior art keywords
- modules
- result
- waterlogging
- flume
- stream process
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 36
- 230000008569 process Effects 0.000 title claims abstract description 28
- 238000003672 processing method Methods 0.000 title claims description 6
- 238000004364 calculation method Methods 0.000 claims abstract description 10
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 claims abstract description 4
- 230000008859 change Effects 0.000 claims description 10
- 239000003795 chemical substances by application Substances 0.000 claims description 7
- 238000004891 communication Methods 0.000 claims description 3
- 238000003860 storage Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000012550 audit Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 238000001983 electron spin resonance imaging Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/10—Services
- G06Q50/26—Government or public services
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24568—Data stream processing; Continuous queries
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Tourism & Hospitality (AREA)
- Physics & Mathematics (AREA)
- Educational Administration (AREA)
- Economics (AREA)
- Databases & Information Systems (AREA)
- Development Economics (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- General Business, Economics & Management (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of waterlogging data handling system based on stream process, which includes waterlogging model computation module, Flume modules, Kafka modules, SparkStreaming modules and application system.Reading and treatment effeciency is improved using SparkStreaming stream process framework, result of calculation is submitted to by stream process framework with interval of timestamps, the parsing of Shp files is carried out in stream process framework, and the result to same node, the result for keeping up with a time is compared, the relatively last result of each node is exported, the different triangle gridding of water depth value is exported.And then meet actual demand.Improve the efficiency of our process and displaying.
Description
Technical field
The invention belongs to high amount of traffic process application, in particular to a kind of process waterlogging data system and
Method.
Background technology
With the development of big data, people are to the processing requirement of big data also more and more higher, original batch processing framework
MapReduce is suitable for calculated off line, cannot but meet the higher business of requirement of real-time, such as real-time recommendation, user behavior analysis
Deng.
Spark Streaming are built upon the real-time Computational frame on Spark, by it provide abundant API,
Based on the high-speed execution engine of internal memory, user can ask application in conjunction with streaming, batch processing and interaction audit trial, and Spark is a class
The distributed computing framework of MapReduce is similar to, its core is elasticity distribution formula data set, there is provided richer than MapReduce
Rich model, quickly can carry out successive ignition to data set, in internal memory to support the data mining algorithm and figure of complexity
Shape computational algorithm.Spark Streaming are a kind of real-time Computational frame of structure on Spark, and it extends Spark process
The ability of extensive stream data.
Flume is the system of distributed, reliable and High Availabitity massive logs collection, polymerization and a transmission, supports
Various types of data sender is customized in system, for collecting data;Meanwhile, Flume is provided and is carried out simple process to data, and is write
Various data receivings(Customizable)Ability.
Flume is mainly purchased into by 3 important components:
Source:The collection to daily record data is completed, is divided into transtion and event is driven among channel.
Channel:The function of a queue is mainly provided, and the data in providing to source are simply cached.
Sink:The data in Channel are taken out, corresponding storage file system, data base is carried out, or is submitted to long-range
Server.
It is using the journal file for being the original record of the program that directly reads, base to change minimum occupation mode to existing program
Originally seamless access can be realized, it is not necessary to which existing program is made any change.
Flume divides three-tier architecture in logic:Agent, collector and storage.
①agent
For gathered data, agent be in flume produce data flow where, meanwhile, the data of generation can be spread by agent
Defeated to collector.
②collector
The effect of collector be by the data summarization of multiple agent after, be loaded in storage.
③storage
Storage is storage system, can be common a file, or HDFS, HIVE, HBase etc..
At present, as due to the characteristic of geography information, the real-time estimate of waterlogging model fails to carry using Distributed Calculation
The high computational efficiency of itself.Therefore for the calculating of large area waterlogging model, the calculating for carrying out zones of different using multiple nodes
Then the result of each node is processed.But for model prediction area increasing when, need to process
Data also more and more, single work station configures higher server and is increasingly difficult to the demand for meeting this change.
Content of the invention
For overcoming deficiency of the prior art, it is an object of the invention to provide at a kind of waterlogging data based on stream process
Reason system is improving the efficiency and real-time of the bandwagon effect of result.
For realizing above-mentioned technical purpose, above-mentioned technique effect is reached, the present invention is achieved through the following technical solutions:
A kind of waterlogging data handling system based on stream process, which includes waterlogging model computation module, Flume modules, Kafka moulds
Block, SparkStreaming modules and application system;The waterlogging model computation module will produce substantial amounts of waterlogging Predicting Technique
Result data, is then stored as Shp files with Shp forms(Shp files are developed by ESRI, and the Shp files of an ESRI include one
Individual master file, an index file, and a dBASE table, the suffix of wherein master file is exactly .shp), the Flume modules lead to
Cross its Agent and collect the Shp files, be then aggregated into the collector of the Flume modules, the Flume modules
Daily record is transported to Sink the production procedure that the Kafka modules complete data, and the SparkStreaming modules are followed the trail of and disappeared
The side-play amount or offset for taking this data is consumed, and is encoded with parsing described in the SparkStreaming modules
The program of Shp files, described program return the result of change every time after parsing the Shp files, be transmitted further to the Kafka moulds
Block, then communication is set up by the application system and the Kafka systems, specific message queue is monitored, the result of change is obtained,
Complete the displaying of GIS information.
Another goal of the invention of the present invention is to provide a kind of waterlogging data processing method based on stream process, it include with
Lower step:
1)The calculating that zones of different is carried out by waterlogging model computation module to node;
2)The results of prediction and calculation of these multiple nodes is collected by process by Flume modules;
3)The result that collects is processed by SparkStreaming modules, result of calculation is submitted to interval of timestamps
Stream process framework, carries out the parsing of Shp files in stream process framework;
4)By result of the Kafka modules to same node, the result for keeping up with a time is compared;
5)The relatively last result of each node is exported by application system, the different triangle gridding of water depth value is exported.
The invention has the beneficial effects as follows:
Compared with prior art, the result of calculation of waterlogging model is used for stream calculation framework by system and method for the invention, is carried
The speed of the displaying of high waterlogging early warning.Manager can be made to take the precautionary measures faster, reduce loss.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of description, below with presently preferred embodiments of the present invention and coordinate accompanying drawing describe in detail as after.
The specific embodiment of the present invention is shown in detail in by following examples and its accompanying drawing.
Description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this
Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 is the system framework schematic diagram of the present invention.
Specific embodiment
Below with reference to the accompanying drawings and in conjunction with the embodiments, the present invention is described in detail.
Shown in Figure 1, a kind of waterlogging data handling system based on stream process, it include waterlogging model computation module 1,
Flume modules 2, Kafka modules 3, SparkStreaming modules 4 and application system 5;The waterlogging model computation module 1 will
Substantial amounts of waterlogging Predicting Technique result data is produced, Shp files are stored as with Shp forms then, the Flume modules 2 pass through
Its Agent collects the Shp files, is then aggregated into the collector of the Flume modules 2, the Flume modules 2
Daily record is transported to Sink the production procedure that the Kafka modules 3 complete data, and the SparkStreaming modules 4 are followed the trail of
The side-play amount or offset for consuming this data is consumed, and is encoded with parsing institute in the SparkStreaming modules 4
The program of Shp files is stated, described program returns the result of change every time, is transmitted further to the Kafka after parsing the Shp files
Module 3, then communication is set up by the application system 5 and the Kafka systems 3, specific message queue is monitored, change is obtained
As a result, the displaying of GIS information is completed.
The processing method of the waterlogging data handling system of the present embodiment is as follows:
1)The calculating that zones of different is carried out by waterlogging model computation module 1 to node;
2)The results of prediction and calculation of these multiple nodes is collected by process by Flume modules 2;
3)Processed by the result of 4 pairs of collections of SparkStreaming modules, result of calculation is submitted to interval of timestamps
To stream process framework, the parsing of Shp files is carried out in stream process framework;
4)By result of the Kafka modules 3 to same node, the result for keeping up with a time is compared;
5)The relatively last result of each node is exported by application system 5, the different triangle gridding of water depth value is exported.
The preferred embodiments of the present invention are the foregoing is only, the present invention is not limited to, for the skill of this area
For art personnel, the present invention can have various modifications and variations.All within the spirit and principles in the present invention, made any repair
Change, equivalent, improvement etc., should be included within the scope of the present invention.
Claims (2)
1. a kind of waterlogging data handling system based on stream process, it is characterised in that:Including waterlogging model computation module(1)、
Flume modules(2), Kafka modules(3), SparkStreaming modules(4)And application system(5);
The waterlogging model computation module(1)Substantial amounts of waterlogging Predicting Technique result data will be produced, will then be stored with Shp forms
For Shp files, the Flume modules(2)The Shp files are collected by its Agent, the Flume modules are then aggregated into
(2)Collector, the Flume modules(2)Sink daily record is transported to the Kafka modules(3)Complete the life of data
Produce flow process, the SparkStreaming modules(4)The side-play amount of this data is consumed in tracking or offset is consumed, institute
State SparkStreaming modules(4)In be encoded with the program that parses the Shp files, described program parses the Shp files
Return the result of change every time afterwards, be transmitted further to the Kafka modules(3), then by the application system(5)With the Kafka
System(3)Communication is set up, specific message queue is monitored, the result of change is obtained, is completed the displaying of GIS information.
2. a kind of waterlogging data processing method based on stream process, it is characterised in that including following processing method:
1)By waterlogging model computation module(1)The calculating that zones of different is carried out to node;
2)By Flume modules(2)The results of prediction and calculation of these multiple nodes is collected process;
3)By SparkStreaming modules(4)The result that collects is processed, result of calculation is carried with interval of timestamps
Stream process framework is given, and the parsing of Shp files is carried out in stream process framework;
4)By Kafka modules(3)Result to same node, the result for keeping up with a time are compared;
5)By application system(5)The relatively last result of each node is exported, the different triangle gridding of water depth value carries out defeated
Go out.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611026709.4A CN106504169A (en) | 2016-11-22 | 2016-11-22 | A kind of waterlogging data handling system and its processing method based on stream process |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611026709.4A CN106504169A (en) | 2016-11-22 | 2016-11-22 | A kind of waterlogging data handling system and its processing method based on stream process |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106504169A true CN106504169A (en) | 2017-03-15 |
Family
ID=58328051
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611026709.4A Pending CN106504169A (en) | 2016-11-22 | 2016-11-22 | A kind of waterlogging data handling system and its processing method based on stream process |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106504169A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107317838A (en) * | 2017-05-24 | 2017-11-03 | 重庆邮电大学 | A kind of astronomical metadata archiving method and system based on stream data processing framework |
CN110377653A (en) * | 2019-07-15 | 2019-10-25 | 武汉中地数码科技有限公司 | A kind of real-time big data calculates and storage method and system |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060095202A1 (en) * | 2004-11-01 | 2006-05-04 | Hitachi, Ltd. | Method of delivering difference map data |
CN101727261A (en) * | 2008-10-17 | 2010-06-09 | 华硕电脑股份有限公司 | Page operation method and electronic device |
-
2016
- 2016-11-22 CN CN201611026709.4A patent/CN106504169A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060095202A1 (en) * | 2004-11-01 | 2006-05-04 | Hitachi, Ltd. | Method of delivering difference map data |
CN101727261A (en) * | 2008-10-17 | 2010-06-09 | 华硕电脑股份有限公司 | Page operation method and electronic device |
Non-Patent Citations (1)
Title |
---|
陈任飞等: "基于Flume/Kafka/Spark的分布式日志流处理***的设计与实现", 《中国科技论文在线》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107317838A (en) * | 2017-05-24 | 2017-11-03 | 重庆邮电大学 | A kind of astronomical metadata archiving method and system based on stream data processing framework |
CN107317838B (en) * | 2017-05-24 | 2020-11-17 | 重庆邮电大学 | Astronomical metadata filing method and system based on streaming data processing architecture |
CN110377653A (en) * | 2019-07-15 | 2019-10-25 | 武汉中地数码科技有限公司 | A kind of real-time big data calculates and storage method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yang | IoT stream processing and analytics in the fog | |
CN106709035B (en) | A kind of pretreatment system of electric power multidimensional panoramic view data | |
CN103297503B (en) | Mobile terminal intelligent perception system based on information retrieval server by different level | |
CN102902752B (en) | Method and system for monitoring log | |
Wang et al. | A deep learning based energy-efficient computational offloading method in Internet of vehicles | |
CN105512297A (en) | Distributed stream-oriented computation based spatial data processing method and system | |
CN106951552A (en) | A kind of user behavior data processing method based on Hadoop | |
CN109710731A (en) | A kind of multidirectional processing system of data flow based on Flink | |
CN104216889B (en) | Data dissemination analyzing and predicting method and system based on cloud service | |
CN111586091A (en) | Edge computing gateway system for realizing computing power assembly | |
Yan et al. | Big data driven wireless communications: A human-in-the-loop pushing technique for 5G systems | |
Du | Energy analysis of Internet of things data mining algorithm for smart green communication networks | |
CN106815254A (en) | A kind of data processing method and device | |
CN103916478B (en) | The method and apparatus that streaming based on distributed system builds data side | |
CN106504169A (en) | A kind of waterlogging data handling system and its processing method based on stream process | |
CN107995278B (en) | A kind of scene intelligent analysis system and method based on metropolitan area grade Internet of Things perception data | |
CN106990913B (en) | A kind of distributed approach of extensive streaming collective data | |
CN104778355A (en) | Trajectory outlier detection method based on wide-area distributed traffic system | |
CN111049898A (en) | Method and system for realizing cross-domain architecture of computing cluster resources | |
CN110941836A (en) | Distributed vertical crawler method and terminal equipment | |
CN115391429A (en) | Time sequence data processing method and device based on big data cloud computing | |
CN114219165A (en) | Electricity consumption big data storage system, prediction algorithm and visual display platform | |
CN105991366B (en) | A kind of business monitoring method and system | |
Liu et al. | Distributed and real-time query framework for processing participatory sensing data streams | |
CN113360576A (en) | Power grid mass data real-time processing method and device based on Flink Streaming |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB03 | Change of inventor or designer information |
Inventor after: Shi Xinming Inventor after: Li Yujie Inventor after: Liu Jia Inventor after: Chen Kun Inventor after: Liu Changxin Inventor after: Yang Fang Inventor before: Shi Xinming |
|
CB03 | Change of inventor or designer information | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170315 |
|
WD01 | Invention patent application deemed withdrawn after publication |