CN110297871A - A kind of method that isomeric data acquires in real time - Google Patents

A kind of method that isomeric data acquires in real time Download PDF

Info

Publication number
CN110297871A
CN110297871A CN201910534200.8A CN201910534200A CN110297871A CN 110297871 A CN110297871 A CN 110297871A CN 201910534200 A CN201910534200 A CN 201910534200A CN 110297871 A CN110297871 A CN 110297871A
Authority
CN
China
Prior art keywords
data
server
docking
management server
real time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910534200.8A
Other languages
Chinese (zh)
Inventor
顾凌云
王伟
李军军
李海全
张力华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Ice Stephen Mdt Infotech Ltd
Original Assignee
Changzhou Ice Stephen Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Ice Stephen Mdt Infotech Ltd filed Critical Changzhou Ice Stephen Mdt Infotech Ltd
Priority to CN201910534200.8A priority Critical patent/CN110297871A/en
Publication of CN110297871A publication Critical patent/CN110297871A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a kind of methods that isomeric data acquires in real time, belong to big data technical field, including establishing several proxy servers, several docking servers and management server, one or more Agent component is established in each proxy server, one or more Sink component is established in each docking server, MQ journal queue and the configuration page are established in the management server, it solves data acquisition not in time and acquisition time concentrates the technical issues of influencing operation system, the present invention is by the real-time acquisition to different data sources, bottom uses different technologies, it but is unified management and operating method on upper layer, real-time analytical database log come by way of acquiring data, the present invention is in order to solve transmission performance and the reliable data packet agreement that uses.

Description

A kind of method that isomeric data acquires in real time
Technical field
The invention belongs to big data technical field more particularly to a kind of methods that isomeric data acquires in real time.
Background technique
Big data era, business data is more and more, and many enterprises all establish data warehouse and due to timeliness industry Business establishes the platform data warehouse calculated in real time and the foundation calculated in real time and acquires dependent on data.General data acquisition Technical solution be T+1 batch jobs, however the batch jobs acquisition mode of T+1, due to being that timing execution task can exist Following problems:
1, acquisition time not in time, response real-time traffic demands that can not be positive;
2, acquisition time, which concentrates, causes very pressure excessive database and operation system;
3, different data sources sampling instrument technology is inconsistent, brings management cost very big.
Summary of the invention
The object of the present invention is to provide a kind of method that isomeric data acquires in real time, solve data acquisition not in time and Acquisition time concentrates the technical issues of influencing operation system.
To achieve the above object, the present invention adopts the following technical scheme:
A kind of method that isomeric data acquires in real time, includes the following steps:
Step 1: it is equal to establish several proxy servers, several docking servers and management server, all proxy servers With management server by internet communication, all docking servers pass through internet communication, agency's clothes with management server Business device is for connecting different databases, and docking server is for connecting different distributed memory systems;
Step 2: one or more Agent component is established in each proxy server, in each docking service One or more Sink component is established in device, establishes MQ journal queue and the configuration page in the management server;
Step 3: each Agent component simulates a database respectively, obtains the data movement of database in real time, generates Data movement information, and data movement information is resolved into JSON file;
Step 4: JSON file is sent to management server by proxy server, and management server receives JSON file Afterwards, JSON file is stored in MQ journal queue;
Step 5: docking server simulates a distributed memory system, Sink component by each Sink component respectively JSON file is read from the MQ journal queue in management server, and distributed memory system is written into JSON file in real time, It docks server and docking between proxy server is realized by JSON file;
Step 6: administrator is taken by the configuration page that management server provides in face of all proxy servers and all docking Business device is configured and is disposed, and implements to open with the data transmission docked between server in face of proxy server by configuration page Dynamic, stopping and monitoring.
Preferably, the data between the docking server, the docking service and the management server, which are transmitted, uses A plurality of data are packaged into data packet compressing and transmitted by following protocol format:
MagicCode+packagelength+Compresstype+Packagesize+data1length+ data2length+...+dataNlength+data1+data1crc+data2+data2crc+...dataN+dataNcrc;
Wherein, MagicCode is fixed as ICKXPKG character, indicates the beginning of a data packet;
Packagelength is the INT categorical data of 4 bytes, represents the length of packet;
Compresstype is that 1 byte data represents the compression algorithm used;
It includes number of data that Packagesize, which represents data packet,;
Data1length, data2length ... dataNlength indicates the size of data packet, and the value of N is positive whole Number;
Data1, data2 ... dataN is the content of data;
Data1crc, data2crc ... dataNcrc represents the check bit of each data, and the check bit is 4.
Preferably, when executing step 3, the data movement of the database is that format is binary data in database Variation.
Preferably, when executing step 6, administrator inputs configuration information, management server root by the configuration page Configuration file is generated according to configuration information, and configuration file is sent to the proxy server specified in configuration file or docking service Device.
The method that a kind of isomeric data of the present invention acquires in real time, solve data acquisition not in time and acquisition when Between point concentrate the technical issues of influencing operation system, the present invention is by the real-time acquisition to different data sources, and bottom is using different Technology, but be unified management and operating method on upper layer, real-time analytical database log come by way of acquiring data, The present invention is in order to solve transmission performance and the reliable data packet agreement that uses.
Detailed description of the invention
Fig. 1 is system architecture figure of the invention;
Fig. 2 is system administration schematic diagram of the invention;
Fig. 3 is data flow figure of the invention.
Specific embodiment
The method that a kind of isomeric data as shown in FIG. 1 to FIG. 3 acquires in real time, includes the following steps:
Step 1: it is equal to establish several proxy servers, several docking servers and management server, all proxy servers With management server by internet communication, all docking servers pass through internet communication, agency's clothes with management server Business device is for connecting different databases, and docking server is for connecting different distributed memory systems;
Step 2: one or more Agent component is established in each proxy server, in each docking service One or more Sink component is established in device, establishes MQ journal queue and the configuration page in the management server;
Step 3: each Agent component simulates a database respectively, obtains the data movement of database in real time, generates Data movement information, and data movement information is resolved into JSON file;
Step 4: JSON file is sent to management server by proxy server, and management server receives JSON file Afterwards, JSON file is stored in MQ journal queue;
Step 5: docking server simulates a distributed memory system, Sink component by each Sink component respectively JSON file is read from the MQ journal queue in management server, and distributed memory system is written into JSON file in real time, It docks server and docking between proxy server is realized by JSON file;
Step 6: administrator is taken by the configuration page that management server provides in face of all proxy servers and all docking Business device is configured and is disposed, and implements to open with the data transmission docked between server in face of proxy server by configuration page Dynamic, stopping and monitoring.
In the present embodiment, Agent component has specific various types of realizations, for coping with different databases, this Database in embodiment includes MySQLAgent, OracleAgent and MongoAgent3, the acquisition in real time respectively of Agent component The data of MySQL, Oracle and MongoDB;Corresponding Sink component realization has HBaseSink, HiveSink, HdfsSink, Data in MQ journal queue are docked in HBase, Hive and Hdfs by HBaseSink, HiveSink and HdfsSink respectively. The distributed memory system includes HBase, Hive and Hdfs.
Preferably, the data between the docking server, the docking service and the management server, which are transmitted, uses A plurality of data are packaged into data packet compressing and transmitted by following protocol format:
MagicCode+packagelength+Compresstype+Packagesize+data1length+ data2length+...+dataNlength+data1+data1crc+data2+data2crc+...dataN+dataNcrc;
Wherein, MagicCode is fixed as ICKXPKG character, indicates the beginning of a data packet;
Packagelength is the INT categorical data of 4 bytes, represents the length of packet;
Compresstype is that 1 byte data represents the compression algorithm used;
It includes number of data that Packagesize, which represents data packet,;
Data1length, data2length ... dataNlength indicates the size of data packet, and the value of N is positive whole Number;
Data1, data2 ... dataN is the content of data;Such as, data1 can be according to data1length after decompression Data in read.
Data1crc, data2crc ... dataNcrc represents the check bit of each data, and the check bit is 4, After having read data, in the check information for reading corresponding length, whether make a variation in the transmission for verifying data.
Preferably, when executing step 3, the data movement of the database is that format is binary data in database Variation.
Preferably, when executing step 6, administrator inputs configuration information, management server root by the configuration page Configuration file is generated according to configuration information, and configuration file is sent to the proxy server specified in configuration file or docking service Device.
UI man-machine interactive system is additionally provided in management server, for realizing the friendship between administrator and management server Mutually.
The method that a kind of isomeric data of the present invention acquires in real time, solve data acquisition not in time and acquisition when Between point concentrate the technical issues of influencing operation system, the present invention is by the real-time acquisition to different data sources, and bottom is using different Technology, but be unified management and operating method on upper layer, real-time analytical database log come by way of acquiring data, The present invention is in order to solve transmission performance and the reliable data packet agreement that uses.

Claims (4)

1. a kind of method that isomeric data acquires in real time, characterized by the following steps:
Step 1: establish several proxy servers, several docking servers and management server, all proxy servers with pipe Server is managed by internet communication, all docking servers pass through internet communication, proxy server with management server For connecting different databases, docking server is for connecting different distributed memory systems;
Step 2: one or more Agent component is established in each proxy server, in each docking server One or more Sink component is established, establishes MQ journal queue and the configuration page in the management server;
Step 3: each Agent component simulates a database respectively, obtains the data movement of database in real time, generates data Transition information, and data movement information is resolved into JSON file;
Step 4: JSON file is sent to management server by proxy server, will after management server receives JSON file JSON file is stored in MQ journal queue;
Step 5: docking server simulates a distributed memory system by each Sink component respectively, and Sink component is from pipe It manages and reads JSON file in the MQ journal queue in server, and distributed memory system is written into JSON file in real time, dock Server realizes docking between proxy server by JSON file;
Step 6: the configuration page that administrator is provided by management server is in face of all proxy servers and all docking servers Configured and disposed, and by configuration page in face of proxy server with dock between server data transmission implement start, Stop and monitors.
2. a kind of method that isomeric data acquires in real time as described in claim 1, it is characterised in that: the docking server, A plurality of data are packaged into number using following protocol format by the data transmission between the docking service and the management server It is transmitted according to packet compression:
MagicCode+packagelength+Compresstype+Packagesize+data1length+data2length +...+dataNlength+data1+data1crc+data2+data2crc+...dataN+dataNcrc;
Wherein, MagicCode is fixed as ICKXPKG character, indicates the beginning of a data packet;
Packagelength is the INT categorical data of 4 bytes, represents the length of packet;
Compresstype is that 1 byte data represents the compression algorithm used;
It includes number of data that Packagesize, which represents data packet,;
Data1length, data2length ... dataNlength indicates that the size of data packet, the value of N are positive integer;
Data1, data2 ... dataN is the content of data;
Data1crc, data2crc ... dataNcrc represents the check bit of each data, and the check bit is 4.
3. a kind of method that isomeric data acquires in real time as described in claim 1, it is characterised in that: when executing step 3, institute The data movement for stating database is format is binary data in database variation.
4. a kind of method that isomeric data acquires in real time as described in claim 1, it is characterised in that: when executing step 6, pipe Reason person inputs configuration information by the configuration page, and management server generates configuration file according to configuration information, and will configuration File is sent to the proxy server specified in configuration file or docking server.
CN201910534200.8A 2019-06-20 2019-06-20 A kind of method that isomeric data acquires in real time Pending CN110297871A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910534200.8A CN110297871A (en) 2019-06-20 2019-06-20 A kind of method that isomeric data acquires in real time

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910534200.8A CN110297871A (en) 2019-06-20 2019-06-20 A kind of method that isomeric data acquires in real time

Publications (1)

Publication Number Publication Date
CN110297871A true CN110297871A (en) 2019-10-01

Family

ID=68028239

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910534200.8A Pending CN110297871A (en) 2019-06-20 2019-06-20 A kind of method that isomeric data acquires in real time

Country Status (1)

Country Link
CN (1) CN110297871A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022402A (en) * 2022-07-01 2022-09-06 杭州乘云数字技术有限公司 Agent acquisition method and system based on one-stack integration technology

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571272A (en) * 2011-12-14 2012-07-11 展讯通信(上海)有限公司 Method and device for receiving service data in communication system, and baseband chip
CN103595504A (en) * 2013-11-04 2014-02-19 上海数字电视国家工程研究中心有限公司 Encapsulation method and calibration method for data package
CN106850788A (en) * 2017-01-22 2017-06-13 中国科学院电子学研究所苏州研究院 Towards the integrated framework and integrated approach of multi-source heterogeneous geographic information resources
CN107908690A (en) * 2017-11-01 2018-04-13 南京欣网互联网络科技有限公司 A kind of data processing method based on big data OA operation analysis
CN108512911A (en) * 2018-03-15 2018-09-07 成都优易数据有限公司 A kind of distributed capture agency plant and its implementation based on Flume
CN109492012A (en) * 2018-10-31 2019-03-19 厦门安胜网络科技有限公司 A kind of method, apparatus and storage medium of data real-time statistics and retrieval

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102571272A (en) * 2011-12-14 2012-07-11 展讯通信(上海)有限公司 Method and device for receiving service data in communication system, and baseband chip
CN103595504A (en) * 2013-11-04 2014-02-19 上海数字电视国家工程研究中心有限公司 Encapsulation method and calibration method for data package
CN106850788A (en) * 2017-01-22 2017-06-13 中国科学院电子学研究所苏州研究院 Towards the integrated framework and integrated approach of multi-source heterogeneous geographic information resources
CN107908690A (en) * 2017-11-01 2018-04-13 南京欣网互联网络科技有限公司 A kind of data processing method based on big data OA operation analysis
CN108512911A (en) * 2018-03-15 2018-09-07 成都优易数据有限公司 A kind of distributed capture agency plant and its implementation based on Flume
CN109492012A (en) * 2018-10-31 2019-03-19 厦门安胜网络科技有限公司 A kind of method, apparatus and storage medium of data real-time statistics and retrieval

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
刘劲松: "《专业学位研究生实验课—工科篇》", 30 November 2017 *
方中纯,赵江鹏: ""基于 Flume 和 HDFS 的大数据采集***的研究与实现"", 《内蒙古科技大学学报》 *
曾明宇: ""一种基于Storm和Mongodb的分布式实时日志数据存储与处理***的设计与实现及应用"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
胡水友,刘孜学: ""高速铁路监测***与集成平台的交互接口设计"", 《高速铁路技术》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115022402A (en) * 2022-07-01 2022-09-06 杭州乘云数字技术有限公司 Agent acquisition method and system based on one-stack integration technology

Similar Documents

Publication Publication Date Title
US10270788B2 (en) Machine learning based anomaly detection
US10169709B2 (en) Avoiding incompatibility between data and computing processes to enhance computer performance
CN106227780B (en) A kind of the automation screenshot evidence collecting method and system of magnanimity webpage
CN108038207A (en) A kind of daily record data processing system, method and server
CN104022902A (en) Method and system of monitoring server cluster
US10025599B1 (en) Connectivity as a service
CN106572087A (en) Voice outbound system
CN109151056B (en) Method and system for pushing messages based on Canal
CN104462562A (en) Data migration system and method based on data warehouse automation
CN103457802A (en) Information transmission system and method
CN108319539A (en) A kind of method and system generating GPU card slot position information
CN113704004B (en) Method, device, equipment and storage medium for realizing notification service
CN103117878A (en) Design method of Nagios-based distribution monitoring system
CN110297871A (en) A kind of method that isomeric data acquires in real time
US20170236132A1 (en) Automatically modeling or simulating indications of interest
CN103685363A (en) Efficient and reliable method and system for multitask processing
CN114022711A (en) Industrial identification data caching method and device, medium and electronic equipment
CN110111068A (en) Production executive system and method based on micro services framework
CN108614820A (en) The method and apparatus for realizing the parsing of streaming source data
CN103092932A (en) Distributed document transcoding system
CN115599571A (en) Data processing method and device, electronic equipment and storage medium
CN111401819B (en) Intersystem data pushing method and system
CN204652428U (en) A kind of distributed data base management system (DDBMS)
CN104462220B (en) Web page screen-cutting and coding and transmission method and device
CN207903729U (en) A kind of escalator remote information integrated system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191001