CN108664643A - A kind of distributed memory system and method for gathered data - Google Patents

A kind of distributed memory system and method for gathered data Download PDF

Info

Publication number
CN108664643A
CN108664643A CN201810467501.9A CN201810467501A CN108664643A CN 108664643 A CN108664643 A CN 108664643A CN 201810467501 A CN201810467501 A CN 201810467501A CN 108664643 A CN108664643 A CN 108664643A
Authority
CN
China
Prior art keywords
data
server
database
backup
cluster
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810467501.9A
Other languages
Chinese (zh)
Inventor
周士凯
蔡茜
王强
惠兴海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SICHUAN HUADI INFORMATION TECHNOLOGY CO LTD
Chongqing Technology and Business Institute
Original Assignee
SICHUAN HUADI INFORMATION TECHNOLOGY CO LTD
Chongqing Technology and Business Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SICHUAN HUADI INFORMATION TECHNOLOGY CO LTD, Chongqing Technology and Business Institute filed Critical SICHUAN HUADI INFORMATION TECHNOLOGY CO LTD
Priority to CN201810467501.9A priority Critical patent/CN108664643A/en
Publication of CN108664643A publication Critical patent/CN108664643A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1458Management of the backup or restore process
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2201/00Indexing scheme relating to error detection, to error correction, and to monitoring
    • G06F2201/80Database-specific techniques

Abstract

The present invention discloses a kind of distributed memory system and method for gathered data, including application server, database server and big data cluster server, the application server is connected to database server, and the big data cluster server is connected to application server;Application server, for placing application program;Database server, for storing database;Database server returns to request data to application server, and calculating interaction is carried out with big data cluster server, and the result of calculation for calculating data or storing big data cluster is provided for big data cluster;Big data cluster server is used for Distributed Calculation big data cluster;Big data cluster and data library server carry out data interaction, read analysis data source or database server is written in the Calculation results of data.The present invention can realize being effectively treated, distribute and storing for high-volume concurrent data, improve storage efficiency, promote the performance and autgmentability of storage system.

Description

A kind of distributed memory system and method for gathered data
Technical field
The invention belongs to technical field of data storage, the distributed memory system more particularly to a kind of gathered data and side Method.
Background technology
With the development of society and science and technology, the continuous improvement of region scale and the size of population, various data volumes are in eruption type Growth, for data storage pressure it is also increasing.Especially, the vocational ability with larger data storage quantity is faced It analyzes for the systems such as big data service platform, has the characteristics that data volume is huge, needs concurrent processing mass data, need to have There is the storage system that can carry out successfully managing such case.
But existing storage system mostly stores data in a manner of centralised storage, can not should in high volume simultaneously Processing, distribution and the storage of the data of hair;In this case, existing stocking system efficiency of storage is low, poor performance and can not Realize extendable functions.
Invention content
To solve the above-mentioned problems, the present invention proposes a kind of distributed memory system and method for gathered data, can It realizes being effectively treated, distribute and storing for high-volume concurrent data, improves storage efficiency, promote performance and the expansion of storage system Malleability.
In order to achieve the above objectives, the technical solution adopted by the present invention is:A kind of distributed memory system of gathered data, packet Application server, database server and big data cluster server are included, the application server is connected to database server, The big data cluster server is connected to application server;
The application server, for placing application program;
The database server, for storing database;Database server returns to request data to application server, Calculating interaction is carried out with big data cluster server, the meter for calculating data or storing big data cluster is provided for big data cluster Calculate result;
The big data cluster server is used for Distributed Calculation big data cluster;Big data cluster and the service of data library Device carries out data interaction, reads analysis data source or database server is written in the Calculation results of data.
Further, to realize that database quickly and can meet the processing of a large amount of concurrent datas, the database clothes Business device is library table hash structure, and the data in database are carried out dispersion storage;The database server to critical data into Row database mirror image processing improves system operations efficiency, improves the security reliability of system;To ensure that database damages data Bad Tolerate and redundance ability.
Further, the database server be read and write abruption structure, using 1 master mostly from or double masters mostly from read-write Capable separation will be read and write into strategy, can realize the read-write capability for being carried out at the same time a large amount of concurrent datas;The database server Using the different periods as Regionalization basis, read-write process is realized, improve the readwrite performance to data.
Further, the distributed memory system, which includes at least two database servers, constitutes data base set Group, data-base cluster constitute a virtual centralized database logical image;The scalability for improving system, carries to user terminal For transparent data service.
Further, further include DB Backup server, the DB Backup server and database server Connection side by side;Back-up processing is carried out to database, improves system reliability.
Further, the DB Backup server is using the hot standby redundancy technique of three machines;The DB Backup Server includes three backup servers, and mutually backup support ability is provided between three backup servers, when three backup clothes One of device be engaged in when something goes wrong, another server detects immediately to be come, and is taken and taken over action, and distributed storage is made System can restart the service of failure within the shortest time;The DB Backup server of offer service is set to stop service When, it will not middle circuit network service.System will be made to become safe, reliable, efficient, the reliability-availability of system is up to 99.99%.
Further, the big data cluster server includes manager's machine and at least two worker's machines The Distributed Calculation cluster of composition;Manager's machine is responsible for the load balancing of Distributed Calculation cluster;The work of raising system Make efficiency.
On the other hand, the present invention also provides a kind of distributed storage methods of gathered data, including step:
S100 initiates big data PC cluster by application server;
S200 carries out data interaction by big data cluster server and database server, reads analysis data source by dividing Cloth file system handles data cluster;
By treated, data cluster is stored in database server S300;Database server returns to request data extremely Application server.
Further, the distributed file system is HDFS systems.
HDFS systems have the characteristics of high fault tolerance;The data of acquisition are divided into and do not verify and verify two major classes type, portion Divided data may need multiple authentication, high fault tolerance to ensure that persistent storage and the calculating of gathered data.
It is very big for gathered data amount, and verify, calculate and analysis task is various;HDFS provides high-throughput Carry out the data of access application, those is suitble to have the application program of super large data set.
It is carried out for gathered data is synchronous with the analysis of data, the access for needing to realize flow data;HDFS relaxes POSIX Requirement, may be implemented in this way stream form access file system in data.
Further, being backed up to data in database server by DB Backup server, backup mode Including backing up completely, incremental backup and differential backup;
Backup completely, backup database whole set of data;
Differential backup, since Last Backup, the data that change;
The data to have changed since incremental backup, last full backup or incremental backup.
Differential backup can solve the data variation that the frequent business processing of owner is brought;Incremental backup is in order to adapt to owner Simultaneously few caused incremental data scale of construction is little for portfolio;Full backup then lose in order to prevent by owner's business datum.
Using the advantageous effect of the technical program:
The present invention can realize being effectively treated, distribute and storing for high-volume concurrent data, improve storage efficiency, be promoted The performance and autgmentability of storage system;
There is the present invention Transaction Processing ability, system constantly to have new data entrance while storage, and each Operation system data can calculate, and history detailed data can be inquired;
The present invention have storage can parallel expansion ability, in global storage scheme, to reduce hardware cost, frame as possible Structure support is put by stages on demand;
The present invention can support cluster and load balancing, to ensure system can on the basis of existing hardware facility, Large-scale consumer remains able to provide the service of high speed in the case of accessing.
Description of the drawings
Fig. 1 is a kind of flow diagram of the distributed storage method of gathered data of the present invention;
Fig. 2 is a kind of structural schematic diagram of the distributed memory system of gathered data in the present invention.
Specific implementation mode
To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention is made into one below in conjunction with the accompanying drawings Step illustrates.
In the present embodiment, shown in Figure 1, the present invention proposes a kind of distributed memory system of gathered data, packet Application server, database server and big data cluster server are included, the application server is connected to database server, The big data cluster server is connected to application server;
The application server, for placing application program;
The database server, for storing database;Database server returns to request data to application server, Calculating interaction is carried out with big data cluster server, the meter for calculating data or storing big data cluster is provided for big data cluster Calculate result;
The big data cluster server is used for Distributed Calculation big data cluster;Big data cluster and the service of data library Device carries out data interaction, reads analysis data source or database server is written in the Calculation results of data.
As the prioritization scheme of above-described embodiment, to realize that database being capable of place that is quick and meeting a large amount of concurrent datas Reason, the database server are library table hash structure, and the data in database are carried out dispersion storage;The database service Device carries out database mirroring processing to critical data, improves system operations efficiency, improves the security reliability of system;To ensure Tolerate and redundance ability of the database to corrupted data.
The database server be read and write abruption structure, using 1 master mostly from or double masters mostly from read-write strategy, will read and Capable separation is write into, can realize the read-write capability for being carried out at the same time a large amount of concurrent datas;The database server with it is different when Between section as Regionalization basis, realize read-write process, improve the readwrite performance to data.
The distributed memory system includes that at least two database servers constitute data-base cluster, data base set Group constitutes a virtual centralized database logical image;The scalability for improving system provides transparent number to user terminal According to service.
Further include DB Backup server as the prioritization scheme of above-described embodiment, the DB Backup server It is connect side by side with database server;Back-up processing is carried out to database, improves system reliability.
The DB Backup server is using the hot standby redundancy technique of three machines;The DB Backup server includes three Platform backup server provides mutually backup support ability between three backup servers, when one of three backup servers When something goes wrong, another server detects immediately comes, and takes and take over action, enables distributed memory system most short Time in restarting failure service;It, will not middle suspension when the DB Backup server of offer service being made to stop service Network service.System will be made to become safe, reliable, efficient, the reliability-availability of system is up to 99.99%.
As the prioritization scheme of above-described embodiment, the big data cluster server is including manager's machine and at least The Distributed Calculation cluster that two worker's machines are constituted;The load that manager's machine is responsible for Distributed Calculation cluster is equal Weighing apparatus;The working efficiency of raising system.
To coordinate the realization of the method for the present invention, it is based on identical inventive concept, as shown in Fig. 2, the present invention also provides one The distributed storage method of kind gathered data, including step:
S100 initiates big data PC cluster by application server;
S200 carries out data interaction by big data cluster server and database server, reads analysis data source by dividing Cloth file system handles data cluster;
By treated, data cluster is stored in database server S300;Database server returns to request data extremely Application server.
As the prioritization scheme of above-described embodiment, the distributed file system is HDFS systems.
HDFS systems have the characteristics of high fault tolerance;The data of acquisition are divided into and do not verify and verify two major classes type, portion Divided data may need multiple authentication, high fault tolerance to ensure that persistent storage and the calculating of gathered data.
It is very big for gathered data amount, and verify, calculate and analysis task is various;HDFS provides high-throughput Carry out the data of access application, those is suitble to have the application program of super large data set.
It is carried out for gathered data is synchronous with the analysis of data, the access for needing to realize flow data;HDFS relaxes POSIX Requirement, may be implemented in this way stream form access file system in data.
As the prioritization scheme of above-described embodiment, data in database server are carried out by DB Backup server Backup, backup mode be include back up completely, incremental backup and differential backup;
Backup completely, backup database whole set of data;
Differential backup, since Last Backup, the data that change;
The data to have changed since incremental backup, last full backup or incremental backup.
Differential backup can solve the data variation that the frequent business processing of owner is brought;Incremental backup is in order to adapt to owner Simultaneously few caused incremental data scale of construction is little for portfolio;Full backup then lose in order to prevent by owner's business datum.
Such as:DB Backup takes the mode of full backup+incremental backup, is adopted on every Mondays to database after online implementing Complete standby mode (Rman backups) is taken, Tuesday to Sunday takes incremental backup mode.
The above shows and describes the basic principles and main features of the present invention and the advantages of the present invention.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its Equivalent thereof.

Claims (10)

1. a kind of distributed memory system of gathered data, which is characterized in that including application server, database server and big Data cluster server, the application server are connected to database server, and the big data cluster server, which is connected to, answers Use server;
The application server, for placing application program;
The database server, for storing database;Database server returns to request data to application server, and big Data cluster server carries out calculating interaction, and the calculating knot for calculating data or storing big data cluster is provided for big data cluster Fruit;
The big data cluster server is used for Distributed Calculation big data cluster;Big data cluster and data library server into Row data interaction reads analysis data source or database server is written in the Calculation results of data.
2. a kind of distributed memory system of gathered data according to claim 1, which is characterized in that the database clothes Business device is library table hash structure, and the data in database are carried out dispersion storage;The database server to critical data into Row database mirror image processing.
3. a kind of distributed memory system of gathered data according to claim 2, which is characterized in that the database clothes Business device be read and write abruption structure, using 1 master mostly from or double masters mostly from read-write strategy, will read and write into capable separation;The data Library server realizes read-write process using the different periods as Regionalization basis.
4. a kind of distributed memory system of gathered data according to claim 3, which is characterized in that the distribution is deposited Storage system includes that at least two database servers constitute data-base clusters, data-base cluster constitute one it is virtual single Data base logic image.
5. a kind of distributed memory system of gathered data according to claim 1, which is characterized in that further include database Backup server, the DB Backup server are connect side by side with database server.
6. a kind of distributed memory system of gathered data according to claim 5, which is characterized in that the database is standby Part server is using the hot standby redundancy technique of three machines;The DB Backup server includes three backup servers, and three standby Mutually backup support ability is provided between part server, when one of three backup servers when something goes wrong, another service Device detects immediately to be come, and is taken and taken over action, and distributed memory system is enable to restart mistake within the shortest time The service lost;It, will not middle circuit network service when the DB Backup server of offer service being made to stop service.
7. a kind of distributed memory system of gathered data according to claim 1, which is characterized in that the large data sets Group's server includes the Distributed Calculation cluster that manager's machine and at least two worker's machines are constituted;The manager Machine is responsible for the load balancing of Distributed Calculation cluster.
8. a kind of distributed storage method of gathered data, which is characterized in that including step:
S100 initiates big data PC cluster by application server;
S200 carries out data interaction by big data cluster server and database server, reads analysis data source and passes through distribution File system handles data cluster;
By treated, data cluster is stored in database server S300;Database server returns to request data to application Server.
9. a kind of distributed storage method of gathered data according to claim 8, which is characterized in that the distributed text Part system is HDFS systems.
10. a kind of distributed storage method of gathered data according to claim 8, which is characterized in that pass through database Backup server backs up data in database server, and backup mode includes that backup, incremental backup and difference completely is standby Part;
Backup completely, backup database whole set of data;
Differential backup, since Last Backup, the data that change;
The data to have changed since incremental backup, last full backup or incremental backup.
CN201810467501.9A 2018-05-11 2018-05-11 A kind of distributed memory system and method for gathered data Pending CN108664643A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810467501.9A CN108664643A (en) 2018-05-11 2018-05-11 A kind of distributed memory system and method for gathered data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810467501.9A CN108664643A (en) 2018-05-11 2018-05-11 A kind of distributed memory system and method for gathered data

Publications (1)

Publication Number Publication Date
CN108664643A true CN108664643A (en) 2018-10-16

Family

ID=63779630

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810467501.9A Pending CN108664643A (en) 2018-05-11 2018-05-11 A kind of distributed memory system and method for gathered data

Country Status (1)

Country Link
CN (1) CN108664643A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658867A (en) * 2018-12-10 2019-04-19 北京欧徕德微电子技术有限公司 Data read-write method and its device
CN110489397A (en) * 2019-08-09 2019-11-22 上海富欣智能交通控制有限公司 Date storage method, apparatus and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104767813A (en) * 2015-04-08 2015-07-08 江苏国盾科技实业有限责任公司 Public bank big data service platform based on openstack
CN105577558A (en) * 2015-12-21 2016-05-11 浪潮集团有限公司 Solution to improving high concurrence of website server
CN105824744A (en) * 2016-03-21 2016-08-03 焦点科技股份有限公司 Real-time log collection and analysis method on basis of B2B (Business to Business) platform
US20170039236A1 (en) * 2015-08-06 2017-02-09 International Business Machines Corporation Vertical tuning of distributed analytics clusters

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104767813A (en) * 2015-04-08 2015-07-08 江苏国盾科技实业有限责任公司 Public bank big data service platform based on openstack
US20170039236A1 (en) * 2015-08-06 2017-02-09 International Business Machines Corporation Vertical tuning of distributed analytics clusters
CN105577558A (en) * 2015-12-21 2016-05-11 浪潮集团有限公司 Solution to improving high concurrence of website server
CN105824744A (en) * 2016-03-21 2016-08-03 焦点科技股份有限公司 Real-time log collection and analysis method on basis of B2B (Business to Business) platform

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
简玲: "基于Hadoop 的PB 级海量数据处理***的设计与实现", 《理论研究》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109658867A (en) * 2018-12-10 2019-04-19 北京欧徕德微电子技术有限公司 Data read-write method and its device
CN110489397A (en) * 2019-08-09 2019-11-22 上海富欣智能交通控制有限公司 Date storage method, apparatus and system

Similar Documents

Publication Publication Date Title
CN103116596B (en) System and method of performing snapshot isolation in distributed databases
CN108804112B (en) Block chain settlement processing method and system
KR102437664B1 (en) System and method for transaction recovery in a multitenant application server environment
CN104081353B (en) Balancing dynamic load in scalable environment
US9582520B1 (en) Transaction model for data stores using distributed file systems
CN104081354B (en) Subregion is managed in scalable environment
RU2208834C2 (en) Method and system for recovery of database integrity in system of bitslice databases without resource sharing using shared virtual discs and automated data medium for them
US8108634B1 (en) Replicating a thin logical unit
CN106021016A (en) Virtual point in time access between snapshots
US20100023564A1 (en) Synchronous replication for fault tolerance
US10127077B2 (en) Event distribution pattern for use with a distributed data grid
CN101814045A (en) Data organization method for backup services
US20040107381A1 (en) High performance transaction storage and retrieval system for commodity computing environments
CN103064728A (en) Fault-tolerant scheduling method of Map Reduce task
US20060190460A1 (en) Method and mechanism of handling reporting transactions in database systems
Douglis et al. Content-aware load balancing for distributed backup
WO2013138774A1 (en) Systems and methods for supporting transaction recovery based on a strict ordering of two-phase commit calls
CN110727709A (en) Cluster database system
CN108038201B (en) A kind of data integrated system and its distributed data integration system
CN106354548A (en) Virtual cluster creating and management method and device in distributed database system
CN110083490A (en) A kind of database backup method, restoring method and storage medium
US20220121527A1 (en) Dynamically updating database archive log dependency and backup copy recoverability
CN110263095A (en) Backup and recovery method, apparatus, computer equipment and storage medium
CN111930716A (en) Database capacity expansion method, device and system
EP3710936A1 (en) Methods and systems for power failure resistance for a distributed storage system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181016