CN108664643A

CN108664643A - A kind of distributed memory system and method for gathered data

Info

Publication number: CN108664643A
Application number: CN201810467501.9A
Authority: CN
Inventors: 周士凯; 蔡茜; 王强; 惠兴海
Original assignee: SICHUAN HUADI INFORMATION TECHNOLOGY CO LTD; Chongqing Technology and Business Institute
Current assignee: SICHUAN HUADI INFORMATION TECHNOLOGY CO LTD; Chongqing Technology and Business Institute
Priority date: 2018-05-11
Filing date: 2018-05-11
Publication date: 2018-10-16

Abstract

The present invention discloses a kind of distributed memory system and method for gathered data, including application server, database server and big data cluster server, the application server is connected to database server, and the big data cluster server is connected to application server；Application server, for placing application program；Database server, for storing database；Database server returns to request data to application server, and calculating interaction is carried out with big data cluster server, and the result of calculation for calculating data or storing big data cluster is provided for big data cluster；Big data cluster server is used for Distributed Calculation big data cluster；Big data cluster and data library server carry out data interaction, read analysis data source or database server is written in the Calculation results of data.The present invention can realize being effectively treated, distribute and storing for high-volume concurrent data, improve storage efficiency, promote the performance and autgmentability of storage system.

Description

A kind of distributed memory system and method for gathered data

Technical field

The invention belongs to technical field of data storage, the distributed memory system more particularly to a kind of gathered data and side Method.

Background technology

With the development of society and science and technology, the continuous improvement of region scale and the size of population, various data volumes are in eruption type Growth, for data storage pressure it is also increasing.Especially, the vocational ability with larger data storage quantity is faced It analyzes for the systems such as big data service platform, has the characteristics that data volume is huge, needs concurrent processing mass data, need to have There is the storage system that can carry out successfully managing such case.

But existing storage system mostly stores data in a manner of centralised storage, can not should in high volume simultaneously Processing, distribution and the storage of the data of hair；In this case, existing stocking system efficiency of storage is low, poor performance and can not Realize extendable functions.

Invention content

To solve the above-mentioned problems, the present invention proposes a kind of distributed memory system and method for gathered data, can It realizes being effectively treated, distribute and storing for high-volume concurrent data, improves storage efficiency, promote performance and the expansion of storage system Malleability.

In order to achieve the above objectives, the technical solution adopted by the present invention is：A kind of distributed memory system of gathered data, packet Application server, database server and big data cluster server are included, the application server is connected to database server, The big data cluster server is connected to application server；

The application server, for placing application program；

The database server, for storing database；Database server returns to request data to application server, Calculating interaction is carried out with big data cluster server, the meter for calculating data or storing big data cluster is provided for big data cluster Calculate result；

The big data cluster server is used for Distributed Calculation big data cluster；Big data cluster and the service of data library Device carries out data interaction, reads analysis data source or database server is written in the Calculation results of data.

Further, to realize that database quickly and can meet the processing of a large amount of concurrent datas, the database clothes Business device is library table hash structure, and the data in database are carried out dispersion storage；The database server to critical data into Row database mirror image processing improves system operations efficiency, improves the security reliability of system；To ensure that database damages data Bad Tolerate and redundance ability.

Further, the database server be read and write abruption structure, using 1 master mostly from or double masters mostly from read-write Capable separation will be read and write into strategy, can realize the read-write capability for being carried out at the same time a large amount of concurrent datas；The database server Using the different periods as Regionalization basis, read-write process is realized, improve the readwrite performance to data.

Further, the distributed memory system, which includes at least two database servers, constitutes data base set Group, data-base cluster constitute a virtual centralized database logical image；The scalability for improving system, carries to user terminal For transparent data service.

Further, further include DB Backup server, the DB Backup server and database server Connection side by side；Back-up processing is carried out to database, improves system reliability.

Further, the DB Backup server is using the hot standby redundancy technique of three machines；The DB Backup Server includes three backup servers, and mutually backup support ability is provided between three backup servers, when three backup clothes One of device be engaged in when something goes wrong, another server detects immediately to be come, and is taken and taken over action, and distributed storage is made System can restart the service of failure within the shortest time；The DB Backup server of offer service is set to stop service When, it will not middle circuit network service.System will be made to become safe, reliable, efficient, the reliability-availability of system is up to 99.99%.

Further, the big data cluster server includes manager's machine and at least two worker's machines The Distributed Calculation cluster of composition；Manager's machine is responsible for the load balancing of Distributed Calculation cluster；The work of raising system Make efficiency.

On the other hand, the present invention also provides a kind of distributed storage methods of gathered data, including step：

S100 initiates big data PC cluster by application server；

S200 carries out data interaction by big data cluster server and database server, reads analysis data source by dividing Cloth file system handles data cluster；

By treated, data cluster is stored in database server S300；Database server returns to request data extremely Application server.

Further, the distributed file system is HDFS systems.

HDFS systems have the characteristics of high fault tolerance；The data of acquisition are divided into and do not verify and verify two major classes type, portion Divided data may need multiple authentication, high fault tolerance to ensure that persistent storage and the calculating of gathered data.

It is very big for gathered data amount, and verify, calculate and analysis task is various；HDFS provides high-throughput Carry out the data of access application, those is suitble to have the application program of super large data set.

It is carried out for gathered data is synchronous with the analysis of data, the access for needing to realize flow data；HDFS relaxes POSIX Requirement, may be implemented in this way stream form access file system in data.

Further, being backed up to data in database server by DB Backup server, backup mode Including backing up completely, incremental backup and differential backup；

Backup completely, backup database whole set of data；

Differential backup, since Last Backup, the data that change；

The data to have changed since incremental backup, last full backup or incremental backup.

Differential backup can solve the data variation that the frequent business processing of owner is brought；Incremental backup is in order to adapt to owner Simultaneously few caused incremental data scale of construction is little for portfolio；Full backup then lose in order to prevent by owner's business datum.

Using the advantageous effect of the technical program：

The present invention can realize being effectively treated, distribute and storing for high-volume concurrent data, improve storage efficiency, be promoted The performance and autgmentability of storage system；

There is the present invention Transaction Processing ability, system constantly to have new data entrance while storage, and each Operation system data can calculate, and history detailed data can be inquired；

The present invention have storage can parallel expansion ability, in global storage scheme, to reduce hardware cost, frame as possible Structure support is put by stages on demand；

The present invention can support cluster and load balancing, to ensure system can on the basis of existing hardware facility, Large-scale consumer remains able to provide the service of high speed in the case of accessing.

Description of the drawings

Fig. 1 is a kind of flow diagram of the distributed storage method of gathered data of the present invention；

Fig. 2 is a kind of structural schematic diagram of the distributed memory system of gathered data in the present invention.

Specific implementation mode

To make the objectives, technical solutions, and advantages of the present invention clearer, the present invention is made into one below in conjunction with the accompanying drawings Step illustrates.

In the present embodiment, shown in Figure 1, the present invention proposes a kind of distributed memory system of gathered data, packet Application server, database server and big data cluster server are included, the application server is connected to database server, The big data cluster server is connected to application server；

The application server, for placing application program；

As the prioritization scheme of above-described embodiment, to realize that database being capable of place that is quick and meeting a large amount of concurrent datas Reason, the database server are library table hash structure, and the data in database are carried out dispersion storage；The database service Device carries out database mirroring processing to critical data, improves system operations efficiency, improves the security reliability of system；To ensure Tolerate and redundance ability of the database to corrupted data.

The database server be read and write abruption structure, using 1 master mostly from or double masters mostly from read-write strategy, will read and Capable separation is write into, can realize the read-write capability for being carried out at the same time a large amount of concurrent datas；The database server with it is different when Between section as Regionalization basis, realize read-write process, improve the readwrite performance to data.

The distributed memory system includes that at least two database servers constitute data-base cluster, data base set Group constitutes a virtual centralized database logical image；The scalability for improving system provides transparent number to user terminal According to service.

Further include DB Backup server as the prioritization scheme of above-described embodiment, the DB Backup server It is connect side by side with database server；Back-up processing is carried out to database, improves system reliability.

The DB Backup server is using the hot standby redundancy technique of three machines；The DB Backup server includes three Platform backup server provides mutually backup support ability between three backup servers, when one of three backup servers When something goes wrong, another server detects immediately comes, and takes and take over action, enables distributed memory system most short Time in restarting failure service；It, will not middle suspension when the DB Backup server of offer service being made to stop service Network service.System will be made to become safe, reliable, efficient, the reliability-availability of system is up to 99.99%.

As the prioritization scheme of above-described embodiment, the big data cluster server is including manager's machine and at least The Distributed Calculation cluster that two worker's machines are constituted；The load that manager's machine is responsible for Distributed Calculation cluster is equal Weighing apparatus；The working efficiency of raising system.

To coordinate the realization of the method for the present invention, it is based on identical inventive concept, as shown in Fig. 2, the present invention also provides one The distributed storage method of kind gathered data, including step：

S100 initiates big data PC cluster by application server；

As the prioritization scheme of above-described embodiment, the distributed file system is HDFS systems.

As the prioritization scheme of above-described embodiment, data in database server are carried out by DB Backup server Backup, backup mode be include back up completely, incremental backup and differential backup；

Backup completely, backup database whole set of data；

Differential backup, since Last Backup, the data that change；

Such as：DB Backup takes the mode of full backup+incremental backup, is adopted on every Mondays to database after online implementing Complete standby mode (Rman backups) is taken, Tuesday to Sunday takes incremental backup mode.

The above shows and describes the basic principles and main features of the present invention and the advantages of the present invention.The technology of the industry Personnel are it should be appreciated that the present invention is not limited to the above embodiments, and the above embodiments and description only describe this The principle of invention, without departing from the spirit and scope of the present invention, various changes and improvements may be made to the invention, these changes Change and improvement all fall within the protetion scope of the claimed invention.The claimed scope of the invention by appended claims and its Equivalent thereof.

Claims

1. a kind of distributed memory system of gathered data, which is characterized in that including application server, database server and big Data cluster server, the application server are connected to database server, and the big data cluster server, which is connected to, answers Use server；

The application server, for placing application program；

The database server, for storing database；Database server returns to request data to application server, and big Data cluster server carries out calculating interaction, and the calculating knot for calculating data or storing big data cluster is provided for big data cluster Fruit；

The big data cluster server is used for Distributed Calculation big data cluster；Big data cluster and data library server into Row data interaction reads analysis data source or database server is written in the Calculation results of data.

2. a kind of distributed memory system of gathered data according to claim 1, which is characterized in that the database clothes Business device is library table hash structure, and the data in database are carried out dispersion storage；The database server to critical data into Row database mirror image processing.

3. a kind of distributed memory system of gathered data according to claim 2, which is characterized in that the database clothes Business device be read and write abruption structure, using 1 master mostly from or double masters mostly from read-write strategy, will read and write into capable separation；The data Library server realizes read-write process using the different periods as Regionalization basis.

4. a kind of distributed memory system of gathered data according to claim 3, which is characterized in that the distribution is deposited Storage system includes that at least two database servers constitute data-base clusters, data-base cluster constitute one it is virtual single Data base logic image.

5. a kind of distributed memory system of gathered data according to claim 1, which is characterized in that further include database Backup server, the DB Backup server are connect side by side with database server.

6. a kind of distributed memory system of gathered data according to claim 5, which is characterized in that the database is standby Part server is using the hot standby redundancy technique of three machines；The DB Backup server includes three backup servers, and three standby Mutually backup support ability is provided between part server, when one of three backup servers when something goes wrong, another service Device detects immediately to be come, and is taken and taken over action, and distributed memory system is enable to restart mistake within the shortest time The service lost；It, will not middle circuit network service when the DB Backup server of offer service being made to stop service.

7. a kind of distributed memory system of gathered data according to claim 1, which is characterized in that the large data sets Group's server includes the Distributed Calculation cluster that manager's machine and at least two worker's machines are constituted；The manager Machine is responsible for the load balancing of Distributed Calculation cluster.

8. a kind of distributed storage method of gathered data, which is characterized in that including step：

S100 initiates big data PC cluster by application server；

S200 carries out data interaction by big data cluster server and database server, reads analysis data source and passes through distribution File system handles data cluster；

By treated, data cluster is stored in database server S300；Database server returns to request data to application Server.

9. a kind of distributed storage method of gathered data according to claim 8, which is characterized in that the distributed text Part system is HDFS systems.

10. a kind of distributed storage method of gathered data according to claim 8, which is characterized in that pass through database Backup server backs up data in database server, and backup mode includes that backup, incremental backup and difference completely is standby Part；

Backup completely, backup database whole set of data；

Differential backup, since Last Backup, the data that change；