CN111046044A

CN111046044A - High-reliability framework of distributed object storage system based on memory type database

Info

Publication number: CN111046044A
Application number: CN201911278933.6A
Authority: CN
Inventors: 庄鹏盛
Original assignee: Nanjing Fujitsu Nanda Software Technology Co Ltd
Current assignee: Nanjing Fujitsu Nanda Software Technology Co Ltd
Priority date: 2019-12-13
Filing date: 2019-12-13
Publication date: 2020-04-21

Abstract

The invention discloses a high-reliability framework of a distributed object storage system based on a memory type database. The index mapping adopted by the invention has small occupation ratio to an object stored on a disk, so the influence on the read-write performance is small; when the index is lost, the background can be found back in a short time, and the reliability is high.

Description

High-reliability framework of distributed object storage system based on memory type database

Technical Field

The invention belongs to the technical field of distributed object storage systems, and particularly relates to a high-reliability framework of a distributed object storage system based on a memory type database.

Background

The read-write performance of the memory type database is high, but the risk of data loss exists. Once the index map is lost, the user will not have access to the object and the object data stored on disk becomes garbage data.

Compared with the characteristics of safety and persistence of data of the traditional database, the memory type database has the advantages that data are directly stored in a memory, so that the reading and writing speed is greatly improved, and the application performance is greatly improved. But upon a sudden power loss, the data in the memory is lost. In a distributed object storage system, the in-memory database is often used as a level 1 cache or an acceleration of the level 1 cache to provide the overall performance of the system. However, once the memory data is lost, the disk data that has been persisted becomes garbage data directly due to the loss of the index.

Although the performance of the database can be improved by using the solid state disk to replace the traditional mechanical hard disk, the read-write speed of the solid state disk is poorer than that of the memory.

How to exert the high performance of reading and writing of the memory and ensure the safety and persistence of the data are two seemingly contradictory directions. It is common practice to sacrifice a portion of memory reads and writes to ensure data security and persistence, such as in the background by asynchronous disk flushes.

In the distributed object storage system, strong consistency of data is the most important thing, and if the data passes through the asynchronous disk refreshing mode, once an exception occurs, the result of data inconsistency is generated. Strong consistency is often ensured by means of synchronous disk brushing.

Disclosure of Invention

The present invention provides a high reliability architecture of a distributed object storage system based on a memory database, aiming at the above deficiencies of the prior art.

In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:

in the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.

In order to optimize the technical scheme, the specific measures adopted further comprise:

when the user reads the object, the storage system searches the secondary index in the memory type database according to the object name, and then the disk is removed to search the object data.

When the object is uploaded, the index mapping of the primary index and the secondary index and the object data are written into a disk together, and then the mapping is written into the memory type database.

When the object is deleted, the index mapping of the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the memory type database.

When a map in the database is lost, an audit task that is periodically executed reads the map from the disk and provides a lost list in csv format to the user, who decides whether to insert the map into the database.

When the mapping in the database is lost, the audit task executed regularly reads the mapping from the disk, and provides a lost list in the csv format to the user, and the user determines whether to insert the mapping into the database, specifically:

step 1: starting a user-adjustable timing audit task;

step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;

and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;

and 4, step 4: the auditing result is provided for the user in a csv format file;

and 5: the user decides the object to be retrieved;

step 6: and inserting the index mapping of the retrieved object into the memory type database.

The invention has the following beneficial effects:

the index mapping adopted by the invention has small occupation ratio to an object stored on a disk, so the influence on the read-write performance is small; when the index is lost, the background can be found back in a short time, and the reliability is high.

Drawings

FIG. 1 is an exemplary diagram of a memory-based database losing data;

FIG. 2 is a flow chart of the present invention for retrieving missing maps in a database in a high reliability architecture of a distributed object storage system based on a memory-type database.

Detailed Description

Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.

Referring to fig. 1, in the embodiment, when a user reads an object, the storage system searches the secondary index in the memory type database according to the object name, and then searches the object data by using a magnetic disk. The read-write performance of the memory type database is high, but the risk of data loss exists. Once the index map is lost, the user will not have access to the object and the object data stored on disk becomes garbage data.

In the embodiment, when the object is uploaded, the index mapping of the primary index and the secondary index and the object data are written into a disk together, and then the mapping is written into the memory type database.

In the embodiment, when the object is deleted, the index mapping of the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the memory type database.

Referring to fig. 2, in the embodiment, when a mapping in a database is lost, an audit task executed periodically reads the mapping from a disk, and provides a lost list in a csv format to a user, and the user determines whether to insert the mapping into the database, specifically:

step 1: starting a user-adjustable timing audit task;

and 5: the user decides the object to be retrieved;

The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims

1. A high-reliability architecture of a distributed object storage system based on a memory type database is characterized in that in the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.

2. The high reliability architecture of a distributed object storage system based on a memory type database as claimed in claim 1, wherein when a user reads an object, the storage system finds the secondary index in the memory type database according to the object name, and then performs a disk search to find the object data.

3. The architecture of claim 1, wherein when uploading an object, the mapping between the primary index and the secondary index is written to a disk along with the object data, and then written to the EMS memory database.

4. The architecture of claim 1, wherein when an object is deleted, the index mapping between the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the EMS memory database.

5. The architecture of claim 1, wherein when a mapping in the database is lost, a periodically executed audit task reads the mapping from disk and provides a lost list in csv format to the user, who decides whether to insert the mapping into the database.

6. The architecture of claim 5, wherein when a mapping in the database is lost, the periodically executed audit task reads the mapping from the disk and provides a lost list in csv format to the user, and the user determines whether to insert the mapping into the database, specifically:

step 1: starting a user-adjustable timing audit task;

and 5: the user decides the object to be retrieved;