CN111046044A - High-reliability framework of distributed object storage system based on memory type database - Google Patents

High-reliability framework of distributed object storage system based on memory type database Download PDF

Info

Publication number
CN111046044A
CN111046044A CN201911278933.6A CN201911278933A CN111046044A CN 111046044 A CN111046044 A CN 111046044A CN 201911278933 A CN201911278933 A CN 201911278933A CN 111046044 A CN111046044 A CN 111046044A
Authority
CN
China
Prior art keywords
mapping
index
database
disk
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911278933.6A
Other languages
Chinese (zh)
Inventor
庄鹏盛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Fujitsu Nanda Software Technology Co Ltd
Original Assignee
Nanjing Fujitsu Nanda Software Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Fujitsu Nanda Software Technology Co Ltd filed Critical Nanjing Fujitsu Nanda Software Technology Co Ltd
Priority to CN201911278933.6A priority Critical patent/CN111046044A/en
Publication of CN111046044A publication Critical patent/CN111046044A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2471Distributed queries

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a high-reliability framework of a distributed object storage system based on a memory type database. The index mapping adopted by the invention has small occupation ratio to an object stored on a disk, so the influence on the read-write performance is small; when the index is lost, the background can be found back in a short time, and the reliability is high.

Description

High-reliability framework of distributed object storage system based on memory type database
Technical Field
The invention belongs to the technical field of distributed object storage systems, and particularly relates to a high-reliability framework of a distributed object storage system based on a memory type database.
Background
The read-write performance of the memory type database is high, but the risk of data loss exists. Once the index map is lost, the user will not have access to the object and the object data stored on disk becomes garbage data.
Compared with the characteristics of safety and persistence of data of the traditional database, the memory type database has the advantages that data are directly stored in a memory, so that the reading and writing speed is greatly improved, and the application performance is greatly improved. But upon a sudden power loss, the data in the memory is lost. In a distributed object storage system, the in-memory database is often used as a level 1 cache or an acceleration of the level 1 cache to provide the overall performance of the system. However, once the memory data is lost, the disk data that has been persisted becomes garbage data directly due to the loss of the index.
Although the performance of the database can be improved by using the solid state disk to replace the traditional mechanical hard disk, the read-write speed of the solid state disk is poorer than that of the memory.
How to exert the high performance of reading and writing of the memory and ensure the safety and persistence of the data are two seemingly contradictory directions. It is common practice to sacrifice a portion of memory reads and writes to ensure data security and persistence, such as in the background by asynchronous disk flushes.
In the distributed object storage system, strong consistency of data is the most important thing, and if the data passes through the asynchronous disk refreshing mode, once an exception occurs, the result of data inconsistency is generated. Strong consistency is often ensured by means of synchronous disk brushing.
Disclosure of Invention
The present invention provides a high reliability architecture of a distributed object storage system based on a memory database, aiming at the above deficiencies of the prior art.
In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:
in the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.
In order to optimize the technical scheme, the specific measures adopted further comprise:
when the user reads the object, the storage system searches the secondary index in the memory type database according to the object name, and then the disk is removed to search the object data.
When the object is uploaded, the index mapping of the primary index and the secondary index and the object data are written into a disk together, and then the mapping is written into the memory type database.
When the object is deleted, the index mapping of the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the memory type database.
When a map in the database is lost, an audit task that is periodically executed reads the map from the disk and provides a lost list in csv format to the user, who decides whether to insert the map into the database.
When the mapping in the database is lost, the audit task executed regularly reads the mapping from the disk, and provides a lost list in the csv format to the user, and the user determines whether to insert the mapping into the database, specifically:
step 1: starting a user-adjustable timing audit task;
step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;
and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;
and 4, step 4: the auditing result is provided for the user in a csv format file;
and 5: the user decides the object to be retrieved;
step 6: and inserting the index mapping of the retrieved object into the memory type database.
The invention has the following beneficial effects:
the index mapping adopted by the invention has small occupation ratio to an object stored on a disk, so the influence on the read-write performance is small; when the index is lost, the background can be found back in a short time, and the reliability is high.
Drawings
FIG. 1 is an exemplary diagram of a memory-based database losing data;
FIG. 2 is a flow chart of the present invention for retrieving missing maps in a database in a high reliability architecture of a distributed object storage system based on a memory-type database.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
In the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.
Referring to fig. 1, in the embodiment, when a user reads an object, the storage system searches the secondary index in the memory type database according to the object name, and then searches the object data by using a magnetic disk. The read-write performance of the memory type database is high, but the risk of data loss exists. Once the index map is lost, the user will not have access to the object and the object data stored on disk becomes garbage data.
In the embodiment, when the object is uploaded, the index mapping of the primary index and the secondary index and the object data are written into a disk together, and then the mapping is written into the memory type database.
In the embodiment, when the object is deleted, the index mapping of the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the memory type database.
Referring to fig. 2, in the embodiment, when a mapping in a database is lost, an audit task executed periodically reads the mapping from a disk, and provides a lost list in a csv format to a user, and the user determines whether to insert the mapping into the database, specifically:
step 1: starting a user-adjustable timing audit task;
step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;
and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;
and 4, step 4: the auditing result is provided for the user in a csv format file;
and 5: the user decides the object to be retrieved;
step 6: and inserting the index mapping of the retrieved object into the memory type database.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.

Claims (6)

1. A high-reliability architecture of a distributed object storage system based on a memory type database is characterized in that in the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.
2. The high reliability architecture of a distributed object storage system based on a memory type database as claimed in claim 1, wherein when a user reads an object, the storage system finds the secondary index in the memory type database according to the object name, and then performs a disk search to find the object data.
3. The architecture of claim 1, wherein when uploading an object, the mapping between the primary index and the secondary index is written to a disk along with the object data, and then written to the EMS memory database.
4. The architecture of claim 1, wherein when an object is deleted, the index mapping between the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the EMS memory database.
5. The architecture of claim 1, wherein when a mapping in the database is lost, a periodically executed audit task reads the mapping from disk and provides a lost list in csv format to the user, who decides whether to insert the mapping into the database.
6. The architecture of claim 5, wherein when a mapping in the database is lost, the periodically executed audit task reads the mapping from the disk and provides a lost list in csv format to the user, and the user determines whether to insert the mapping into the database, specifically:
step 1: starting a user-adjustable timing audit task;
step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;
and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;
and 4, step 4: the auditing result is provided for the user in a csv format file;
and 5: the user decides the object to be retrieved;
step 6: and inserting the index mapping of the retrieved object into the memory type database.
CN201911278933.6A 2019-12-13 2019-12-13 High-reliability framework of distributed object storage system based on memory type database Pending CN111046044A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911278933.6A CN111046044A (en) 2019-12-13 2019-12-13 High-reliability framework of distributed object storage system based on memory type database

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911278933.6A CN111046044A (en) 2019-12-13 2019-12-13 High-reliability framework of distributed object storage system based on memory type database

Publications (1)

Publication Number Publication Date
CN111046044A true CN111046044A (en) 2020-04-21

Family

ID=70235941

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911278933.6A Pending CN111046044A (en) 2019-12-13 2019-12-13 High-reliability framework of distributed object storage system based on memory type database

Country Status (1)

Country Link
CN (1) CN111046044A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023197404A1 (en) * 2022-04-14 2023-10-19 上海川源信息科技有限公司 Object storage method and apparatus based on distributed database

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287201A (en) * 2019-07-02 2019-09-27 重庆紫光华山智安科技有限公司 Data access method, device, equipment and storage medium

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110287201A (en) * 2019-07-02 2019-09-27 重庆紫光华山智安科技有限公司 Data access method, device, equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023197404A1 (en) * 2022-04-14 2023-10-19 上海川源信息科技有限公司 Object storage method and apparatus based on distributed database

Similar Documents

Publication Publication Date Title
CN102364474B (en) Metadata storage system for cluster file system and metadata management method
US20120030413A1 (en) Memory management device, information processing device, and memory management method
EP2209074A1 (en) Data storage processing method, data searching method and devices thereof
CN104156380A (en) Distributed memory Hash indexing method and system
CN105786410A (en) Method for increasing processing speed of data storage system and data storage system
CN103559027A (en) Design method of separate-storage type key-value storage system
CN101707633B (en) Message-oriented middleware persistent message storing method based on file system
CN103324699B (en) A kind of rapid data de-duplication method adapting to large market demand
CN103226965B (en) Based on the audio/video data access method of time bitmap
KR20140116617A (en) Method for Page-level address mapping using flash memory and System thereof
CN101819509A (en) Solid state disk read-write method
CN105117415A (en) Optimized SSD data updating method
CN109582598B (en) Preprocessing method for realizing efficient hash table searching based on external storage
CN113821171B (en) Key value storage method based on hash table and LSM tree
CN103488709A (en) Method and system for building indexes and method and system for retrieving indexes
CN104268088A (en) Vehicle DVR (Digital Video Recorder) hard disk data storage method
CN105159616A (en) Disk space management method and device
WO2018133762A1 (en) File merging method and apparatus
CN104360825A (en) Hybrid internal memory system and management method thereof
CN104156432A (en) File access method
CN111046044A (en) High-reliability framework of distributed object storage system based on memory type database
CN103455284A (en) Method and device for reading and writing data
KR101226600B1 (en) Memory System And Memory Mapping Method thereof
CN102323907A (en) Method for embedded ARM (advanced RISC machines) processor to store and delete NANDFLASH data
CN104391802A (en) Simplified pool metadata node refreshing consistency protection method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination