CN111046044A - High-reliability framework of distributed object storage system based on memory type database - Google Patents
High-reliability framework of distributed object storage system based on memory type database Download PDFInfo
- Publication number
- CN111046044A CN111046044A CN201911278933.6A CN201911278933A CN111046044A CN 111046044 A CN111046044 A CN 111046044A CN 201911278933 A CN201911278933 A CN 201911278933A CN 111046044 A CN111046044 A CN 111046044A
- Authority
- CN
- China
- Prior art keywords
- mapping
- index
- database
- disk
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2264—Multidimensional index structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a high-reliability framework of a distributed object storage system based on a memory type database. The index mapping adopted by the invention has small occupation ratio to an object stored on a disk, so the influence on the read-write performance is small; when the index is lost, the background can be found back in a short time, and the reliability is high.
Description
Technical Field
The invention belongs to the technical field of distributed object storage systems, and particularly relates to a high-reliability framework of a distributed object storage system based on a memory type database.
Background
The read-write performance of the memory type database is high, but the risk of data loss exists. Once the index map is lost, the user will not have access to the object and the object data stored on disk becomes garbage data.
Compared with the characteristics of safety and persistence of data of the traditional database, the memory type database has the advantages that data are directly stored in a memory, so that the reading and writing speed is greatly improved, and the application performance is greatly improved. But upon a sudden power loss, the data in the memory is lost. In a distributed object storage system, the in-memory database is often used as a level 1 cache or an acceleration of the level 1 cache to provide the overall performance of the system. However, once the memory data is lost, the disk data that has been persisted becomes garbage data directly due to the loss of the index.
Although the performance of the database can be improved by using the solid state disk to replace the traditional mechanical hard disk, the read-write speed of the solid state disk is poorer than that of the memory.
How to exert the high performance of reading and writing of the memory and ensure the safety and persistence of the data are two seemingly contradictory directions. It is common practice to sacrifice a portion of memory reads and writes to ensure data security and persistence, such as in the background by asynchronous disk flushes.
In the distributed object storage system, strong consistency of data is the most important thing, and if the data passes through the asynchronous disk refreshing mode, once an exception occurs, the result of data inconsistency is generated. Strong consistency is often ensured by means of synchronous disk brushing.
Disclosure of Invention
The present invention provides a high reliability architecture of a distributed object storage system based on a memory database, aiming at the above deficiencies of the prior art.
In order to achieve the technical purpose, the technical scheme adopted by the invention is as follows:
in the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.
In order to optimize the technical scheme, the specific measures adopted further comprise:
when the user reads the object, the storage system searches the secondary index in the memory type database according to the object name, and then the disk is removed to search the object data.
When the object is uploaded, the index mapping of the primary index and the secondary index and the object data are written into a disk together, and then the mapping is written into the memory type database.
When the object is deleted, the index mapping of the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the memory type database.
When a map in the database is lost, an audit task that is periodically executed reads the map from the disk and provides a lost list in csv format to the user, who decides whether to insert the map into the database.
When the mapping in the database is lost, the audit task executed regularly reads the mapping from the disk, and provides a lost list in the csv format to the user, and the user determines whether to insert the mapping into the database, specifically:
step 1: starting a user-adjustable timing audit task;
step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;
and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;
and 4, step 4: the auditing result is provided for the user in a csv format file;
and 5: the user decides the object to be retrieved;
step 6: and inserting the index mapping of the retrieved object into the memory type database.
The invention has the following beneficial effects:
the index mapping adopted by the invention has small occupation ratio to an object stored on a disk, so the influence on the read-write performance is small; when the index is lost, the background can be found back in a short time, and the reliability is high.
Drawings
FIG. 1 is an exemplary diagram of a memory-based database losing data;
FIG. 2 is a flow chart of the present invention for retrieving missing maps in a database in a high reliability architecture of a distributed object storage system based on a memory-type database.
Detailed Description
Embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
In the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.
Referring to fig. 1, in the embodiment, when a user reads an object, the storage system searches the secondary index in the memory type database according to the object name, and then searches the object data by using a magnetic disk. The read-write performance of the memory type database is high, but the risk of data loss exists. Once the index map is lost, the user will not have access to the object and the object data stored on disk becomes garbage data.
In the embodiment, when the object is uploaded, the index mapping of the primary index and the secondary index and the object data are written into a disk together, and then the mapping is written into the memory type database.
In the embodiment, when the object is deleted, the index mapping of the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the memory type database.
Referring to fig. 2, in the embodiment, when a mapping in a database is lost, an audit task executed periodically reads the mapping from a disk, and provides a lost list in a csv format to a user, and the user determines whether to insert the mapping into the database, specifically:
step 1: starting a user-adjustable timing audit task;
step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;
and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;
and 4, step 4: the auditing result is provided for the user in a csv format file;
and 5: the user decides the object to be retrieved;
step 6: and inserting the index mapping of the retrieved object into the memory type database.
The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above-mentioned embodiments, and all technical solutions belonging to the idea of the present invention belong to the protection scope of the present invention. It should be noted that modifications and embellishments within the scope of the invention may be made by those skilled in the art without departing from the principle of the invention.
Claims (6)
1. A high-reliability architecture of a distributed object storage system based on a memory type database is characterized in that in the distributed object storage system, an object name defined by a user is used as a primary index in the storage system, the position of an index object finally stored on a disk is used as a secondary index, and the primary index and the secondary index are stored in the memory type database in a one-to-one mapping relationship.
2. The high reliability architecture of a distributed object storage system based on a memory type database as claimed in claim 1, wherein when a user reads an object, the storage system finds the secondary index in the memory type database according to the object name, and then performs a disk search to find the object data.
3. The architecture of claim 1, wherein when uploading an object, the mapping between the primary index and the secondary index is written to a disk along with the object data, and then written to the EMS memory database.
4. The architecture of claim 1, wherein when an object is deleted, the index mapping between the primary index and the secondary index is deleted from the disk together with the object data, and then the mapping is deleted from the EMS memory database.
5. The architecture of claim 1, wherein when a mapping in the database is lost, a periodically executed audit task reads the mapping from disk and provides a lost list in csv format to the user, who decides whether to insert the mapping into the database.
6. The architecture of claim 5, wherein when a mapping in the database is lost, the periodically executed audit task reads the mapping from the disk and provides a lost list in csv format to the user, and the user determines whether to insert the mapping into the database, specifically:
step 1: starting a user-adjustable timing audit task;
step 2: each storage node scans objects on the disk in parallel and adds the objects with the missing index mapping in the memory into the cache of the audit result;
and step 3: judging whether the auditing result exceeds a set threshold value, if not, returning to execute the step 2, otherwise, executing the step 4;
and 4, step 4: the auditing result is provided for the user in a csv format file;
and 5: the user decides the object to be retrieved;
step 6: and inserting the index mapping of the retrieved object into the memory type database.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911278933.6A CN111046044A (en) | 2019-12-13 | 2019-12-13 | High-reliability framework of distributed object storage system based on memory type database |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911278933.6A CN111046044A (en) | 2019-12-13 | 2019-12-13 | High-reliability framework of distributed object storage system based on memory type database |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111046044A true CN111046044A (en) | 2020-04-21 |
Family
ID=70235941
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911278933.6A Pending CN111046044A (en) | 2019-12-13 | 2019-12-13 | High-reliability framework of distributed object storage system based on memory type database |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111046044A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023197404A1 (en) * | 2022-04-14 | 2023-10-19 | 上海川源信息科技有限公司 | Object storage method and apparatus based on distributed database |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287201A (en) * | 2019-07-02 | 2019-09-27 | 重庆紫光华山智安科技有限公司 | Data access method, device, equipment and storage medium |
-
2019
- 2019-12-13 CN CN201911278933.6A patent/CN111046044A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110287201A (en) * | 2019-07-02 | 2019-09-27 | 重庆紫光华山智安科技有限公司 | Data access method, device, equipment and storage medium |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2023197404A1 (en) * | 2022-04-14 | 2023-10-19 | 上海川源信息科技有限公司 | Object storage method and apparatus based on distributed database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN102364474B (en) | Metadata storage system for cluster file system and metadata management method | |
US20120030413A1 (en) | Memory management device, information processing device, and memory management method | |
EP2209074A1 (en) | Data storage processing method, data searching method and devices thereof | |
CN104156380A (en) | Distributed memory Hash indexing method and system | |
CN105786410A (en) | Method for increasing processing speed of data storage system and data storage system | |
CN103559027A (en) | Design method of separate-storage type key-value storage system | |
CN101707633B (en) | Message-oriented middleware persistent message storing method based on file system | |
CN103324699B (en) | A kind of rapid data de-duplication method adapting to large market demand | |
CN103226965B (en) | Based on the audio/video data access method of time bitmap | |
KR20140116617A (en) | Method for Page-level address mapping using flash memory and System thereof | |
CN101819509A (en) | Solid state disk read-write method | |
CN105117415A (en) | Optimized SSD data updating method | |
CN109582598B (en) | Preprocessing method for realizing efficient hash table searching based on external storage | |
CN113821171B (en) | Key value storage method based on hash table and LSM tree | |
CN103488709A (en) | Method and system for building indexes and method and system for retrieving indexes | |
CN104268088A (en) | Vehicle DVR (Digital Video Recorder) hard disk data storage method | |
CN105159616A (en) | Disk space management method and device | |
WO2018133762A1 (en) | File merging method and apparatus | |
CN104360825A (en) | Hybrid internal memory system and management method thereof | |
CN104156432A (en) | File access method | |
CN111046044A (en) | High-reliability framework of distributed object storage system based on memory type database | |
CN103455284A (en) | Method and device for reading and writing data | |
KR101226600B1 (en) | Memory System And Memory Mapping Method thereof | |
CN102323907A (en) | Method for embedded ARM (advanced RISC machines) processor to store and delete NANDFLASH data | |
CN104391802A (en) | Simplified pool metadata node refreshing consistency protection method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |