CN106599096B - High-performance file system design method based on nonvolatile memory - Google Patents

High-performance file system design method based on nonvolatile memory Download PDF

Info

Publication number
CN106599096B
CN106599096B CN201611058790.4A CN201611058790A CN106599096B CN 106599096 B CN106599096 B CN 106599096B CN 201611058790 A CN201611058790 A CN 201611058790A CN 106599096 B CN106599096 B CN 106599096B
Authority
CN
China
Prior art keywords
data
file system
memory
cache line
persistence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611058790.4A
Other languages
Chinese (zh)
Other versions
CN106599096A (en
Inventor
陈海波
董明凯
余倩倩
臧斌宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN201611058790.4A priority Critical patent/CN106599096B/en
Publication of CN106599096A publication Critical patent/CN106599096A/en
Application granted granted Critical
Publication of CN106599096B publication Critical patent/CN106599096B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a high-performance file system design method based on a nonvolatile memory, which comprises the following steps: storing metadata in a file system by adopting a self-checking data structure; the method comprises the steps of processing metadata in a file system by using nondestructive updating and recycling and reusing a delayed data structure, wherein a cache line refresh instruction and a memory barrier are not arranged in a key path, and only the persistence dependency of key operation is recorded; the background thread is in charge of guaranteeing the persistence of the key information and recovering the deleted data structure; the file system is checked and restored using a data checksum restoration algorithm. Under the condition of ensuring the consistency of the file system, the invention reduces the use of cache line refresh instructions and memory barriers so as to reduce the use delay of the file system, increase the access throughput and improve the performance of the file system.

Description

High-performance file system design method based on nonvolatile memory
Technical Field
The invention relates to a file system design method, in particular to a high-performance file system design method based on a nonvolatile memory.
Background
The file system is a very important part of the operating system. Program loading and data access require file system support, and file system performance is particularly important. The conventional file system is designed based on a disk, and how to arrange disk reading and writing to reduce seek time and how to fully utilize a page cache (PageCache) in a memory are considered in the design. With the development of Flash and SSD technologies, many file systems take into account the characteristics of SSDs.
With the development of nonvolatile Memory technologies, such as STT-MRAM, Phase Change Memory, 3D XPoint, NVDIMM, and the like, nonvolatile Memory is becoming more and more mature. Among them NVDIMM technology is already in commercial use. The nonvolatile memory can still keep the data therein from being lost after power failure, so the nonvolatile memory is used for replacing a traditional magnetic disk or SSD as a main data storage medium. In contrast to disks or SSDs, non-volatile memory is directly accessed on a memory bus, which can be addressed and accessed by a processor at byte granularity. Its power consumption is low, and its storage speed is close to that of internal memory, and some non-volatile memories made by using some techniques are even faster than DRAM memory.
The advent of non-volatile memory technology has had a significant impact on the design of file systems. In a disk-based file system, data is cached in cache pages, and disk writes can be considered atomic writes in units of one Block (Block). In non-volatile memory, the granularity of data persistence is usually one Cache Line (Cache Line). In the disk file system, the brushing, the removing and the obtaining of the cache pages can be controlled by software. In a file system with non-volatile memory, when data is written into the non-volatile memory is not entirely determined by software due to the presence of a processor Cache (CPU Cache). It is difficult for the file system to control the order and time in which data is persisted. The out-of-order persistence can cause inconsistency of data and metadata in the file system, which can cause various serious problems, such as data confusion and data loss.
In order to ensure the consistency of data in a file system, the existing non-volatile Memory file system uses a Cache Line Flush-back instruction (Cache Line Flush) and a Memory Barrier instruction (Memory Barrier) to force a certain Line in a processor Cache to be flushed back into a non-volatile Memory, so as to ensure that the data in the Cache Line is persistent. However, the performance of program execution is greatly affected by frequently used cache line flush back instructions and memory barrier instructions. Unlike in disk file systems, disk reading is a performance bottleneck in the entire process; in a non-volatile memory file system, since the access speed of a storage medium (i.e., a non-volatile memory) is very fast, persistent storage is no longer a performance bottleneck, and therefore instructions such as cache line refresh and a memory barrier greatly affect the performance of the entire file system.
Therefore, how to design a file system according to the characteristics of the non-volatile memory and reduce the use of cache line refresh instructions and memory barriers on the premise of ensuring the consistency and the persistence of the file system, so as to reduce the use delay of the file system and increase the access throughput thereof, has become one of the technical problems to be solved in the art.
Disclosure of Invention
Aiming at the defects in the prior art, the invention aims to provide a high-performance file system design method based on a nonvolatile memory, which makes full use of the advantages and characteristics of the nonvolatile memory and reduces cache line refresh instructions and memory barriers in the system call processing process, thereby reducing the use delay of the file system and increasing the access throughput of the file system.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
a high-performance file system design method based on a nonvolatile memory is characterized by comprising the following steps:
self-verifiable structures: the metadata part in the file system is self-verifiable, namely whether the metadata is complete or not can be judged by reading the data of the file system.
The system calls and processes, namely, metadata in the file system is processed by using non-destructive updating and the recovery and reuse of a delayed data structure are delayed, a cache line refresh instruction and a memory barrier are not generated in a key path, and only the persistence dependency of key operation is recorded;
the background process is responsible for ensuring the persistence of the key information and recovering the deleted data structure;
data verification and recovery, checking and recovering the file system by using a data checksum recovery algorithm.
Preferably, in the system call processing, all the file system related system calls are divided into three categories, which are respectively processed as follows:
the system call of an inode is involved, and the modification is directly carried out;
the system call of two inodes is involved, and for the deletion operation, only the target directory entry is marked as invalid, the deletion is not really carried out, and the resources are delayed and recycled; for the adding operation, when creating and adding the directory entry, the existing data and the structure thereof are not damaged, but the persistence is not carried out immediately;
the system call of three or more inodes is involved, enough binding information is recorded in the system before operation and in operation, all the inodes participating in the operation are bound together, cache line refresh instructions and data barriers are not needed, but the refresh granularity of the cache lines is utilized to ensure that the operation can be completely cancelled or completed by the persistent data under any refresh condition.
Preferably, in the system call processing, all operations need to be recorded in the corresponding inode for the background process to perform the persistence operation.
Preferably, in the system call process, all operations do not affect the persisted data, and no matter whether the operation is completely persisted, no other data is affected.
Preferably, in the system calling process, all the new data are self-verifiable, and whether the new data are complete and persistent can be judged by checking the data of the new data.
Preferably, there are no cache line flush back instructions and memory barriers in the system call process, and the consistency of the file system can be guaranteed through data recovery.
Preferably, the background process is used for ensuring the persistence progress of the whole file system, reading a series of operations from each inode, calling a cache line refresh instruction for memory addresses involved in the operations according to the sequence, and finally calling a memory barrier to ensure that data are persisted.
Preferably, the background process needs to be responsible for returning the deleted resources to the allocator after the persistent deletion operation, so as to prevent the memory leakage.
Preferably, in the data checking and recovering process, after the unexpected restart is detected, a special process is used for checking the data and recovering the inconsistent data, the data checking starts from the root of the file, all directories and files are sequentially traversed, and the data are not completely maintained
Preferably, in the data checking and recovering process, information such as a special judgment log is not needed, and whether an operation should be cancelled or completed can be judged only according to the integrity of the data.
Compared with the prior art, the invention has the following beneficial effects:
the high-performance file system design method based on the nonvolatile memory adopts the self-checking structure and the nondestructive updating, can reduce the use of cache line refresh instructions and memory barriers under the condition of ensuring the consistency of the file system, so as to reduce the use delay of the file system, increase the access throughput of the file system and improve the performance of the file system.
Drawings
Other features, objects and advantages of the invention will become more apparent upon reading of the detailed description of non-limiting embodiments with reference to the following drawings:
FIG. 1 is an illustration of a self-verifiable structure;
FIG. 2 is an example of the processing of a system call involving only one inode in the present invention;
FIG. 3 is an example of the processing of a system call involving two inodes in the present invention;
FIG. 4 is an example of processing for a system call involving multiple inodes in accordance with the present invention;
FIG. 5 is a diagram of a first embodiment of the present invention for saving an operation list in a background process and an inode;
FIG. 6 is a second diagram illustrating saving of an operation list in a background process and an inode according to the present invention.
Detailed Description
The present invention will be described in detail with reference to specific examples. The following examples will assist those skilled in the art in further understanding the invention, but are not intended to limit the invention in any way. It should be noted that it would be obvious to those skilled in the art that various changes and modifications can be made without departing from the spirit of the invention. All falling within the scope of the present invention.
The design method of the high-performance file system based on the nonvolatile memory comprises the following four aspects, and the specific implementation mode is as follows:
one, self-verifiable structure: the metadata part in the file system is self-verifiable, namely whether the metadata is complete or not can be judged by reading the data of the file system. For example, as shown in fig. 1, by saving a Checksum (Checksum), it can be ensured that no matter which block of the five blocks of data in the figure is not completely persisted, it can be detected by checking its Checksum. Thus, the persistence of the five blocks of data is guaranteed to be an atomic operation, and the persistence order is not important.
Secondly, system call processing: all file system related system calls are divided into three classes and processed separately. For a system call involving only one inode, such as chmod, chown, etc., the modification to the non-volatile memory does not cause a problem of file system consistency, so that the modification can be performed directly, and the operation is recorded to wait for the persistence performed in the background, such as the operation for modifying the permission group (chgrp) in fig. 2. For a system call involving two inodes, such as mkdir, rmdir, link, unlink, create, symlink, etc., for a delete operation, only the target directory entry (Dentry) is marked as invalid, and the deletion is not actually performed, and the resources thereof are delayed and recycled, so that the cache line refresh instruction and the memory barrier in the process, such as the delete (unlink) operation in fig. 3, can be removed. For the add operation, when creating and adding directory entries, it is guaranteed that existing data and its structure are not destroyed, but persistence is not performed immediately, so that cache line refresh instructions and memory barriers in the process can be omitted, such as the create (create) operation in fig. 3. In any event, this operation needs to be recorded at the end to wait for the persistence of the background process. For system calls involving three or more inodes, such as a rename operation, enough binding information should be recorded in the system before and during the operation, all inodes participating in the operation are bound together without using a cache line refresh instruction and a data barrier, but with the refresh granularity of the cache line, it can be guaranteed that in any refresh case, the data that has been persisted can completely undo or complete the operation, such as a renaming (rename) operation in fig. 4. Also, operations should be recorded for the background process to guarantee persisted progress. All of the above operations do not affect data that has been persisted, and thus do not affect other data, regardless of whether the operation is completely persisted. Meanwhile, all the newly added data use the self-verifiable structure, that is, whether the newly added data are completely persisted is judged by checking the data of the newly added data, so that if all the operations of the newly added data are not persisted, the operations can be detected in the data verification, and the operations are cancelled. In all the above operations, no cache line flush back instruction and memory barrier are used, and the consistency of the file system can be guaranteed through data recovery.
Thirdly, background process: the background process is used for ensuring the persistence progress of the whole file system. The method reads a series of operations from each inode, calls a cache line to flush back instructions for memory addresses involved in the operations according to the sequence, and finally calls a memory barrier to ensure that data is persisted. Meanwhile, after the persistent deletion operation, the background process needs to be responsible for returning the deleted resources to the distributor to prevent the memory leakage. As shown in fig. 5 and 6, the background process performs persistence and resource recovery on the operation in the inode in sequence.
Fourthly, data checking and restoring: after detecting an unexpected restart, a special process checks the data and recovers the inconsistent data. The data verification is started from the file root, all the directories and the files are sequentially traversed, the integrity of the data is verified, whether the general operation needs to be cancelled or not is determined, the operation is timely executed to half for the renaming operation, and the renaming operation can be continuously completed if the persistent information is sufficient.
In summary, the high-performance file system design method based on the non-volatile memory provided by the invention adopts the self-checking structure and the non-destructive updating, and can reduce the cache line refresh instruction and the use of the memory barrier under the condition of ensuring the consistency of the file system, so as to reduce the use delay of the file system, increase the access throughput of the file system and improve the performance of the file system.
The foregoing description of specific embodiments of the present invention has been presented. It is to be understood that the present invention is not limited to the specific embodiments described above, and that various changes or modifications may be made by one skilled in the art within the scope of the appended claims without departing from the spirit of the invention. The embodiments and features of the embodiments of the present application may be combined with each other arbitrarily without conflict.

Claims (5)

1. A high-performance file system design method based on a nonvolatile memory is characterized by comprising the following steps:
the self-verifiable structure is used for verifying the metadata part in the file system;
the system calls and processes, namely, metadata in the file system is processed by using non-destructive updating and the recovery and reuse of a delayed data structure are delayed, a cache line refresh instruction and a memory barrier are not generated in a key path, and only the persistence dependency of key operation is recorded;
the background process is responsible for ensuring the persistence of the key information and recovering the deleted data structure;
data checking and recovering, namely checking and recovering the file system by using a data checking and recovering algorithm;
in the system call processing, all the system calls related to the file system are divided into three types, which are respectively processed:
the system call of an inode is involved, and the modification is directly carried out;
the system call of two inodes is involved, and for the deletion operation, only the target directory entry is marked as invalid, the deletion is not really carried out, and the resources are delayed and recycled; for the adding operation, when creating and adding the directory entry, the existing data and the structure thereof are not damaged, but the persistence is not carried out immediately;
the method comprises the steps that system calls of three or more inodes are involved, enough binding information is recorded in a system before operation and in operation, all the inodes participating in the operation are bound together, a cache line refresh instruction and a data barrier are not needed, and the persistence data can completely withdraw or complete the operation under any refresh condition by utilizing the refresh granularity of the cache line;
in the system calling process, all operations need to be recorded in corresponding inodes for the background process to carry out persistent operation;
in the system calling process, all operations do not affect the persisted data, and no matter whether the operation is completely persisted or not, other data are not affected;
in the system calling process, all the newly added data are self-verifiable, and whether the newly added data are completely persisted or not can be judged by checking the data of the newly added data;
in the system calling process, no cache line refreshing instruction and no memory barrier exist, and the consistency of the file system can be ensured through data recovery.
2. The method according to claim 1, wherein in the background process, a series of operations are read from each inode, and a cache line refresh instruction is called for memory addresses involved in the operations in sequence, and finally a memory barrier is called to ensure that data is persistent.
3. The method as claimed in claim 2, wherein the background process is responsible for returning the deleted resources to the allocator after the persistent delete operation, thereby preventing memory leakage.
4. The method as claimed in claim 1, wherein in the data verification and recovery, a special process is used to verify the data after detecting an unexpected restart, and recover the inconsistent data, and the data verification is performed by sequentially traversing all directories and files from the file root, and canceling or completing the operations that are not completely persisted.
5. The method as claimed in claim 4, wherein the log information is not specially judged during the data verification and recovery process, and whether an operation should be cancelled or completed is judged only according to the integrity of the data itself.
CN201611058790.4A 2016-11-24 2016-11-24 High-performance file system design method based on nonvolatile memory Active CN106599096B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611058790.4A CN106599096B (en) 2016-11-24 2016-11-24 High-performance file system design method based on nonvolatile memory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611058790.4A CN106599096B (en) 2016-11-24 2016-11-24 High-performance file system design method based on nonvolatile memory

Publications (2)

Publication Number Publication Date
CN106599096A CN106599096A (en) 2017-04-26
CN106599096B true CN106599096B (en) 2020-09-15

Family

ID=58593449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611058790.4A Active CN106599096B (en) 2016-11-24 2016-11-24 High-performance file system design method based on nonvolatile memory

Country Status (1)

Country Link
CN (1) CN106599096B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107577492A (en) * 2017-08-10 2018-01-12 上海交通大学 The NVM block device drives method and system of accelerating file system read-write
CN107784121B (en) * 2017-11-18 2020-04-24 中国人民解放军国防科技大学 Lowercase optimization method of log file system based on nonvolatile memory
CN110647764B (en) * 2019-09-05 2022-10-28 上海交通大学 Protection method and system for user-mode nonvolatile memory file system
CN112667588B (en) * 2019-10-16 2022-12-02 青岛海信移动通信技术股份有限公司 Intelligent terminal device and method for writing file system data
CN111736996B (en) * 2020-06-17 2022-08-16 上海交通大学 Process persistence method and device for distributed non-volatile memory system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104391930A (en) * 2014-11-21 2015-03-04 用友软件股份有限公司 Distributed file storage device and method
CN104881371A (en) * 2015-05-29 2015-09-02 清华大学 Persistent internal memory transaction processing cache management method and device
CN105138275A (en) * 2015-07-06 2015-12-09 中国科学院高能物理研究所 Data sharing method for Lustre storage system
CN105404673A (en) * 2015-11-19 2016-03-16 清华大学 NVRAM-based method for efficiently constructing file system
CN105701156A (en) * 2015-12-29 2016-06-22 青岛海信网络科技股份有限公司 Distributed file system management method and device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104391930A (en) * 2014-11-21 2015-03-04 用友软件股份有限公司 Distributed file storage device and method
CN104881371A (en) * 2015-05-29 2015-09-02 清华大学 Persistent internal memory transaction processing cache management method and device
CN105138275A (en) * 2015-07-06 2015-12-09 中国科学院高能物理研究所 Data sharing method for Lustre storage system
CN105404673A (en) * 2015-11-19 2016-03-16 清华大学 NVRAM-based method for efficiently constructing file system
CN105701156A (en) * 2015-12-29 2016-06-22 青岛海信网络科技股份有限公司 Distributed file system management method and device

Also Published As

Publication number Publication date
CN106599096A (en) 2017-04-26

Similar Documents

Publication Publication Date Title
CN106599096B (en) High-performance file system design method based on nonvolatile memory
EP3724764B1 (en) Write-ahead style logging in a persistent memory device
US11301379B2 (en) Access request processing method and apparatus, and computer device
US20230016555A1 (en) Data recovery method, apparatus, and solid state drive
US7822932B2 (en) Systems and methods for providing nonlinear journaling
US8380689B2 (en) Systems and methods for providing nonlinear journaling
US8181065B2 (en) Systems and methods for providing nonlinear journaling
US7752402B2 (en) Systems and methods for allowing incremental journaling
CN103744961B (en) The method improving the non-volatile memories life-span by reconfigurable file system directory tree
US11030092B2 (en) Access request processing method and apparatus, and computer system
US20190278482A1 (en) Data storage device backup
Son et al. SSD-assisted backup and recovery for database systems
CN112035294A (en) Security log file system, and implementation method and medium thereof
KR20200060220A (en) NVM-based file system and method for data update using the same
CN111414320B (en) Method and system for constructing disk cache based on nonvolatile memory of log file system
Song et al. CADedup: High-performance consistency-aware deduplication based on persistent memory
KR20190003091A (en) Device and method on file system journaling using atomic operation
CN112416812B (en) Access request processing method, device and computer system
Haga et al. Failure-atomic synchronization of memory mapped data in non-volatile memory based system
Banikazemi et al. Eucalyptus: Support for effective use of persistent memory
JP5751488B2 (en) File management device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant