WO2022037016A1 - 一种树结构数据的存储方法、***及相关装置 - Google Patents

一种树结构数据的存储方法、***及相关装置 Download PDF

Info

Publication number
WO2022037016A1
WO2022037016A1 PCT/CN2021/073607 CN2021073607W WO2022037016A1 WO 2022037016 A1 WO2022037016 A1 WO 2022037016A1 CN 2021073607 W CN2021073607 W CN 2021073607W WO 2022037016 A1 WO2022037016 A1 WO 2022037016A1
Authority
WO
WIPO (PCT)
Prior art keywords
block address
storage
key
value pair
tree structure
Prior art date
Application number
PCT/CN2021/073607
Other languages
English (en)
French (fr)
Inventor
刚亚州
Original Assignee
苏州浪潮智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州浪潮智能科技有限公司 filed Critical 苏州浪潮智能科技有限公司
Publication of WO2022037016A1 publication Critical patent/WO2022037016A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations

Definitions

  • the present application relates to the field of data storage, and in particular, to a method, system and related apparatus for storing tree-structured data.
  • the all-flash storage includes the deduplication function.
  • the deduplication function means that only one copy of the duplicated data is stored on the SSD (Solid State Drive). Therefore, the deduplication function can greatly save the SSD space and achieve the function of capacity reduction.
  • the deduplication function will generate a mapping relationship between multiple LBAs (Logical Block Address, logical block address) and one PBA (Physical Block Address, physical block address). Because of this P-L, that is, the one-to-many mapping relationship between PBA and LBA, the standard B+ tree operation cannot satisfy the fast search for the corresponding relationship of P-L key-value pairs.
  • the purpose of this application is to provide a storage method, system, computer-readable storage medium and electronic device for tree structure data, which can improve the search efficiency of logical block data.
  • the present application provides a storage method for tree structure data, and the specific technical solutions are as follows:
  • the key-value pair includes a physical block address and a corresponding logical block address
  • storing the logical block address exceeding the storage threshold before the overflow page of the leaf node further comprising:
  • the last storage unit of the leaf node is used to store the address of the overflow page.
  • the last storage unit of the overflow page is used to store the address of the second overflow page.
  • a space of a preset size is requested from the cache, and the overflow page or the second overflow page is generated.
  • storing the key-value pair in the tree structure includes:
  • the logical block address corresponding to the physical block address in the key-value pair is stored in the leaf node corresponding to the intermediate node.
  • storing the key-value pair exceeding the storage threshold in the overflow page of the leaf node includes:
  • the logical block addresses that do not exceed the storage threshold are stored in the leaf node, and the remaining logical block addresses are stored in the overflow page of the leaf node.
  • the number of storage units of the leaf node, the overflow page and the second overflow page is the same.
  • the present application also provides a storage system for tree-structured data, including:
  • an acquisition module for acquiring a key-value pair;
  • the key-value pair includes a physical block address and a corresponding logical block address;
  • the judgment module is used to judge whether the number of logical block addresses corresponding to the same physical block address is greater than the storage threshold of the leaf nodes in the tree structure;
  • the second storage module is configured to store the key-value pair exceeding the storage threshold in the overflow page of the leaf node when the determination result of the determination module is yes.
  • the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps of the above-described method.
  • the present application also provides an electronic device, comprising a memory and a processor, wherein a computer program is stored in the memory, and the processor implements the steps of the above method when the computer program in the memory is invoked.
  • the present application provides a method for storing tree structure data, comprising: obtaining a key-value pair; the key-value pair includes a physical block address and a corresponding logical block address; and judging whether the number of logical block addresses corresponding to the same physical block address is greater than that of the tree structure If not, store the key-value pair in the tree structure; if yes, store the key-value pair exceeding the storage threshold in the overflow page of the leaf node.
  • the present application determines whether the logical block address corresponding to the physical block address exceeds the storage threshold of the corresponding leaf node. Use the overflow page of the leaf node to store key-value pairs, so that each physical block address corresponds to only one intermediate node in the tree structure, so that the same physical block address does not need to be stored in multiple intermediate nodes when there are too many logical block addresses.
  • the same physical block address has a problem of high search error rate caused by multiple intermediate nodes, which improves the data reduction ratio of the system, thereby improving the efficiency of metadata access and the availability of data organization.
  • the present application also provides a tree-structured data storage system, a computer-readable storage medium, and an electronic device, which have the above-mentioned beneficial effects, and will not be repeated here.
  • FIG. 1 is a schematic diagram of a storage process of existing tree structure data provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of a method for storing tree structure data provided by an embodiment of the present application
  • FIG. 3 is a schematic diagram of a storage process of one-to-many tree structure data provided by an embodiment of the present application
  • FIG. 4 is a schematic structural diagram of a storage system for tree-structured data provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of a computer-readable storage medium provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 1 is a schematic diagram of a storage process of existing tree structure data provided by an embodiment of the present application.
  • the intermediate node Pt corresponds to the logical block addresses Lg, Li, Lj, Lk, L1, Lm, Ln... Since the number of logical block addresses exceeds the storage threshold of a single leaf node, two leaf nodes are required to store the logical block addresses. Correspondingly, there are also two Pt in the intermediate node. Then, if the logical block address is searched, the logical block address is not clear. Which intermediate node Pt the block address is located under will lead to low search efficiency and easy search errors.
  • the present application provides a storage method for tree structure data, and the specific steps are as follows:
  • FIG. 2 is a flowchart of a method for storing tree structure data provided by an embodiment of the present application. The method includes:
  • S101 Obtain a key-value pair; the key-value pair includes a physical block address and a corresponding logical block address;
  • a key-value pair that is, a P-L key-value pair
  • a physical block address and a corresponding logical block address that is, each logical block address and its corresponding physical block address need to be confirmed.
  • the physical block address and logical block address can usually be obtained directly from the storage device.
  • the key-value pair obtained in this step does not have to be a key-value pair in ⁇ P, L> format, but can also be obtained from a mapping table or other data structure including the mapping relationship between physical block addresses and logical block addresses.
  • the corresponding relationship between each logical block and the physical block address, so as to obtain the required key-value pair should also be within the protection scope of this application.
  • S102 determine whether the number of logical block addresses corresponding to the same physical block address is greater than the storage threshold of the leaf node in the tree structure; if not, enter S103; if so, enter S104;
  • this step it is necessary to judge whether the number of logical block addresses corresponding to each physical block address is greater than the storage threshold of the corresponding leaf node, that is, the size relationship between the number of logical block addresses corresponding to each physical block address and the storage threshold of the leaf node. Since the key-value pair may contain duplicate physical block addresses, this step only accumulates the number of logical block addresses corresponding to the same physical block address.
  • the specific size of the storage threshold is not limited here.
  • the storage threshold of the leaf node is usually determined when the tree structure is established, for example, there may be 32 or more, which are usually powers of 2.
  • step S104 is performed; otherwise, step S103 is performed.
  • the key-value pair can be directly stored in the tree structure.
  • the key-value pair can be directly stored in the leaf node of the tree structure as shown in FIG. 1 .
  • S104 Store the key-value pair exceeding the storage threshold in the overflow page of the leaf node.
  • the overflow page is used to store the key-value pairs that exceed the storage space of the leaf node, and the last storage space of the leaf node is used.
  • Cell is used to hold the address of the overflow page.
  • the storage unit actually used by the leaf node to store the logical block address is the storage threshold minus one. It is easy to understand that the key point of this step is to store the key-value pairs exceeding the storage threshold in the overflow page of the leaf node, but the leaf node itself can still store key-value pairs.
  • the overflow page when the overflow page is saturated, the overflow page can be split to obtain a second overflow page, so as to store the redundant logical block address in the second overflow page, and the last storage unit of the same overflow page is used to store the second overflow page.
  • the address of the page It should be noted that it does not mean that the overflow page is saturated when the storage unit of the overflow page is full, but when only one storage unit of the overflow page is empty, the remaining logical block addresses are still not stored at this time. That is, the overflow page needs to be split to obtain the second overflow page.
  • the second overflow page will be split as an overflow page to obtain more overflow pages to meet the storage requirements of the logical block. It can be seen that, except for the last overflow page, the preceding leaf nodes and each overflow page can only store their respective storage thresholds minus a number of logical block addresses.
  • the storage thresholds between overflow pages and leaf nodes are not limited here, that is, the storage thresholds between overflow pages can be the same or different, and the storage thresholds of overflow pages and leaf nodes can be the same or different.
  • the storage threshold of each overflow page in order to facilitate the calculation of the number of overflow pages required for each physical block, can be set corresponding to the leaf node, that is, the storage threshold of the leaf node is used as the storage threshold of each overflow page.
  • Threshold after determining the number of logical block addresses corresponding to the physical block address, it is convenient to determine the number of overflow pages required by the physical block address, so as to generate overflow pages in a targeted manner and avoid wasting storage space of the tree structure.
  • the leaf node is also regarded as an overflow page, and the overflow page except the last overflow page can only store 31 logical blocks.
  • the required number of overflow pages n is That is, the quotient of M and N is rounded up to obtain the number of overflow pages except leaf nodes.
  • the quotient of is an integer, then the required number of overflow pages n is Of course, this requires the same number of storage units for leaf nodes, overflow pages, and second overflow pages.
  • the embodiment of the present application determines whether the logical block address corresponding to the same physical block address exceeds the storage threshold of the corresponding leaf node. Instead, the overflow page of the leaf node is used to store key-value pairs, so that each physical block address corresponds to only one leaf node, which improves the data reduction ratio of the system and avoids the existence of multiple addresses of the same physical block when the physical block address corresponds to too many logical block addresses.
  • the problem of high search error rate caused by intermediate nodes improves the efficiency of metadata access and the availability of data organization.
  • the physical block address in the key-value pair can be stored in the tree structure.
  • the logical block address corresponding to the physical block address in the key-value pair is stored in the leaf node corresponding to the intermediate node. If the physical block address is determined, only the logical block address can be stored in the leaf node, and there is no need to repeatedly store the physical block address, thereby reducing the amount of data that the leaf node needs to store.
  • the logical block that does not exceed the storage threshold can also be stored.
  • the address is stored in the leaf node, and the remaining logical block addresses are stored in the overflow page of the leaf node. That is, the physical block address is also stored in the intermediate node, and only the logical block address is stored in the leaf node and the overflow page.
  • FIG. 3 is a schematic diagram of a storage process of one-to-many tree structure data provided by an embodiment of the present application, and the storage method disclosed in the present application is applied with reference to the key-value pairs included in FIG. 1 .
  • Only the logical block address corresponding to the physical block address Pt is stored in the leaf node corresponding to the intermediate node where the physical block address Pt is located and in the overflow page, so that there is only one intermediate node corresponding to the physical block address Pt in the tree structure, so that the key corresponding to Pt is retrieved
  • retrieval failure or repeated retrieval will not be caused due to the existence of multiple intermediate nodes, which improves retrieval efficiency.
  • a storage system for tree structure data provided by an embodiment of the present application is introduced below.
  • the storage system described below and the storage method for tree structure data described above may refer to each other correspondingly.
  • the present application also provides a storage system for tree-structured data, including:
  • Obtaining module 100 is used to obtain key-value pairs; the key-value pairs include physical block addresses and corresponding logical block addresses;
  • Judging module 200 for judging whether the number of logical block addresses corresponding to the same physical block address is greater than the storage threshold of the leaf node in the tree structure
  • the first storage module 300 is used for storing the key-value pair in the tree structure when the judgment result of the judgment module is no;
  • the second storage module 400 is configured to store the key-value pair exceeding the storage threshold in the overflow page of the leaf node when the determination result of the determination module is yes.
  • an overflow page generation module for generating an overflow page of the leaf node
  • the last storage unit of the leaf node is used to store the address of the overflow page.
  • An overflow page request module configured to request a preset size space from the cache, and generate the overflow page or the second overflow page.
  • the first storage module 300 may include:
  • a first storage unit for storing the physical block address in the key-value pair in an intermediate node in the tree structure
  • the second storage unit is configured to store the logical block address corresponding to the physical block address in the key-value pair in the leaf node corresponding to the intermediate node.
  • the second storage module 400 may be specifically configured to store the logical block address that does not exceed the storage threshold in the leaf A module that stores the remaining logical block address in the overflow page of the leaf node.
  • the present application also provides a computer-readable storage medium 5, as shown in FIG. 5, on which a computer program 51 is stored, and when the computer program 51 is executed, the steps provided in the above embodiments can be implemented.
  • the storage medium may include: U disk, removable hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
  • the present application also provides an electronic device, as shown in FIG. 6 , which may include a memory 6 and a processor 7, where a computer program is stored in the memory 6, and when the processor 7 calls the computer program in the memory,
  • a computer program is stored in the memory 6, and when the processor 7 calls the computer program in the memory.
  • the electronic device may also include various network interfaces, power supplies and other components.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种树结构数据的存储方法,包括:获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;若否,将所述键值对存于所述树结构;若是,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。本申请使得每个物理块地址在树结构中仅对应一个中间节点,避免在物理块地址对应逻辑块地址过多时同一物理块地址存在多个中间节点导致的查找错误率高的问题,提高***的数据缩减比,从而提高元数据访问效率,数据组织可用性更高。本申请还提供一种树结构数据的存储***、计算机可读存储介质和电子设备,具有上述有益效果。

Description

一种树结构数据的存储方法、***及相关装置
本申请要求于2020年08月20日提交中国国家知识产权局,申请号为202010844595.4,发明名称为“一种树结构数据的存储方法、***及相关装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据存储领域,特别涉及一种树结构数据的存储方法、***及相关装置。
背景技术
在全闪存储中包括重删功能,重删功能指重复的数据在SSD(Solid State Drive,固态硬盘)上只存储一份,因此重删功能可以大大节省SSD空间,达到容量缩减的功能。重删功能会产生多个LBA(Logical Block Address,逻辑块地址)与一个PBA(Physical Block Address,物理块地址)的映射关系。由于这种P-L,即PBA和LBA一对多的映射关系,使得标准的B+树操作不能满足快速的查找P-L键值对的对应关系。
因此如何改变数据的存储方式以提高数据的查找效率是本领域技术人员亟需解决的技术问题。
发明内容
本申请的目的是提供一种树结构数据的存储方法、***、计算机可读存储介质和电子设备,能够提高逻辑块数据的查找效率。
为解决上述技术问题,本申请提供一种树结构数据的存储方法,具体技术方案如下:
获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;
判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;
若否,将所述键值对存于所述树结构;
若是,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
可选的,将超过所述存储阈值的逻辑块地址存于所述叶子节点的溢出页之前,还包括:
生成所述叶子节点的溢出页;
其中,所述叶子节点的最后一个存储单元用于保存所述溢出页的地址。
可选的,还包括:
当所述溢出页存储饱和时,***所述溢出页得到第二溢出页;
将所述逻辑块地址存于所述第二溢出页;
其中,所述溢出页的最后一个存储单元用于保存所述第二溢出页的地址。
可选的,还包括:
从缓存请求预设大小空间,生成所述溢出页或所述第二溢出页。
可选的,将所述键值对存于所述树结构包括:
将所述键值对中的物理块地址存于所述树结构中的中间节点;
将所述键值对中物理块地址对应的逻辑块地址存于所述中间节点对应的叶子节点。
可选的,若将所述键值对中的物理块地址存于所述树结构中的中间节点,则将超过所述存储阈值的键值对存于所述叶子节点的溢出页包括:
将未超过所述存储阈值的逻辑块地址存于所述叶子节点,将剩余逻辑块地址存于所述叶子节点的溢出页。
可选的,所述叶子节点、所述溢出页和所述第二溢出页的存储单元数量相同。
本申请还提供一种树结构数据的存储***,包括:
获取模块,用于获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;
判断模块,用于判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;
第一存储模块,用于所述判断模块的判断结果为否时,将所述键值对存于所述树结构;
第二存储模块,用于所述判断模块的判断结果为是时,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
本申请还提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上所述的方法的步骤。
本申请还提供一种电子设备,包括存储器和处理器,所述存储器中存有计算机程序,所述处理器调用所述存储器中的计算机程序时实现如上所述的方法的步骤。
本申请提供一种树结构数据的存储方法,包括:获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;若否,将所述键值对存于所述树结构;若是,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
本申请在存储键值对时,判断该物理块地址对应的逻辑块地址是否超过对应的叶子节点的存储阈值,如超过,不再建立新的中间节点和叶子节点之间的对应关系,而是利用叶子节点的溢出页存放键值对,使得每个物理块地址在树结构中仅对应一个中间节点,使得同一物理块地址无需在逻辑块地址过多时采用多个中间节点存储,避免在物理块地址对应逻辑块地址过多时同一物理块地址存在多个中间节点导致的查找错误率高的问题,提高***的数据缩减比,从而提高元数据访问效率,数据组织可用性更高。本申请还提供 一种树结构数据的存储***、计算机可读存储介质和电子设备,具有上述有益效果,此处不再赘述。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为本申请实施例所提供的现有树结构数据的存储过程示意图;
图2为本申请实施例所提供的一种树结构数据的存储方法的流程图;
图3为本申请实施例所提供的一对多树结构数据的存储过程示意图;
图4为本申请实施例所提供的一种树结构数据的存储***结构示意图;
图5为本申请实施例所提供的一种计算机可读存储介质的示意图;
图6为本申请实施例所提供的一种电子设备的示意图。
具体实施方式
为使本申请实施例的目的、技术方案和优点更加清楚,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
请参见图1,图1为本申请实施例所提供的现有树结构数据的存储过程示意图,当中间节点Pt对应逻辑块地址Lg、Li、Lj、Lk、Ll、Lm、Ln……时,由于逻辑块地址的数量超过单个叶子节点的存储阈值,此时需要两个叶子节点存储逻辑块地址,对应的,中间节点中也存在两个Pt,然后若查找逻辑块地址时,并不清楚逻辑块地址究竟位于哪一个中间节点Pt下,导致查找效率过低, 也容易查找错误。
为了解决上述问题,本申请提供了一种树结构数据的存储方法,具体步骤如下:
请参考图2,图2为本申请实施例所提供的一种树结构数据的存储方法的流程图,该方法包括:
S101:获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;
本步骤需要获取键值对,即P-L键值对,包含物理块地址和对应的逻辑块地址,即需要确认每一个逻辑块地址和与之对应的物理块地址。通常可以从存储设备中直接获取到物理块地址和逻辑块地址。但需要注意的是,若存储设备包含重删功能,使得设备中的物理块地址通常只会存在一个,而对应的逻辑块地址包含多个。因此,本步骤获取得到的键值对并不必须为<P,L>格式的键值对,还可以从包括物理块地址和逻辑块地址之间映射关系的映射表或者其他数据结构中获取到每个逻辑块与物理块地址的对应关系,从而获取得到所需要的键值对,也应在本申请的保护范围内。
S102:判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;若否,进入S103;若是,进入S104;
本步骤需要判断各物理块地址对应的逻辑块地址数量是否大于对应叶子节点的存储阈值,即每个物理块地址对应的逻辑块地址数量与叶子节点存储阈值之间的大小关系。由于键值对中可能包含重复的物理块地址,因此本步骤仅针对相同物理块地址对应的逻辑块地址作数量累加。在此对于存储阈值的具体大小不做限定。叶子节点的存储阈值通常在树结构建立时确定,例如可以为32个或多个,通常均为2的幂次方。
若逻辑块地址数量大于存储阈值,执行步骤S104,否则执行步骤S103。
S103:将所述键值对存于所述树结构;
若同一物理块地址对应的逻辑块地址不超过叶子节点的阈值,意味着对应的物理块地址并不需要两个相同的中间节点,此时可以直接将键值对存于树结构。在此对于如何将键值对存于树结构不作限定,可以如图1直接将键值对存于树结构的叶子节点中。
S104:将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
若同一物理块地址对应的逻辑块地址超过叶子节点的存储阈值,为了避免出现两个相同的中间节点,此时利用溢出页存放超过叶子节点存储空间的键值对,且叶子节点的最后一个存储单元用于保存溢出页的地址。换句话说,此时叶子节点实际用于存储逻辑块地址的存储单元为存储阈值减一。容易理解的是,本步骤重点在于将超过存储阈值的键值对存于叶子节点的溢出页,但叶子节点本身依旧可以存放键值对。
需要注意的是,本实施例对于何时生成溢出页并不作具体限定。可以在得到S102的判断结果后,针对于逻辑块地址超过存储阈值的叶子节点生成相应的溢出页。也可以在执行本步骤前或者本实施例前事先为每个叶子节点配置相应的溢出页,均可以实现本实施例所需实现的技术方案。此外,在此对于各溢出页的生成方式不作具体限定,可以从缓存请求预设大小空间,生成溢出页或第二溢出页。当然,在此对于从缓存中请求的空间大小不作具体限定。
此外,当溢出页存储饱和时,可以***溢出页得到第二溢出页,以便将多余的逻辑块地址存于第二溢出页,同样的溢出页的最后一个存储单元用于保存所述第二溢出页的地址。需要注意的是,并非溢出页的存储单元占满时才意味着溢出页饱和,而是溢出页的存储单元仅有一个存储单位为空时,此时依旧剩余逻辑块地址未存入,此时即需要***溢出页,得到第二溢出页。容易理解的是,若逻辑块地址足够多,以此类推,第二溢出页此后也作为溢出页***得到更多的溢出页,以满足逻辑块的存储需求。这样可以看出,除 最后一张溢出页外,前面的叶子节点和各溢出页仅能存储各自存储阈值减一数量个逻辑块地址。
在此对于各溢出页与叶子节点之间的存储阈值不作限定,即各溢出页之间的存储阈值可以相同,也可以不同,且溢出页与叶子节点的存储阈值可以相同,也可以不同。当然,作为一种优选的实施方式,也为了便于计算每个物理块所需溢出页数量,可以将各溢出页的存储阈值与叶子节点对应设置,即将叶子节点的存储阈值作为各溢出页的存储阈值,此时在确定物理块地址对应的逻辑块地址数量后,便于确定该物理块地址所需溢出页数量,以便针对性的生成溢出页,避免树结构的存储空间浪费。举例而言,若某一个物理块地址对应96个逻辑块地址,叶子节点的存储阈值为32,将叶子节点也视为溢出页,而除最后一个溢出页外前述溢出页仅能存储31个逻辑块地址,此时应需要96/32+1=4个溢出页,则除去叶子节点,还需要3个溢出页。
换句话说,若逻辑块地址为M,存储阈值为N,若
Figure PCTCN2021073607-appb-000001
的商值不为整数,则所需溢出页数量n为
Figure PCTCN2021073607-appb-000002
即M与N的商值向上取整即为除叶子节点外所需溢出页数量。若
Figure PCTCN2021073607-appb-000003
的商值为整数,则所需要的溢出页数量n为
Figure PCTCN2021073607-appb-000004
当然,这要求叶子节点、溢出页和第二溢出页的存储单元数量相同。
本申请实施例在存储逻辑块地址时,判断同一物理块地址对应的逻辑块地址是否超过对应的叶子节点的存储阈值,如超过,不再建立新的中间节点和叶子节点之间的对应关系,而是利用叶子节点的溢出页存放键值对,使得每个物理块地址仅对应一个叶子节点,提高***的数据缩减比,避免在物理块地址对应逻辑块地址过多时同一物理块地址存在多个中间节点导致的查找错误率高的问题,提高元数据访问效率,数据组织可用性更高。
基于上述实施例,作为优选的实施例,为了进一步提高树结构***的数 据缩减比,在执行将键值对存于树结构时,可以将键值对中的物理块地址存于树结构中的中间节点,再将键值对中该物理块地址对应的逻辑块地址存于中间节点对应的叶子节点,通过建立中间节点与叶子节点之间的一一对应关系,由于在遍历逻辑块地址时先确定物理块地址,则在叶子节点可以只存储逻辑块地址,而不必重复存储物理块地址,从而减少叶子节点所需存储数据量。
类似的,若将键值对中的物理块地址存于树结构中的中间节点,执行将超过存储阈值的键值对存于叶子节点的溢出页时,也可以将未超过存储阈值的逻辑块地址存于叶子节点,将剩余逻辑块地址存于叶子节点的溢出页。即同样将物理块地址存于中间节点,而叶子节点和溢出页中仅存放逻辑块地址。
此时请见图3,图3为本申请实施例所提供的一对多树结构数据的存储过程示意图,对照图1中包括的键值对采用本申请公开的存储方法加以应用。物理块地址Pt所在中间节点对应的叶子节点中及溢出页中仅存储物理块地址Pt对应的逻辑块地址,使得树结构中仅存在一个物理块地址Pt对应的中间节点,使得检索Pt对应的键值对时不会因为存在多个中间节点导致检索失败或者重复检索,提高了检索效率。同时由于仅在叶子节点或溢出页中存储逻辑块地址,而不再存储物理块地址和逻辑块地址的映射关系,降低了每个存储单元所需的存储空间,节省了树结构数据所占用的存储空间。
下面对本申请实施例提供的一种树结构数据的存储***进行介绍,下文描述的存储***与上文描述的一种树结构数据的存储方法可相互对应参照。
本申请还提供一种树结构数据的存储***,包括:
获取模块100,用于获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;
判断模块200,用于判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;
第一存储模块300,用于所述判断模块的判断结果为否时,将所述键值对存于所述树结构;
第二存储模块400,用于所述判断模块的判断结果为是时,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
基于上述实施例,作为优选的实施例,还包括:
溢出页生成模块,用于生成所述叶子节点的溢出页;
其中,所述叶子节点的最后一个存储单元用于保存所述溢出页的地址。
基于上述实施例,作为优选的实施例,还包括:
溢出页请求模块,用于从缓存请求预设大小空间,生成所述溢出页或所述第二溢出页。
基于上述实施例,作为优选的实施例,第一存储模块300可以包括:
第一存储单元,用于将所述键值对中的物理块地址存于所述树结构中的中间节点;
第二存储单元,用于将所述键值对中物理块地址对应的逻辑块地址存于所述中间节点对应的叶子节点。
基于上述实施例,作为优选的实施例,若第一存储模块300包括第一存储单元,则第二存储模块400可以具体为用于将未超过所述存储阈值的逻辑块地址存于所述叶子节点,将剩余逻辑块地址存于所述叶子节点的溢出页的模块。
本申请还提供了一种计算机可读存储介质5,如图5所示,其上存有计算机程序51,该计算机程序51被执行时可以实现上述实施例所提供的步骤。该存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或 者光盘等各种可以存储程序代码的介质。
本申请还提供了一种电子设备,如图6所示,可以包括存储器6和处理器7,所述存储器6中存有计算机程序,所述处理器7调用所述存储器中的计算机程序时,可以实现上述实施例所提供的步骤。当然所述电子设备还可以包括各种网络接口,电源等组件。
说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例提供的***而言,由于其与实施例提供的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以对本申请进行若干改进和修饰,这些改进和修饰也落入本申请权利要求的保护范围内。
还需要说明的是,在本说明书中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。

Claims (10)

  1. 一种树结构数据的存储方法,其特征在于,包括:
    获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;
    判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;
    若否,将所述键值对存于所述树结构;
    若是,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
  2. 根据权利要求1所述的存储方法,其特征在于,将超过所述存储阈值的逻辑块地址存于所述叶子节点的溢出页之前,还包括:
    生成所述叶子节点的溢出页;
    其中,所述叶子节点的最后一个存储单元用于保存所述溢出页的地址。
  3. 根据权利要求1或2所述的存储方法,其特征在于,还包括:
    当所述溢出页存储饱和时,***所述溢出页得到第二溢出页;
    将所述逻辑块地址存于所述第二溢出页;
    其中,所述溢出页的最后一个存储单元用于保存所述第二溢出页的地址。
  4. 根据权利要求3所述的存储方法,其特征在于,还包括:
    从缓存请求预设大小空间,生成所述溢出页或所述第二溢出页。
  5. 根据权利要求1所述的存储方法,其特征在于,将所述键值对存于所述树结构包括:
    将所述键值对中的物理块地址存于所述树结构中的中间节点;
    将所述键值对中物理块地址对应的逻辑块地址存于所述中间节点对应的叶子节点。
  6. 根据权利要求5所述的存储方法,其特征在于,若将所述键值对中的物理块地址存于所述树结构中的中间节点,则将超过所述存储阈值的键值对存于所述叶子节点的溢出页包括:
    将未超过所述存储阈值的逻辑块地址存于所述叶子节点,将剩余逻辑块地址存于所述叶子节点的溢出页。
  7. 根据权利要求1所述的存储方法,其特征在于,所述叶子节点、所述溢出页和所述第二溢出页的存储单元数量相同。
  8. 一种树结构数据的存储***,其特征在于,包括:
    获取模块,用于获取键值对;所述键值对包括物理块地址和对应的逻辑块地址;
    判断模块,用于判断同一物理块地址对应的逻辑块地址数量是否大于树结构中叶子节点的存储阈值;
    第一存储模块,用于所述判断模块的判断结果为否时,将所述键值对存于所述树结构;
    第二存储模块,用于所述判断模块的判断结果为是时,将超过所述存储阈值的键值对存于所述叶子节点的溢出页。
  9. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如权利要求1-7任一项所述的方法的步骤。
  10. 一种电子设备,其特征在于,包括存储器和处理器,所述存储器中存有计算机程序,所述处理器调用所述存储器中的计算机程序时实现如权利要求1-7任一项所述的方法的步骤。
PCT/CN2021/073607 2020-08-20 2021-01-25 一种树结构数据的存储方法、***及相关装置 WO2022037016A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010844595.4 2020-08-20
CN202010844595.4A CN111984650A (zh) 2020-08-20 2020-08-20 一种树结构数据的存储方法、***及相关装置

Publications (1)

Publication Number Publication Date
WO2022037016A1 true WO2022037016A1 (zh) 2022-02-24

Family

ID=73442684

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/073607 WO2022037016A1 (zh) 2020-08-20 2021-01-25 一种树结构数据的存储方法、***及相关装置

Country Status (2)

Country Link
CN (1) CN111984650A (zh)
WO (1) WO2022037016A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662019A (zh) * 2023-07-31 2023-08-29 苏州浪潮智能科技有限公司 请求的分配方法、装置、存储介质及电子装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984650A (zh) * 2020-08-20 2020-11-24 苏州浪潮智能科技有限公司 一种树结构数据的存储方法、***及相关装置

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108027764A (zh) * 2015-10-30 2018-05-11 桑迪士克科技有限责任公司 可转换的叶的存储器映射
CN110781101A (zh) * 2019-10-25 2020-02-11 苏州浪潮智能科技有限公司 一种一对多映射关系的存储方法、装置、电子设备及介质
CN111984650A (zh) * 2020-08-20 2020-11-24 苏州浪潮智能科技有限公司 一种树结构数据的存储方法、***及相关装置

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108027764A (zh) * 2015-10-30 2018-05-11 桑迪士克科技有限责任公司 可转换的叶的存储器映射
CN110781101A (zh) * 2019-10-25 2020-02-11 苏州浪潮智能科技有限公司 一种一对多映射关系的存储方法、装置、电子设备及介质
CN111984650A (zh) * 2020-08-20 2020-11-24 苏州浪潮智能科技有限公司 一种树结构数据的存储方法、***及相关装置

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662019A (zh) * 2023-07-31 2023-08-29 苏州浪潮智能科技有限公司 请求的分配方法、装置、存储介质及电子装置
CN116662019B (zh) * 2023-07-31 2023-11-03 苏州浪潮智能科技有限公司 请求的分配方法、装置、存储介质及电子装置

Also Published As

Publication number Publication date
CN111984650A (zh) 2020-11-24

Similar Documents

Publication Publication Date Title
US11079953B2 (en) Packing deduplicated data into finite-sized containers
US11010300B2 (en) Optimized record lookups
WO2017201977A1 (zh) 一种数据写、读方法、装置及分布式对象存储集群
US10860494B2 (en) Flushing pages from solid-state storage device
JP5732536B2 (ja) 重複排除に基づくストレージシステムにおけるスケーラブル参照管理のためのシステム、方法及び非一時的なコンピュータ可読ストレージ媒体
US10331641B2 (en) Hash database configuration method and apparatus
US10545987B2 (en) Replication to the cloud
US9021189B2 (en) System and method for performing efficient processing of data stored in a storage node
US9092321B2 (en) System and method for performing efficient searches and queries in a storage node
CA2893304C (en) Data storage method, data storage apparatus, and storage device
WO2022037016A1 (zh) 一种树结构数据的存储方法、***及相关装置
US9430492B1 (en) Efficient scavenging of data and metadata file system blocks
WO2022048356A1 (zh) 一种云平台的数据处理方法、***、电子设备及存储介质
US9336135B1 (en) Systems and methods for performing search and complex pattern matching in a solid state drive
US11599290B2 (en) Data storage method, electronic device, and computer program product
WO2015027731A1 (zh) 布隆过滤器生成方法和装置
WO2022166265A1 (zh) 一种数据恢复方法、装置、设备及介质
WO2019072088A1 (zh) 一种文件管理方法、文件管理装置、电子设备及存储介质
WO2017028718A1 (zh) 数据读取方法和装置
Prabavathy et al. Multi-index technique for metadata management in private cloud storage
US11714805B1 (en) Method and system for streaming data from portable storage devices
US11372554B1 (en) Cache management system and method
CN115525219A (zh) 一种对象数据的存储方法、装置及介质
Feng et al. OLP Scheme on Backup Log and Hbase
CN113625955A (zh) 一种分布式存储***的脏数据处理方法、装置及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21857122

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21857122

Country of ref document: EP

Kind code of ref document: A1