CN114153848B - Block chain data storage method and device and electronic equipment - Google Patents

Block chain data storage method and device and electronic equipment Download PDF

Info

Publication number
CN114153848B
CN114153848B CN202111443113.5A CN202111443113A CN114153848B CN 114153848 B CN114153848 B CN 114153848B CN 202111443113 A CN202111443113 A CN 202111443113A CN 114153848 B CN114153848 B CN 114153848B
Authority
CN
China
Prior art keywords
node
tree
data
key
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111443113.5A
Other languages
Chinese (zh)
Other versions
CN114153848A (en
Inventor
俞本权
卓海振
陆钟豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Ant Blockchain Technology Shanghai Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Ant Blockchain Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd, Ant Blockchain Technology Shanghai Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202111443113.5A priority Critical patent/CN114153848B/en
Publication of CN114153848A publication Critical patent/CN114153848A/en
Application granted granted Critical
Publication of CN114153848B publication Critical patent/CN114153848B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2246Trees, e.g. B+trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/04Trading; Exchange, e.g. stocks, commodities, derivatives or currency exchange

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Business, Economics & Management (AREA)
  • Software Systems (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Technology Law (AREA)
  • Health & Medical Sciences (AREA)
  • Bioethics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Computing Systems (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A block chain data storage method and device, and an electronic device, wherein the method comprises the following steps: acquiring key-value key value pairs of block chain data to be stored; converting key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of the storage content in each sub-position; the child positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node; key-value pairs for the root node, intermediate node, and leaf node are stored in a database.

Description

Block chain data storage method and device and electronic equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of blockchain technologies, and in particular, to a blockchain data storage method and apparatus, and an electronic device.
Background
Blockchain technology, also known as distributed ledger technology, is an emerging technology that is commonly engaged in "accounting" by several node devices, together storing and maintaining a complete distributed database. For the node equipment of the blockchain, the blockchain data which needs to be stored and maintained generally comprises the blockchain data and account state data corresponding to a blockchain account in the blockchain; the block data may further include block header data, block transaction data in the block, and transaction receipts corresponding to the block transaction data in the block, etc.
When storing the various blockchain data shown above, the node devices of the blockchain may typically organize the various blockchain data into Merkle trees for storage in a database in the form of key-value key value pairs. When the various blockchain data stored by the node device need to be queried, the data can be efficiently queried by traversing the Merkle tree by taking keys of the various blockchain data as query indexes.
Disclosure of Invention
A blockchain data storage method is proposed, the method comprising:
acquiring key-value key value pairs of block chain data to be stored;
converting key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of the storage content in each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node;
Storing key-value key value pairs of the root node, the intermediate node and the leaf node in a database; and key-value key value pairs of the leaf node, the intermediate node and the root node, wherein value is the storage content of the node, and key is the hash value of the storage content of the node.
There is also presented a blockchain data storage device, the device comprising:
the acquisition module acquires key-value key value pairs of the blockchain data to be stored;
The conversion module converts key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logic tree structure; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of the storage content in each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node;
The storage module is used for storing key-value key value pairs of the root node, the intermediate node and the leaf node in a database; and key-value key value pairs of the leaf node, the intermediate node and the root node, wherein value is the storage content of the node, and key is the hash value of the storage content of the node.
In the above technical solution, on one hand, each root node and each intermediate node in the logical tree structure includes a plurality of positions respectively representing different characters; and each position further comprises a plurality of slots respectively representing different characters; therefore, through the design, each root node and each intermediate node have larger data storage capacity and data bearing capacity, so that when the root nodes and the intermediate nodes in the logical tree structure are written into the database for storage; or when the root node and the intermediate node in the logic tree structure stored in the database are accessed, the data storage capacity of the root node and the intermediate node can be more adaptive to the single IO read-write capacity of the storage medium bearing the database, so that the IO read-write capacity of the storage medium bearing the database can be fully utilized, and the data read-write efficiency is improved; moreover, the improvement of the data storage capacity and the data bearing capacity of the nodes on the tree structure of the logic also tends to result in the improvement of the overall data storage capacity and the data bearing capacity of the tree structure of the logic, so that more blockchain data can be stored on the tree structure of the logic;
In the second aspect, as each root node and each intermediate node in the logical tree structure adopt a uniform data structure; for the root node and intermediate nodes in the logical tree structure, the character length of the character prefix of the key of the blockchain data actually stored will also remain fixed; therefore, through the design, frequent splitting of the nodes caused by unfixed character lengths actually stored by the root node and the intermediate node can be avoided, so that the number of layers of the root node and the intermediate node contained in the logical tree structure can be ensured to be always in a relatively stable state;
In a third aspect, since the main locations of the root node and the intermediate node include a plurality of slots respectively corresponding to the sub-locations, the slots are used for filling the hash of each sub-location; the sub-positions further comprise a plurality of slots for filling the hash value of the next layer of nodes linked by the character node; furthermore, the hash values of the root node and the intermediate node may be represented by a hash of the master location in the root node and the intermediate node; therefore, when the hash value filled in any slot position in the sub-position of the root node or the intermediate node on the tree structure is updated, when the hash value of the root node or the intermediate node is recalculated, only the hash value filled in each slot position in the sub-position updated by the generated data is spliced, the spliced hash value is used as a calculation parameter to carry out hash calculation again, the calculated hash value is filled into the slot position corresponding to the sub-position in the main position of the root node or the intermediate node, then the hash filled in each slot position in the main position is spliced, and the spliced hash value is used as the calculation parameter to carry out hash calculation again, so that the hash of the root node or the intermediate node can be obtained; and for the hash values filled in each slot in other sub-positions where data update does not occur, the hash values are not needed to be spliced and then used as calculation parameters to participate in the hash calculation, so that the hash calculation amount and calculation time when the hash value of the root node or the intermediate node is recalculated can be reduced, and the calculation efficiency of the hash calculation is improved.
Drawings
FIG. 1 is a schematic diagram of organizing blockchain account status data into MPT status trees, as provided by an exemplary embodiment;
FIG. 2 is a schematic diagram of organizing contract data stored in a storage space corresponding to a contract account into MPT storage trees, according to an example embodiment;
FIG. 3 is a flowchart of a blockchain data storage method provided by an exemplary embodiment;
FIG. 4 is a tree structure diagram of FDMT trees provided in accordance with one illustrative embodiment;
FIG. 5 is a tree structure diagram of another FDMT tree provided by an example embodiment;
FIG. 6 is a block diagram of one Treenode provided by an exemplary embodiment;
FIG. 7 is a block diagram of a bucket data provided by an exemplary embodiment;
FIG. 8 is a schematic diagram of setting node IDs for nodes in a FDMT tree in accordance with one exemplary embodiment;
FIG. 9 is a schematic diagram of a recursive compression of Tree nodes on FDMT trees provided by an example embodiment;
FIG. 10 is a schematic diagram of an exemplary embodiment for writing account status data to a Merkle status storage tree;
FIG. 11 is a schematic diagram of node splitting for a bucket node in accordance with an exemplary embodiment;
FIG. 12 is a schematic diagram of an electronic device according to an exemplary embodiment;
FIG. 13 is a block diagram of a blockchain data storage device provided by an exemplary embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary embodiments do not represent all implementations consistent with one or more embodiments of the present specification. Rather, they are merely examples of apparatus and methods consistent with aspects of one or more embodiments of the present description as detailed in the accompanying claims.
It should be noted that: in other embodiments, the steps of the corresponding method are not necessarily performed in the order shown and described in this specification. In some other embodiments, the method may include more or fewer steps than described in this specification. Furthermore, individual steps described in this specification, in other embodiments, may be described as being split into multiple steps; while various steps described in this specification may be combined into a single step in other embodiments.
Blockchains are generally divided into three types: public chain (Public Blockchain), private chain (Private Blockchain) and federated chain (Consortium Blockchain). In addition, there may be combinations of the above types, such as private chain+federation chain, federation chain+public chain, and the like.
Among them, the highest degree of decentralization is the public chain. Participants joining the public chain (which may also be referred to as nodes in the blockchain) may read data records on the chain, participate in transactions, and compete for billing rights for new blocks, etc. Moreover, each node can freely join or leave the network and perform relevant operations.
The private chain is the opposite, the write rights of the network are controlled by an organization or organization, and the data read rights are specified by the organization. In short, the private chain may be a weakly centralized system with strict restrictions on the nodes and a small number of nodes. This type of blockchain is more suitable for use within a particular organization.
The alliance chain is a block chain between public and private chains, and can realize 'partial decentralization'. Each node in the federation chain typically has an entity organization or organization corresponding thereto; nodes join the network by authorization and form a benefit-related federation, which collectively maintains blockchain operation.
Based on the basic characteristics of a blockchain, a blockchain is typically made up of several blocks. The time stamps corresponding to the creation time of the block are recorded in the blocks respectively, and all the blocks form a time-ordered data chain strictly according to the time stamps recorded in the blocks.
For data generated outside the chain, the data can be constructed into a standard transaction (transaction) format supported by the blockchain, then issued to the blockchain, the transaction is agreed by node equipment in the blockchain, and after the agreement is reached, the transaction is packed into a block by the node equipment serving as an accounting node in the blockchain, and persistence is carried out in the blockchain.
In the blockchain domain, an important concept is Account (Account); in practical applications, accounts can be generally divided into two types, external accounts and contracted accounts; the external account is an account directly controlled by the user, and is also called a user account; the contract account is an account (i.e., smart contract) that is created by the user through an external account and contains a contract code.
For accounts in a blockchain, the account status of the account is typically maintained by a structure. When a transaction in a block is executed, the status of the account in the blockchain associated with the transaction will typically change.
In one example, the structure of an account typically includes fields such as Balance, nonce, code, and Storage. Wherein:
a Balance field for maintaining a current account Balance of the account;
a Nonce field for maintaining a number of transactions for the account; the counter is used for guaranteeing that each transaction can be processed only once, and effectively avoiding replay attack;
A Code field for maintaining a contract Code for the account; in practical applications, only the hash value of the contract Code is usually maintained in the Code field; thus, the Code field is also commonly referred to as Codehash field.
A Storage field for maintaining the stored content of the account; for a contract account, a separate persistent storage space is generally allocated to store contract data stored in a storage space corresponding to the contract account; this separate storage space is commonly referred to as the account store for the contract account. The stored contents of the contract account are typically stored in the separate storage space as a data structure built into MPT (Merkle Patricia Trie) trees in the form of key-value pairs. MPT trees are a logical tree structure used in the blockchain domain to store and maintain blockchain data, and typically include root nodes, intermediate nodes, and leaf nodes in such a tree structure.
Among them, the MPT tree constructed based on the stored contents of the contract account is also commonly referred to as Storage tree. Whereas the Storage field typically only maintains the hash value of the root node of the Storage tree; thus, the Storage field is also commonly referred to as Storage Root hash field. For the external account, the field values of the Code field and the Storage field shown above are null values.
For most blockchain models, merkle trees are typically used; or a tree structure based on logic such as Merkle tree variants of the data structure of the Merkle tree. For example, the MPT tree is a Merkle tree variant of a tree structure incorporating a Trie dictionary tree for storing and maintaining blockchain data.
The following description will take the use of MPT trees to store blockchain data as an example;
In one example, blockchain data in the blockchain that needs to be stored and maintained typically includes account state (state) data, transaction data, and receipt data; therefore, in practical application, the account status data, transaction data and receipt data can be respectively organized into three MPT trees such as an MPT status tree (i.e. a world state), an MPT transaction tree and an MPT receipt tree in the form of key-value key value pairs, so as to be respectively stored and maintained.
In addition to the above three MPT trees, the contract data stored in the Storage space corresponding to the contract account is also generally constructed as one MPT Storage tree (hereinafter referred to as Storage tree). The hash value of the root node of the Storage tree is added to the Storage field in the structure of the contract account corresponding to the Storage tree.
MPT state tree, which is an MPT tree organized in the form of key-value key value pairs from account state data of all accounts (including external accounts and contract accounts) in the blockchain; MPT transaction tree, which is an MPT tree organized by transaction (transaction) data in a blockchain in the form of key-value pairs; the MPT receipt tree is an MPT tree formed by organizing transaction (receipt) receipts corresponding to each transaction generated after the execution of the transactions in the block in the form of key-value key value pairs.
The hash values of the root nodes of the MPT status tree, MPT transaction tree, and MPT receipt tree shown above are eventually added to the block header of the corresponding block.
Wherein, the MPT transaction tree and the MPT receipt tree correspond to blocks, i.e. each block has its own MPT transaction tree and MPT receipt tree. Whereas the MPT state tree is a global MPT tree and does not correspond to a particular block, but covers account state data for all accounts in the blockchain. Each time the blockchain generates a latest block, the account status of the relevant accounts (either external or contracted accounts) in the blockchain for the executed transactions will typically change after the transactions in the latest block are executed.
For example, when a "transfer transaction" in a block is completed, the balances of the transfer account and the transfer account associated with the "transfer transaction" (i.e., the field values of the Balance fields of these accounts) will typically change. After the transaction in the latest block generated by the block chain is executed, the node equipment needs to construct an MPT state tree according to the current account state data of all accounts in the block chain because the account state in the current block chain is changed, so as to maintain the latest state of all accounts in the block chain.
That is, each time a latest block is generated in the blockchain, and after the transaction in the latest block is executed, account states of part of accounts in the blockchain are changed, and the node device needs to reconstruct an MPT state tree based on the latest account state data of all accounts in the blockchain. In other words, each block in the blockchain has an MPT state tree corresponding to it; the MPT status tree maintains the most current account status for all accounts in the blockchain after transactions in the blockchain have been executed.
Referring to fig. 1, fig. 1 is a schematic diagram illustrating an organization of account status data of each blockchain account in a blockchain into an MPT status tree in the form of key-value key pairs.
MPT tree is a more traditional modified Merkle tree variant that combines the advantages of both Merkle tree and Trie dictionary tree (also known as prefix tree) tree structures.
Three types of nodes are typically included in the MPT tree, namely leaf node (leaf node), extension node (extension node), and branch node (branch node), respectively. Wherein, the root node of the MPT tree can be an extension node in general; the intermediate nodes of the MPT tree may typically be branching nodes or other expansion nodes.
The extension node and the branch node may be collectively referred to as a character node, and are used for storing a character prefix portion of a character string corresponding to a key (i.e., an account address) of the account status data; wherein, for MPT trees, the character prefix part is usually referred to as a shared character prefix; the shared character prefix refers to a prefix composed of one or more identical characters possessed by keys (namely, blockchain account addresses) of all account state data. And the leaf node is used for storing the character suffix part and Value (namely specific account state data) of the character string corresponding to the key of the blockchain data.
An extension node for storing one or more characters in the shared character prefix of the account address (i.e., shared nibble shown in fig. 1), and a hash value of a node of a Next layer (i.e., next node shown in fig. 1) to which the extension node is linked.
A branch node comprising 17 slots, the first 16 slots corresponding to 16 possible hexadecimal characters in a key, one character corresponding to one nibble (nibble), each of the first 16 slots representing one character in a shared character prefix of an account address, the slots being used to populate a hash value of a node of a next level to which the branch node is linked. The last slot is a value slot, typically a null value.
A leaf node for storing a character suffix of an account address (i.e., key-end shown in fig. 1), and a value of account status data (i.e., a structure of the account described above); the character suffix of the account address and the shared character prefix of the account address form a complete account address together; the character suffix refers to a suffix formed by the last character or characters except for the shared character prefix of the account address.
Assume that account state data that needs to be organized into an MPT state tree is shown in table 1 below:
TABLE 1
In table 1, the blockchain account corresponding to the account addresses of the first three rows is an external account, and Codehash and Storage root fields are null values. The blockchain account corresponding to the account address of the 4 th row is a contract account, and a Codehash field maintains a hash value of a contract code corresponding to the contract account; the Storage root field maintains the hash value of the root node of the Storage tree that the stored contents of the contract account constitute.
Finally, an MPT state tree organized according to account state data in Table 1, as shown in FIG. 1; the MPT state tree is composed of 4 leaf nodes, 2 branch nodes, and 2 extension nodes (one of which serves as a root node).
In fig. 1, the prefix field is a prefix field common to the extension node and the leaf node. Different field values of the prefix field may be used to represent different node types.
For example, the value of the prefix field is 0, indicating that the extension node contains an even number nibbles; as previously described, nibble represents nibbles, consisting of 4-bit binary, one nibble may correspond to one of the characters that make up the account address. The value of the prefix field is1, which indicates that the expansion node contains an odd number of nibbles(s); the value of the prefix field is 2, indicating that the leaf node contains an even number nibbles; the value of the prefix field is 3, which indicates a leaf node containing an odd number of nibbles(s).
The branch node is not provided with the prefix field because it is a character node of the parallel list nibble.
A Shared nibble field in the extension node, corresponding to a key value of a key value pair contained in the extension node, representing a common character prefix between account addresses; for example, all account addresses in the table above have a common character prefix a7. The Next Node field is filled with a hash value (hash pointer) of the Next Node.
A 16-ary character 0-f field in the branch node, corresponding to the key value of the key value pair contained in the branch node; if the branch node is an intermediate node of the account address on the search path on the MPT tree, the Value field of the branch node may be a null Value. The 0-f field is used to fill the hash value of the next level node.
Key-end in the leaf node, corresponding to the Key value of the Key value pair contained in the leaf node, represents the last few characters of the account address (the character suffix of the account address). The key value of each node on the search path from the root node to the leaf node constitutes a complete account address. The Value field of the leaf node is filled with account state data corresponding to the account address; for example, the structure body formed by the fields of the above-mentioned base, nonce, code, storage, etc. may be encoded and then filled in the Value field of the leaf node.
Further, the node on the MPT state tree shown in fig. 1 is finally stored in the database in the form of Key-Value Key Value pairs;
When the node on the MPT state tree is stored in the database, the key in the key value pair of the node on the MPT state tree may be the hash value of the data content contained in the node; value in the key Value pair of the node on the MPT state tree is the data content contained in the node.
When a node on the MPT state tree is stored in a database, a hash Value of data content contained in the node can be calculated (namely, the whole node is subjected to hash calculation), the calculated hash Value is used as a Key, the data content contained in the node is used as a Value, and a Key-Value Key Value pair is generated; then, the generated Key-Value Key Value pair is stored in a database.
Because the node on the MPT state tree is stored in the form of a Key-value Key value pair; wherein, key can be the hash Value of the data content contained in the node, and Value can be the data content contained in the node; therefore, when a node on the MPT state tree needs to be queried, content addressing can be generally performed based on a hash value of data content contained in the node as a key.
Referring to fig. 2, fig. 2 is a schematic diagram illustrating organization of contract data stored in a storage space corresponding to a contract account into an MPT storage tree according to the present disclosure.
With continued reference to table 1, the account with the account address "a77d397" shown in table 1 is a contract account, so that contract data stored in a storage space corresponding to the contract account is organized into a storage tree; the hash value S1 of the root node of the storage tree is added to the storage root field in the leaf node corresponding to the contract account in the MPT state tree shown in fig. 1.
Assume that contract data stored in a memory space of the contract account is as shown in the following table 2:
TABLE 2
Note that, the contract data stored in the storage space of the contract account may be generally in the form of a state variable; in storing, the state variable names may be organized in key-value key value pairs into a storage tree as shown in FIG. 2 for storage.
For example, in one example, a hash value of an account address of the contract account and a storage location of the state variable in an account storage of the contract account may be used as a key, and a variable corresponding to the state variable may be used as a value.
The basic structure of the storage tree shown in fig. 2 is similar to the MPT state tree shown in fig. 1, and will not be described in detail in this specification. As can be seen from the description of fig. 1 and fig. 2, based on the tree structure design of the MPT tree, the branch node may store one of the characters in the shared character prefix of all account addresses; the extension node may store one or more characters in the shared character prefix for all account addresses.
In practical applications, the character length of the shared character prefix of the keys of all data stored on the MPT tree is generally not fixed; moreover, after new data is written in the MPT tree, the character length of the shared character prefix may be changed accordingly; therefore, this may cause the extension node on the MPT tree to split, splitting out a new branch node; that is, the splitting condition of the nodes on the MPT tree is that the character length of the above-mentioned shared character prefix is changed;
For example, taking the MPT state tree shown in fig. 1 as an example, assuming that account state data of an account address with the first two characters of "a8" of an account address is newly added to the MPT state tree, the shared character prefix stored in the "Shared nibble" field of the extension node as the root node shown in fig. 1 is changed from "a7" to "a"; according to the splitting condition of the nodes of the MPT state tree, the expansion node serving as the root node is split into an expansion node with a stored shared character prefix of a; and a branching node where the character "8" is occupied.
Once the expansion nodes on the MPT tree are split, the node layer number of the MPT tree can be changed, so that the node layer number of the MPT tree is not stable enough. Since the character length of the shared character prefix of the key of all the data stored on the MPT tree is changed frequently as new data is written in the MPT tree; therefore, in the node splitting method shown above, frequent splitting of the nodes is caused, and thus data storage efficiency when writing new data into the MPT tree is affected.
In view of this, the present specification proposes a new tree structure design scheme of logic.
When implemented, the tree structure of the above logic may still include root nodes, intermediate nodes, and leaf nodes; the root node and the intermediate node are used for storing characters in the key of the blockchain data; the leaf node is used for storing the value of the blockchain data.
Unlike the MPT tree, the root node, intermediate node may include a main location and a plurality of sub-locations for storing characters in the key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of the storage content in each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node.
When node equipment in the blockchain stores blockchain data, key-value key value pairs of the blockchain data to be stored can be obtained, and then the blockchain data to be stored is converted into root nodes, intermediate nodes and leaf nodes on the logical tree structure; then, the key-value key value pairs of the root node, the intermediate node and the leaf node are further stored in a database; and the key-value key value pairs of the leaf node, the intermediate node and the root node can be used as the storage content of the node, and the key can be used as the hash value of the storage content of the node.
On the one hand, each root node and each intermediate node in the logical tree structure comprise a plurality of positions respectively representing different characters; and each position further comprises a plurality of slots respectively representing different characters; therefore, through the design, each root node and each intermediate node have larger data storage capacity and data bearing capacity, so that when the root nodes and the intermediate nodes in the logical tree structure are written into the database for storage; or when the root node and the intermediate node in the logic tree structure stored in the database are accessed, the data storage capacity of the root node and the intermediate node can be more adaptive to the single IO read-write capacity of the storage medium bearing the database, so that the IO read-write capacity of the storage medium bearing the database can be fully utilized, and the data read-write efficiency is improved; moreover, the improvement of the data storage capacity and the data bearing capacity of the nodes on the tree structure of the logic also tends to result in the improvement of the overall data storage capacity and the data bearing capacity of the tree structure of the logic, so that more blockchain data can be stored on the tree structure of the logic;
In the second aspect, as each root node and each intermediate node in the logical tree structure adopt a uniform data structure; for the root node and intermediate nodes in the logical tree structure, the character length of the character prefix of the key of the blockchain data actually stored will also remain fixed; therefore, through the design, frequent splitting of the nodes caused by unfixed character lengths actually stored by the root node and the intermediate node can be avoided, so that the number of layers of the root node and the intermediate node contained in the logical tree structure can be ensured to be always in a relatively stable state;
In a third aspect, since the main locations of the root node and the intermediate node include a plurality of slots respectively corresponding to the sub-locations, the slots are used for filling the hash of each sub-location; the sub-positions further comprise a plurality of slots for filling the hash value of the next layer of nodes linked by the character node; furthermore, the hash values of the root node and the intermediate node may be represented by a hash of the master location in the root node and the intermediate node; therefore, when the hash value filled in any slot position in the sub-position of the root node or the intermediate node on the tree structure is updated, when the hash value of the root node or the intermediate node is recalculated, only the hash value filled in each slot position in the sub-position updated by the generated data is spliced, the spliced hash value is used as a calculation parameter to carry out hash calculation again, the calculated hash value is filled into the slot position corresponding to the sub-position in the main position of the root node or the intermediate node, then the hash filled in each slot position in the main position is spliced, and the spliced hash value is used as the calculation parameter to carry out hash calculation again, so that the hash of the root node or the intermediate node can be obtained; and for the hash values filled in each slot in other sub-positions where data update does not occur, the hash values are not needed to be spliced and then used as calculation parameters to participate in the hash calculation, so that the hash calculation amount and calculation time when the hash value of the root node or the intermediate node is recalculated can be reduced, and the calculation efficiency of the hash calculation is improved.
Referring to fig. 3, fig. 3 is a flowchart of a blockchain data storage method according to an exemplary embodiment. The method comprises the following steps:
Step 302, acquiring key-value key value pairs of block chain data to be stored;
Step 304, converting key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of the storage content in each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node;
Step 306, storing key-value key value pairs of the root node, the intermediate node and the leaf node in a database; and key-value key value pairs of the leaf node, the intermediate node and the root node, wherein value is the storage content of the node, and key is the hash value of the storage content of the node.
In this specification, in order to avoid frequent splitting of nodes, a logical tree structure of the first N layers is proposed in which the number of layers of character nodes in the key for storing blockchain data is fixed.
The logical tree structure is that physical data of each node and link relation data between each node on the tree structure are stored in the database only, and the tree structure can be restored on a logical level based on the physical data and link relation data of each node stored in the database.
The tree structure of the logic can comprise a root node, an intermediate node and a leaf node; wherein, the nodes on the logical tree structure can be linked with the nodes of the upper layer through the hash value of the nodes. The root node and the intermediate node are specifically used for storing at least one character in a key-value key value pair of the blockchain data; the leaf node is specifically used to store the value of the blockchain data (i.e., the specific content of the blockchain data). The number of layers of the intermediate node may be one or more, and is not particularly limited in the present specification.
For example, in one example, the key of the blockchain data described above may still include a character prefix portion (Shared nibble) and a character suffix portion (key-end); in this case, the root node and the intermediate node may be used to store the characters in the character prefix described above; the leaf nodes may be used to store the character suffixes and the value of the blockchain data.
On one hand, the characters in the key of the blockchain data can be stored due to the root node and the intermediate node in the logical tree structure; therefore, the tree structure of the logic has the characteristic of a Trie dictionary tree; on the other hand, the nodes on the logical tree structure can be linked with the nodes of the upper layer through the hash value of the nodes; therefore, the tree structure of the logic also has the characteristic of a Merkle tree. In summary, the logical tree structure described in this specification is actually a Merkle tree variant of the tree structure that merges with Trie dictionary trees, similar to the MPT tree.
Referring to fig. 4, fig. 4 is a tree structure diagram of a FDMT (Fixed DEPTH MERKLE TREE) tree shown in the present specification. The FDMT tree is a Merkle tree variant of the tree structure incorporating Trie dictionary trees.
As shown in fig. 4, in the Tree structure of FDMT Tree shown in the present specification, a Tree node of the first N layers (3 layers are shown in fig. 4, which is only illustrative), and a Leaf node of the last layer (i.e., leaf node) are included. Among the Tree nodes of the first N layers, the first layer is a Tree node serving as a root node, and the other layers are Tree nodes serving as intermediate nodes.
Unlike the MPT Tree described above, the Tree nodes (i.e., the root node and the intermediate node) of the N layers before the FDMT Tree adopt a unified data structure, so that the Tree nodes of the N layers before the FDMT Tree can not generate node splitting due to the change of the length of the stored characters caused by the new writing of data.
As shown in fig. 4, the Tree nodes of the first N layers on the FDMT Tree may each include a plurality of blocks respectively representing different characters; the above block is the "location" of the character in the key for storing the blockchain data. Each block may further include a plurality of slots respectively representing different characters; the slots are also used to store characters in the key of the blockchain data. For example, fig. 4 shows that each Tree node includes N blocks; each block further includes N slots. Among the nodes of each layer on the FDMT tree, the link between the nodes can still be performed by filling the hash value (hash pointer) of the node of the next layer in the node of the previous layer. That is, the nodes in the FDMT tree are linked to the nodes of the previous layer by their own hash values. Correspondingly, the slot may be specifically used to fill the hash value of the next layer node linked to the current Tree node. The next layer node of the Tree node may be the Tree node or the Leaf node.
In practical application, after the hash value filled by any slot in any block in the Tree node in the FDMT Tree is updated, the hash value of the Tree node is usually needed to be recalculated, and the link is carried out again with the node of the upper layer according to the calculated hash value.
When calculating the hash value of the Tree node, it is generally necessary to perform hash calculation by taking all data filled in the Tree node as calculation parameters; therefore, in the case where the Tree node adopts the data structure as shown in fig. 4, if the hash value of the Tree node needs to be calculated, the hash of each block included in the Tree node needs to be calculated first, then the hashes of each block are spliced together, and then the secondary hash calculation is performed on the spliced hashes.
In this way, when the hash value filled by any slot in any block in the Tree node on the FDMT Tree is updated, even if data update does not occur in other blocks in the Tree node, the data update still needs to be used as a calculation parameter to participate in the hash calculation, and obviously, the problem of large calculation amount of the hash calculation exists.
In view of this, in the present specification, in order to reduce the amount of computation when calculating the hash of the Tree node, the Tree node may specifically employ a data structure of a main block (i.e., a main position) and a plurality of sub blocks (i.e., sub positions).
Referring to fig. 5, fig. 5 is a tree structure diagram of another FDMT tree shown in the present specification.
As shown in fig. 5, in the Tree structure of FDMT Tree shown in the present specification, tree node of the first N layers, and leaf node of the last layer are still included.
Wherein, unlike the FDMT Tree structure shown in fig. 4, the Tree nodes of the first N layers on the FDMT Tree shown in fig. 5 may include a main block (i.e., root block shown in fig. 5) and a plurality of sub-blocks respectively representing different characters; each block may further comprise a plurality of slots; for example, FIG. 5 shows that each Tree node includes a main block and N sub-blocks; each sub-block further includes N slots.
The functions of the slots contained in the main block and the sub block can be different; as shown in fig. 5, for the main block, a plurality of slots corresponding to each sub-block may be included, and each slot may be specifically used to fill a hash of the data content stored in the corresponding sub-block;
for example, when calculating the hash of any sub-block, specifically, the hash values filled in each slot in the sub-block may be spliced together, and then a secondary hash calculation is performed on the spliced hash to obtain the hash of the sub-block.
For the sub-block, a plurality of slots respectively representing different characters can be included, and each slot can be specifically used for filling the hash value of the next-layer node of the Tree node link.
In the present specification, according to the structure of the Tree node shown in fig. 5, the hash value of the main block in the Tree node may be used to represent the hash value of the Tree node; therefore, in this case, when the hash value filled in the slot in any sub-block in the Tree node on the FDMT Tree is updated, when the hash value of the Tree node is recalculated, only the hash value filled in each slot in the sub-block updated by the generated data is spliced, the spliced hash value is used as a calculation parameter to perform hash calculation again, the calculated hash value is filled into the slot corresponding to the sub-block in the main block in the Tree node, then the hash filled in each slot in the main block is spliced, and the spliced hash value is used as the calculation parameter to perform hash calculation again, so that the hash of the Tree node can be obtained.
In the whole process, the hash values filled in each slot position in other sub-blocks which are not updated with data are not needed to be spliced and used as calculation parameters to participate in the hash calculation, so that the hash calculation amount and calculation time when the hash value of the Tree node is recalculated can be reduced, and the calculation efficiency of the hash calculation is improved.
It should be noted that the link relationships between the nodes at each level in the FDMT tree shown in fig. 4 and 5 are merely illustrative, and do not refer to a specific limitation on the link relationships between the nodes at each level in the FDMT tree.
With continued reference to fig. 4 and 5, each Tree node on the FDMT Tree shown in fig. 4 and 5 may be used to store at least a portion of the characters in the key of the blockchain data.
For each Tree node on the FDMT Tree shown in fig. 4, the actual stored character may specifically be a character represented by a block (i.e., a non-empty block with at least one slot filled with a hash value) in the Tree node, and a character represented by a slot (i.e., a non-empty slot) with a hash value in the block, and a character string generated by concatenating the characters.
For each Tree node on the FDMT Tree shown in fig. 5, the actually stored character may specifically be a character represented by a sub-block in the Tree node, and a character represented by a slot bit filled with a hash value in the block, and a character string generated by stitching the character.
In practical application, each block in the Tree node may represent only one character; that is, based on the storage format of the Tree nodes shown in fig. 4 and 5, each Tree node actually stores a part of characters in the character prefix of the key of the above blockchain data as a character string of which the length is 2-bit characters.
For example, referring to fig. 6, fig. 6 is a block diagram of a Tree node shown in the present specification;
As shown in fig. 6, the Tree node contains 16 sub-blocks representing different 16-ary characters; each sub-block further includes 16 slots (only 16 slots contained in block6 are shown in fig. 6) that respectively represent different 16-ary characters; assuming that a sub-block 6 (representing 16-ary character 6) in the Tree node is a non-empty block, and a slot4 (representing 16-ary character 4), a slot6 (representing 16-ary character 6) and a slot9 (representing 16-ary character 9) in the sub-block are non-empty slots filled with the hash value of the next-layer node of the Tree node link; the Tree node stores the 16-ary character strings "64", "66" and "69", respectively, as part of the characters in the character prefix of the key of the above blockchain data.
The number of sub-blocks included in the Tree node and the number of slots included in each sub-block are not particularly limited in the present specification; in practical applications, the number of sub-blocks included in the Tree node may be determined based on the number of types of character elements included in the character string corresponding to the key of the blockchain data; and the number of slots contained by the sub-block;
for example, assume that the key corresponding to the blockchain data is a 16-ary string, and at this time, the number of types of character elements included in the string corresponding to the key of the blockchain data is 16; the number of sub-blocks included in the Tree node and the number of slots included in each sub-block may be 16.
In practical applications, the number of sub-blocks included in the Tree node and the number of slots included in the sub-blocks may be the same;
For example, in one example, taking the string corresponding to the key of the blockchain data as a 16-ary string as an example, in this case, the Tree nodes of the first N layers may each include 16 sub-blocks respectively representing different 16-ary characters; and each sub-block may further include 16 slots representing different 16-ary characters, respectively.
In this way, a single Tree node in each of the N preceding layers of the FDMT Tree may have 16×16=256 slots, and it is apparent that the Tree node on the FDMT Tree shown in fig. 4 will have a larger storage capacity than the node for storing the character prefix on the MPT Tree shown in fig. 1.
Currently, in practical applications, the number of sub-blocks included in the Tree node may be different from the number of slots included in the sub-block;
for example, in practical application, the character string corresponding to the key of the blockchain data may be a character string formed by two different binary characters; for example, the character string corresponding to the key of the blockchain data may specifically be a character string formed by mixing 16-system characters and 10-system characters, and in this case, the Tree nodes of the first N layers may each include 16 sub-blocks respectively representing different 16-system characters; each sub-block may further include 10 slots respectively representing different 10-ary characters; or the Tree nodes of the first N layers may each include 10 sub-blocks respectively representing different 10-level characters; and each sub-block may further include 16 slots representing different 16-ary characters, respectively.
In the illustrated embodiment, the number of layers of Tree node included in the FDMT Tree may be a fixed value; in practical application, the value of N may be an integer greater than or equal to 1; that is, the FDMT Tree may be a Merkle Tree that includes at least one Tree node, and the number of layers of the Tree node included is relatively fixed.
For example, in one example, taking the key of the above blockchain data as the blockchain account address, assuming that the blockchain account address supported by the blockchain system is designed such that the first 6-bit address characters may be identical, then in this case, the address characters of the first 6 bits of the blockchain account address may be used as the character prefix of the blockchain account address; moreover, since the length of the character in the character prefix of the blockchain account address stored by the Tree node is 2-bit character; thus, the FDMT Tree described above can be designed as a Tree structure containing three layers of Tree nodes.
Based on the Tree structure design of FDMT Tree shown in fig. 4 and 5, on one hand, since each Tree node in FDMT Tree includes a plurality of blocks respectively representing different characters; each block further comprises a plurality of slots respectively representing different characters; thus, by this design, each Tree node will have a greater data storage capacity and data carrying capacity, so that when the Tree node in the FDMT Tree is written into the database for storage; or when the Tree node in the FDMT Tree stored in the database is accessed, the data storage capacity of the Tree node can be more adaptive to the single IO read-write capacity of the storage medium bearing the database, so that the IO read-write capacity of the storage medium bearing the database can be fully utilized, and the data read-write efficiency is improved; moreover, the improvement of the data storage capacity and the data carrying capacity of a single Tree node on the FDMT Tree tends to also result in the improvement of the overall data storage capacity and the data carrying capacity of the FDMT Tree, so that more blockchain data can be stored on the FDMT Tree;
For example, taking a storage medium carrying the database as an example, a disk with a single physical sector being 4KB in size, assuming that the capacity of the disk for single IO reading and writing is 4KB (one sector), for 1 Branch Node on the MPT tree described in fig. 1, assuming that 16 fields included in the Branch Node are all filled with a hash value of 32 bytes, the data storage capacity of the Branch Node is about 32 bytes by 16=512 bytes; obviously, the maximum capacity of one IO read for the Branch node can only be about 512byte, which is far smaller than the maximum reading capacity of 4Kb of one IO read of the disk, and the IO reading capacity of the disk can not be fully utilized, so that serious performance waste exists.
However, if a Tree structure design of FDMT Tree as shown in fig. 4 and 5 is adopted, it is assumed that the Tree nodes of the previous N layers may each include 16 blocks representing different 16-ary characters; each block may further include 16 slots respectively representing different 16-ary characters, and then a single Tree node in each of the first N layers FDMT may have 16×16=256 slots; assuming that each slot is filled with a hash value of 32 bytes, the maximum storage capacity of one Tree node is 256 x 32 bytes=8192 bytes=8 kb, which is just the capacity of two sectors, with all slots full. Obviously, the data storage capacity of each Tree node in the Tree structure FDMT shown in fig. 4 and 5 is more adaptive to the capacity of single IO reading and writing of the disk, so that the IO reading and writing capacity of the disk itself can be fully utilized, and the data reading and writing efficiency is improved.
Moreover, the increase in data storage capacity and data carrying capacity of the single Tree node on FDMT tends to also result in an increase in the overall data storage capacity and data carrying capacity of FDMT, such that more blockchain data may be stored on FDMT;
For example, assuming that the Tree structure FDMT shown in fig. 4 and 5 contains 3 layers of Tree nodes, each layer of Tree node may include 16 sub-blocks each representing a different 16-ary character; each block may further include 16 slots each representing a different 16-ary character; then a single Tree node in each layer may have 16 x 16 = 256 slots; then the three layers Tree node can bear 256×256=16.77M character combinations, and can link 16.77M socket data blocks in total; assuming that each bucket data block can bear 16 data records by user definition, the whole FDMT tree can bear 16.77M x 16 pieces of blockchain data at most; it is apparent that the FDMT illustrated in fig. 4 and 5 may store more blockchain data with greater data carrying capacity than the MPT tree illustrated in fig. 1.
In a second aspect, since each Tree node in FDMT employs a uniform data structure; for the Tree node of each layer in FDMT above, the character length of the character prefix of the key of the blockchain data actually stored will also remain fixed; therefore, through the design, frequent splitting of the nodes caused by unfixed character lengths of the Tree nodes of each layer can be avoided, so that the number of layers of the Tree nodes contained in the Tree structure FDMT can be ensured to be always in a relatively stable state.
In a third aspect, since by this design each Tree node will have a greater data storage capacity and data carrying capacity; the number of layers of Tree nodes included in the Tree structure FDMT is in a relatively stable state; therefore, the improvement of the Tree node storage capacity and the number of layers of the Tree node are relatively stable, so that the Tree node of FDMT can be ensured to have fewer layers to some extent; therefore, on the basis that the Tree node has larger data storage capacity and data bearing capacity and the Tree node of FDMT has fewer layers, when the system needs to load the Tree node of the front N layer of FDMT into the memory as data needing to be frequently read from the storage medium bearing the database during cold start, the IO read times when the Tree node of the front N layer stored in the storage medium is read into the memory and the overall read time when the Tree node of the front N layer of FDMT is loaded into the memory can be obviously reduced, and the start time delay during the cold start of the system is radically shortened.
For example, FDMT shown in fig. 4 and 5 is fixed for 3 layers of Tree nodes, and the Tree node of each layer includes 16 blocks representing different 16-ary characters respectively; each block further comprises 16 slots respectively representing different 16-system characters, and in the case of full load of all slots, the maximum storage capacity of one Tree node is 256 x 32 byte=8192 byte=8 kb; for the MPT tree shown in FIG. 1, the number of layers of the MPT tree is not fixed because it requires frequent node splitting; moreover, because of its single Branch Node's storage capacity 512 bytes, which is much smaller than the Tree Node on FDMT shown in FIGS. 4 and 5; tends to result in MPT trees having a greater number of layers (e.g., MPT trees can reach up to 64 layers, much greater than 3 layers). When the system is started in a cold mode and N layers in front of FDMT trees are read into the memory, the system is usually read layer by layer; thus, based on the MPT tree shown in FIG. 1, it is apparent that more reads are required.
Moreover, because the storage capacity of a single Branch Node on the MPT tree is 512 bytes, which is only one eighth of 4KB of a single physical sector carrying the database, the reading efficiency is very low; thus, even if the same data is stored according to the MPT tree of fig. 1 and the FDMT tree shown in fig. 4 and 5, the system may have at least 8 times the number of IO reads for the MPT tree at cold start as compared to the FDMT tree shown in fig. 4 and 5.
Obviously, at the time of cold start, the IO read times of the system for the FDMT tree shown in fig. 4 and 5 are far smaller than those of the MPT tree shown in fig. 1; thus, the above-described FDMT tree design shown in fig. 4 and 5 would be more friendly to system cold starts.
In one embodiment, the character string corresponding to the key of the blockchain data may still include a character prefix and a character suffix; in this case, the above Tree node may be used to store characters in the character prefix of the key of the blockchain data; the leaf node may be used to store a character suffix for a key of the blockchain data and a Value for the blockchain data.
It should be noted that, because the leaf node actually stores data, the leaf node generally has a larger data capacity than the Tree node; for example, the value of the blockchain data actually stored by the leaf node is usually the original content of the blockchain data, and the original content of the blockchain data occupies a larger storage space than the character prefix of the blockchain data; therefore, in this specification, in order to ensure that the leaf node can have a larger data capacity, the leaf node may specifically store data in the form of a large data block.
The specific form and storage structure of the data block are not particularly limited in the present specification;
in one embodiment shown, the leaf nodes may be in the form of bucket data buckets; the bucket data bucket may be a container or a storage space for storing data.
Referring to fig. 7, fig. 7 is a block diagram of a bucket data bucket shown in the present specification;
as shown in fig. 7, in the bucket data (i.e., bucker node shown in fig. 7) described above, several data records may be included; it should be noted that, the plurality of data records included in the bucket may not be logically integrated, but rather may be logically separated from a plurality of different data records. Each data record corresponds to a piece of blockchain data and is used for storing the value of the blockchain data; that is, a data record refers to a stored record of data content including the value of the blockchain data.
It should be noted that, the plurality of data records included in the bucket are not logically integral, which means that each data record may correspond to an independent query key value (key), so that accurate query may be performed on each data record in the bucket based on the query key value of each data record, and the integral reading of all the data records stored in the bucket is not required.
The specific form and specific content of the query key value corresponding to each data record are not particularly limited in this specification, and may be any form of character string that can be used as a query index for each data record.
In one embodiment, the query key value corresponding to each data record may specifically be a hash value of the data content contained in each data record; the data record included in the bucket data may specifically include a key-value key pair formed by a hash value of the blockchain data and a data content corresponding to the value of the blockchain data.
Of course, in practical application, the query key value corresponding to each data record may be a character string other than the hash value, which can be used as a query index of each data record, and the description is not particularly limited; for example, in one example, the query key value corresponding to each data record may specifically be a unique identifier (such as a number) set by the node device for each data record.
In one embodiment shown, if the Tree node is used to store characters in a character prefix of a key of the blockchain data; the leaf node is used for storing the character suffix of the key of the blockchain data and the Value of the blockchain data;
correspondingly, the data record included in the bucket data may specifically be a key-value key pair formed by a hash value obtained by performing a hash calculation on the whole data content corresponding to the value of the blockchain data and the character suffix of the key of the blockchain data.
In practical applications, the data record may be in a form of data other than key-value key value pairs, and this is not specifically described in the present specification.
With the above embodiment, the plurality of data records contained in the leaf nodes in the FDMT tree are not logically one whole, but rather are logically separate key-value pairs; therefore, the data contained in the leaf node in the FDMT tree is not accessed as a whole in the database, and each key-va lue key pair contained in the leaf node is used as a separate access unit in the database, so that the data access to the database is more flexible;
For example, taking the value of the blockchain data as the latest account state data corresponding to the blockchain account in the blockchain, the key of the blockchain data is the blockchain account address, in this case, the key of the key-value key value pair corresponding to the data record included in the bucket may be the character suffix of the blockchain account address and the hash value of the two data contents of the corresponding account state data; the value of the key-value key value pair can be the character suffix of the block chain account address and the data content of the corresponding account state data.
Assuming that the character suffix of a specific account address and corresponding account state data contained in the bucket data need to be read, each key-value key value pair in the bucket data is an independent access unit; therefore, only the hash value of two data contents, namely the character suffix of the specific account address and the corresponding account state data, is needed to carry out content addressing in the database, all the data contents contained in the leaf node are not needed to be read into the memory from the database, and then the character suffix of the account address and the corresponding account state data which need to be read are further searched in the memory;
Correspondingly, if a character suffix of a new account address and corresponding account state data are required to be written into the bucket data, or the account state data corresponding to a specific account address contained in the bucket data is updated, a key-value key pair can be directly constructed according to the character suffix of the new account address and the corresponding account state data, and the key-value key pair is written into the bucket data; or based on the character suffix of the specific account address and the hash value of the corresponding account state data, performing content addressing in the database, finding out the corresponding key-value key value pair, writing in the updated account state data corresponding to the specific account address, and updating the key-value key value pair to the original value.
The number of data records included in the bucket is not particularly limited in the present specification; in one implementation manner, the number of data records contained in the bucket data may be specifically configured by a user in a user-defined manner.
For example, taking the blockchain data as the latest account state data corresponding to the blockchain account in the blockchain, and taking the key of the blockchain data as the blockchain account address as an example, in this case, each data record in the bucket corresponds to the account state of one blockchain account respectively; the number of data records in the bucket actually represents the account bearing capacity of the bucket for accommodating the blockchain account; therefore, the user can flexibly customize the account bearing capacity of the bucket through customizing the number of the data records which can be accommodated by the bucket; for example, in one example, the number of data records contained in the bucket may be configured by a user to be 16 or 64 such that a single bucket may carry state data for 16 or 64 blockchain accounts.
Note that, the total storage capacity of the data records stored in the bucket is not particularly limited in the present specification; in one implementation shown, the total storage capacity of the data records stored in the bucket may be specifically configured by a user.
For example, in implementation, taking a storage medium carrying the database as an example, where a single physical sector is a disk with a size of 4KB, in this case, the user may set the maximum storage capacity of the data records stored in the bucket to be 4KB; or an integer multiple of 4KB, to enable the maximum storage capacity of the bucket to be adapted to the storage capacity of a single physical sector of the storage medium.
In this specification, as described above, each Tree node of the first N layers on the FDMT Tree adopts a unified data structure; thus, each Tree node of the first N layers on the FDMT Tree will no longer experience node splitting. And the leaf nodes on the FDMT tree can still perform node splitting when a certain splitting condition is met.
Unlike the MPT tree, in the present specification, the leaf nodes on the FDMT tree may be split according to the storage capacity of the leaf nodes instead of being split according to the character length of the shared character prefix. When the storage capacity of the leaf node on the FDMT tree meets the node splitting condition, an intermediate node is split from the leaf node. The split intermediate node is an upper node of the leaf node. Multiple splits may be performed for the same leaf node.
The node splitting condition may include any type of condition related to the storage capacity of the leaf node, and is not particularly limited in the present specification;
In the illustrated embodiment, the node splitting condition may specifically be any one of condition 1 and condition 2 illustrated below; or may be a combination of condition 1 and condition 2 shown below:
condition 1: the total number of data records stored in the leaf node is greater than a threshold;
Condition 2: the total storage capacity of the data records stored in the leaf nodes is greater than a threshold.
For example, in one example, the node splitting condition may be set such that when the total number of data records stored in the leaf node is greater than 64; and/or, when the total storage capacity of the data records stored in the leaf node is greater than 4KB, splitting an intermediate node from the leaf node.
In the present specification, the extended node, which is also the intermediate node of the FDMT Tree, may adopt the same data structure as the Tree node described above, and is specifically used to store characters split from the character suffix stored in the leaf node;
For example, an extended node split from a leaf node may also include a plurality of blocks each representing a different character; each block may further include a plurality of slots respectively representing different characters; the slot may be used to populate a hash value of a next level node linked by the node. For example, for the extended node, the next layer node is the leaf node; therefore, in the slots contained in the block in the extended node, the hash value of the leaf node of the next layer of the extended node link can be used to be filled.
In the present specification, the extended node split from the leaf nodes may be merged with the leaf node of the lower layer to which the extended node is linked according to the storage capacity of the extended node. When the storage capacity of the extended node on the FDMT tree satisfies the node merge condition, the extended node may be merged to the leaf node of the lower layer linked thereto.
The merging of the extended node into the leaf node of the lower layer linked with the extended node means that the characters stored in the extended node are written into one or more leaf nodes of the lower layer linked with the extended node as the character suffix.
The node combination condition may include any type of condition related to the storage capacity of the extended node, and is not particularly limited in the present specification;
In the illustrated embodiment, the node merging condition may specifically be any one of condition 3 and condition 4 illustrated below; or may be a combination of condition 3 and condition 4 shown below:
condition 3: the total number of characters stored in the extended node is less than or equal to a threshold value; that is, the total number of non-empty slots (i.e., slots filled with hash values) in each block in the extended node;
Condition 4: the total storage capacity of the characters stored in the above-mentioned extended node is less than or equal to a threshold value. That is, the total storage capacity of the hash value filled in the non-empty slots in each block in the extended node;
For example, in one example, the node merge condition may be set such that when the total number of characters stored in the extended node is less than or equal to 1; and/or when the total storage capacity of the characters stored in the extended node is less than or equal to 1KB, writing the characters stored in the extended node into one or more leaf nodes of the lower layer of the extended node as character suffixes.
As can be seen from the above description, the extended node described in the present specification may be a dynamically scalable intermediate node on the FDMT tree; when the storage capacity of any leaf node on the FDMT tree meets the node splitting condition, at least one extended node and at least one leaf node can be split from the leaf node; when the storage capacity of any extended node satisfies the node merge condition, the extended node may be merged to the leaf node of the lower layer linked thereto.
In this specification, the blockchain data to be stored may specifically include at least the following four types of data:
Transactions recorded in the block; after the transaction in the block is executed, a transaction receipt corresponding to the transaction recorded in the block; after the transaction in the block is executed, the latest account state data corresponding to the block chain account in the block chain; the storage content of the intelligent contract account;
correspondingly, the FDMT tree body may also include:
a transaction tree for storing transactions embodied in the blocks; a receipt tree for storing transaction receipts corresponding to transactions recorded in the blocks; a state tree for storing up-to-date account state data corresponding to blockchain accounts in the blockchain; a storage tree for storing storage content of smart contract accounts.
Of course, in practical application, only the tree structure of FDMT trees may be used to store part of types of data in the four types of data; for example, using only the FDMT tree to store the latest account state data corresponding to the blockchain account, other types of data may be stored using other forms of binary trees (e.g., MPTs or other forms of binary trees).
Wherein the hash values of the root nodes of the transaction tree, receipt tree and status tree may be stored in a block header; the hash value of the root node of the Storage tree may be stored in a Storage field in a structure of a contract account corresponding to the Storage tree.
The key of the blockchain data may specifically refer to a lookup key value corresponding to the blockchain data in the database; correspondingly, the Value of the blockchain data may be specifically the original content of the blockchain data;
In practical application, the search key value may specifically be a character string corresponding to the blockchain data; when the blockchain data are different types of data, certain differences exist in the corresponding keys;
For example, when the blockchain data is a transaction recorded in a block; or after the transaction in the block is executed, when the transaction receipt corresponding to the transaction recorded in the block is received, the key corresponding to the block chain data at the moment can be specifically the serial number of the transaction in the block; or other forms of transaction identification.
When the blockchain data is the latest account status data corresponding to the blockchain account in the blockchain after the transaction in the block is executed, the key corresponding to the blockchain data at this time may be specifically the account address of the blockchain account.
When the blockchain data is the storage content of the intelligent contract account, the key corresponding to the blockchain data at the moment can be specifically a hash value of the account address of the contract account and the storage position of the storage content in the account storage of the contract account; for example, in one example, the storage content may be a state variable, and the account address of the contract account and the hash value of the storage location of the state variable in the account storage of the contract account may be used as a key.
In the present specification, when a node device in a blockchain stores blockchain data, a key-value key value pair of the blockchain data to be stored may be acquired first; for example, in one example, the node device may, upon acquiring blockchain data to be stored, process the blockchain data into key-value pairs;
After the key-value key value pair of the blockchain data to be stored is obtained, the key-value key value pair of the blockchain data to be stored can be converted into a root node, an intermediate node and a leaf node on the logical tree structure;
For example, in one example, the node device may carry a storage interface or a storage service corresponding to the FDMT tree, where the storage interface or the storage service may be specifically configured to convert a key-value key pair of the blockchain data to be stored into a node on the FDMT tree; the node device may convert the key-value key pair of the blockchain data to be stored into the root node, the intermediate node and the leaf node in fig. 4 or fig. 5 by calling the storage interface or the storage service according to the tree structure of FDMT tree shown in fig. 4 or fig. 5 after obtaining the key-value key pair of the blockchain data to be stored.
After converting the key-value pairs of the stored blockchain data into root nodes, intermediate nodes and leaf nodes on the above-described logical tree structure, the root nodes, intermediate nodes and leaf nodes may be stored in the database in the form of key-value pairs;
For example, the database is typically stored in a persistent storage medium (e.g., a storage disk) mounted on the node device; the storage medium is a physical storage corresponding to the database; when the FDMT tree is stored in the database, the nodes on the FDMT tree can be further written into a storage medium carrying the database from the memory of the node device in the form of Key-Value Key Value pairs by executing a commit command.
The specific type of the database is not particularly limited in the present specification, and a person skilled in the art can flexibly select the database based on actual requirements;
in one implementation manner, the database may be a Key-Value type database; for example, in one example, the database may be LevelDB database in a multi-tier storage structure; or a database based on LevelDB architecture; for example, rocksdb database is a typical database based on LevelDB database architecture.
Note that, when the node on the FDMT tree stores the Key in the database in the form of a Key-Value Key pair, the Key of the Key-Value Key pair may specifically be a node ID of the node on the FDMT tree;
The node ID may specifically include identification information capable of uniquely identifying a node on the FDMT tree;
for example, in one implementation, the node ID may specifically be a hash value of the data content contained in the node on the FDMT tree; in this case, when a node on the FDMT tree needs to be queried, content addressing can be performed based on a hash value of data content contained in the node as a key.
In another implementation manner, the node ID may specifically include path information of the node on the FDMT tree in the FDMT tree; that is, the path information of the node on the FDMT tree in the FDMT tree is set as the node ID of the node; the path information may specifically include any form of information that can describe a link relationship between a node and other nodes, and a position of the node on the FDMT tree;
for example, referring to fig. 8, fig. 8 is a schematic diagram illustrating setting node IDs for nodes on a FDMT tree according to the present specification; for FDMT trees containing three layers of Tree nodes as shown in fig. 8, assuming that the node ID of the Tree node as the root node is represented by 0x00, the node ID of the bucket node shown in fig. 8 may be represented as 0x00123456.
Wherein 0x00 is the node ID of the root node; 123456 refers to the path information from the root node to the bucket node on the FDMT tree; 12 denotes the 2 nd slot of the first block of the first layer tree node; 34 represents the 4 th slot of the 3 rd block of the second layer tree node; 56 indicates the 6 th slot of the 5 th block of the third layer tree node.
Based on the node ID, the link relation between the socket node and other nodes and the specific position of the socket node on the FDMT tree can be clarified; for example, based on the node ID, it can be determined that the bucket node is linked to the 6 th slot of the 5 th block of the third layer tree node; the tree node of the third layer is linked with the 4 th slot position of the 3 rd block of the tree node of the second layer; the tree node of the second layer is further linked to the 2 nd slot of the 1 st block of the tree node of the first layer as the root node.
In this way, a particular slot in the FDMT tree in which a node is located can be precisely located when the node ID is used to retrieve the node stored on the FDMT tree.
In another implementation manner, the node ID may specifically include a relative position of a node on the FDMT tree in the FDMT tree, and a hash value of the data content included in the node (where the node ID may also be used as a hash identifier of the node); that is, the relative position of the node on the FDMT tree in the FDMT tree and the hash value of the data content included in the node are set as the node ID of the node.
For example, in implementation, a string generated by splicing the relative position of a node in the FDMT tree with the hash value of the data content contained in the node may be used as the node ID of the node; of course, in practical application, in the process of splicing, other types of information except the relative position and the hash value can be further introduced to generate a node ID; and are not listed in this specification.
In this way, in addition to content addressing based on the hash value of the data content contained by the node as a key, the node ID may be used to retrieve the node stored on the FDMT tree, and the specific slot in which the node is located on the FDMT tree may be precisely located.
The storage medium on the node device for carrying the database may be a persistent storage medium; for example, a disk, memory, or other form of storage medium capable of persistent storage of data, which is not specifically recited in this specification.
In the present specification, before writing the key-value key pairs of the nodes in the FDMT tree into the database, the nodes in the FDMT tree may be encoded in advance, and then the key-value key pairs of the encoded nodes may be stored in the database.
Wherein, because the Tree node and the extended node on the FDMT Tree are both adopted, each block comprises a plurality of blocks, and each block further comprises a data structure with a plurality of slots; therefore, in the present specification, in encoding for the FDMT Tree, bitmap encoding can be performed for blocks in the Tree node and the extended node on the FDMT Tree.
It should be noted that, when performing bitmap encoding for a block (which may include a main block and a sub-block) in a Tree node and an extended node on the FDMT Tree, it may be specifically counted whether each slot in blcok in the Tree node and the extended node is filled with a hash value, then the counted result is represented by a special encoding character to obtain bitmap encoding information, and finally the bitmap encoding information is added to the block to complete a bitmap encoding process for the block;
for example, in one example, the bitmap encoding information may specifically be a 16-ary string for identifying whether each slot in the block is filled with a hash value; assume that a block to be encoded (which may be a main block or a sub-block) contains 16 slots in total; the 16 slots are respectively filled with hash values at bit 0, bit 8, bit 9, bit 10 and bit 11; whether each slot in the block is filled with the statistics of the hash can be represented by a binary string 0000 1111 0000 0001; wherein, the leftmost bit of the binary character string is the 15 th bit of the most significant bit, and the rightmost bit is the 0 th bit of the least significant bit; finally, the binary string may be converted into a 16-ary string 0x0f01; at this time, the 16-ary string 0x0f01 is the bitmap encoding information of the block.
When the bitmap encoding is performed for the blocks in the Tree node and the extended node in the FDMT Tree, the bitmap encoding process is exactly the same for both the main block and the sub-block.
In this way, since the blocks included in the Tree node and the extended node in the FDMT Tree after encoding are added, it is able to indicate whether each slot in the block is filled with bitmap encoding information of a hash value; therefore, when the data filled in the block in each Tree node stored in the database needs to be queried, the non-empty slots in the block can be determined through the bitmap coding information in the block, and each slot in the block does not need to be traversed, so that the searching efficiency can be improved.
Wherein, the leaf nodes on the FDMT Tree adopt completely different data structures from the Tree node and the extended node; therefore, when encoding the leaf node on the FDMT Tree, a completely different encoding scheme from the Tree node and the extended node may be used.
In one embodiment shown, RLP (Recursive Length Prefix, recursive length prefix coding) coding may still be employed in particular when coding for leaf nodes on the FDMT tree described above; the RLP coding is a coding method commonly used for the MPT tree shown in fig. 1, and the specific coding process thereof is not described in detail in the present specification.
Of course, in practical application, the coding mode adopted when coding the leaf nodes on the FDMT tree may be other coding modes besides RLP coding, and in practical application, flexible selection is possible, and one-to-one enumeration is not performed in this specification.
In the present specification, after encoding is completed for the nodes on the FDMT tree, key-value key pairs of the nodes on the FDMT tree may be stored in the database.
In one embodiment, for the Tree node and the extended node on the FDMT Tree, before the key-value key value pair of the Tree node and the extended node after encoding is stored in the database, the empty slots of the block, which are not filled with the hash value, may be determined according to bitmap encoding information added to the block in the Tree node and the extended node after encoding, and then the determined empty slots are deleted from the block.
By the method, when the Tree node and Extend nodekey-value key value pairs are stored in the database, the empty slots in each block can be deleted, the storage space occupied when the Tree node and the extended node are stored in the database can be further saved, and the storage efficiency is improved.
It should be noted that, for the coded block, when calculating the hash value of the block, hash calculation can be performed by taking the bitmap coding information contained in the coded block and the hash value filled with the non-empty character slots as a whole; of course, bitmap encoding information may be excluded.
In this specification, after node key-value key pairs on the FDMT Tree are written into the database, if a value corresponding to a piece of blockchain data written on the FDMT Tree is updated, then it may be necessary to update leaf nodes for storing a character suffix and a value of the blockchain data on the FDMT Tree, extended nodes split from the leaf nodes, and Tree nodes for storing a character prefix of a key of the blockchain data, respectively.
Of course, if the leaf node for storing the character suffix and value of the blockchain data has not been split, only the leaf node for storing the character suffix and value of the blockchain data and the Tree node for storing the character prefix of the key of the blockchain data on the FDMT Tree may be updated respectively.
In this case, the node device may search the database for a node corresponding to the blockchain data that needs data update; the specific searching method is not described in detail; and then reading the searched nodes from the database into a memory, modifying and updating the nodes in the memory, and then writing the updated nodes into the database to update the original updated nodes.
For example, updating the leaf node on the FDMT tree where the data update occurs may specifically include updating the value of the blockchain data stored in the leaf node. After the update is completed, the hash value of the leaf node may be recalculated, and the node of the layer above the node is re-linked based on the hash value.
Updating an extension node or a Tree node at the upper layer of the leaf node where the data update occurs on the FDMT Tree, which may specifically include updating a slot bit filled with the hash value of the leaf node in the extension node or the Tree node; after the update is completed, the hash value of the extended node or the Tree node can be recalculated, and the new link is continuously performed with the node of the layer above the node based on the hash value.
In one embodiment, the splitting determination and the specific node splitting operation for the leaf node on the FDMT Tree may be performed at a stage of recalculating the hash value of the leaf node and re-linking the hash value with a node (which may be a Tree node or an extended node) of a layer above the leaf node.
In this case, the node device may determine whether a leaf node on the FDMT Tree is data-updated, and if any leaf node on the FDMT Tree is data-updated, the node device may further determine whether the storage capacity of the leaf node satisfies a node splitting condition before re-calculating a hash value of the leaf node and re-linking with a Tree node or an extended node at an upper layer of the leaf node based on the hash value; if the storage capacity of the leaf node satisfies the node splitting condition, at least one extended node may be further split from the leaf node.
For example, in one example, when the storage capacity of a leaf node satisfies a node splitting condition, in order to quickly reduce the storage capacity of the leaf node, the leaf node may be split into one extended node and a plurality of leaf nodes.
The splitting strategy of splitting the extended node from the leaf node is not particularly limited in the specification, and in practical application, a user can perform custom setting according to specific splitting requirements;
In one embodiment shown, the splitting strategy may specifically include: the latest data record to be written to a leaf node in the current chunking period is split from the leaf node and an extended node and a leaf node are additionally created based on the latest data.
In this case, upon node splitting for any target leaf node, it may be determined that the most recent data record for that target leaf node was written during the current blocking period; deleting the latest data records from the target data node, and splitting character prefixes from character suffixes contained in the latest data records;
For example, in one example, defaulting may split the first two bits of these most recent data records; or when the shared character prefix exists in the character suffixes contained in the latest data records and the length of the shared character prefix reaches two bits, splitting the shared character prefix out.
Then, at least one extended node for storing the split character prefix and at least one leaf node for storing the latest data record after the split character prefix are further created.
In this way, since the latest data record written in the current block period to the target leaf node can be deleted from the target leaf node, at least one extended node and at least one leaf node are recreated as split-out nodes based on the deleted latest data records; thus, for the target leaf node, after it has undergone node splitting, the data record actually stored by the target leaf node will be completely consistent with the data record stored by the target leaf node in the last chunking period; the hash value of the target leaf node does not change any relative to the last chunking period. Therefore, when the target leaf node writes new blockchain data in the current blocking period and the storage capacity of the target leaf node meets the node splitting condition, only the newly written data record is split to re-create additional extended node and leaf node, and the re-calculation of the hash value of the target leaf node and the re-linking of the node of the upper layer are not needed; by the splitting mode, the hash value of the leaf node meeting the node splitting condition can be ensured to be in a stable state, and the calculation times of recalculating the hash value of the leaf node can be reduced.
In another embodiment shown, the splitting strategy may specifically include: and splitting a shared character prefix among character suffixes contained in a plurality of data records stored in the leaf node from the leaf node, and additionally creating an extended node based on the shared character prefix.
In this case, when any target leaf node is subjected to node splitting, it may be determined whether a shared character prefix exists between character suffixes contained in a plurality of data records stored by the target leaf node; if there is a shared character prefix between character suffixes included in the plurality of data records stored by the target leaf node, the shared character prefix may be deleted from the plurality of data records, and at least one extended node for storing the shared character prefix may be created.
For example, the same data structure as the Tree node can be adopted as the extended node; therefore, the character stored in the extended node may be a character string with a length of two bits, in which the character represented by the block and the character represented by the slot filled with the hash value are spliced together; and when the length of the shared character prefix reaches two bits, the shared character prefix can be split from the plurality of data records.
In the illustrated embodiment, the above-mentioned merging determination and specific node merging operation for the extended node on FDMT may also be completed in a stage of recalculating the hash value of the extended node and re-linking with the node of the layer above the extended node based on the hash value.
In this case, the node device may determine whether the data update of the extended node on the FDMT tree occurs, and if any of the extended nodes split from the leaf nodes on the FDMT tree occurs, the node device may further determine whether the storage capacity of the extended node satisfies the node merge condition before re-calculating the hash value of the extended node and re-linking with the node of the upper layer of the extended node based on the hash value;
if the storage capacity of the extended node satisfies the node splitting condition, the extended node may be further merged into a leaf node of a next layer linked with the extended node.
For example, in implementation, the leaf node of the next layer of the extended node link may be determined first; for example, the leaf node of the next layer of the extended node link may be determined according to the hash value filled in the non-empty slot in the block included in the extended node; then, the characters stored by the extended node may be written into the leaf node of the next layer of the determined extended node link as a part of the character suffix.
Of course, in practical application, when a leaf node is split multiple times, the leaf node may split out multiple layers of extended nodes; in this case, when any extended node satisfies the node merging condition, if its lower node is still an extended node, not a leaf node, the extended node temporarily cannot perform node merging; in this case, after the extended node of the lower layer and the leaf node of the lower layer are combined to form a new leaf node, the new leaf node of the lower layer is combined for the second time, and the specific combining mode is not described again.
In the present specification, since the Tree node on the FDMT Tree adopts a data structure including a plurality of blocks, each block further includes a plurality of slots; thus, the Tree node on the FDMT Tree will have a larger data storage capacity.
However, since the Tree node on the FDMT Tree will have a larger data storage capacity, when the FDMT Tree is used to store blockchain data, the blockchain data stored on the FDMT Tree tends to be too "sparse"; for example, when using the FDMT Tree to store blockchain data, there may be a high frequency of cases where only one block of a single Tree node is filled with data and only one slot of a single block is filled with data, which causes a problem that the interval between slots where the blockchain data stored on the FDMT Tree is located is large, resulting in that the blockchain data stored on the FDMT Tree is logically sparse.
The blockchain data stored on the FDMT tree is too sparse, so that when the FDMT tree is stored in the database, a large number of empty slots are stored, which inevitably causes waste of storage space, and the storage space of the database cannot be fully utilized.
In this regard, in the present specification, in storing key-value key pairs of nodes on the FDMT Tree in the database, or after storing key-value key pairs of nodes on the FDMT Tree in the database, blocks having a number of non-empty slots of 1 in each Tree node may be compressed and then compressed into blocks of the previous level.
Note that, the time for compression processing of the block having the number of non-empty slots of 1 in each Tree node may be in the process of storing the FDMT Tree in the database, or may be after the FDMT Tree is stored in the database, and is not particularly limited in the present specification. In the process of storing the key-value key value pair of the Tree node or the extended node as the intermediate node on the FDMT Tree in the database, or after the key-value key value pair of the Tree node or the extended node as the intermediate node on the FDMT Tree is stored in the database, it may be determined whether the number of non-empty slots of each block in the Tree node or the extended node as the intermediate node is 1; if the number of non-empty slots of any target block in any Tree node or any extended node is 1, compressing the target block to the previous block or the corresponding block in the previous node;
in one embodiment shown, if the Tree node employs the data structure shown in fig. 4; that is, the Tree node includes a plurality of blocks each representing a different character; each block may further include a plurality of slots respectively representing different characters; in this case, if the number of non-empty slots of any target block in any Tree node or extension node as an intermediate node is 1, first, a target slot for filling the hash value of the Tree node or extension node in the upper layer node linked with the Tree node or extension node may be determined.
And then filling the hash mark of the target position as the hash value of the Tree node or the extended node to the target slot position, and deleting the Tree node or the extended node.
The hash identifier may specifically refer to an intermediate node on the FDMT tree that can be uniquely identified; or the identification information of a specific block in the intermediate node; for example, when the above-mentioned hash identifier is used to identify an intermediate node, the hash identifier may be used as the node ID of the intermediate node.
In this specification, the hash identifier may specifically include a hash value filled in the unique non-empty slot, and compression information corresponding to the non-empty slot; specifically, the hash identifier is a binary group formed by concatenating the hash value filled in the unique non-empty slot and the compressed information corresponding to the non-empty slot.
The compression information corresponding to the non-empty slot may specifically include the number of times the non-empty slot is compressed and the relative position of the non-empty slot in the Tree node or the extended node as the intermediate node. The relative position may specifically be any information capable of locating the relative position of the non-empty slot on the intermediate node; for example, it may be a string of 16 or other system characters that concatenates the block in which the non-empty slot is located with the specific slot number. In another embodiment shown, if the Tree node employs the data structure shown in fig. 5; namely, the Tree node comprises a main block and a plurality of sub-blocks respectively representing different characters; and each block may further comprise a plurality of slots:
On the one hand, if the number of non-empty slots of any target sub-block in any Tree node or extended node as an intermediate node is 1, a target slot for filling the hash value of the storage content in the target sub-block in the primary main block of the upper level linked with the target sub-block can be determined, the hash identifier of the target sub-block is used as the hash value of the storage content in the target sub-block to fill the target slot, and the target sub-block is deleted.
On the other hand, if the number of non-empty slots of any one of the Tree node or the extended node as the intermediate node is 1, it is possible to determine a target slot for filling the hash value of the Tree node or the extended node in the upper node linked with the Tree node or the extended node, and fill the hash identifier of the main block as the hash value of the Tree node or the extended node to the target slot, and delete the Tree node or the extended node.
It should be noted that, when there is a target block with only one slot bit filled with data, the compression process of the non-empty slot bit may be a recursive compression process specifically; the recursive compression refers to compressing non-empty slots of a target block into a previous-stage block of the block, wherein the previous-stage block may have similar conditions, and only one slot is filled with data; therefore, the same compression process needs to be performed for the previous block, and the non-empty character slots of the target block are compressed continuously to the higher block step by step.
The recursive compression process shown above will be described in detail by way of a specific example, taking the data structure shown in fig. 4 as an example of the above Tree node.
Referring to fig. 9, in the FDMT tree including three layers of blocks as shown in fig. 9, the hash value of the leaf node with the node ID of 0x00123456 is initially stored in the 6 th slot of the 5 th block in Tree nodeC of the third layer; assuming that only one slot is filled with data in the three-layer block of the FDMT tree; the 4 th slot of the 3 rd block of the Tree nodeB of the second layer is filled with data; filling data in the 2 nd slot of the 2 nd block of the Tree nodeA of the first layer;
Then, in the manner of recursive compression described above, the 6 th slot of the 5 th block of Tree nodeC th layer is filled with the hash value of the leaf node;
Since the 5 th block of Tree nodeC is only filled with data, the 6 th slot is compressed into the 3 rd block of the Tree nodeB of the previous layer, and at this time, the hash value of the 5 th block filled with the 4 th slot in the 3 rd block is replaced by the hash-ID of the 5 th block. At this time, the hash-ID may specifically be a binary group formed by splicing the compressed data "0x01,0x65" of the 5 th block and the "hash value" filled in the 6 th slot of the 5 th block;
Wherein "0x01" indicates that the number of times of compression is 1, and "0x65" indicates the relative position of the slot in Tree nodeC, i.e., the 6 th slot in the 5 th block; "0x01,0x65" means that the 6 th slot of the 5 th block is compressed for the first time.
Since the 3 rd block of the Tree nodeB is only filled with data with the 4 th slot, the same compression process needs to be performed at this time, the 4 th slot is compressed into the 1 st block of the Tree nodeA of the upper layer, and at this time, the hash value of the 3 rd block filled with the 2 nd slot in the 1 st block is replaced by the hash-ID of the 3 rd block. At this time, the hash-ID may specifically be a binary group formed by splicing the compressed data "0x02,0x43,0x65" of the 3 rd block and the hash value "filled with the 4 th slot of the 3 rd block;
Wherein "0x02" means that the number of times of compression is 2 times; "0x43" indicates the relative position of the slot in the Tree nodeB, i.e., the 4 th slot in the 3 rd block; "0x02,0x43,0x65" means that the first compression is the 6 th slot of the 5 th block; the second compression is the 4 th slot of the 3 rd block.
It should be noted that, the above-described representation modes for representing the position information and the compression number of the slot are only exemplary and are not intended to limit the scheme of the present specification; in practical applications, the representation of the position information and the compression number of the slot can be flexibly defined by a person skilled in the art, and no one-to-one example is given in the present specification. For example, in practical applications, in order to further reduce the number of bits occupied by the compressed information, the number of times of compression and the location information in the compressed information may be encoded into a numerical value.
It should be noted that, the above example is described taking the data structure shown in fig. 4 as an example, and in practical application, when the Tree node adopts the data structure shown in fig. 5, the process of recursive compression is similar to the process described in the above example;
For example, when the Tree node adopts the data structure shown in fig. 5, the 5 th block of Tree nodeC shown in fig. 9 and the 3 rd block of Tree node b will be sub-blocks; the previous block of the 5 th block of Tree nodeC at this point is the master block of Tree nodeC; the last level block of the 3 rd bloc of Tree nodeB is the master block of Tree nodeB; in this case, the 6 th slot of the 5 th block (child node) in Tree nodeC may be compressed to the master node of Tree nodeC, then the master node of Tree nodeC may be compressed to the 6 th slot of the 3 rd block (child node) of Tree node b, and then the above process is repeated, and the compression manner of the child node to the master node of the previous stage may refer to the above example, which will not be described again.
The following describes the specific process of writing the blockchain data in the form of key-value key value pairs into FDMT tree as shown in fig. 4 and 5, taking the blockchain data as the latest account status data corresponding to the blockchain account, and taking the key of the blockchain data as the blockchain account address as an example.
When the method is realized, a user client accessing the blockchain can package transaction data into a standard transaction format supported by the blockchain and then issue the transaction data to the blockchain; the node equipment in the block chain can be used for carrying out consensus on the transactions issued to the block chain by the user client together with other node equipment based on the carried consensus algorithm so as to generate the latest block for the block chain; the specific consensus process is not described herein.
After the node devices in the blockchain have executed transactions in the target blocks, account status of the target accounts in the blockchain associated with the executed transactions will typically change; therefore, after the transaction in the target block is executed, the node device can acquire the latest account state data of the target account after the account is updated (namely, the account state of the target block after the transaction is executed), and process the acquired latest account state data of the target account into a key-value key value pair; wherein, the key of the key-value key value pair is the account address of the target account; the value of the key-value key pair is the latest account state of the target account.
After processing the acquired latest account status data of the target account into a key-value pair, the key-value pair may be converted into a Tree node, an extended node, and a Leaf node on the FDMT Tree, and the key-value pair of the Tree node, the extended node, and the Leaf node may be stored in a database.
Referring to fig. 10, fig. 10 is a schematic diagram illustrating writing of account status data into FDMT status trees according to the present description;
In the FDMT state Tree shown in fig. 10, the first three layers are Tree nodes and the last layer is a leaf node; the leaf node may adopt the storage structure of the bucket data bucket described above. The Tree node of each layer comprises a main block and 16 sub-blocks respectively representing different 16-system characters 0-f; and, each sub-block further includes 16 slots respectively representing different 16-ary characters 0-f.
Assuming that the account address of the target account is "a71135125", the latest account state data is "state1"; at this time, the character prefix (Shared nibble) of the account address is "a71135"; the character suffix (key-end) is "125".
When the account state data "state1" of the account address "a71135125" is written into the FDMT Tree in the form of key-value key value pairs, the Tree node serving as the root node on the FDMT Tree may be first located in the database (for example, the root node is located according to the hash of the root node filled in the block header); if the data is written for the first time, the root node is not created yet, and the root node can be created at the moment; and then, starting from the root node Tree node1 of the first layer, determining character slots corresponding to the character prefix 'a 71135' of the account address in the three layers of Tree nodes in sequence. As shown in fig. 10, the character slot corresponding to the character prefix "a71135" of the account address specifically includes: the 8 th slot (representing character 7) in the 11 th sub-block of Tree node1 (representing character a); the 2 nd slot (representing character 1) in the 2 nd sub-block of Tree node2 (representing character 1); the 6 th slot (representing character 5) in the 4 th sub-block of Tree node3 (representing character 3).
After the slots are determined, writing a data record consisting of a character suffix '125' and state data 'state 1' into a 6 th slot linked socket node of the 4 th sub-block of the Tree node 3; of course, if the data record corresponding to the character suffix "125" already exists in the bucket node, it indicates that the value corresponding to the historical account status data written in the account address "a71135125" on the FMDT tree; at this time, the Value stored in the data record may be updated to "state 1".
After the data writing is completed, the hash value of the data content contained in the socket node can be recalculated, the hash value is filled into the 6 th slot of the 4 th block of the Tree node3, and the original hash value of the slot is updated.
Further, after updating the original hash value in the 6 th slot of the 4 th block of the Tree node3, the hash value of the data content contained in the main block of the Tree node3 is recalculated, the hash value is filled into the 2 nd slot of the 2 nd block of the Tree node2, and the original hash value of the slot is updated.
Then, after the original hash value of the 2 nd slot of the 2 nd block of the Tree node2 is updated, the hash value of the data content contained in the main block of the Tree node2 is recalculated, the hash value is continuously filled into the 8 th slot of the 11 th block of the Tree node1 (namely the root node), and the original hash value of the slot is updated.
When the original hash value of the 8 th slot position of the 11 th block of the root node Tree node1 is updated, the hash value of the data content contained in the main block of the root node Tree node1 is recalculated, and the hash value of the root node of the FDMT state Tree stored in the block head is updated based on the hash value. After the hash value of the root node of the FDMT state tree stored in the block header is updated, the updating of the FDMT state tree is completed, and the key-value key pair formed by the account address "a71135512" and the corresponding account state data "state1" is successfully written into the FDMT state tree.
Referring to fig. 11, assume that the bucket node shown in fig. 10 satisfies the node splitting condition described above, and that a shared character prefix "12" exists in character suffixes in several data records included in the bucket node; the socket node shown in fig. 10 may be further split into an extended node for storing the shared character prefix "12" as shown in fig. 11, and the specific splitting process will not be described in detail.
It should be emphasized that the above-mentioned blockchain data is merely exemplary, taking the latest account status data corresponding to the blockchain account as an example; in practical application, when the above-mentioned blockchain data is other types of blockchain data of the above-mentioned 4 types of blockchain data, the specific process of writing the blockchain data into the corresponding FDMT tree is similar to the implementation process described above, and will not be described in detail in this specification.
The application also provides an embodiment of the device corresponding to the embodiment of the method.
Corresponding to the method embodiments described above, the present specification also provides an embodiment of a blockchain data storage device.
The embodiments of the blockchain data storage of the present specification may be applied to an electronic device. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of an electronic device where the device is located for operation.
In terms of hardware, as shown in fig. 12, a hardware structure diagram of an electronic device where the blockchain data storage device in this specification is located is shown, and in addition to the processor, the memory, the network interface, and the nonvolatile memory shown in fig. 12, the electronic device where the device is located in the embodiment generally includes other hardware according to the actual function of the electronic device, which is not described herein again.
FIG. 13 is a block diagram of a blockchain data storage device as shown in an exemplary embodiment of the present description.
Referring to fig. 13, the blockchain data storage device 130 may be applied to the electronic device shown in fig. 12, where the device 130 includes:
An acquisition module 1301 acquires key-value key value pairs of blockchain data to be stored;
The conversion module 1302 converts key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of the storage content in each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node;
A storage module 1303, configured to store key-value key value pairs of the root node, the intermediate node, and the leaf node in a database; and key-value key value pairs of the leaf node, the intermediate node and the root node, wherein value is the storage content of the node, and key is the hash value of the storage content of the node.
In this embodiment, the character string corresponding to the key of the blockchain data includes a character prefix and a character suffix; the root node and the intermediate node are used for storing characters in the character prefix; the leaf node is used for storing the character suffix and the Value of the blockchain data.
In this embodiment, the apparatus 130 further includes:
the splitting module is used for determining whether the storage capacity of the leaf nodes on the tree structure meets the node splitting condition; splitting at least one intermediate node from the leaf node if the storage capacity of the leaf node satisfies a node splitting condition; the intermediate node is used for storing characters split from character suffixes stored in the leaf nodes.
In this embodiment, the characters stored in the root node and the intermediate node are character strings generated by splicing characters represented by each sub-position in the root node and the intermediate node with characters represented by slots filled with hash values in each sub-position.
In this embodiment, the number of sub-positions included in the root node and the intermediate node is the same as the number of slots included in the sub-positions.
In this embodiment, the root node and the intermediate node include 16 sub-positions respectively representing different 16-ary characters; the sub-positions comprise 16 slots respectively representing different 16-system characters; the main position comprises 16 slots corresponding to each sub-position.
In this embodiment, the leaf node is a bucket data bucket; the bucket data barrel comprises a plurality of data records; the data content stored in the data record includes the character suffix and the Value of the blockchain data.
In this embodiment, the splitting module:
Determining whether a leaf node on the tree structure is updated with data;
If any leaf node on the tree structure is updated, before the hash value of the leaf node is recalculated and is linked with a node of a layer above the leaf node again based on the hash value, whether the storage capacity of the leaf node meets the node splitting condition is further determined.
In this embodiment, the apparatus 130 further includes:
the merging module is used for determining whether data update occurs to the intermediate nodes on the tree structure;
If any intermediate node on the tree structure is subjected to data updating, determining whether the storage capacity of the intermediate node meets a node merging condition before recalculating a hash value of the intermediate node and before re-linking with a node of a layer above the intermediate node based on the hash value; if the storage capacity of the intermediate node meets the node merging condition, the intermediate node is further merged to a leaf node of a next layer linked with the intermediate node.
In this embodiment, the data splitting condition includes:
the total number of data records stored in the leaf node is greater than a threshold; and/or the total storage capacity of the data records stored in the leaf nodes is greater than a threshold;
correspondingly, the data merging condition includes:
the total number of characters stored in the intermediate node is less than or equal to a threshold; and/or, the total storage capacity of the characters stored in the intermediate node is less than or equal to a threshold value.
In this embodiment, the splitting module:
determining a most recent data record written to the leaf node during a current chunking period;
deleting the latest data record from the leaf node; and
Splitting a character prefix from a character suffix contained in the latest data record, and creating at least one intermediate node for storing the split character prefix, and at least one leaf node for storing the latest data record after the splitting of the character prefix.
In this embodiment, the splitting module:
determining shared character prefixes among character suffixes contained in a plurality of data records stored by the leaf nodes;
Deleting the shared character prefix from the plurality of data records and creating at least one intermediate node for storing the shared character prefix.
In this embodiment, the merging module:
determining a leaf node of a next layer linked with the intermediate node;
and writing the characters stored by the intermediate node into the leaf nodes as the character suffixes.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. A typical implementation device is a computer, which may be in the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email device, game console, tablet computer, wearable device, or a combination of any of these devices.
In a typical configuration, a computer includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, read only compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage, quantum memory, graphene-based storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by the computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used in one or more embodiments of the present description to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.
The foregoing description of the preferred embodiment(s) is (are) merely intended to illustrate the embodiment(s) of the present invention, and it is not intended to limit the embodiment(s) of the present invention to the particular embodiment(s) described.

Claims (17)

1. A blockchain data storage method, the method comprising:
Acquiring key-value key value pairs of block chain data to be stored; the character string corresponding to the key of the blockchain data comprises a character prefix and a character suffix;
converting key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logical tree structure; the root node and the intermediate node are used for storing characters in the character prefix; the leaf node is used for storing the character suffix and Value of the blockchain data; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node where the slot is located; the hash value of the sub-position is represented by a hash value obtained by hash calculation after the hash values filled in all the slots in the sub-position are spliced; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node;
And storing key-value key value pairs of the root node, the intermediate node and the leaf node in a database.
2. The method of claim 1, the method further comprising:
determining whether the storage capacity of leaf nodes on the tree structure meets a node splitting condition;
Splitting at least one intermediate node from the leaf node if the storage capacity of the leaf node satisfies a node splitting condition; the intermediate node is used for storing characters split from character suffixes stored in the leaf nodes.
3. The method of claim 2, wherein the characters stored in the root node and the intermediate node are character strings generated by splicing characters represented by each sub-position in the root node and the intermediate node with characters represented by slots filled with hash values in each sub-position.
4. A method according to claim 3, the root node, the intermediate node comprising the same number of sub-positions as the number of slots comprised by the sub-positions.
5. The method of claim 4, the root node, the intermediate node comprising 16 sub-positions each representing a different 16-ary character; the sub-positions comprise 16 slots respectively representing different 16-system characters; the main position comprises 16 slots corresponding to each sub-position.
6. The method of claim 1, the leaf node being a bucket data bucket; the bucket data barrel comprises a plurality of data records; the data content stored in the data record includes the character suffix and the Value of the blockchain data.
7. The method of claim 2, the determining whether the storage capacity of leaf nodes on the tree structure satisfies a node split condition, comprising:
Determining whether a leaf node on the tree structure is updated with data;
If any leaf node on the tree structure is updated, before the hash value of the leaf node is recalculated and is linked with a node of a layer above the leaf node again based on the hash value, whether the storage capacity of the leaf node meets the node splitting condition is further determined.
8. The method of claim 7, the method further comprising:
determining whether a data update occurs at an intermediate node on the tree structure;
if any intermediate node on the tree structure is subjected to data updating, determining whether the storage capacity of the intermediate node meets a node merging condition before recalculating a hash value of the intermediate node and before re-linking with a node of a layer above the intermediate node based on the hash value;
if the storage capacity of the intermediate node meets the node merging condition, the intermediate node is further merged to a leaf node of a next layer linked with the intermediate node.
9. The method of claim 8, the node splitting condition comprising:
the total number of data records stored in the leaf node is greater than a threshold; and/or the total storage capacity of the data records stored in the leaf nodes is greater than a threshold;
correspondingly, the node merging condition includes:
the total number of characters stored in the intermediate node is less than or equal to a threshold; and/or, the total storage capacity of the characters stored in the intermediate node is less than or equal to a threshold value.
10. The method of claim 7, splitting at least one intermediate node from the leaf node, comprising:
determining a most recent data record written to the leaf node during a current chunking period;
deleting the latest data record from the leaf node; and
Splitting a character prefix from a character suffix contained in the latest data record, and creating at least one intermediate node for storing the split character prefix, and at least one leaf node for storing the latest data record after the splitting of the character prefix.
11. The method of claim 7, splitting at least one intermediate node from the leaf node, comprising:
determining shared character prefixes among character suffixes contained in a plurality of data records stored by the leaf nodes;
Deleting the shared character prefix from the plurality of data records and creating at least one intermediate node for storing the shared character prefix.
12. The method of claim 8, the merging the intermediate node to a leaf node of a next level linked to the intermediate node, comprising:
determining a leaf node of a next layer linked with the intermediate node;
and writing the characters stored by the intermediate node into the leaf nodes as the character suffixes.
13. The method of claim 1, the blockchain data including any of the following:
Transactions recorded in the block;
a transaction receipt corresponding to the transaction recorded in the block;
latest account status data corresponding to blockchain accounts in the blockchain;
The stored content of the smart contract account.
14. The method of claim 13, the tree structure of logic comprising any one of the following:
a transaction tree for storing transactions embodied in the blocks;
a receipt tree for storing transaction receipts corresponding to transactions recorded in the blocks;
A state tree for storing up-to-date account state data corresponding to blockchain accounts in the blockchain;
a storage tree for storing storage content of smart contract accounts.
15. A blockchain data storage device, the device comprising:
The acquisition module acquires key-value key value pairs of the blockchain data to be stored; the character string corresponding to the key of the blockchain data comprises a character prefix and a character suffix;
The conversion module converts key-value key value pairs of the blockchain data to be stored into root nodes, intermediate nodes and leaf nodes on a logic tree structure; the root node and the intermediate node are used for storing characters in the character prefix; the leaf node is used for storing the character suffix and Value of the blockchain data; the root node and the intermediate node comprise a main position and a plurality of sub-positions for storing characters in a key of the blockchain data; the main position comprises a plurality of slots which are respectively corresponding to the sub-positions and are used for storing the hash value of each sub-position; the sub-positions comprise a plurality of slots for storing characters in keys of the blockchain data; the slot in the sub-position is used for storing the hash value of the next layer node linked with the node where the slot is located; the hash value of the sub-position is represented by a hash value obtained by hash calculation after the hash values filled in all the slots in the sub-position are spliced; the hash values of the root node and the intermediate node are the hash of the storage content in the main position in the root node and the intermediate node;
and the storage module is used for storing key-value key value pairs of the root node, the intermediate node and the leaf node in a database.
16. An electronic device, comprising:
A processor;
a memory for storing processor-executable instructions;
Wherein the processor is configured to implement the method of any one of claims 1-14 by executing the executable instructions.
17. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method of any of claims 1-14.
CN202111443113.5A 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment Active CN114153848B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111443113.5A CN114153848B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111443113.5A CN114153848B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment
CN202110494903.XA CN112988908B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN202110494903.XA Division CN112988908B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN114153848A CN114153848A (en) 2022-03-08
CN114153848B true CN114153848B (en) 2024-06-28

Family

ID=76337218

Family Applications (2)

Application Number Title Priority Date Filing Date
CN202110494903.XA Active CN112988908B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment
CN202111443113.5A Active CN114153848B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN202110494903.XA Active CN112988908B (en) 2021-05-07 2021-05-07 Block chain data storage method and device and electronic equipment

Country Status (1)

Country Link
CN (2) CN112988908B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988910B (en) * 2021-05-07 2021-09-24 支付宝(杭州)信息技术有限公司 Block chain data storage method and device and electronic equipment
CN112988908B (en) * 2021-05-07 2021-10-15 支付宝(杭州)信息技术有限公司 Block chain data storage method and device and electronic equipment
CN118035503B (en) * 2024-04-11 2024-06-28 福建时代星云科技有限公司 Method for storing key value pair database

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988908B (en) * 2021-05-07 2021-10-15 支付宝(杭州)信息技术有限公司 Block chain data storage method and device and electronic equipment

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107040582B (en) * 2017-02-17 2020-08-14 创新先进技术有限公司 Data processing method and device
KR101878869B1 (en) * 2017-11-17 2018-08-16 주식회사 미탭스플러스 Distributed Ledger Device and Distributed Ledger Method for User Identification Management Based on Block Chain
US10275400B1 (en) * 2018-04-11 2019-04-30 Xanadu Big Data, Llc Systems and methods for forming a fault-tolerant federated distributed database
KR102322729B1 (en) * 2019-03-04 2021-11-05 어드밴스드 뉴 테크놀로지스 씨오., 엘티디. Blockchain World State Merkle Patricia Trie Subtree Update
KR20200128250A (en) * 2019-05-01 2020-11-12 김민규 System and method for providing contract platform service based on block chain
CN110334154B (en) * 2019-06-28 2020-07-21 阿里巴巴集团控股有限公司 Block chain based hierarchical storage method and device and electronic equipment
WO2019179540A2 (en) * 2019-07-11 2019-09-26 Alibaba Group Holding Limited Shared blockchain data storage
CN111324596B (en) * 2020-03-06 2021-06-11 腾讯科技(深圳)有限公司 Data migration method and device for database cluster and electronic equipment
CN111737654B (en) * 2020-08-14 2020-12-11 支付宝(杭州)信息技术有限公司 Infringement detection method and device based on block chain and electronic equipment
CN112579602B (en) * 2020-12-22 2023-06-09 杭州趣链科技有限公司 Multi-version data storage method, device, computer equipment and storage medium
CN112632077A (en) * 2020-12-28 2021-04-09 深圳壹账通智能科技有限公司 Data storage method, device, equipment and storage medium based on redis
CN112988909B (en) * 2021-05-07 2021-09-28 支付宝(杭州)信息技术有限公司 Block chain data storage method and device and electronic equipment

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112988908B (en) * 2021-05-07 2021-10-15 支付宝(杭州)信息技术有限公司 Block chain data storage method and device and electronic equipment

Also Published As

Publication number Publication date
CN114153848A (en) 2022-03-08
CN112988908A (en) 2021-06-18
CN112988908B (en) 2021-10-15

Similar Documents

Publication Publication Date Title
CN114153848B (en) Block chain data storage method and device and electronic equipment
CN110334154B (en) Block chain based hierarchical storage method and device and electronic equipment
CN110457319B (en) Block chain state data storage method and device and electronic equipment
CN112988912B (en) Block chain data storage method and device and electronic equipment
CN113220685B (en) Traversal method and device for intelligent contract storage content and electronic equipment
CN110471795B (en) Block chain state data recovery method and device and electronic equipment
CN112988761B (en) Block chain data storage method and device and electronic equipment
US6910043B2 (en) Compression of nodes in a trie structure
CN108205577B (en) Array construction method, array query method, device and electronic equipment
US6532476B1 (en) Software based methodology for the storage and retrieval of diverse information
CN113961514B (en) Data query method and device
JP3992495B2 (en) Functional memory based on tree structure
JP2010508606A (en) Storage management of individually accessible data units
CN114706848A (en) Block chain data storage, updating and reading method and device and electronic equipment
CN112988909B (en) Block chain data storage method and device and electronic equipment
CN115221176A (en) Block chain data storage method and device and electronic equipment
US20020040361A1 (en) Memory based on a digital trie structure
CN112988911B (en) Block chain data storage method and device and electronic equipment
CN112988910B (en) Block chain data storage method and device and electronic equipment
CN112905607B (en) Block chain data storage method and device and electronic equipment
CN112511629B (en) Data compression method and system for account tree of MPT structure
CN116501760A (en) Efficient distributed metadata management method combining memory and prefix tree
CN116303425A (en) Method for creating account in block chain and block chain link point
CN115982781A (en) Method for creating account in block chain and block chain link point
CN118568175A (en) Method and system for generating n-ary tree

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant