WO2022048358A1 - Data processing method and device, and storage medium - Google Patents

Data processing method and device, and storage medium Download PDF

Info

Publication number
WO2022048358A1
WO2022048358A1 PCT/CN2021/109268 CN2021109268W WO2022048358A1 WO 2022048358 A1 WO2022048358 A1 WO 2022048358A1 CN 2021109268 W CN2021109268 W CN 2021109268W WO 2022048358 A1 WO2022048358 A1 WO 2022048358A1
Authority
WO
WIPO (PCT)
Prior art keywords
block
thread
blocks
write
tasks
Prior art date
Application number
PCT/CN2021/109268
Other languages
French (fr)
Chinese (zh)
Inventor
冯世伟
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2022048358A1 publication Critical patent/WO2022048358A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/544Buffers; Shared memory; Pipes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/548Queue

Definitions

  • the present application relates to the field of blockchain technology, and in particular, to a data processing method, device and storage medium.
  • Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • MVCC Multi-Version Concurrency Control
  • Embodiments of the present application provide a data processing method, device, and storage medium, which are beneficial to improving the verification efficiency of MVCC.
  • a first aspect of the embodiments of the present application provides a data processing method, applied to a server, including:
  • the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to
  • the data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
  • the process executes the process to the second thread.
  • the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
  • the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  • a second aspect of an embodiment of the present application provides a data processing apparatus, which is applied to a server, and the apparatus includes: a first sending unit, a second sending unit, and a third sending unit, wherein,
  • the first sending unit is used to send a first information instruction to the first thread, and the first information instruction is used to instruct the first thread if the data corresponding to the Mth block is read from the read set database set, then execute the first queue to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, wherein, the N is a positive integer less than or equal to M, and the M is a positive integer;
  • the second sending unit is used when determining that the current task of the second thread is to obtain the P first tasks generated by the first queue from the data sets corresponding to the P blocks before the N blocks. , then execute sending a second information instruction to the second thread, where the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to perform the MVCC verification on the P blocks;
  • the data sets corresponding to the P blocks are written into the write set cache respectively, and P write events are generated; the P write events are synchronized to the second queue, and P third tasks are generated, wherein, P is a positive integer less than or equal to N;
  • the third sending unit is configured to execute a sending third information instruction to the third thread if it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database , the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed , then perform updating the current BolckID to be the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to P blocks before Q blocks of , each third task corresponds to a block.
  • a third aspect of the embodiments of the present application provides a server, the server includes a processor, a communication interface, a memory, and one or more programs, the processor, the communication interface, and the memory are connected to each other, wherein the memory is used for A computer program is stored, the computer program includes program instructions, the processor is configured to invoke the program instructions to perform the following method:
  • the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to
  • the data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
  • the process executes the process to the second thread.
  • the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
  • the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  • a fourth aspect of the embodiments of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to execute the following method:
  • the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to
  • the data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
  • the process executes the process to the second thread.
  • the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
  • the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  • the reading and writing of the database can be separated into a read-set database and a write-set database respectively, and the concept of cache is introduced; further, when each thread is running, it can access different intervals correspondingly, which realizes Multi-threaded database access is beneficial to reduce the access pressure of the database; at the same time, during MVCC verification, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, It is beneficial to improve the performance of the blockchain.
  • FIG. 1A provides a schematic diagram of a system architecture for data processing according to an embodiment of the present application
  • FIG. 1B provides a schematic flowchart of a data processing method according to an embodiment of the present application
  • FIG. 2 provides a schematic flowchart of a data processing method according to an embodiment of the present application
  • FIG. 3 provides a schematic flowchart of a data processing method according to an embodiment of the present application
  • FIG. 4 provides a schematic structural diagram of a server according to an embodiment of the present application.
  • FIG. 5 provides a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
  • This application may relate to the field of artificial intelligence technology, and may be applied to data processing scenarios based on blockchain.
  • medical data can be stored in the blockchain.
  • the medical data can include personal health records, prescriptions, inspection reports and other data, and then the blockchain can be stored in the blockchain. processing of medical data.
  • the server mentioned in the embodiments of this application may be an independent server, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms.
  • the server may include, but is not limited to, a background server, a component server, a cloud server, a data processing system server, or a data processing software server, etc.
  • the above are only examples, not exhaustive, including but not limited to the above devices.
  • FIG. 1A is a schematic diagram of a system architecture for data processing provided by an embodiment of the present application.
  • the embodiments of the present application are applied to a server in which a blockchain network can be deployed.
  • the blockchain is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the FiMax blockchain platform is a blockchain network composed of S3C as the basic framework.
  • the S3C is composed of blockchain solution modules, districts A framework system composed of blockchain kernel module, blockchain privacy protection module and blockchain network management module.
  • the above system architecture may include: a read set database, a write set database, a first queue, a second queue, a write set cache, and a multi-version concurrency control (Multi-Version Concurrency Control, MVCC) module.
  • MVCC Multi-Version Concurrency Control
  • the above-mentioned MVCC module is used for version verification of transaction information in the block
  • the above-mentioned block can be the basic unit structure in the blockchain network, and each block is composed of a block header and a block body; wherein, the block header storage structure
  • the block body is generally a tree structure, which is used to record the transaction information of the block; and the above-mentioned MVCC module is mainly used to verify whether the corresponding data version in the above-mentioned transaction information is correct. If the current transaction uses the read set version If it is not equal to the read set version in the current database, the transaction is considered outdated and marked as an erroneous transaction.
  • the read and write parts in the database can be separated, and the database can be accessed through multiple threads (two or more threads), and the write set data in the database can be stored in the write set cache.
  • the above data processing method may correspond to 3 threads, the above-mentioned read set database is used to complete the task of the first thread, the above-mentioned MVCC module is used to complete the task of the second thread, and the above-mentioned write set database is used to complete the task of the third thread.
  • the above three threads can be performed at the same time, and the data in multiple blocks can be processed synchronously. In this way, the relevant operations of the above read set and write set can be understood as IO operations.
  • the above-mentioned MVCC After the above-mentioned database is separated from reading and writing, the above-mentioned MVCC The verification can be done by the CPU, and the above operations are performed synchronously by dividing into 3 threads, thereby realizing the separation of IO operations and CPU operations, which is beneficial to improve the performance of the blockchain network.
  • FIG. 1B is a schematic flowchart of a data processing method provided by an embodiment of the present application, applied to a server, and the above method includes the following steps:
  • the queue generates N first tasks from data sets corresponding to N blocks before the Mth block, where N is a positive integer less than or equal to M, and M is a positive integer.
  • the above-mentioned read set database may correspond to the read set database in the system architecture diagram as shown in FIG. 1A ; the database used for storing the corresponding information in the block may be divided into a read set database and a write set database, so as to have It is beneficial for subsequent different threads to access at different time periods, so as to relieve frequent access to the same database and reduce the access pressure of the database.
  • the first information instruction may be set by the user or the system defaults, which is not limited here; the first information instruction is used to instruct the first thread to read the data set corresponding to the block from the read set database.
  • the above-mentioned M-th block can be any block for which the above-mentioned first thread needs to perform MVCC verification on the block within a certain period of time, and the corresponding M-th block can be read from the read set database.
  • Data set which can include at least one of the following: BlockID, data version number, data key-value pair and reference count corresponding to the BlockID, etc., which are not limited here, and the above-mentioned data version number can be the corresponding block. Database version number.
  • each block in the blockchain network since the blocks in the blockchain network are arranged in a chain, each block can be numbered, and each BolckID can correspond to a block.
  • the BolckID can be arranged from small to large. The smaller the BolckID, the more Priority processing by the above-mentioned first thread.
  • the BlockID may be M, and the BlockID is used to identify the block.
  • the second thread and the third thread can process data in parallel, and the first thread, the second thread and the third thread can be combined into a data processing flow, and the may be different, but the processing steps may be the same.
  • the corresponding BlockIDs are: 1, 2, 3, 4, 5, 6, and 7; in the first time period, if the first thread In processing block 6 and block 7, at this time, the second thread can synchronize or process block 3, block 4 and block 5 in parallel; while the third thread can synchronize or process block 1 and block 2 in parallel;
  • the block can be placed in the first queue, the first queue can be used to store the block, and the first queue can provide the necessary data for executing the task.
  • Functions such as return of results, retry of tasks, etc.; for example, when the first queue receives the block sent by the first thread, it can return a result message to the subsequent thread, that is, the second thread, to inform the second thread
  • the thread block stores information such as completion, so that the second thread can implement subsequent functions. For example, MVCC verification can be performed on the block.
  • the first task can be generated. Each block may correspond to a first task.
  • the first queue stores N blocks before the Mth block at this time, and the N blocks are the first block. After a thread is processed, it is sent to the first queue to obtain, and N is an integer less than or equal to M.
  • the second thread sends a second information instruction, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to verify the data corresponding to the P blocks Sets are written into the write set cache respectively, and P write events are generated; and the P write events are synchronized to the second queue, and P third tasks are generated, where P is less than or equal to N positive integer.
  • the above-mentioned multi-version concurrency control (Multi-Version Concurrency Control, MVCC) verification is a method of concurrency control, which is used to verify whether the data version of the transaction data included in the above-mentioned block is correct; Whether the read set is consistent with the version in the current database ledger (that is, there is no change), if there is no change, it means that the modification of the data in the transaction write set is valid, mark the transaction as valid, and the write set of the transaction is updated to the specific database (such as writing set database).
  • MVCC Multi-Version Concurrency Control
  • the above-mentioned second information instruction can be set by the user or the system defaults, which is not limited here; after the server detects the preset trigger condition, the server can send the above-mentioned second information instruction to the second thread, and the second information instruction is used for Instruct the second thread to implement MVCC verification on the block, the above-mentioned preset trigger conditions can be set by the user or the system defaults, which is not limited here; for example, when the current task of the second thread is to obtain the first task generated by the first queue, Alternatively, when the first thread receives the first information instruction, the second thread may also be triggered to start running at the same time to implement the corresponding function.
  • the second thread can directly obtain the block from the first queue to perform MVCC verification on the block, and after the verification is completed, write it into the write set cache.
  • the write set cache can be set by the user or The system defaults and is not limited here; the write set cache can be used to store information such as the data set corresponding to the block.
  • the verification result can be generated, and the verification result can be written into the write set in the database to record the verification results for the block.
  • the second queue can be set by the user or the system defaults, which is not limited here; the structure of the second queue and the first queue can be the same, and the second queue can be used to store blocks verified by the second thread , and can provide blocks to the write set cache.
  • the second queue can generate its corresponding third task for the block, and the third task is used to notify the write set database that the previous thread processing is completed, so as to notify the write set database of the completion of the previous thread processing.
  • the write set database provides information such as data sets corresponding to blocks to help the write set database update the data sets and other information corresponding to the above blocks.
  • the MVCC module shown in Figure 1A can be used to realize the above-mentioned MVCC verification of P blocks.
  • Each block can be packaged with at least one transaction, and each transaction corresponds to transaction information.
  • the minimum unit Can be transaction information.
  • the following step may be further included: before writing the block A to the write set cache, if there is a block B The data set is associated with the data set of the block A, then the reference count in the data set of the block B in the write set cache is increased by 1 through the block A, wherein the block B Any block in the write set cache is written before the block A.
  • the previously processed block can have an impact on the currently processed block. ; For example, if a transaction information in the current block is being verified by MVCC, it may be necessary to use the transaction information in other previously processed blocks as reference data to verify the transaction status of the target block (for example: balance information, consumption information, etc.); in this way, when MVCC verification is performed, other blocks associated with this block may affect the process of its MVCC verification; therefore, before performing MVCC verification, the current The block is associated with multiple previous blocks to be judged.
  • the judgment of data association can be realized by the reference count in the data set, and the reference count is used to judge whether there is a block with associated data with its corresponding block in the write set cache.
  • the second thread if the second thread processes the current processing block, the second thread performs MVCC verification or the reference count corresponding to any block in the block processed by the third thread is not 0, it indicates that the block is If there is an association between the datasets corresponding to the blocks currently being processed, during MVCC verification, the datasets in the associated blocks can be directly called to help the blocks currently being processed realize MVCC verification.
  • the above-mentioned block A is any one of the P blocks, if after the above-mentioned MVCC verification process is performed on the block A, before the block A is written into the write set cache, if it is found that there is a block B of the block B
  • the data set is associated with the data set of the block A, then the block A can increase the reference count of the block B by 1, which means that there is a data association between the block A and the block B, and at the same time,
  • the block A can know that there is a correlation between the block B and its own data set through the reference count of the block B; wherein, the block B can be a block
  • the block B can be a block
  • the block before A in the embodiments of the present application, only two of the blocks are used as examples for description.
  • performing MVCC verification on the above-mentioned P blocks may include the following steps: based on the data set corresponding to each block, determine the data version number corresponding to each transaction information in each block, and use the data The version number is compared with the preset data version number, if the same, it is determined that the MVCC verification for the block is successful, otherwise, it is determined that the MVCC verification for the block fails; the block can be any of the above P blocks One.
  • the preset data version number may be set by the user or the system defaults, which is not limited herein.
  • the preset version number can be set by the background staff, for example, it can be 100 or 200, etc. If the MVCC verification of a transaction is successful, it means that the transaction is valid, and the block corresponding to the transaction can be stored in the In the local chain, the result data corresponding to the successful verification can be generated by the second queue to generate a third task for informing the write set database that the verification is successful, and is stored and updated in the write set database.
  • the following step may be further included: determining that when the block A accesses the corresponding data set from the read set database, the read set The data version number of the database is the first version number; according to the first version number, if the block E associated with the block A is determined, then the reference count of the block E is incremented by 1; Block A performs the MVCC verification; when the MVCC verification of the block A is completed, the reference count of the block E is decremented by 1.
  • the association between the above-mentioned blocks can be established based on the database version number, the blocks with data association can be placed in the same database in advance, or the database version number can be written into multiple data associations. in the block.
  • the first version number may be the data version number corresponding to the read set database corresponding to block A, and the corresponding associated block E may be determined according to the data version number.
  • the association between the above-mentioned blocks can be established based on the database version number. For example, when the block 100 whose BlockID is 100 accesses data from the database, at this time, the version number of the database is 60; then, The block 100 may be associated with the block 61 corresponding to the database version number 61 . In subsequent steps, the reference count of block 61 can be set to +1, and the reference count of block 61 can be set to -1 only after block 100 completes the above-mentioned MVCC verification. At this time, block 100 and block 100 are also released The association between the blocks 61, that is, there is no association between the block 100 and the block 61.
  • the current BolckID is the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks, and each A third task corresponds to a block.
  • the block corresponding to the current BolckID is the BlockID corresponding to the block currently being processed by the third thread.
  • the third thread can synchronously process the Q blocks before the BlockID corresponding to the P blocks, and the Q blocks are processed before the above-mentioned P blocks.
  • the second thread processes the completed batch of blocks.
  • the above third information instruction can be set by the user or the system defaults, which is not limited here; when the write set database acquires the Q third tasks in the above second queue, the server can be triggered to send the above third information to the third thread instruction.
  • each time the third thread processes a block it can update the current BlockID in the write set cache to the BlockID corresponding to the currently processed block, so as to identify the currently processed block number and inform other threads of the current processing
  • the block is conducive to ensuring the cooperation of the entire thread.
  • the current BlockID in the write-set cache can be updated.
  • the data set includes: a reference count
  • the above-mentioned method may further include the following steps: if the reference count corresponding to the block A is not 0, executing the determination that the data set with the block exists and the all There is an association between the data sets of the block A, wherein the block A is any one of the P blocks; if the reference count corresponding to the block A is 0, it is determined that no block exists. There is an association between the data set of the block A and the data set of the block A, then after updating the current BlockID to the BlockID of the block A, the data set corresponding to the block A is cached from the write set deleted in.
  • the above-mentioned data set may include at least one of the following: BlockID, version number, data key-value pair and reference count corresponding to the BlockID, etc., which are not limited here;
  • the block number to be processed, the above version number can be the database version number.
  • the reference count of the block A can be monitored. If the count is 0, it means that after the second thread MVCC verification, there may be data associations between other blocks and their data sets, and the block A can be retained; otherwise, if the reference count corresponding to the block A is 0, it means that there is no relationship between the block and the data set of block A, and because, in general, each block may contain thousands of transactions, the memory space required may be Therefore, in order to save the memory space of the cache, after updating the current BlockID to the BlockID of the block A, the data set corresponding to the block A can be deleted from the write set cache.
  • the method may further include the following step: if a cache reclamation mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, executing the determination of the number of existing buffers in the write set cache. block and multiple reference counts corresponding to the multiple blocks, each block corresponds to one reference count; if there is a BlockID corresponding to block C that is smaller than the current BolckID, and the reference count corresponding to block C is 0 , the block C is deleted from the write set cache, wherein the block C is any one of the multiple blocks.
  • a cache recycling mechanism can also be set in the above server.
  • the cache recycling mechanism can be set by the system by default or by the user, which is not limited here; when any thread triggers the cache recycling mechanism, Then, all the blocks in the write set cache and the reference count corresponding to each block can be determined, and the data set corresponding to the block stored in the write set cache can be processed through the reference count.
  • BlockID corresponding to the above block C is smaller than the current BlockID, it means that the block C has completed the processing of the above three threads. If the reference count of the block C is 0, it indicates that there is no connection with the For the block associated with the data set of block C, the block C can be deleted from the above write set cache, so as to save the memory space of the write set cache and improve the efficiency of subsequent data acquisition from the write set cache.
  • the following step may be further included: if the data set of the block C is associated with the data set corresponding to the block D, after deleting the block C, execute the block D to be deleted.
  • the corresponding reference count is decremented by 1, and the block C and the block D are any two blocks in the write set cache.
  • the addition and subtraction of the above reference counts are actually block-to-block operations. For example, if block C finds that there is a relationship between the data set of block D and block C during MVCC verification, the block C can make the reference count corresponding to block D +1. In this way, in the subsequent process, the block C can obtain the reference count corresponding to other blocks. If it is not 0, there is an association with it. There is no association, and the reference counts of other blocks acquired by each block are relative to the block; for another example, if the reference count of a block is 5, there may be 5 blocks associated with it.
  • the reference count of the block C is 0, but the block D is associated with the data set of the block C, that is, the block C finds that the block D is related to the data between them, However, at this time, when block C is performing MVCC authentication, the data set corresponding to block D has already been used, and after the block C is deleted, the reference count corresponding to block D can be decremented by 1.
  • the data processing method described in the embodiment of the present application is applied to the server, and can send the first information instruction to the first thread, and the first information instruction is used to instruct the first thread if the The data set corresponding to the Mth block is executed to generate N first tasks from the data set corresponding to the N blocks before the Mth block through the first queue, where N is a positive integer less than or equal to M , M is a positive integer; if it is determined that the current task of the second thread is to obtain the P first tasks generated by the data sets corresponding to the P blocks before the N blocks through the first queue, then execute the task to the second thread.
  • Send a second information instruction and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P write events; and synchronize the P write events to the second queue to generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is to write
  • the set database acquires the Q third tasks in the second queue, it executes the sending third information instruction to the third thread.
  • the current BolckID is executed to update the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P,
  • the Q third tasks correspond to the Q blocks preceding the P blocks, and each third task corresponds to a block.
  • the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database.
  • FIG. 2 is an exemplary flowchart of a data processing method disclosed in an embodiment of the present application, applied to a server, wherein, as shown in the figure, a read set as shown in FIG. 1A may be included.
  • Database write set cache, first queue, second queue, MVCC module and write set database.
  • block 66 which is processed in the read database is processed, is the same, the write cache is processing the block 57, block 58, 90, 60, and 61, respectively, block 57, block 58, region Blocks 59, 60, and 61; the first queue is processing blocks 63, 64, and 65 with BolckIDs 63, 64, and 65, respectively; the second queue is processing BolckIDs 59, respectively , 61 and 61 blocks 59, 60 and 61; the MVCC module is processing block 62 with BolckID 62; the write set database is processing block 58 with BolckID 58.
  • the above-mentioned first thread can read the data set corresponding to block 66 from the read set database, and then execute the first queue to generate three first data sets corresponding to block 63, block 64, and block 65 through the first queue.
  • the second thread can perform MVCC verification on block 62; at this time, the data sets corresponding to block 57, block 58, block 59, block 60 and block 61 are being written in the write set cache.
  • the second queue can be placed with blocks 59, 60 and 61 that have been verified by the MVCC module; at this time, If the execution of the third task corresponding to block 58 is completed, the data set corresponding to block 58 can be written into the write set database.
  • the block processed last is block 57, and the current BolckID is updated. (57) is the BolckID of block 58, which is 58.
  • the read and write of the database is separated into a read set database and a write set database, and the concept of cache is introduced; further, when each thread is running, each thread can access different intervals correspondingly, which can realize multi-threading It is beneficial to reduce the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed to obtain the data set corresponding to the block, without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; The operation only involves CPU operations, which is beneficial to improve the performance of the blockchain.
  • FIG. 3 is an exemplary flowchart of a data processing method disclosed in an embodiment of the present application, applied to a server, and the data processing method may include the following steps:
  • the queue generates N first tasks with data sets corresponding to N blocks before the Mth block.
  • the second thread sends a second information instruction, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to verify the data corresponding to the P blocks Sets are written into the write set cache respectively, and P write events are generated; and the P write events are synchronized to the second queue, and P third tasks are generated, where P is less than or equal to N positive integer.
  • the current BolckID is the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks, and each A third task corresponds to a block.
  • the cache recycling mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, execute and determine the multiple blocks and the multiple blocks that exist in the write set cache. Multiple reference counts corresponding to a block, and each block corresponds to a reference count.
  • BlockID corresponding to block C is smaller than the current BolckID, and the reference count corresponding to block C is 0, then delete the block C from the write set cache, wherein the block C is any one of the plurality of blocks.
  • the reference count corresponding to the block D is decremented by 1, and the block C and the block D are any two blocks in the write set cache.
  • the data processing method described in the embodiment of the present application is applied to the server, and can send the first information instruction to the first thread, and the first information instruction is used to instruct the first thread to read the first information from the read set database.
  • the first queue is executed to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, where N is a positive integer less than or equal to M, M is a positive integer; if it is determined that the current task of the second thread is to obtain P first tasks generated from the data sets corresponding to the P blocks before N blocks through the first queue, then execute sending to the second thread.
  • the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P and synchronize P write events to the second queue to generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is through the write set
  • the database acquires the Q third tasks in the second queue, it executes the sending third information instruction to the third thread, and the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the Current BlockID; when the execution of any third task in the Q third tasks is completed, the current BolckID is executed to update the BlockID corresponding to the block of any third task, where Q is a positive integer less than or equal to P, and Q
  • the third tasks correspond to the Q blocks before the P blocks, and each third task corresponds to a block; if the cache recycling mechanism is triggered when the tasks of the first thread, the second thread or the third thread
  • the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database.
  • FIG. 4 is a schematic structural diagram of a server provided by an embodiment of the present application. As shown in FIG. 4, it includes a processor, a communication interface, a memory, and one or more programs.
  • a processor, a communication interface and a memory are interconnected, wherein the memory is used to store a computer program, the computer program includes program instructions, the processor is configured to invoke the program instructions, the one or more programs instructions to perform the following steps:
  • the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to
  • the data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
  • the process executes the process to the second thread.
  • the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
  • the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  • the server described in the embodiment of the present application can send the first information instruction to the first thread, and the first information instruction is used to instruct the first thread to read the Mth block corresponding to the read set database.
  • the data set then execute the first queue to generate N first tasks with the data set corresponding to the N blocks before the Mth block, wherein, N is a positive integer less than or equal to M, and M is a positive integer;
  • the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P write events; Synchronize P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread
  • the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database.
  • Access which is conducive to reducing the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, which is conducive to improving the area The performance of the blockchain.
  • the data set includes: a reference count
  • the program is used to perform instructions for the following steps:
  • the reference count corresponding to the block A is 0, it is determined that there is no relationship between the data set of the block and the data set of the block A, and the current BlockID is updated to the block A. After the BlockID is specified, the data set corresponding to the block A is deleted from the write set cache.
  • the program is used to execute the instructions of the following steps:
  • the write set Before writing the block A into the write set cache, if there is a data set of the block B associated with the data set of the block A, the write set is cached through the block A The reference count in the data set of the block B is incremented by 1, wherein the block B is any block written into the write set cache before the block A.
  • the program is used to execute instructions for the following steps:
  • each block corresponds to a reference count
  • BlockID corresponding to the block C is smaller than the current BolckID, and the reference count corresponding to the block C is 0, the block C is deleted from the write set cache, wherein the block C is the Any one of multiple blocks.
  • the program is also used to execute instructions for the following steps:
  • the reference count corresponding to the block D is decremented by 1, and the block C and the The block D is any two blocks in the write set cache.
  • the data set further includes: a data version number, and in the aspect of performing the MVCC verification on the P blocks, the program is used to execute the instructions of the following steps:
  • the server includes corresponding hardware structures and/or software modules for executing each function.
  • the present application can be implemented in hardware or in the form of a combination of hardware and computer software, in combination with the units and algorithm steps of each example described in the embodiments provided herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
  • the server may be divided into functional units according to the foregoing method examples.
  • each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit.
  • the above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units. It should be noted that the division of units in the embodiments of the present application is schematic, and is only a logical function division, and other division methods may be used in actual implementation.
  • FIG. 5 is a schematic structural diagram of a data processing apparatus disclosed in an embodiment of the present application, applied to a server, and the apparatus includes: a first sending unit 501, a second sending unit 502, and a third sending unit 501. sending unit 503, wherein,
  • the first sending unit 501 is configured to send a first information instruction to the first thread, where the first information instruction is used to instruct the first thread to read the Mth block corresponding to the first thread from the read set database. data set, then execute the first queue to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
  • the second sending unit 502 is configured to, if it is determined that the current task of the second thread is to obtain the P first tasks generated by the first queue from the data sets corresponding to the P blocks before the N blocks is executed, the second information instruction is sent to the second thread, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to
  • the data sets corresponding to the P blocks are respectively written into the write set cache, and P write events are generated; and the P write events are synchronized to the second queue, and P third tasks are generated, wherein , P is a positive integer less than or equal to N;
  • the third sending unit 503 is configured to execute sending third information to the third thread if it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database instruction, the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when any third task in the Q third tasks executes When completed, then execute and update the current BolckID to be the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to P blocks For the previous Q blocks, each third task corresponds to a block.
  • the data processing apparatus described in the embodiment of the present application is applied to the server, and can send the first information instruction to the first thread.
  • the data set corresponding to the Mth block is executed to generate N first tasks from the data set corresponding to the N blocks before the Mth block through the first queue, where N is a positive integer less than or equal to M , M is a positive integer; if it is determined that the current task of the second thread is to obtain the P first tasks generated by the data sets corresponding to the P blocks before the N blocks through the first queue, then execute the task to the second thread.
  • Send a second information instruction and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P write events; and synchronize the P write events to the second queue to generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is to write
  • the set database acquires the Q third tasks in the second queue, it executes the sending third information instruction to the third thread.
  • the current BolckID is executed to update the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P,
  • the Q third tasks correspond to the Q blocks preceding the P blocks, and each third task corresponds to a block.
  • the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database.
  • the data set further includes: a data version number
  • the second sending unit 502 is specifically configured to: determine the block A When accessing the corresponding data set from the read set database, the data version number of the read set database is the first version number; according to the first version number, the block E associated with the block A is determined , then add 1 to the reference count of the block E; perform the MVCC verification on the block A; when the block A is performing the MVCC verification and complete, then determine to execute the reference to the block E Decrement the count by 1.
  • Embodiments of the present application further provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute any data as described in the foregoing method embodiments Some or all of the steps of a processing method.
  • the computer program may include program instructions, which, when executed by the processor, cause the processor to execute part or all of the steps of the above method, which will not be repeated here.
  • the storage medium involved in this application such as a computer-readable storage medium, may be non-volatile or volatile.
  • the embodiments of the present application further provide a computer program product
  • the computer program product includes a non-transitory computer-readable storage medium storing a computer program
  • the computer program is operable to cause a computer to execute the methods described in the foregoing method embodiments. Some or all of the steps of any data processing method.
  • the computer program product may be a software installation package.
  • the disclosed apparatus may be implemented in other manners.
  • the apparatus embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented.
  • the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
  • the units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software program modules.
  • the integrated unit if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory.
  • the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory.
  • a computer device which may be a personal computer, a server, or a network device, etc.
  • the aforementioned memory includes: U disk, read-only memory (ROM), random access memory (random access memory, RAM), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the field of artificial intelligence and blockchain technology, particularly relating to a data processing method and device, and a storage medium. The method and the device are applied to a server. The method comprises: the read and write of a database are split into a read set database and a write set database respectively, and a concept of a cache is introduced; thus, when each thread is running, different intervals may be correspondingly accessed, thereby realizing multi-thread database access, and helping to reduce database access pressure; meanwhile, when performing a MVCC verification, the cache can be directly accessed without frequently accessing the database, thereby helping to improve the verification efficiency of the MVCC; also, the operations described above relate only to CPU operation, thereby improving the performance of the blockchain.

Description

数据处理方法、装置及存储介质Data processing method, device and storage medium
本申请要求于2020年9月3日提交中国专利局、申请号为202010913919.5,发明名称为“数据处理方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number of 202010913919.5 and the invention titled "Data Processing Method, Device and Storage Medium", which was filed with the China Patent Office on September 3, 2020, the entire contents of which are incorporated herein by reference middle.
技术领域technical field
本申请涉及区块链技术领域,具体涉及一种数据处理方法、装置及存储介质。The present application relates to the field of blockchain technology, and in particular, to a data processing method, device and storage medium.
背景技术Background technique
区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。发明人意识到,在区块链应用中,当通过多版本并发控制(Multi-Version Concurrency Control,MVCC)实现并发验证时,往往需要多次读取或者写入数据库,造成数据库的访问压力过大,进而导致MVCC验证效率低。Blockchain is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. The inventor realized that in blockchain applications, when concurrent verification is implemented through Multi-Version Concurrency Control (MVCC), it is often necessary to read or write the database multiple times, resulting in excessive access pressure to the database. , which in turn leads to low MVCC verification efficiency.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种数据处理方法、装置及存储介质,有利于提高MVCC的验证效率。Embodiments of the present application provide a data processing method, device, and storage medium, which are beneficial to improving the verification efficiency of MVCC.
本申请实施例第一方面提供了一种数据处理方法,应用于服务器,包括:A first aspect of the embodiments of the present application provides a data processing method, applied to a server, including:
向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
本申请实施例第二方面提供了一种数据处理装置,应用于服务器,所述装置包括:第一发送单元、第二发送单元和第三发送单元,其中,A second aspect of an embodiment of the present application provides a data processing apparatus, which is applied to a server, and the apparatus includes: a first sending unit, a second sending unit, and a third sending unit, wherein,
所述第一发送单元,用于向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;The first sending unit is used to send a first information instruction to the first thread, and the first information instruction is used to instruct the first thread if the data corresponding to the Mth block is read from the read set database set, then execute the first queue to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, wherein, the N is a positive integer less than or equal to M, and the M is a positive integer;
所述第二发送单元,用于若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;The second sending unit is used when determining that the current task of the second thread is to obtain the P first tasks generated by the first queue from the data sets corresponding to the P blocks before the N blocks. , then execute sending a second information instruction to the second thread, where the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to perform the MVCC verification on the P blocks; The data sets corresponding to the P blocks are written into the write set cache respectively, and P write events are generated; the P write events are synchronized to the second queue, and P third tasks are generated, wherein, P is a positive integer less than or equal to N;
所述第三发送单元,用于若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令 用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。The third sending unit is configured to execute a sending third information instruction to the third thread if it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database , the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed , then perform updating the current BolckID to be the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to P blocks before Q blocks of , each third task corresponds to a block.
本申请实施例的第三方面提供一种服务器,所述服务器包括处理器、通信接口、存储器以及一个或多个程序,所述处理器、通信接口和存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下方法:A third aspect of the embodiments of the present application provides a server, the server includes a processor, a communication interface, a memory, and one or more programs, the processor, the communication interface, and the memory are connected to each other, wherein the memory is used for A computer program is stored, the computer program includes program instructions, the processor is configured to invoke the program instructions to perform the following method:
向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
本申请实施例的第四方面提供了一种计算机可读存储介质,其中,上述计算机可读存储介质存储用于电子数据交换的计算机程序,其中,上述计算机程序使得计算机执行以下方法:A fourth aspect of the embodiments of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, wherein the computer program causes a computer to execute the following method:
向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
实施本申请实施例,可将数据库的读写分离,分别分为读集数据库和写集数据库,并引入缓存的概念;进而,在上述每一线程运行时,可对应访问不同的区间,实现了多线程的数据库访问,有利于减少数据库的访问压力;同时,在进行MVCC验证时,可直接访问 上述缓存,不用频繁访问数据库,有利于提高MVCC的验证效率;此外,上述操作仅涉及CPU操作,有利于提高区块链的性能。By implementing the embodiment of the present application, the reading and writing of the database can be separated into a read-set database and a write-set database respectively, and the concept of cache is introduced; further, when each thread is running, it can access different intervals correspondingly, which realizes Multi-threaded database access is beneficial to reduce the access pressure of the database; at the same time, during MVCC verification, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, It is beneficial to improve the performance of the blockchain.
附图说明Description of drawings
图1A为本申请实施例提供了一种数据处理的***架构示意图;FIG. 1A provides a schematic diagram of a system architecture for data processing according to an embodiment of the present application;
图1B为本申请实施例提供了一种数据处理方法的流程示意图;FIG. 1B provides a schematic flowchart of a data processing method according to an embodiment of the present application;
图2为本申请实施例提供了一种数据处理方法的流程示意图;FIG. 2 provides a schematic flowchart of a data processing method according to an embodiment of the present application;
图3为本申请实施例提供了一种数据处理方法的流程示意图;FIG. 3 provides a schematic flowchart of a data processing method according to an embodiment of the present application;
图4为本申请实施例提供了一种服务器的结构示意图;FIG. 4 provides a schematic structural diagram of a server according to an embodiment of the present application;
图5为本申请实施例提供了一种数据处理装置的结构示意图。FIG. 5 provides a schematic structural diagram of a data processing apparatus according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
为了能够更好地理解本申请实施例,下面将对应用本申请实施例的方法进行介绍。In order to better understand the embodiments of the present application, a method for applying the embodiments of the present application will be introduced below.
本申请可涉及人工智能技术领域,并可应用于基于区块链的数据处理场景。例如,可具体应用于数字医疗中的医疗数据处理场景,如区块链中可以存储医疗数据,该医疗数据可以包括个人健康档案、处方、检查报告等数据,进而可实现对该区块链中的医疗数据的处理。This application may relate to the field of artificial intelligence technology, and may be applied to data processing scenarios based on blockchain. For example, it can be specifically applied to medical data processing scenarios in digital medicine. For example, medical data can be stored in the blockchain. The medical data can include personal health records, prescriptions, inspection reports and other data, and then the blockchain can be stored in the blockchain. processing of medical data.
本申请实施例中提到的服务器可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(Content Delivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。例如,该服务器可以包括但不限于后台服务器、组件服务器、云端服务器、数据处理***服务器或数据处理软件服务器等,上述仅是举例,而非穷举,包含但不限于上述装置。The server mentioned in the embodiments of this application may be an independent server, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, Content Delivery Network (CDN), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms. For example, the server may include, but is not limited to, a background server, a component server, a cloud server, a data processing system server, or a data processing software server, etc. The above are only examples, not exhaustive, including but not limited to the above devices.
请参见图1A,图1A是本申请实施例提供的一种数据处理的***架构示意图。Please refer to FIG. 1A . FIG. 1A is a schematic diagram of a system architecture for data processing provided by an embodiment of the present application.
其中,本申请实施例应用于服务器,该服务器中可部署区块链网络,区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层。The embodiments of the present application are applied to a server in which a blockchain network can be deployed. The blockchain is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
其中,本申请实施例具体可应用于FiMax区块链平台中,该FiMax区块链平台是一种以S3C为基础框架组成的区块链网络,该S3C是由区块链解决方案模块、区块链内核模块、区块链隐私保护模块和区块链网络管理模块组成的框架***。Among them, the embodiments of the present application can be specifically applied to the FiMax blockchain platform. The FiMax blockchain platform is a blockchain network composed of S3C as the basic framework. The S3C is composed of blockchain solution modules, districts A framework system composed of blockchain kernel module, blockchain privacy protection module and blockchain network management module.
其中,上述***架构中可包括:读集数据库、写集数据库、第一队列、第二队列、写集缓存和多版本并发控制(Multi-Version Concurrency Control,MVCC)模块。The above system architecture may include: a read set database, a write set database, a first queue, a second queue, a write set cache, and a multi-version concurrency control (Multi-Version Concurrency Control, MVCC) module.
其中,上述MVCC模块用于区块中交易信息的版本验证,上述区块可为区块链网络中的基本单元结构,每个区块由区块头和区块体组成;其中,区块头存储结构化的数据,区块体一般为树状结构,用于记录区块的交易信息;而上述MVCC模块主要用于验证上述交易信息中对应的数据版本是否正确,如果当前交易的使用的读集版本和当前数据库中的读集版本不相等,则认为交易已经过时,并标记为错误的交易。Among them, the above-mentioned MVCC module is used for version verification of transaction information in the block, and the above-mentioned block can be the basic unit structure in the blockchain network, and each block is composed of a block header and a block body; wherein, the block header storage structure The block body is generally a tree structure, which is used to record the transaction information of the block; and the above-mentioned MVCC module is mainly used to verify whether the corresponding data version in the above-mentioned transaction information is correct. If the current transaction uses the read set version If it is not equal to the read set version in the current database, the transaction is considered outdated and marked as an erroneous transaction.
此外,在本申请实施例中,可将数据库中的读写部分分开,并通过多线程(2个或2个以上的线程)去访问数据库,将在数据库中的写集数据存储于写集缓存中,在做上述MVCC版本验证时,可直接访问写集缓存,不需要等待数据库写缓存,有利于提高数据库 的写入性能,避免写入时的数据阻塞。In addition, in the embodiment of the present application, the read and write parts in the database can be separated, and the database can be accessed through multiple threads (two or more threads), and the write set data in the database can be stored in the write set cache. In the above MVCC version verification, you can directly access the write set cache without waiting for the database to write to the cache, which is beneficial to improve the write performance of the database and avoid data blocking during writing.
其中,上述数据处理方法可对应有3个线程,上述读集数据库用于完成第一线程的任务,上述MVCC模块用于完成第二线程的任务,上述写集数据库用于完成第三线程的任务,并且,上述三个线程可同时进行,可同步处理多个区块中的数据,如此,上述读集和写集的相关操作可理解为IO操作,将上述数据库进行读写分离以后,上述MVCC验证可由CPU完成,通过分成3个线程去同步进行上述操作,从而实现将IO操作与CPU操作进行分离,有利于提高区块链网络的性能。The above data processing method may correspond to 3 threads, the above-mentioned read set database is used to complete the task of the first thread, the above-mentioned MVCC module is used to complete the task of the second thread, and the above-mentioned write set database is used to complete the task of the third thread. , and the above three threads can be performed at the same time, and the data in multiple blocks can be processed synchronously. In this way, the relevant operations of the above read set and write set can be understood as IO operations. After the above-mentioned database is separated from reading and writing, the above-mentioned MVCC The verification can be done by the CPU, and the above operations are performed synchronously by dividing into 3 threads, thereby realizing the separation of IO operations and CPU operations, which is beneficial to improve the performance of the blockchain network.
请参见图1B,图1B是本申请实施例提供的一种数据处理方法的流程示意图,应用于服务器,上述方法包括以下步骤:Please refer to FIG. 1B . FIG. 1B is a schematic flowchart of a data processing method provided by an embodiment of the present application, applied to a server, and the above method includes the following steps:
101、向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数。101. Send a first information instruction to the first thread, where the first information instruction is used to instruct the first thread if the data set corresponding to the Mth block is read from the read set database, then execute the first information through the first thread. The queue generates N first tasks from data sets corresponding to N blocks before the Mth block, where N is a positive integer less than or equal to M, and M is a positive integer.
其中,上述读集数据库可对应于如图1A所示的***架构图中的读集数据库;可将用于存储区块中对应的信息的数据库拆分为读集数据库和写集数据库,以有利于后续不同线程在不同的时段访问,以缓解频繁访问同一数据库,以减少数据库的访问压力。Wherein, the above-mentioned read set database may correspond to the read set database in the system architecture diagram as shown in FIG. 1A ; the database used for storing the corresponding information in the block may be divided into a read set database and a write set database, so as to have It is beneficial for subsequent different threads to access at different time periods, so as to relieve frequent access to the same database and reduce the access pressure of the database.
其中,上述第一信息指令可为用户自行设置或者***默认,在此不作限定;该第一信息指令用于指示第一线程从上述读集数据库中读取区块对应的数据集。The first information instruction may be set by the user or the system defaults, which is not limited here; the first information instruction is used to instruct the first thread to read the data set corresponding to the block from the read set database.
其中,上述第M个区块可为在某一时间段内上述第一线程需要对该区块进行MVCC验证的任意一个区块,可从读集数据库中读取该第M个区块对应的数据集,该数据集中可包括以下至少一项:BlockID、数据版本号、该BlockID对应的数据键值对和引用计数等等,在此不做限定,上述数据版本号可为该区块对应的数据库版本号。Wherein, the above-mentioned M-th block can be any block for which the above-mentioned first thread needs to perform MVCC verification on the block within a certain period of time, and the corresponding M-th block can be read from the read set database. Data set, which can include at least one of the following: BlockID, data version number, data key-value pair and reference count corresponding to the BlockID, etc., which are not limited here, and the above-mentioned data version number can be the corresponding block. Database version number.
其中,由于区块链网络中的区块是链式排列的,因此可对每一区块进行编号,每一BolckID可对应一个区块,该BolckID可从小到大排列,BolckID越小,则越被上述第一线程优先处理。例如,对于上述第M个区块来说,该BlockID可以为M,该BlockID用于标识该区块。Among them, since the blocks in the blockchain network are arranged in a chain, each block can be numbered, and each BolckID can correspond to a block. The BolckID can be arranged from small to large. The smaller the BolckID, the more Priority processing by the above-mentioned first thread. For example, for the above-mentioned Mth block, the BlockID may be M, and the BlockID is used to identify the block.
其中,上述第一线程工作的同时,第二线程、第三线程均可并行处理数据,上述第一线程、第二线程和第三线程可组合成一个数据处理流程,每一线程处理的区块可不同,但是处理步骤可相同,例如,针对7个区块来说,其对应的BlockID分别为:1、2、3、4、5、6、7;在第一时间段,若第一线程在处理区块6和区块7,此时,第二线程可同步或者并行处理区块3、区块4和区块5;而第三线程可同步或者并行处理区块1和区块2;且上述区块对应的BlockID越小,则越优先处理;即,在第一时间段的上一时间段内,第一线程在处理区块3、区块4和区块5,第二线程可能在同步处理区块2,第三线程可能在同步处理区块1;以此类推。Wherein, while the first thread is working, the second thread and the third thread can process data in parallel, and the first thread, the second thread and the third thread can be combined into a data processing flow, and the may be different, but the processing steps may be the same. For example, for 7 blocks, the corresponding BlockIDs are: 1, 2, 3, 4, 5, 6, and 7; in the first time period, if the first thread In processing block 6 and block 7, at this time, the second thread can synchronize or process block 3, block 4 and block 5 in parallel; while the third thread can synchronize or process block 1 and block 2 in parallel; And the smaller the BlockID corresponding to the above-mentioned block, the higher the priority; that is, in the last time period of the first time period, the first thread is processing block 3, block 4 and block 5, and the second thread may While synchronously processing block 2, a third thread may be synchronously processing block 1; and so on.
其中,上述第一线程读取区块对应的数据集以后,可将该区块置于第一队列中,该第一队列可用于存放上述区块,该第一队列可提供执行任务所需的功能,比如,结果的返回、任务的重试等等;例如,当第一队列接收到第一线程发送的区块以后,可向后续线程,即第二线程返回一个结果信息,以告知第二线程区块存放完成等信息,以便于第二线程实现后续功能,例如,可对区块进行MVCC验证等;每一区块被第一线程置于第一队列时,均可生成第一任务,每一区块可对应一个第一任务。Wherein, after the first thread reads the data set corresponding to the block, the block can be placed in the first queue, the first queue can be used to store the block, and the first queue can provide the necessary data for executing the task. Functions, such as return of results, retry of tasks, etc.; for example, when the first queue receives the block sent by the first thread, it can return a result message to the subsequent thread, that is, the second thread, to inform the second thread The thread block stores information such as completion, so that the second thread can implement subsequent functions. For example, MVCC verification can be performed on the block. When each block is placed in the first queue by the first thread, the first task can be generated. Each block may correspond to a first task.
其中,由于上述第一线程在处理第M个区块时,第一线程和其他线程是同步工作的,且每一线程的处理速度可能不同,当第一线程在处理第M个区块时,此时第一队列中的区块全部为已由第一线程处理完成的区块,因此,第一队列此时存放有第M个区块之前的N个区块,该N个区块为第一线程处理完成以后发送到第一队列得到,N为小于或等于M的 整数。Among them, since the first thread and other threads work synchronously when the first thread is processing the Mth block, and the processing speed of each thread may be different, when the first thread is processing the Mth block, At this time, the blocks in the first queue are all blocks that have been processed by the first thread. Therefore, the first queue stores N blocks before the Mth block at this time, and the N blocks are the first block. After a thread is processed, it is sent to the first queue to obtain, and N is an integer less than or equal to M.
102、若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数。102. If it is determined that the current task of the second thread is to obtain the P first tasks generated by the data sets corresponding to the P blocks before the N blocks through the first queue, execute the first task to the first queue. The second thread sends a second information instruction, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to verify the data corresponding to the P blocks Sets are written into the write set cache respectively, and P write events are generated; and the P write events are synchronized to the second queue, and P third tasks are generated, where P is less than or equal to N positive integer.
其中,上述多版本并发控制(Multi-Version Concurrency Control,MVCC)验证是一种并发控制的方法,用于验证上述区块中包括的交易数据的数据版本是否正确;即校验区块中交易的读集是否和当前数据库账本中的版本一致(即没有变化),如果没有改变,说明交易写集中对数据的修改有效,把该交易标注为有效,交易的写集更新到具体的数据库(如写集数据库)中。Among them, the above-mentioned multi-version concurrency control (Multi-Version Concurrency Control, MVCC) verification is a method of concurrency control, which is used to verify whether the data version of the transaction data included in the above-mentioned block is correct; Whether the read set is consistent with the version in the current database ledger (that is, there is no change), if there is no change, it means that the modification of the data in the transaction write set is valid, mark the transaction as valid, and the write set of the transaction is updated to the specific database (such as writing set database).
其中,上述第二信息指令可为用户自行设置或者***默认,在此不作限定;服务器在检测到预设触发条件以后,可向第二线程发送上述第二信息指令,该第二信息指令用于指示第二线程对区块实现MVCC验证,上述预设触发条件可由用户自行设置或者***默认,在此不作限定;例如,当第二线程的当前任务为获取第一队列生成的第一任务时、或者当第一线程接收到第一信息指令时,同时也可触发第二线程开始运行以实现对应的功能。The above-mentioned second information instruction can be set by the user or the system defaults, which is not limited here; after the server detects the preset trigger condition, the server can send the above-mentioned second information instruction to the second thread, and the second information instruction is used for Instruct the second thread to implement MVCC verification on the block, the above-mentioned preset trigger conditions can be set by the user or the system defaults, which is not limited here; for example, when the current task of the second thread is to obtain the first task generated by the first queue, Alternatively, when the first thread receives the first information instruction, the second thread may also be triggered to start running at the same time to implement the corresponding function.
其中,上述第二线程可直接从第一队列中获取区块,以对该区块进行MVCC验证,并在验证完成以后,写入到写集缓存中,该写集缓存可为用户自行设置或者***默认,在此不作限定;该写集缓存可用于存储区块对应的数据集等信息,此外,当区块验证成功或者验证失败以后,可生成验证结果,可将该验证结果写入写集数据库中,以记录针对该区块的验证结果。The second thread can directly obtain the block from the first queue to perform MVCC verification on the block, and after the verification is completed, write it into the write set cache. The write set cache can be set by the user or The system defaults and is not limited here; the write set cache can be used to store information such as the data set corresponding to the block. In addition, when the block verification succeeds or fails, the verification result can be generated, and the verification result can be written into the write set in the database to record the verification results for the block.
其中,上述第二队列可为用户自行设置或者***默认,在此不作限定;该第二队列与上述第一队列的结构可相同,该第二队列可用于存放由第二线程验证完成的区块,并且可向写集缓存提供区块,具体实现中,第二队列可针对区块,生成其对应的第三任务,该第三任务用于向写集数据库告知上一线程处理完成,以向写集数据库提供区块对应的数据集等信息,以帮助写集数据库更新上述区块对应的数据集等信息。The second queue can be set by the user or the system defaults, which is not limited here; the structure of the second queue and the first queue can be the same, and the second queue can be used to store blocks verified by the second thread , and can provide blocks to the write set cache. In the specific implementation, the second queue can generate its corresponding third task for the block, and the third task is used to notify the write set database that the previous thread processing is completed, so as to notify the write set database of the completion of the previous thread processing. The write set database provides information such as data sets corresponding to blocks to help the write set database update the data sets and other information corresponding to the above blocks.
其中,如图1A所示的MVCC模块可用于实现上述对P个区块进行MVCC验证,每一区块可打包有至少一个交易,每一交易对应有交易信息,在实现MVCC验证时,最小单位可为交易信息。Among them, the MVCC module shown in Figure 1A can be used to realize the above-mentioned MVCC verification of P blocks. Each block can be packaged with at least one transaction, and each transaction corresponds to transaction information. When implementing MVCC verification, the minimum unit Can be transaction information.
在一种可能的示例中,若所述区块A对应的引用计数不为0,还可包括如下步骤:在将所述区块A写入到所述写集缓存之前,若存在区块B的数据集与所述区块A的数据集相关联,则通过所述区块A将所述写集缓存中所述区块B的数据集中的引用计数加1,其中,所述区块B为在所述区块A之前写入所述写集缓存中的任意一个区块。In a possible example, if the reference count corresponding to the block A is not 0, the following step may be further included: before writing the block A to the write set cache, if there is a block B The data set is associated with the data set of the block A, then the reference count in the data set of the block B in the write set cache is increased by 1 through the block A, wherein the block B Any block in the write set cache is written before the block A.
其中,在进行MVCC验证时,若当前处理的区块中的交易信息与之前处理过的区块的写集中打包的交易信息相关联,则之前处理的区块可对当前处理的区块产生影响;例如,若该当前区块中的一笔交易信息在进行MVCC验证时,可能需要使用之前处理的其他区块中的交易信息作为参考数据,以验证该目标区块的交易情况(例如:余额信息、消费信息等等);如此,在进行MVCC验证时,由于该区块相关联的其他区块可能会对其的MVCC验证的过程产生影响;因此,可在进行MVCC验证之前,对该当前区块与之前的多个区块进行关联性判断。Among them, when MVCC verification is performed, if the transaction information in the currently processed block is associated with the transaction information packaged in the write set of the previously processed block, the previously processed block can have an impact on the currently processed block. ; For example, if a transaction information in the current block is being verified by MVCC, it may be necessary to use the transaction information in other previously processed blocks as reference data to verify the transaction status of the target block (for example: balance information, consumption information, etc.); in this way, when MVCC verification is performed, other blocks associated with this block may affect the process of its MVCC verification; therefore, before performing MVCC verification, the current The block is associated with multiple previous blocks to be judged.
其中,可通过数据集中的引用计数实现数据关联性的判断,该引用计数用于判断写集缓存中是否存在与其对应的区块有关联数据的区块。Wherein, the judgment of data association can be realized by the reference count in the data set, and the reference count is used to judge whether there is a block with associated data with its corresponding block in the write set cache.
具体实现中,若第二线程在处理当前处理区块时,在第二线程进行MVCC验证或者第 三线程处理的区块中存在任意一个区块对应的引用计数不为0,则表明该区块与当前处理的区块分别对应的数据集之间存在关联,则在MVCC验证时,可直接调用关联的区块中的数据集,以帮助当前处理的区块实现MVCC验证。In the specific implementation, if the second thread processes the current processing block, the second thread performs MVCC verification or the reference count corresponding to any block in the block processed by the third thread is not 0, it indicates that the block is If there is an association between the datasets corresponding to the blocks currently being processed, during MVCC verification, the datasets in the associated blocks can be directly called to help the blocks currently being processed realize MVCC verification.
此外,上述区块A为P个区块中任意一个区块,若在上述针对区块A进行MVCC验证处理之后,在将区块A写入到写集缓存之前,若发现存在区块B的数据集与该区块A的数据集相关联,则该区块A可令区块B的引用计数加1,如此,可表示该区块A与区块B之间存在数据关联性,同时,在下一个流程在此进行MVCC验证的时候,该区块A即可通过该区块B的引用计数知道该区块B与其自身的数据集之间存在关联性;其中,区块B可为区块A之前的任意一个区块,在本申请实施例中,仅以其中两个区块为例进行说明。In addition, the above-mentioned block A is any one of the P blocks, if after the above-mentioned MVCC verification process is performed on the block A, before the block A is written into the write set cache, if it is found that there is a block B of the block B The data set is associated with the data set of the block A, then the block A can increase the reference count of the block B by 1, which means that there is a data association between the block A and the block B, and at the same time, When the MVCC verification is performed here in the next process, the block A can know that there is a correlation between the block B and its own data set through the reference count of the block B; wherein, the block B can be a block For any block before A, in the embodiments of the present application, only two of the blocks are used as examples for description.
可选地,对上述P个区块进行MVCC验证,可包括如下步骤:可基于每一区块对应的数据集,确定每一区块中每一交易信息对应的数据版本号,并将该数据版本号与预设数据版本号进行比较,若相同,则确定针对该区块的MVCC验证成功,反之,则确定针对该区块的MVCC验证失败;该区块可为上述P个区块中任意一个。Optionally, performing MVCC verification on the above-mentioned P blocks may include the following steps: based on the data set corresponding to each block, determine the data version number corresponding to each transaction information in each block, and use the data The version number is compared with the preset data version number, if the same, it is determined that the MVCC verification for the block is successful, otherwise, it is determined that the MVCC verification for the block fails; the block can be any of the above P blocks One.
其中,该预设数据版本号可为可为用户自行设置或者***默认,在此不作限定。该预设版本号可为后台工作人员进行设置,例如,可为100或者200等,若针对一个交易进行MVCC验证成功,则表明该笔交易是有效的,则该交易对应的区块可存储到本地链中,且验证成功的对应的结果数据可由第二队列生成第三任务,以用于告知写集数据库验证成功,并被存储并更新到写集数据库中。The preset data version number may be set by the user or the system defaults, which is not limited herein. The preset version number can be set by the background staff, for example, it can be 100 or 200, etc. If the MVCC verification of a transaction is successful, it means that the transaction is valid, and the block corresponding to the transaction can be stored in the In the local chain, the result data corresponding to the successful verification can be generated by the second queue to generate a third task for informing the write set database that the verification is successful, and is stored and updated in the write set database.
在一种可能的示例中,在对上述P个区块进行MVCC验证时,还可包括如下步骤:确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;对所述区块A进行所述MVCC验证;当所述区块A在进行MVCC验证完成时,则执行将所述区块E的引用计数减1。In a possible example, when performing the MVCC verification on the above-mentioned P blocks, the following step may be further included: determining that when the block A accesses the corresponding data set from the read set database, the read set The data version number of the database is the first version number; according to the first version number, if the block E associated with the block A is determined, then the reference count of the block E is incremented by 1; Block A performs the MVCC verification; when the MVCC verification of the block A is completed, the reference count of the block E is decremented by 1.
其中,上述区块与区块之间的关联性可基于数据库版本号建立,可事先将有数据关联性的区块置于同一数据库中,或者将数据库版本号写入有数据关联性的多个区块中。The association between the above-mentioned blocks can be established based on the database version number, the blocks with data association can be placed in the same database in advance, or the database version number can be written into multiple data associations. in the block.
其中,上述第一版本号可为区块A对应的读集数据库所对应的数据版本号,可根据该数据版本号,确定其对应的相关联的区块E。The first version number may be the data version number corresponding to the read set database corresponding to block A, and the corresponding associated block E may be determined according to the data version number.
举例来说,上述区块与区块之间的关联性可基于数据库版本号建立,比如,BlockID为100的区块100在从数据库访问数据时,此时,数据库的版本号是60;那么,区块100可与数据库版本号61对应的区块61产生了关联。在后续步骤中,可令区块61的引用计数+1,只有等区块100完成上述MVCC验证以后,才可将区块61的引用计数-1,此时,也解除了区块100和区块61之间的关联性,即区块100与区块61之间没有关联性。For example, the association between the above-mentioned blocks can be established based on the database version number. For example, when the block 100 whose BlockID is 100 accesses data from the database, at this time, the version number of the database is 60; then, The block 100 may be associated with the block 61 corresponding to the database version number 61 . In subsequent steps, the reference count of block 61 can be set to +1, and the reference count of block 61 can be set to -1 only after block 100 completes the above-mentioned MVCC verification. At this time, block 100 and block 100 are also released The association between the blocks 61, that is, there is no association between the block 100 and the block 61.
103、若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。103. If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, the third information instruction Used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; The current BolckID is the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks, and each A third task corresponds to a block.
其中,上述当前BolckID对应的区块为当前第三线程正在处理的区块对应的BlockID。Wherein, the block corresponding to the current BolckID is the BlockID corresponding to the block currently being processed by the third thread.
其中,当第二线程在处理上述P个区块时,第三线程可同步处理在P个区块对应的BlockID之前的Q个区块,该Q个区块是在上述P个区块之前被第二线程处理完成的一批区块。Wherein, when the second thread is processing the above-mentioned P blocks, the third thread can synchronously process the Q blocks before the BlockID corresponding to the P blocks, and the Q blocks are processed before the above-mentioned P blocks. The second thread processes the completed batch of blocks.
其中,上述第三信息指令可由用户自行设置或者***默认,在此不作限定;当写集数据库获取上述第二队列中的Q个第三任务时,可触发服务器向第三线程发送上述第三信息 指令。The above third information instruction can be set by the user or the system defaults, which is not limited here; when the write set database acquires the Q third tasks in the above second queue, the server can be triggered to send the above third information to the third thread instruction.
其中,第三线程每处理完成一个区块,可将写集缓存中的当前BlockID更新为当前处理完成的区块对应的BlockID,以用于标识当前处理的区块号,并告知其他线程当前处理的区块,有利于保证整个线程的共同协作。Wherein, each time the third thread processes a block, it can update the current BlockID in the write set cache to the BlockID corresponding to the currently processed block, so as to identify the currently processed block number and inform other threads of the current processing The block is conducive to ensuring the cooperation of the entire thread.
其中,当上述任意一个区块对应的第三任务执行完成,即任意一个区块对应的数据集被写入上述写集数据库以后,则可执行更新上述写集缓存中的当前BlockID。Wherein, when the execution of the third task corresponding to any one of the blocks is completed, that is, after the data set corresponding to any one of the blocks is written into the write-set database, the current BlockID in the write-set cache can be updated.
在一种可能的示例中,所述数据集包括:引用计数,上述方法还可包括如下步骤:若所述区块A对应的引用计数不为0,则执行确定存在区块的数据集与所述区块A的数据集之间有关联,其中,所述区块A为所述P个区块中任意一个;若所述区块A对应的引用计数为0,则执行确定不存在区块的数据集与所述区块A的数据集之间有关联,则在更新所述当前BlockID为所述区块A的BlockID以后,将所述区块A对应的数据集从所述写集缓存中删除。In a possible example, the data set includes: a reference count, and the above-mentioned method may further include the following steps: if the reference count corresponding to the block A is not 0, executing the determination that the data set with the block exists and the all There is an association between the data sets of the block A, wherein the block A is any one of the P blocks; if the reference count corresponding to the block A is 0, it is determined that no block exists. There is an association between the data set of the block A and the data set of the block A, then after updating the current BlockID to the BlockID of the block A, the data set corresponding to the block A is cached from the write set deleted in.
其中,上述数据集可包括以下至少一项:BlockID、版本号、该BlockID对应的数据键值对和引用计数等等,在此不做限定;上述BlockID可以理解为当前处于这一线程中进行数据处理的区块号,上述版本号可为数据库版本号。The above-mentioned data set may include at least one of the following: BlockID, version number, data key-value pair and reference count corresponding to the BlockID, etc., which are not limited here; The block number to be processed, the above version number can be the database version number.
具体实现中,在经过上述第二线程对区块A处理以后,其对应的引用计数可能发生过变化,如此,当进入第三线程以后,可对该区块A引用计数进行监测,若该引用计数为0,则表明在进行第二线程MVCC验证以后,可能存在其他区块与其的数据集之间有数据关联,则可保留该区块A;反之,若该区块A对应的引用计数为0,则表明不存在区块与该区块A的数据集之间有关联,又由于,一般情况下,每一区块中可能包含成千上万笔交易,其所需占用的内存空间可能较大,因此,为了节省缓存的内存空间,可在更新当前BlockID为该区块A的BlockID以后,可将该区块A对应的数据集从写集缓存中删除。In the specific implementation, after the second thread processes the block A, its corresponding reference count may have changed. In this way, after entering the third thread, the reference count of the block A can be monitored. If the count is 0, it means that after the second thread MVCC verification, there may be data associations between other blocks and their data sets, and the block A can be retained; otherwise, if the reference count corresponding to the block A is 0, it means that there is no relationship between the block and the data set of block A, and because, in general, each block may contain thousands of transactions, the memory space required may be Therefore, in order to save the memory space of the cache, after updating the current BlockID to the BlockID of the block A, the data set corresponding to the block A can be deleted from the write set cache.
可选地,还可包括如下步骤:若在执行所述第一线程、所述第二线程或者所述第三线程的任务时触发缓存回收机制,则执行确定所述写集缓存中存在的多个区块和所述多个区块对应的多个引用计数,每一区块对应一个引用计数;若存在区块C对应的BlockID小于所述当前BolckID,且区块C对应的引用计数为0,则从所述写集缓存中删除所述区块C,其中,所述区块C为所述多个区块中任意一个区块。Optionally, the method may further include the following step: if a cache reclamation mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, executing the determination of the number of existing buffers in the write set cache. block and multiple reference counts corresponding to the multiple blocks, each block corresponds to one reference count; if there is a BlockID corresponding to block C that is smaller than the current BolckID, and the reference count corresponding to block C is 0 , the block C is deleted from the write set cache, wherein the block C is any one of the multiple blocks.
其中,为了节约缓存中的内存空间,上述服务器中还可设置缓存回收机制,该缓存回收机制可由***默认或者用户自行设置,在此不作限定;当在任意一个线程,触发该缓存回收机制以后,则可确定写集缓存中所有的区块以及每一区块对应的引用计数,可通过引用计数,对写集缓存中存储的区块对应的数据集进行处理。Among them, in order to save the memory space in the cache, a cache recycling mechanism can also be set in the above server. The cache recycling mechanism can be set by the system by default or by the user, which is not limited here; when any thread triggers the cache recycling mechanism, Then, all the blocks in the write set cache and the reference count corresponding to each block can be determined, and the data set corresponding to the block stored in the write set cache can be processed through the reference count.
具体实现中,若存在上述区块C对应的BlockID小于当前BlockID,即表明该区块C是已完成上述三个线程的处理的,若该区块C的引用计数为0,则表明没有与该区块C的数据集相关联的区块,则可从上述写集缓存中删除该区块C,以节省写集缓存的内存空间,有利于提高后续从写集缓存中获取数据的效率。In the specific implementation, if the BlockID corresponding to the above block C is smaller than the current BlockID, it means that the block C has completed the processing of the above three threads. If the reference count of the block C is 0, it indicates that there is no connection with the For the block associated with the data set of block C, the block C can be deleted from the above write set cache, so as to save the memory space of the write set cache and improve the efficiency of subsequent data acquisition from the write set cache.
在一个可能的示例中,还可包括如下步骤:若所述区块C的数据集与区块D对应的数据集相关联,则执行在删除所述区块C以后,将所述区块D对应的引用计数减1,所述区块C和所述区块D为所述写集缓存中任意两个区块。In a possible example, the following step may be further included: if the data set of the block C is associated with the data set corresponding to the block D, after deleting the block C, execute the block D to be deleted. The corresponding reference count is decremented by 1, and the block C and the block D are any two blocks in the write set cache.
其中,上述引用计数的加减实际为区块对区块的操作,例如,若区块C在MVCC验证时,发现区块D与该区块C的数据集之间有关联,则该区块C可令区块D对应的引用计数+1,如此,在后续流程时,该区块C可获取其他区块对应的引用计数,若不为0,则与其之间存在关联,若为0,则无关联,每一区块获取的其他区块的引用计数是相对于该区块的;又例如,若某一区块的引用计数为5,则可能存在5个区块与其有关联。The addition and subtraction of the above reference counts are actually block-to-block operations. For example, if block C finds that there is a relationship between the data set of block D and block C during MVCC verification, the block C can make the reference count corresponding to block D +1. In this way, in the subsequent process, the block C can obtain the reference count corresponding to other blocks. If it is not 0, there is an association with it. There is no association, and the reference counts of other blocks acquired by each block are relative to the block; for another example, if the reference count of a block is 5, there may be 5 blocks associated with it.
具体实现中,若该区块C的引用计数为0,但是区块D与该区块C的数据集相关联, 也就是说,该区块C发现区块D与其之间的数据有关联,但是此时区块C在进行MVCC认证时,已经使用过该区块D对应的数据集,则在删除该区块C以后,可将区块D对应的引用计数减1。In a specific implementation, if the reference count of the block C is 0, but the block D is associated with the data set of the block C, that is, the block C finds that the block D is related to the data between them, However, at this time, when block C is performing MVCC authentication, the data set corresponding to block D has already been used, and after the block C is deleted, the reference count corresponding to block D can be decremented by 1.
可以看出,本申请实施例中所描述的数据处理方法,应用于服务器,可向第一线程发送第一信息指令,第一信息指令用于指示第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,N为小于或等于M的正整数,M为正整数;若确定第二线程的当前任务为获取通过第一队列将N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向第二线程发送第二信息指令,第二信息指令用于指示第二线程对P个区块进行MVCC验证;并指示第二线程将P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;若确定第三线程的当前任务为通过写集数据库获取第二队列中的Q个第三任务时,则执行向第三线程发送第三信息指令,第三信息指令用于指示第三线程执行Q个第三任务,并获取写集缓存中的当前BlockID;当Q个第三任务中任意一个第三任务执行完成时,则执行更新当前BolckID为任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。如此,可将数据库的读写分离,分别分为读集数据库和写集数据库,并引入缓存的概念;进而,在上述每一线程运行时,可对应访问不同的区间,实现了多线程的数据库访问,有利于减少数据库的访问压力;同时,在进行MVCC验证时,可直接访问上述缓存,不用频繁访问数据库,有利于提高MVCC的验证效率;此外,上述操作仅涉及CPU操作,有利于提高区块链的性能。It can be seen that the data processing method described in the embodiment of the present application is applied to the server, and can send the first information instruction to the first thread, and the first information instruction is used to instruct the first thread if the The data set corresponding to the Mth block is executed to generate N first tasks from the data set corresponding to the N blocks before the Mth block through the first queue, where N is a positive integer less than or equal to M , M is a positive integer; if it is determined that the current task of the second thread is to obtain the P first tasks generated by the data sets corresponding to the P blocks before the N blocks through the first queue, then execute the task to the second thread. Send a second information instruction, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P write events; and synchronize the P write events to the second queue to generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is to write When the set database acquires the Q third tasks in the second queue, it executes the sending third information instruction to the third thread. When the execution of any third task in the Q third tasks is completed, the current BolckID is executed to update the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, The Q third tasks correspond to the Q blocks preceding the P blocks, and each third task corresponds to a block. In this way, the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database. Access, which is conducive to reducing the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, which is conducive to improving the area The performance of the blockchain.
与上述一致地,请参阅图2,图2是本申请实施例公开的一种数据处理方法的流程示例图,应用于服务器,其中,如图所示,可包括如图1A所示的读集数据库、写集缓存、第一队列、第二队列、MVCC模块和写集数据库。Consistent with the above, please refer to FIG. 2. FIG. 2 is an exemplary flowchart of a data processing method disclosed in an embodiment of the present application, applied to a server, wherein, as shown in the figure, a read set as shown in FIG. 1A may be included. Database, write set cache, first queue, second queue, MVCC module and write set database.
其中,读集数据库中正在处理的为BolckID为66的区块66,同样的,写集缓存正在处理的为BolckID分别为57、58、59、60和61的区块57、区块58、区块59、区块60和区块61;第一队列正在处理的为BolckID分别为63、64和65的区块63、区块64和区块65;第二队列正在处理的为BolckID分别为59、61和61的区块59、区块60和区块61;MVCC模块正在处理的为BolckID为62的区块62;写集数据库正在处理的为BolckID为58的区块58.Among them, block 66, which is processed in the read database is processed, is the same, the write cache is processing the block 57, block 58, 90, 60, and 61, respectively, block 57, block 58, region Blocks 59, 60, and 61; the first queue is processing blocks 63, 64, and 65 with BolckIDs 63, 64, and 65, respectively; the second queue is processing BolckIDs 59, respectively , 61 and 61 blocks 59, 60 and 61; the MVCC module is processing block 62 with BolckID 62; the write set database is processing block 58 with BolckID 58.
其中,上述第一线程可从读集数据库读取到区块66对应的数据集,则执行通过第一队列以区块63、区块64和区块65分别对应的数据集生成3个第一任务;同时,第二线程可对区块62进行MVCC验证;此时,写集缓存中正在写入区块57、区块58、区块59、区块60和区块61分别对应的数据集,并生成5个写入事件,并同时将5个写入事件同步到第二队列中,生成5个第三任务,以告知第二队列此时正在写入的区块以及告知第二队列此时需要写入到写集数据库中的区块对应的数据集;同时,第三线程中,第二队列可放置有MVCC模块验证完成的区块59、区块60和区块61;此时,若区块58对应的第三任务执行完成,则可执行在写集数据库中写入该区块58对应的数据集,同时,可知,上一处理的区块为区块57,则更新当前BolckID(57)为区块58的BolckID,即为58。如此,将数据库的读写分离,分别分为读集数据库和写集数据库,并引入缓存的概念;进而,在上述每一线程运行时,每一线程可对应访问不同的区间,可实现多线程的数据库访问,有利于减少数据库的访问压力;同时,在进行MVCC验证时,可直接访问上述缓存以获取区块对应的数据集,不用频繁访问数据库,有利于提高MVCC的验证效率;此外,上述操作仅涉及CPU操作,有利于提高区块链的性能。Wherein, the above-mentioned first thread can read the data set corresponding to block 66 from the read set database, and then execute the first queue to generate three first data sets corresponding to block 63, block 64, and block 65 through the first queue. At the same time, the second thread can perform MVCC verification on block 62; at this time, the data sets corresponding to block 57, block 58, block 59, block 60 and block 61 are being written in the write set cache. , and generate 5 write events, and synchronize 5 write events to the second queue at the same time, generate 5 third tasks to inform the second queue of the block being written at this time and inform the second queue of this At the same time, in the third thread, the second queue can be placed with blocks 59, 60 and 61 that have been verified by the MVCC module; at this time, If the execution of the third task corresponding to block 58 is completed, the data set corresponding to block 58 can be written into the write set database. At the same time, it can be known that the block processed last is block 57, and the current BolckID is updated. (57) is the BolckID of block 58, which is 58. In this way, the read and write of the database is separated into a read set database and a write set database, and the concept of cache is introduced; further, when each thread is running, each thread can access different intervals correspondingly, which can realize multi-threading It is beneficial to reduce the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed to obtain the data set corresponding to the block, without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; The operation only involves CPU operations, which is beneficial to improve the performance of the blockchain.
与上述一致地,请参阅图3,图3是本申请实施例公开的一种数据处理方法的流程示例图,应用于服务器,该数据处理方法可包括如下步骤:Consistent with the above, please refer to FIG. 3. FIG. 3 is an exemplary flowchart of a data processing method disclosed in an embodiment of the present application, applied to a server, and the data processing method may include the following steps:
301、向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务。301. Send a first information instruction to a first thread, where the first information instruction is used to instruct the first thread if the data set corresponding to the Mth block is read from the read set database, then execute the first information through the first thread. The queue generates N first tasks with data sets corresponding to N blocks before the Mth block.
302、若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数。302. If it is determined that the current task of the second thread is to obtain the P first tasks generated by the data sets corresponding to the P blocks before the N blocks through the first queue, execute the first task to the first queue. The second thread sends a second information instruction, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to verify the data corresponding to the P blocks Sets are written into the write set cache respectively, and P write events are generated; and the P write events are synchronized to the second queue, and P third tasks are generated, where P is less than or equal to N positive integer.
303、若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。303. If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, the third information instruction Used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; The current BolckID is the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks, and each A third task corresponds to a block.
304、若在执行所述第一线程、所述第二线程或者所述第三线程的任务时触发缓存回收机制,则执行确定所述写集缓存中存在的多个区块和所述多个区块对应的多个引用计数,每一区块对应一个引用计数。304. If the cache recycling mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, execute and determine the multiple blocks and the multiple blocks that exist in the write set cache. Multiple reference counts corresponding to a block, and each block corresponds to a reference count.
305、若存在区块C对应的BlockID小于所述当前BolckID,且区块C对应的引用计数为0,则从所述写集缓存中删除所述区块C,其中,所述区块C为所述多个区块中任意一个区块。305. If the BlockID corresponding to block C is smaller than the current BolckID, and the reference count corresponding to block C is 0, then delete the block C from the write set cache, wherein the block C is any one of the plurality of blocks.
306、若所述区块C的数据集与区块D对应的数据集相关联,则执行在删除所述区块C以后,将所述区块D对应的引用计数减1,所述区块C和所述区块D为所述写集缓存中任意两个区块。306. If the data set of the block C is associated with the data set corresponding to the block D, after the block C is deleted, the reference count corresponding to the block D is decremented by 1, and the block C and the block D are any two blocks in the write set cache.
上述步骤301-306的具体描述可以参照图1B所述的数据处理方法的相应描述,在此不再赘述。For the specific description of the above steps 301-306, reference may be made to the corresponding description of the data processing method shown in FIG. 1B , and details are not repeated here.
可以看出,本申请实施例所描述的数据处理方法,应用于服务器,可向第一线程发送第一信息指令,第一信息指令用于指示第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,N为小于或等于M的正整数,M为正整数;若确定第二线程的当前任务为获取通过第一队列将N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向第二线程发送第二信息指令,第二信息指令用于指示第二线程对P个区块进行MVCC验证;并指示第二线程将P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;若确定第三线程的当前任务为通过写集数据库获取第二队列中的Q个第三任务时,则执行向第三线程发送第三信息指令,第三信息指令用于指示第三线程执行Q个第三任务,并获取写集缓存中的当前BlockID;当Q个第三任务中任意一个第三任务执行完成时,则执行更新当前BolckID为任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块;若在执行第一线程、第二线程或者第三线程的任务时触发缓存回收机制,则执行确定写集缓存中存在的多个区块和多个区块对应的多个引用计数,每一区块对应一个引用计数;若存在区块C对应的BlockID小于当前BolckID,且区块C对应的引用计数为0, 则从写集缓存中删除区块C,其中,区块C为多个区块中任意一个区块;若区块C的数据集与区块D对应的数据集相关联,则执行在删除区块C以后,将区块D对应的引用计数减1,区块C和区块D为写集缓存中任意两个区块。如此,可将数据库的读写分离,分别分为读集数据库和写集数据库,并引入缓存的概念;进而,在上述每一线程运行时,可对应访问不同的区间,实现了多线程的数据库访问,有利于减少数据库的访问压力;同时,在进行MVCC验证时,可直接访问上述缓存,不用频繁访问数据库,有利于提高MVCC的验证效率;此外,上述操作仅涉及CPU操作,有利于提高区块链的性能;再进一步地,可设置缓存回收机制,不管在哪个线程触发该机制,均可对写集缓存中的区块进行处理,有利于节省写进缓存中的内存空间。It can be seen that the data processing method described in the embodiment of the present application is applied to the server, and can send the first information instruction to the first thread, and the first information instruction is used to instruct the first thread to read the first information from the read set database. For the data sets corresponding to the M blocks, the first queue is executed to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, where N is a positive integer less than or equal to M, M is a positive integer; if it is determined that the current task of the second thread is to obtain P first tasks generated from the data sets corresponding to the P blocks before N blocks through the first queue, then execute sending to the second thread. The second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P and synchronize P write events to the second queue to generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is through the write set When the database acquires the Q third tasks in the second queue, it executes the sending third information instruction to the third thread, and the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the Current BlockID; when the execution of any third task in the Q third tasks is completed, the current BolckID is executed to update the BlockID corresponding to the block of any third task, where Q is a positive integer less than or equal to P, and Q The third tasks correspond to the Q blocks before the P blocks, and each third task corresponds to a block; if the cache recycling mechanism is triggered when the tasks of the first thread, the second thread or the third thread are executed, then Execute to determine multiple blocks existing in the write set cache and multiple reference counts corresponding to multiple blocks, and each block corresponds to one reference count; if there is a BlockID corresponding to block C that is smaller than the current BolckID, and block C corresponds to If the reference count of block C is 0, block C is deleted from the write set cache, where block C is any block in multiple blocks; if the data set of block C is associated with the data set corresponding to block D , then after deleting block C, the reference count corresponding to block D is decremented by 1, and block C and block D are any two blocks in the write set cache. In this way, the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database. Access, which is conducive to reducing the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, which is conducive to improving the area The performance of the block chain; further, a cache recycling mechanism can be set, no matter which thread triggers the mechanism, the blocks in the write set cache can be processed, which is beneficial to save the memory space written into the cache.
与上述一致地,请参阅图4,图4为本申请实施例提供的一种服务器的结构示意图,如图4所示,包括处理器、通信接口、存储器以及一个或多个程序,所述处理器、通信接口和存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,上述一个或多个程序包括用于执行以下步骤的指令:Consistent with the above, please refer to FIG. 4. FIG. 4 is a schematic structural diagram of a server provided by an embodiment of the present application. As shown in FIG. 4, it includes a processor, a communication interface, a memory, and one or more programs. A processor, a communication interface and a memory are interconnected, wherein the memory is used to store a computer program, the computer program includes program instructions, the processor is configured to invoke the program instructions, the one or more programs instructions to perform the following steps:
向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
可以看出,本申请实施例中所描述的服务器,可向第一线程发送第一信息指令,第一信息指令用于指示第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,N为小于或等于M的正整数,M为正整数;若确定第二线程的当前任务为获取通过第一队列将N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向第二线程发送第二信息指令,第二信息指令用于指示第二线程对P个区块进行MVCC验证;并指示第二线程将P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;若确定第三线程的当前任务为通过写集数据库获取第二队列中的Q个第三任务时,则执行向第三线程发送第三信息指令,第三信息指令用于指示第三线程执行Q个第三任务,并获取写集缓存中的当前BlockID;当Q个第三任务中任意一个第三任务执行完成时,则执行更新当前BolckID为任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。如此,可将数据库的读写分离,分别分为读集数据库和写集数据库,并引入缓存的概念;进而,在上述每一线程运行时,可对应访问不同的区间,实现了多线程的数据库访问,有利于减少数 据库的访问压力;同时,在进行MVCC验证时,可直接访问上述缓存,不用频繁访问数据库,有利于提高MVCC的验证效率;此外,上述操作仅涉及CPU操作,有利于提高区块链的性能。It can be seen that the server described in the embodiment of the present application can send the first information instruction to the first thread, and the first information instruction is used to instruct the first thread to read the Mth block corresponding to the read set database. The data set, then execute the first queue to generate N first tasks with the data set corresponding to the N blocks before the Mth block, wherein, N is a positive integer less than or equal to M, and M is a positive integer; If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate data sets corresponding to the P blocks before the N blocks, then execute the sending of the second information instruction to the second thread, The second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P write events; Synchronize P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is to obtain the second queue through the write set database When the Q third tasks in the When the execution of any third task among the third tasks is completed, execute and update the BlockID corresponding to the block whose current BolckID is any third task, wherein Q is a positive integer less than or equal to P, and Q third tasks correspond to In the Q blocks before the P blocks, each third task corresponds to a block. In this way, the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database. Access, which is conducive to reducing the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, which is conducive to improving the area The performance of the blockchain.
在一个可能的示例中,所述数据集包括:引用计数,所述程序用于执行以下步骤的指令:In one possible example, the data set includes: a reference count, and the program is used to perform instructions for the following steps:
若所述区块A对应的引用计数不为0,则执行确定存在区块的数据集与所述区块A的数据集之间有关联,其中,所述区块A为所述P个区块中任意一个;If the reference count corresponding to the block A is not 0, perform determining that there is a relationship between the data set of the block and the data set of the block A, wherein the block A is the P areas any of the blocks;
若所述区块A对应的引用计数为0,则执行确定不存在区块的数据集与所述区块A的数据集之间有关联,则在更新所述当前BlockID为所述区块A的BlockID以后,将所述区块A对应的数据集从所述写集缓存中删除。If the reference count corresponding to the block A is 0, it is determined that there is no relationship between the data set of the block and the data set of the block A, and the current BlockID is updated to the block A. After the BlockID is specified, the data set corresponding to the block A is deleted from the write set cache.
在一个可能的示例中,若所述区块A对应的引用计数不为0,所述程序用于执行以下步骤的指令:In a possible example, if the reference count corresponding to the block A is not 0, the program is used to execute the instructions of the following steps:
在将所述区块A写入到所述写集缓存之前,若存在区块B的数据集与所述区块A的数据集相关联,则通过所述区块A将所述写集缓存中所述区块B的数据集中的引用计数加1,其中,所述区块B为在所述区块A之前写入所述写集缓存中的任意一个区块。Before writing the block A into the write set cache, if there is a data set of the block B associated with the data set of the block A, the write set is cached through the block A The reference count in the data set of the block B is incremented by 1, wherein the block B is any block written into the write set cache before the block A.
在一个可能的示例中,所述程序用于执行以下步骤的指令:In one possible example, the program is used to execute instructions for the following steps:
若在执行所述第一线程、所述第二线程或者所述第三线程的任务时触发缓存回收机制,则执行确定所述写集缓存中存在的多个区块和所述多个区块对应的多个引用计数,每一区块对应一个引用计数;If the cache reclamation mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, the execution of determining the multiple blocks and the multiple blocks existing in the write set cache is performed. Corresponding multiple reference counts, each block corresponds to a reference count;
若存在区块C对应的BlockID小于所述当前BolckID,且区块C对应的引用计数为0,则从所述写集缓存中删除所述区块C,其中,所述区块C为所述多个区块中任意一个区块。If the BlockID corresponding to the block C is smaller than the current BolckID, and the reference count corresponding to the block C is 0, the block C is deleted from the write set cache, wherein the block C is the Any one of multiple blocks.
在一个可能的示例中,所述程序还用于执行以下步骤的指令:In one possible example, the program is also used to execute instructions for the following steps:
若所述区块C的数据集与区块D对应的数据集相关联,则执行在删除所述区块C以后,将所述区块D对应的引用计数减1,所述区块C和所述区块D为所述写集缓存中任意两个区块。If the data set of the block C is associated with the data set corresponding to the block D, after the block C is deleted, the reference count corresponding to the block D is decremented by 1, and the block C and the The block D is any two blocks in the write set cache.
在一个可能的示例中,所述数据集还包括:数据版本号,在所述对所述P个区块进行MVCC验证方面,所述程序用于执行以下步骤的指令:In a possible example, the data set further includes: a data version number, and in the aspect of performing the MVCC verification on the P blocks, the program is used to execute the instructions of the following steps:
确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;Determine that when the block A accesses the corresponding data set from the read set database, the data version number of the read set database is the first version number;
根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;According to the first version number, determine the block E associated with the block A, then add 1 to the reference count of the block E;
对所述区块A进行所述MVCC验证;Performing the MVCC verification on the block A;
当所述区块A在进行MVCC验证完成时,则确定执行将所述区块E的引用计数减1。When the MVCC verification of the block A is completed, it is determined to decrement the reference count of the block E by 1.
上述主要从方法侧执行过程的角度对本申请实施例的方案进行了介绍。可以理解的是,服务器为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所提供的实施例描述的各示例的单元及算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。The foregoing mainly introduces the solutions of the embodiments of the present application from the perspective of the method-side execution process. It can be understood that, in order to implement the above-mentioned functions, the server includes corresponding hardware structures and/or software modules for executing each function. Those skilled in the art should easily realize that the present application can be implemented in hardware or in the form of a combination of hardware and computer software, in combination with the units and algorithm steps of each example described in the embodiments provided herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Skilled artisans may implement the described functionality using different methods for each particular application, but such implementations should not be considered beyond the scope of this application.
本申请实施例可以根据上述方法示例对服务器进行功能单元的划分,例如,可以对应各个功能划分各个功能单元,也可以将两个或两个以上的功能集成在一个处理单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。需要说明的是,本申请实施例中对单元的划分是示意性的,仅仅为一种逻辑功能划分,实际实现 时可以有另外的划分方式。In this embodiment of the present application, the server may be divided into functional units according to the foregoing method examples. For example, each functional unit may be divided corresponding to each function, or two or more functions may be integrated into one processing unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units. It should be noted that the division of units in the embodiments of the present application is schematic, and is only a logical function division, and other division methods may be used in actual implementation.
与上述一致地,请参阅图5,图5是本申请实施例公开的一种数据处理装置的结构示意图,应用于服务器,该装置包括:第一发送单元501、第二发送单元502和第三发送单元503,其中,Consistent with the above, please refer to FIG. 5. FIG. 5 is a schematic structural diagram of a data processing apparatus disclosed in an embodiment of the present application, applied to a server, and the apparatus includes: a first sending unit 501, a second sending unit 502, and a third sending unit 501. sending unit 503, wherein,
所述第一发送单元501,用于向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;The first sending unit 501 is configured to send a first information instruction to the first thread, where the first information instruction is used to instruct the first thread to read the Mth block corresponding to the first thread from the read set database. data set, then execute the first queue to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
所述第二发送单元502,用于若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;The second sending unit 502 is configured to, if it is determined that the current task of the second thread is to obtain the P first tasks generated by the first queue from the data sets corresponding to the P blocks before the N blocks is executed, the second information instruction is sent to the second thread, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to The data sets corresponding to the P blocks are respectively written into the write set cache, and P write events are generated; and the P write events are synchronized to the second queue, and P third tasks are generated, wherein , P is a positive integer less than or equal to N;
所述第三发送单元503,用于若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。The third sending unit 503 is configured to execute sending third information to the third thread if it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database instruction, the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when any third task in the Q third tasks executes When completed, then execute and update the current BolckID to be the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to P blocks For the previous Q blocks, each third task corresponds to a block.
可以看出,本申请实施例中所描述的数据处理装置,应用于服务器,可向第一线程发送第一信息指令,第一信息指令用于指示第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,N为小于或等于M的正整数,M为正整数;若确定第二线程的当前任务为获取通过第一队列将N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向第二线程发送第二信息指令,第二信息指令用于指示第二线程对P个区块进行MVCC验证;并指示第二线程将P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;若确定第三线程的当前任务为通过写集数据库获取第二队列中的Q个第三任务时,则执行向第三线程发送第三信息指令,第三信息指令用于指示第三线程执行Q个第三任务,并获取写集缓存中的当前BlockID;当Q个第三任务中任意一个第三任务执行完成时,则执行更新当前BolckID为任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。如此,可将数据库的读写分离,分别分为读集数据库和写集数据库,并引入缓存的概念;进而,在上述每一线程运行时,可对应访问不同的区间,实现了多线程的数据库访问,有利于减少数据库的访问压力;同时,在进行MVCC验证时,可直接访问上述缓存,不用频繁访问数据库,有利于提高MVCC的验证效率;此外,上述操作仅涉及CPU操作,有利于提高区块链的性能。It can be seen that the data processing apparatus described in the embodiment of the present application is applied to the server, and can send the first information instruction to the first thread. The data set corresponding to the Mth block is executed to generate N first tasks from the data set corresponding to the N blocks before the Mth block through the first queue, where N is a positive integer less than or equal to M , M is a positive integer; if it is determined that the current task of the second thread is to obtain the P first tasks generated by the data sets corresponding to the P blocks before the N blocks through the first queue, then execute the task to the second thread. Send a second information instruction, and the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to write the data sets corresponding to the P blocks into the write set cache respectively, and generate P write events; and synchronize the P write events to the second queue to generate P third tasks, where P is a positive integer less than or equal to N; if it is determined that the current task of the third thread is to write When the set database acquires the Q third tasks in the second queue, it executes the sending third information instruction to the third thread. When the execution of any third task in the Q third tasks is completed, the current BolckID is executed to update the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, The Q third tasks correspond to the Q blocks preceding the P blocks, and each third task corresponds to a block. In this way, the reading and writing of the database can be separated into a read-set database and a write-set database, and the concept of caching is introduced; further, when each thread is running, it can access different intervals correspondingly, realizing a multi-threaded database. Access, which is conducive to reducing the access pressure of the database; at the same time, when MVCC verification is performed, the above cache can be directly accessed without frequent access to the database, which is conducive to improving the verification efficiency of MVCC; in addition, the above operations only involve CPU operations, which is conducive to improving the area The performance of the blockchain.
在一个可能的示例中,所述数据集还包括:数据版本号,在所述对所述P个区块进行MVCC验证方面,所述第二发送单元502具体用于:确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;对所述区块A进行所述MVCC验证;当所述区块A在进行MVCC验证完成时,则确定执行将所述区块E的引用计数减1。In a possible example, the data set further includes: a data version number, and in the aspect of performing the MVCC verification on the P blocks, the second sending unit 502 is specifically configured to: determine the block A When accessing the corresponding data set from the read set database, the data version number of the read set database is the first version number; according to the first version number, the block E associated with the block A is determined , then add 1 to the reference count of the block E; perform the MVCC verification on the block A; when the block A is performing the MVCC verification and complete, then determine to execute the reference to the block E Decrement the count by 1.
本申请实施例还提供一种计算机可读存储介质,其中,该计算机可读存储介质存储用于电子数据交换的计算机程序,该计算机程序使得计算机执行如上述方法实施例中记载的任何一种数据处理方法的部分或全部步骤。可选的,该计算机程序可包括程序指令,该程序指令当被处理器执行时使处理器执行上述方法的部分或全部步骤,此处不赘述。Embodiments of the present application further provide a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program for electronic data exchange, and the computer program enables a computer to execute any data as described in the foregoing method embodiments Some or all of the steps of a processing method. Optionally, the computer program may include program instructions, which, when executed by the processor, cause the processor to execute part or all of the steps of the above method, which will not be repeated here.
可选的,本申请涉及的存储介质如计算机可读存储介质可以是非易失性的,也可以是易失性的。Optionally, the storage medium involved in this application, such as a computer-readable storage medium, may be non-volatile or volatile.
本申请实施例还提供一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序可操作来使计算机执行如上述方法实施例中记载的任何一种数据处理方法的部分或全部步骤。该计算机程序产品可以为一个软件安装包。The embodiments of the present application further provide a computer program product, the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer program is operable to cause a computer to execute the methods described in the foregoing method embodiments. Some or all of the steps of any data processing method. The computer program product may be a software installation package.
需要说明的是,对于前述的各方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or concurrently. Secondly, those skilled in the art should also know that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述的部分,可以参见其他实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.
在本申请所提供的几个实施例中,应该理解到所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative, for example, the division of the units is only a logical function division, and there may be other division methods in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序模块的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software program modules.
所述集成的单元如果以软件程序模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory, Several instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned memory includes: U disk, read-only memory (ROM), random access memory (random access memory, RAM), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、ROM、RAM、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable memory, and the memory can include: a flash disk , ROM, RAM, disk or CD, etc.
以上对本申请实施例进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The embodiments of the present application are described in detail above, and specific examples are used in this paper to illustrate the principles and implementations of the present application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application; at the same time, for Persons of ordinary skill in the art, based on the idea of the present application, will have changes in the specific implementation manner and application scope. In summary, the contents of this specification should not be construed as limitations on the present application.

Claims (20)

  1. 一种数据处理方法,应用于服务器,包括:A data processing method applied to a server, comprising:
    向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
    若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
    若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  2. 根据权利要求1所述的方法,其中,所述数据集包括:引用计数,所述方法还包括:The method of claim 1, wherein the data set includes a reference count, the method further comprising:
    若所述区块A对应的引用计数不为0,则执行确定存在区块的数据集与所述区块A的数据集之间有关联,其中,所述区块A为所述P个区块中任意一个;If the reference count corresponding to the block A is not 0, perform determining that there is a relationship between the data set of the block and the data set of the block A, wherein the block A is the P areas any of the blocks;
    若所述区块A对应的引用计数为0,则执行确定不存在区块的数据集与所述区块A的数据集之间有关联,则在更新所述当前BlockID为所述区块A的BlockID以后,将所述区块A对应的数据集从所述写集缓存中删除。If the reference count corresponding to the block A is 0, it is determined that there is no relationship between the data set of the block and the data set of the block A, and the current BlockID is updated to the block A. After the BlockID is specified, the data set corresponding to the block A is deleted from the write set cache.
  3. 根据权利要求1或2所述的方法,其中,若所述区块A对应的引用计数不为0,所述方法还包括:The method according to claim 1 or 2, wherein, if the reference count corresponding to the block A is not 0, the method further comprises:
    在将所述区块A写入到所述写集缓存之前,若存在区块B的数据集与所述区块A的数据集相关联,则通过所述区块A将所述写集缓存中所述区块B的数据集中的引用计数加1,其中,所述区块B为在所述区块A之前写入所述写集缓存中的任意一个区块。Before writing the block A into the write set cache, if there is a data set of the block B associated with the data set of the block A, the write set is cached through the block A The reference count in the data set of the block B is incremented by 1, wherein the block B is any block written into the write set cache before the block A.
  4. 根据权利要求1所述的方法,其中,所述方法还包括:The method of claim 1, wherein the method further comprises:
    若在执行所述第一线程、所述第二线程或者所述第三线程的任务时触发缓存回收机制,则执行确定所述写集缓存中存在的多个区块和所述多个区块对应的多个引用计数,每一区块对应一个引用计数;If the cache reclamation mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, the execution of determining the multiple blocks and the multiple blocks existing in the write set cache is performed. Corresponding multiple reference counts, each block corresponds to a reference count;
    若存在区块C对应的BlockID小于所述当前BolckID,且区块C对应的引用计数为0,则从所述写集缓存中删除所述区块C,其中,所述区块C为所述多个区块中任意一个区块。If the BlockID corresponding to the block C is smaller than the current BolckID, and the reference count corresponding to the block C is 0, the block C is deleted from the write set cache, wherein the block C is the Any one of multiple blocks.
  5. 根据权利要求4所述的方法,其中,所述方法还包括:The method of claim 4, wherein the method further comprises:
    若所述区块C的数据集与区块D对应的数据集相关联,则执行在删除所述区块C以后,将所述区块D对应的引用计数减1,所述区块C和所述区块D为所述写集缓存中任意两个区块。If the data set of the block C is associated with the data set corresponding to the block D, after the block C is deleted, the reference count corresponding to the block D is decremented by 1, and the block C and the The block D is any two blocks in the write set cache.
  6. 根据权利要求1或2所述的方法,其中,所述数据集还包括:数据版本号,所述对所述P个区块进行MVCC验证,包括:The method according to claim 1 or 2, wherein the data set further includes: a data version number, and the performing MVCC verification on the P blocks includes:
    确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;Determine that when the block A accesses the corresponding data set from the read set database, the data version number of the read set database is the first version number;
    根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;According to the first version number, determine the block E associated with the block A, then add 1 to the reference count of the block E;
    对所述区块A进行所述MVCC验证;Performing the MVCC verification on the block A;
    当所述区块A在进行MVCC验证完成时,则确定执行将所述区块E的引用计数减1。When the MVCC verification of the block A is completed, it is determined to decrement the reference count of the block E by 1.
  7. 一种数据处理装置,应用于服务器,所述装置包括:第一发送单元、第二发送单元和第三发送单元,其中,A data processing apparatus, applied to a server, the apparatus comprising: a first sending unit, a second sending unit and a third sending unit, wherein,
    所述第一发送单元,用于向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;The first sending unit is used to send a first information instruction to the first thread, and the first information instruction is used to instruct the first thread if the data corresponding to the Mth block is read from the read set database set, then execute the first queue to generate N first tasks with the data sets corresponding to the N blocks before the Mth block, wherein, the N is a positive integer less than or equal to M, and the M is a positive integer;
    所述第二发送单元,用于若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;The second sending unit is used when determining that the current task of the second thread is to obtain the P first tasks generated by the first queue from the data sets corresponding to the P blocks before the N blocks. , then execute sending a second information instruction to the second thread, where the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to perform the MVCC verification on the P blocks; The data sets corresponding to the P blocks are written into the write set cache respectively, and P write events are generated; the P write events are synchronized to the second queue, and P third tasks are generated, wherein, P is a positive integer less than or equal to N;
    所述第三发送单元,用于若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。The third sending unit is configured to execute a sending third information instruction to the third thread if it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database , the third information instruction is used to instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed , then perform updating the current BolckID to be the BlockID corresponding to the block of any third task, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to P blocks before Q blocks of , each third task corresponds to a block.
  8. 根据权利要求7所述的装置,其中,所述数据集还包括:数据版本号,在所述对所述P个区块进行MVCC验证方面,所述第二发送单元具体用于:The apparatus according to claim 7, wherein the data set further comprises: a data version number, and in the aspect of performing MVCC verification on the P blocks, the second sending unit is specifically used for:
    确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;Determine that when the block A accesses the corresponding data set from the read set database, the data version number of the read set database is the first version number;
    根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;According to the first version number, determine the block E associated with the block A, then add 1 to the reference count of the block E;
    对所述区块A进行所述MVCC验证;Performing the MVCC verification on the block A;
    当所述区块A在进行MVCC验证完成时,则确定执行将所述区块E的引用计数减1。When the MVCC verification of the block A is completed, it is determined to decrement the reference count of the block E by 1.
  9. 一种服务器,包括处理器、通信接口、存储器以及一个或多个程序,所述处理器、通信接口和存储器相互连接,其中,所述存储器用于存储计算机程序,所述计算机程序包括程序指令,所述处理器被配置用于调用所述程序指令,执行以下方法:A server comprising a processor, a communication interface, a memory and one or more programs, the processor, the communication interface and the memory being interconnected, wherein the memory is used to store a computer program, the computer program comprising program instructions, The processor is configured to invoke the program instructions to perform the following methods:
    向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
    若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
    若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应 的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  10. 根据权利要求9所述的服务器,其中,所述数据集包括:引用计数,所述处理器还用于执行:The server of claim 9, wherein the data set includes a reference count, and the processor is further configured to perform:
    若所述区块A对应的引用计数不为0,则执行确定存在区块的数据集与所述区块A的数据集之间有关联,其中,所述区块A为所述P个区块中任意一个;If the reference count corresponding to the block A is not 0, perform determining that there is a relationship between the data set of the block and the data set of the block A, wherein the block A is the P areas any of the blocks;
    若所述区块A对应的引用计数为0,则执行确定不存在区块的数据集与所述区块A的数据集之间有关联,则在更新所述当前BlockID为所述区块A的BlockID以后,将所述区块A对应的数据集从所述写集缓存中删除。If the reference count corresponding to the block A is 0, it is determined that there is no relationship between the data set of the block and the data set of the block A, and the current BlockID is updated to the block A. After the BlockID is specified, the data set corresponding to the block A is deleted from the write set cache.
  11. 根据权利要求9或10所述的服务器,其中,若所述区块A对应的引用计数不为0,所述处理器还用于执行:The server according to claim 9 or 10, wherein, if the reference count corresponding to the block A is not 0, the processor is further configured to execute:
    在将所述区块A写入到所述写集缓存之前,若存在区块B的数据集与所述区块A的数据集相关联,则通过所述区块A将所述写集缓存中所述区块B的数据集中的引用计数加1,其中,所述区块B为在所述区块A之前写入所述写集缓存中的任意一个区块。Before writing the block A into the write set cache, if there is a data set of the block B associated with the data set of the block A, the write set is cached through the block A The reference count in the data set of the block B is incremented by 1, wherein the block B is any block written into the write set cache before the block A.
  12. 根据权利要求9所述的服务器,其中,所述处理器还用于执行:The server of claim 9, wherein the processor is further configured to perform:
    若在执行所述第一线程、所述第二线程或者所述第三线程的任务时触发缓存回收机制,则执行确定所述写集缓存中存在的多个区块和所述多个区块对应的多个引用计数,每一区块对应一个引用计数;If the cache reclamation mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, the execution of determining the multiple blocks and the multiple blocks existing in the write set cache is performed. Corresponding multiple reference counts, each block corresponds to a reference count;
    若存在区块C对应的BlockID小于所述当前BolckID,且区块C对应的引用计数为0,则从所述写集缓存中删除所述区块C,其中,所述区块C为所述多个区块中任意一个区块。If the BlockID corresponding to the block C is smaller than the current BolckID, and the reference count corresponding to the block C is 0, the block C is deleted from the write set cache, wherein the block C is the Any one of multiple blocks.
  13. 根据权利要求12所述的服务器,其中,所述处理器还用于执行:The server of claim 12, wherein the processor is further configured to perform:
    若所述区块C的数据集与区块D对应的数据集相关联,则执行在删除所述区块C以后,将所述区块D对应的引用计数减1,所述区块C和所述区块D为所述写集缓存中任意两个区块。If the data set of the block C is associated with the data set corresponding to the block D, after the block C is deleted, the reference count corresponding to the block D is decremented by 1, and the block C and the The block D is any two blocks in the write set cache.
  14. 根据权利要求9或10所述的服务器,其中,所述数据集还包括:数据版本号,执行所述对所述P个区块进行MVCC验证,包括:The server according to claim 9 or 10, wherein the data set further includes: a data version number, and performing the MVCC verification on the P blocks includes:
    确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;Determine that when the block A accesses the corresponding data set from the read set database, the data version number of the read set database is the first version number;
    根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;According to the first version number, determine the block E associated with the block A, then add 1 to the reference count of the block E;
    对所述区块A进行所述MVCC验证;Performing the MVCC verification on the block A;
    当所述区块A在进行MVCC验证完成时,则确定执行将所述区块E的引用计数减1。When the MVCC verification of the block A is completed, it is determined to decrement the reference count of the block E by 1.
  15. 一种计算机可读存储介质,其中,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时使所述处理器执行以下方法:A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, the computer program comprising program instructions that, when executed by a processor, cause the processor to perform the following method:
    向第一线程发送第一信息指令,所述第一信息指令用于指示所述第一线程若从读集数据库中读取到第M个区块对应的数据集,则执行通过第一队列以所述第M个区块之前的N个区块对应的数据集生成N个第一任务,其中,所述N为小于或等于M的正整数,所述M为正整数;Send the first information instruction to the first thread, the first information instruction is used to instruct the first thread to read the data set corresponding to the Mth block from the read set database, then execute the execution through the first queue to The data sets corresponding to the N blocks before the M-th block generate N first tasks, wherein the N is a positive integer less than or equal to M, and the M is a positive integer;
    若确定第二线程的当前任务为获取通过所述第一队列将所述N个区块之前的P个区块对应的数据集生成的P个第一任务时,则执行向所述第二线程发送第二信息指令,所述第二信息指令用于指示所述第二线程对所述P个区块进行MVCC验证;并指示所述第二线程将所述P个区块对应的数据集分别写入到写集缓存中,并生成P个写入事件;并将所述P个写入事件同步到第二队列中,生成P个第三任务,其中,P是小于或等于N的正整数;If it is determined that the current task of the second thread is to obtain P first tasks generated by using the first queue to generate the data sets corresponding to the P blocks before the N blocks, execute the process to the second thread. Sending a second information instruction, the second information instruction is used to instruct the second thread to perform MVCC verification on the P blocks; and instruct the second thread to separate the data sets corresponding to the P blocks Write to the write set cache, and generate P write events; synchronize the P write events to the second queue, and generate P third tasks, where P is a positive integer less than or equal to N ;
    若确定第三线程的当前任务为通过写集数据库获取所述第二队列中的Q个第三任务时,则执行向所述第三线程发送第三信息指令,所述第三信息指令用于指示第三线程执行所述Q个第三任务,并获取所述写集缓存中的当前BlockID;当所述Q个第三任务中任意一个第三任务执行完成时,则执行更新所述当前BolckID为所述任意一个第三任务的区块对应的BlockID,其中,Q是小于或等于P的正整数,所述Q个第三任务对应于P个区块之前的Q个区块,每一第三任务对应一个区块。If it is determined that the current task of the third thread is to obtain Q third tasks in the second queue through the write set database, execute a third information instruction to send to the third thread, and the third information instruction is used for Instruct the third thread to execute the Q third tasks, and obtain the current BlockID in the write set cache; when the execution of any third task in the Q third tasks is completed, then execute and update the current BolckID is the BlockID corresponding to the block of any one of the third tasks, wherein Q is a positive integer less than or equal to P, and the Q third tasks correspond to the Q blocks before the P blocks. Three tasks correspond to a block.
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述数据集包括:引用计数,所述程序指令当被处理器执行时还使所述处理器执行:16. The computer-readable storage medium of claim 15, wherein the data set includes a reference count, the program instructions, when executed by a processor, further cause the processor to execute:
    若所述区块A对应的引用计数不为0,则执行确定存在区块的数据集与所述区块A的数据集之间有关联,其中,所述区块A为所述P个区块中任意一个;If the reference count corresponding to the block A is not 0, perform determining that there is a relationship between the data set of the block and the data set of the block A, wherein the block A is the P areas any of the blocks;
    若所述区块A对应的引用计数为0,则执行确定不存在区块的数据集与所述区块A的数据集之间有关联,则在更新所述当前BlockID为所述区块A的BlockID以后,将所述区块A对应的数据集从所述写集缓存中删除。If the reference count corresponding to the block A is 0, it is determined that there is no relationship between the data set of the block and the data set of the block A, and the current BlockID is updated to the block A. After the BlockID is specified, the data set corresponding to the block A is deleted from the write set cache.
  17. 根据权利要求15或16所述的计算机可读存储介质,其中,若所述区块A对应的引用计数不为0,所述程序指令当被处理器执行时还使所述处理器执行:The computer-readable storage medium according to claim 15 or 16, wherein, if the reference count corresponding to the block A is not 0, the program instructions, when executed by the processor, further cause the processor to execute:
    在将所述区块A写入到所述写集缓存之前,若存在区块B的数据集与所述区块A的数据集相关联,则通过所述区块A将所述写集缓存中所述区块B的数据集中的引用计数加1,其中,所述区块B为在所述区块A之前写入所述写集缓存中的任意一个区块。Before writing the block A into the write set cache, if there is a data set of the block B associated with the data set of the block A, the write set is cached through the block A The reference count in the data set of the block B is incremented by 1, wherein the block B is any block written into the write set cache before the block A.
  18. 根据权利要求15所述的计算机可读存储介质,其中,所述程序指令当被处理器执行时还使所述处理器执行:16. The computer-readable storage medium of claim 15, wherein the program instructions, when executed by a processor, further cause the processor to:
    若在执行所述第一线程、所述第二线程或者所述第三线程的任务时触发缓存回收机制,则执行确定所述写集缓存中存在的多个区块和所述多个区块对应的多个引用计数,每一区块对应一个引用计数;If the cache reclamation mechanism is triggered when the task of the first thread, the second thread or the third thread is executed, the execution of determining the multiple blocks and the multiple blocks existing in the write set cache is performed. Corresponding multiple reference counts, each block corresponds to a reference count;
    若存在区块C对应的BlockID小于所述当前BolckID,且区块C对应的引用计数为0,则从所述写集缓存中删除所述区块C,其中,所述区块C为所述多个区块中任意一个区块。If the BlockID corresponding to the block C is smaller than the current BolckID, and the reference count corresponding to the block C is 0, the block C is deleted from the write set cache, wherein the block C is the Any one of multiple blocks.
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述程序指令当被处理器执行时还使所述处理器执行:The computer-readable storage medium of claim 18, wherein the program instructions, when executed by a processor, further cause the processor to:
    若所述区块C的数据集与区块D对应的数据集相关联,则执行在删除所述区块C以后,将所述区块D对应的引用计数减1,所述区块C和所述区块D为所述写集缓存中任意两个区块。If the data set of the block C is associated with the data set corresponding to the block D, after the block C is deleted, the reference count corresponding to the block D is decremented by 1, and the block C and the The block D is any two blocks in the write set cache.
  20. 根据权利要求15或16所述的计算机可读存储介质,其中,所述数据集还包括:数据版本号,执行所述对所述P个区块进行MVCC验证,包括:The computer-readable storage medium according to claim 15 or 16, wherein the data set further comprises: a data version number, and performing the MVCC verification on the P blocks comprises:
    确定所述区块A在从所述读集数据库访问对应的数据集时,所述读集数据库的数据版本号为第一版本号;Determine that when the block A accesses the corresponding data set from the read set database, the data version number of the read set database is the first version number;
    根据所述第一版本号,确定与所述区块A相关联的区块E,则将所述区块E的引用计数加1;According to the first version number, determine the block E associated with the block A, then add 1 to the reference count of the block E;
    对所述区块A进行所述MVCC验证;Performing the MVCC verification on the block A;
    当所述区块A在进行MVCC验证完成时,则确定执行将所述区块E的引用计数减1。When the MVCC verification of the block A is completed, it is determined to decrement the reference count of the block E by 1.
PCT/CN2021/109268 2020-09-03 2021-07-29 Data processing method and device, and storage medium WO2022048358A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010913919.5A CN111984421B (en) 2020-09-03 2020-09-03 Data processing method, device and storage medium
CN202010913919.5 2020-09-03

Publications (1)

Publication Number Publication Date
WO2022048358A1 true WO2022048358A1 (en) 2022-03-10

Family

ID=73447432

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/109268 WO2022048358A1 (en) 2020-09-03 2021-07-29 Data processing method and device, and storage medium

Country Status (2)

Country Link
CN (1) CN111984421B (en)
WO (1) WO2022048358A1 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984421B (en) * 2020-09-03 2022-09-16 深圳壹账通智能科技有限公司 Data processing method, device and storage medium
CN113505000B (en) * 2021-09-08 2021-12-21 广东卓启云链科技有限公司 Multithreading processing method, device, system and storage medium in block chain
CN114297109B (en) * 2021-12-28 2024-05-24 中汽创智科技有限公司 Data processing method and device based on subscription and release modes, electronic equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190288850A1 (en) * 2016-08-12 2019-09-19 ALTR Solutions, Inc. Decentralized database optimizations
CN111241061A (en) * 2020-01-09 2020-06-05 平安科技(深圳)有限公司 Writing method of state database, data processing device and storage medium
CN111414389A (en) * 2020-03-19 2020-07-14 北京字节跳动网络技术有限公司 Data processing method and device, electronic equipment and storage medium
CN111984421A (en) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 Data processing method, device and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170289134A1 (en) * 2016-03-30 2017-10-05 Ping Identity Corporation Methods and apparatus for assessing authentication risk and implementing single sign on (sso) using a distributed consensus database
CN106598549B (en) * 2016-12-08 2019-02-01 天津米游科技有限公司 A kind of intelligent contract system and implementation method based on block chain
US10708250B2 (en) * 2018-07-11 2020-07-07 Americorps Investments Llc Blockchain operating system
CN109271258B (en) * 2018-08-28 2020-11-17 百度在线网络技术(北京)有限公司 Method, device, terminal and storage medium for realizing re-entry of read-write lock
CN109471734A (en) * 2018-10-27 2019-03-15 哈尔滨工业大学(威海) A kind of novel cache optimization multithreading Deterministic Methods
CN109493223B (en) * 2018-11-07 2021-12-21 联动优势科技有限公司 Accounting method and device
CN109933632B (en) * 2019-04-04 2021-04-27 杭州数梦工场科技有限公司 Data migration method, device and equipment for database
CN110245006B (en) * 2019-05-07 2023-05-02 深圳壹账通智能科技有限公司 Method, device, equipment and storage medium for processing block chain transaction

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190288850A1 (en) * 2016-08-12 2019-09-19 ALTR Solutions, Inc. Decentralized database optimizations
CN111241061A (en) * 2020-01-09 2020-06-05 平安科技(深圳)有限公司 Writing method of state database, data processing device and storage medium
CN111414389A (en) * 2020-03-19 2020-07-14 北京字节跳动网络技术有限公司 Data processing method and device, electronic equipment and storage medium
CN111984421A (en) * 2020-09-03 2020-11-24 深圳壹账通智能科技有限公司 Data processing method, device and storage medium

Also Published As

Publication number Publication date
CN111984421B (en) 2022-09-16
CN111984421A (en) 2020-11-24

Similar Documents

Publication Publication Date Title
WO2022048358A1 (en) Data processing method and device, and storage medium
US20200241613A1 (en) Persistent reservations for virtual disk using multiple targets
US10375167B2 (en) Low latency RDMA-based distributed storage
US11422907B2 (en) Disconnected operation for systems utilizing cloud storage
EP2877942B1 (en) Automatic transaction retry after session failure
US7783601B2 (en) Replicating and sharing data between heterogeneous data systems
US9990225B2 (en) Relaxing transaction serializability with statement-based data replication
WO2017066110A1 (en) Distributed self-directed rdma-based b-tree key-value manager
WO2020181810A1 (en) Data processing method and apparatus applied to multi-level caching in cluster
US9589153B2 (en) Securing integrity and consistency of a cloud storage service with efficient client operations
US20220138056A1 (en) Non-Blocking Backup in a Log Replay Node for Tertiary Initialization
US20150277966A1 (en) Transaction system
US10031948B1 (en) Idempotence service
US20240134758A1 (en) Smart coalescing in data management systems
AU2016373662B2 (en) High throughput, high reliability data processing system
US20230124036A1 (en) In-place garbage collection for state machine replication
US11886439B1 (en) Asynchronous change data capture for direct external transmission
US11438415B2 (en) Managing hash tables in a storage system
US11593030B2 (en) Cross-stream transactions in a streaming data storage system
US11947568B1 (en) Working set ratio estimations of data items in a sliding time window for dynamically allocating computing resources for the data items
US11874796B1 (en) Efficient garbage collection in optimistic multi-writer database systems
US11243930B2 (en) System and method for scalable and space efficient hardening of a fixed sized hash table over an unreliable tier
EP3391223B1 (en) High throughput, high reliability data processing system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21863426

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 30/06/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21863426

Country of ref document: EP

Kind code of ref document: A1