CN112486932A - Data concurrent writing method and distributed data concurrent writing system - Google Patents

Data concurrent writing method and distributed data concurrent writing system Download PDF

Info

Publication number
CN112486932A
CN112486932A CN202011431452.7A CN202011431452A CN112486932A CN 112486932 A CN112486932 A CN 112486932A CN 202011431452 A CN202011431452 A CN 202011431452A CN 112486932 A CN112486932 A CN 112486932A
Authority
CN
China
Prior art keywords
write
version number
target data
data
client
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011431452.7A
Other languages
Chinese (zh)
Inventor
黎海兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN202011431452.7A priority Critical patent/CN112486932A/en
Publication of CN112486932A publication Critical patent/CN112486932A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/176Support for shared access to files; File sharing support
    • G06F16/1767Concurrency control, e.g. optimistic or pessimistic approaches

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a data concurrent writing method and a distributed data concurrent writing system, which comprise the following steps: the control node verifies a first write-in request sent by the client to generate a first version number corresponding to the first write-in request, selects a preset number of storage nodes when verification is successful, and sends address information corresponding to the preset number of storage nodes and the first version number to the client; the client sends the second write-in requests to a preset number of storage nodes; the storage node receives the second write-in request, judges whether the first version number is higher than the local version number of the target data, if so, executes corresponding write operation, and returns write-in result information to the client; and the control node updates the metadata information corresponding to the target data based on the writing result information returned by the client. The embodiment of the invention can ensure the consistency of the duplicate file data when a plurality of clients concurrently write the same target data, and improve the reliability of the data.

Description

Data concurrent writing method and distributed data concurrent writing system
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data concurrent writing method and a distributed data concurrent writing system.
Background
Distributed file storage system means that the physical storage resources managed by the file system are not necessarily directly connected to the local node, but are connected to the node (which may be simply understood as a computer) through a computer network; or a complete hierarchical file system formed by combining several different logical disk partitions or volume labels. The distributed file storage system disperses a large amount of data to different nodes for storage, thereby greatly reducing the risk of data loss.
The method for executing one-time data writing by the existing client comprises the following steps: the client sends a write-in request aiming at target data to the control node; the control node receives and verifies the writing request, and when the verification is successful, the address information of the storage node corresponding to the data to be written is returned to the client; the client receives the address information and sends a data writing request containing the data to be written to a storage node corresponding to the data to be written; the storage node receives and executes the write operation corresponding to the data write request, and then returns write result information containing write success or write failure to the client; the client receives the writing result information returned by the storage node and sends the writing result information to the control node; and when the control node judges that the writing result is successful, updating the metadata information corresponding to the target data.
In practical applications, to ensure the reliability of data, a plurality of copies of file data are generally saved. When data is written using the above-described data writing method, however, when there are a plurality of clients performing write operations on the same target data, due to the differences in network delay and actual service conditions of the storage nodes where the copies are located, the situation that data in the multiple copies corresponding to the target data are inconsistent may result, for example, the client a and the client B perform write operations on the same target data at the same time, the client a performs write operations first and then performs write operations for copy 1 of the target data, the client B performs write operations first and then performs write operations for copy 2 of the target data, the client a performs write operations later, resulting in data inconsistency between copy 1 and copy 2 of the same target data, and further, the reliability of data is influenced, so that only serial writing and not concurrent writing are allowed for the same file in the distributed file storage system.
Disclosure of Invention
The embodiment of the invention aims to provide a data concurrent writing method and a distributed data concurrent writing system, which are used for solving the problem that data reliability is influenced by inconsistent data caused by the fact that a plurality of clients perform writing operation on the same file. The specific technical scheme is as follows:
in a first aspect, an embodiment of the present invention provides a data concurrent writing method, which is applied to a control node in a distributed file storage system, where the distributed file storage system includes: a control node and a storage node, the method comprising:
the method comprises the steps that a first write-in request aiming at target data sent by a client is verified, a version number corresponding to the first write-in request is generated, a first version number is obtained, the first version number is increased along with the increase of the write-in times of the target data, and the first write-in request corresponds to corresponding data to be written in;
when the first write request is successfully verified, selecting a preset number of storage nodes for the data to be written to obtain address information corresponding to the preset number of storage nodes;
sending the address information corresponding to the storage nodes with the preset number and the first version number to the client, so that the client sends a second write request to the storage nodes with the preset number, and the storage nodes with the preset number execute corresponding write operation according to the second write request and return write result information to the client; the second write request comprises the data to be written and the first version number;
updating metadata information corresponding to the target data based on the writing result information returned by the client; the metadata information corresponding to the target data comprises the size of the target data and information of a storage node in which the target data is written.
Optionally, the step of generating a version number corresponding to the first write request to obtain a first version number includes:
and generating a first version number higher than the version number corresponding to the first write request aiming at the target data last time in history.
Optionally, the step of selecting a preset number of storage nodes for the data to be written includes:
and selecting a preset number of storage nodes for the data to be written according to the size of the corresponding data to be written corresponding to the first write request and the size of the residual space of each storage node.
Optionally, the method further includes:
receiving self-state information sent by a storage node, wherein the self-state information of the storage node comprises: the node state of the storage node, and the size of the remaining space of the storage node.
In a second aspect, an embodiment of the present invention provides a data concurrent writing method, which is applied to a storage node in a distributed file storage system, where the distributed file storage system includes: a control node and a storage node, the method comprising:
receiving a second write request sent by a client, wherein the second write request comprises a first version number corresponding to a first write request aiming at target data, and the first version number is increased along with the increase of the write times of the target data;
judging whether the first version number is higher than a local version number of the target data, wherein the local version number of the target data is a version number carried in a corresponding second write request when the target data is written in the local last time;
and if the first version number is higher than the local version number of the target data, executing the write operation corresponding to the second write request, and returning write result information indicating successful write to the client, so that the client sends the write result information to a control node, and the control node updates metadata information corresponding to the target data.
Optionally, the method further includes:
and if the first version number is not higher than the local version number of the target data, rejecting the write operation corresponding to the second write request, and returning write result information indicating write failure to the client, so that the client sends the write result information to a control node.
Optionally, the second write request further includes data to be written corresponding to the first write request for the target data; the step of executing the write operation corresponding to the second write request includes:
updating local data of the target data by using the data to be written;
and updating the local version number of the target data by using the first version number.
Optionally, the method further includes:
sending own state information to the control node, wherein the own state information comprises: the node status and the size of the remaining space.
In a third aspect, an embodiment of the present invention provides a distributed data concurrent writing system, where the distributed data concurrent writing system includes: a control node and a storage node;
the control node is used for verifying a first write-in request aiming at target data sent by a client and generating a first version number corresponding to the first write-in request, the first write-in request corresponds to corresponding data to be written in, when the first write-in request is verified successfully, a preset number of storage nodes are selected for the data to be written in to obtain address information corresponding to the preset number of storage nodes, the address information corresponding to the preset number of storage nodes and the first version number are sent to the client, and metadata information corresponding to the target data is updated based on write-in result information returned by the client; the first version number is increased along with the increase of the writing times of target data, and metadata information corresponding to the target data comprises the size of the target data and information of a storage node in which the target data is written;
the storage node is configured to receive a second write request sent by a client, determine whether a first version number included in the second write request is higher than a local version number of the target data, execute a write operation corresponding to the second write request when it is determined that the first version number is higher than the local version number of the target data, and return write result information indicating that the write is successful to the client; the second write request comprises a first version number corresponding to a first write request for target data and data to be written, and the local version number of the target data is a version number carried in the second write request corresponding to the target data when the local last data is written.
Optionally, the system further comprises a client;
the client is used for sending a first write-in request aiming at target data to the control node, and receiving address information corresponding to a preset number of storage nodes returned by the control node and the first version number; and sending the second write request to a preset number of storage nodes, receiving write result information returned by the preset number of storage nodes, and sending the write result information to the control node.
Optionally, the control node is specifically configured to:
and generating a first version number higher than the version number corresponding to the first write request aiming at the target data last time in history.
Optionally, the storage node is further configured to:
and when the first version number is judged to be not higher than the local version number of the target data, rejecting the write operation corresponding to the second write request, and returning write result information indicating write failure to the client.
Optionally, the storage node is specifically configured to:
updating local data of the target data by using the data to be written;
and updating the local version number of the target data by using the first version number.
Optionally, the storage node is further configured to:
sending own state information to the control node, wherein the own state information comprises: the node status and the size of the remaining space.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program, when executed by a processor, implements the method steps of the data concurrent writing method according to the first aspect.
In a fifth aspect, the present invention provides a computer-readable storage medium, in which a computer program is stored, and the computer program, when executed by a processor, implements the method steps of the data concurrent writing method according to the second aspect.
The embodiment of the invention has the following beneficial effects:
according to the data concurrent writing method and the distributed data concurrent writing system provided by the embodiment of the invention, the control node can generate the corresponding first version number for the first writing request aiming at the target data sent by the client, the first version number is increased along with the increase of the writing times of the target data, namely the version number aiming at the same target data is unique and is increased, so that when the storage node receives the second writing request, whether the data writing operation is executed or not can be determined according to the first version number, the consistency of a plurality of copy file data is ensured when a plurality of clients write the same target data concurrently, the problem of inconsistency of the plurality of copy file data caused when a plurality of clients write the same target data concurrently is solved, and the reliability of the data is improved. Further, the method and the device realize that in the distributed file storage system, a plurality of clients can be allowed to write the same target data simultaneously.
Of course, not all of the advantages described above need to be achieved at the same time in the practice of any one product or method of the invention.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other embodiments can be obtained by using the drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for concurrently writing data according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating another data concurrent writing method according to an embodiment of the present invention;
fig. 3 is a schematic flowchart of another data concurrent writing method according to an embodiment of the present invention;
FIG. 4 is a flow chart illustrating an embodiment of data writing according to the present invention;
fig. 5 is an interaction diagram of a data concurrent writing method according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of a data concurrent writing system according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The distributed file storage system has redundancy, the failure of part of nodes does not affect the normal operation of the whole system, and even if the data stored in the failed computer is damaged, the damaged data can be recovered by other nodes. A typical distributed file storage system generally includes a control node (or metadata server) and a storage node (or data server). The control node mainly has the following functions: the management directory tree manages metadata information (such as file directories, file names, file uploading time, file sizes, storage nodes on which file data are distributed, and the like) of files uploaded by a client (or called a user), and manages all storage nodes in the distributed file storage system. The storage node mainly has the following functions: and storing the file data uploaded by the client.
In order to solve the problem that data inconsistency in multiple copies corresponding to target data may be caused when multiple clients perform write operations on the same target data in the prior art, an embodiment of the present invention provides a data concurrent write method, which is applied to a control node in a distributed file storage system, where the distributed file storage system includes: a control node and a storage node, the method may comprise:
the method comprises the steps that a first write-in request aiming at target data sent by a client is verified, a version number corresponding to the first write-in request is generated, a first version number is obtained, the first version number is increased along with the increase of the write-in times of the target data, and the first write-in request corresponds to corresponding data to be written in;
when the first write request is successfully verified, selecting a preset number of storage nodes for the data to be written to obtain address information corresponding to the preset number of storage nodes;
sending the address information corresponding to the storage nodes with the preset number and the first version number to the client, so that the client sends a second write request to the storage nodes with the preset number, and the storage nodes with the preset number execute corresponding write operation according to the second write request and return write result information to the client; the second write request comprises the data to be written and the first version number;
updating metadata information corresponding to the target data based on the writing result information returned by the client; the metadata information corresponding to the target data comprises the size of the target data and information of a storage node in which the target data is written.
According to the data concurrent writing method provided by the embodiment of the invention, the control node can generate the corresponding first version number for the first writing request aiming at the target data sent by the client, the first version number is increased along with the increase of the writing times of the target data, namely the first version number is unique and is increased progressively aiming at the same target data, so that when the storage node receives the second writing request, whether the data writing operation is executed or not can be determined according to the first version number, the consistency of a plurality of copy file data is ensured when a plurality of clients write the same target data concurrently, the problem of the inconsistency of the plurality of copy file data caused when a plurality of clients write the same target data concurrently is solved, and the reliability of the data is improved. Further, the method and the device realize that in the distributed file storage system, a plurality of clients can be allowed to write the same target data simultaneously.
The following describes a data concurrent writing method provided by an embodiment of the present invention in detail:
in the embodiment of the invention, the data concurrent writing method is applied to a distributed file storage system, and the distributed file storage system can comprise a control node and a plurality of storage nodes. For example, the control node and the plurality of storage nodes may be data storage modules respectively arranged on different servers, and the like.
As shown in fig. 1, an embodiment of the present invention provides a data concurrent writing method, which is applied to a control node in a distributed file storage system, and the method may include the following steps:
s101, a first write-in request aiming at target data sent by a client is verified, a version number corresponding to the first write-in request is generated, and a first version number is obtained.
The first write-in request may be a write-in request for target data sent by any client, in the embodiment of the present invention, a plurality of clients may concurrently perform write-in operation for the target data; the control node may verify the received write requests for the same target data sent by one or more clients, and generate a corresponding version number for the write requests to obtain a first version number corresponding to the first write request.
The first version number increases with the increase of the writing times of the target data, and the first writing request corresponds to corresponding data to be written. For example, when the control node receives a third write request for the target data, the corresponding version number may be generated as V3, when the control node receives a fifth write request for the target data, the corresponding version number may be generated as V5, and so on.
In the embodiment of the invention, for the same target data, the version number corresponding to data writing is unique and strictly increases, so that the version number corresponding to the target data increases when the same target data is written every time, and the problem of data disorder of the target data when a plurality of clients write the same target data simultaneously is avoided.
As an optional implementation manner of the embodiment of the present invention, the first write request for the target data sent by the client may include: the identification of the client corresponds to information such as the size of the corresponding data volume of the data to be written corresponding to the first write request of the target data, and further, the control node verifies the first write request of the target data sent by the client, which may be the verification of the authority of the client. For example, when a first write request for target data sent by a client is received, it may be determined whether the client has permission to perform a write operation for the target data, and the like. The control node verifies a first write request aiming at the target data sent by the client, and can also verify the data volume of the data to be written corresponding to the first write request.
As an optional implementation manner of the embodiment of the present invention, the step of generating a version number corresponding to the first write request to obtain the first version number may include:
a first version number is generated that is higher than a corresponding version number of a last past first write request for the target data.
In the embodiment of the present invention, the control node receives a first write request for target data sent by the client, and may generate, for the first write request, a first version number that is higher than a version number corresponding to a latest historical first write request for the target data, so that for the same target data, the version number corresponding to data write is unique and strictly increases.
S102, when the first write-in request is successfully verified, selecting a preset number of storage nodes for data to be written in, and obtaining address information corresponding to the preset number of storage nodes.
In the embodiment of the present invention, the control node may store a management directory tree of different files, metadata information of the files, storage node information, and the like, where the metadata information may include: the storage node information may include the size of the remaining space of the storage node, or the availability of the remaining space of the storage node, and the like.
When the control node successfully verifies the first write request, a preset number of storage nodes can be selected for the data to be written of the target data, and address information corresponding to the preset number of storage nodes is obtained. The preset number can be set by a person skilled in the art according to actual requirements, for example, in order to ensure the reliability of data, the preset number can be set to 3, and then, when the client needs to perform write operation on target data, 3 storage nodes can be selected for the data to be written.
As an optional implementation manner of the embodiment of the present invention, the step of selecting a preset number of storage nodes for data to be written may include:
and selecting a preset number of storage nodes for the data to be written according to the size of the corresponding data to be written corresponding to the first write request and the size of the residual space of each storage node.
According to the embodiment of the invention, the control node can sequence the storage nodes capable of storing the data to be written according to the size of the data volume of the data to be written corresponding to the first write request and the size of the residual space of each storage node to obtain a storage node sequence, and then select a preset number of storage nodes from the storage node sequence.
S103, sending the address information and the first version number corresponding to the storage nodes with the preset number to the client, so that the client sends the second write-in request to the storage nodes with the preset number, and the storage nodes with the preset number respectively execute corresponding write-in operation according to the second write-in request and return the write-in result information to the client.
The control node sends address information and the first version number corresponding to the storage nodes with the preset number to the client, so that the client sends a second write request containing data to be written and the first version number to the storage nodes with the preset number, the storage nodes with the preset number execute corresponding write operation according to the second write request respectively, write result information is returned to the client, and the client forwards the write result information to the control node.
And S104, updating the metadata information corresponding to the target data based on the writing result information returned by the client.
And the control node receives writing result information returned by the client, judges whether the writing result information represents writing success or writing failure, and updates metadata information corresponding to the target data if the writing result information represents writing success, wherein the metadata information corresponding to the target data comprises the size of the target data and information of a storage node in which the target data is written. If the write result information indicates a write failure, the metadata information corresponding to the target data is not updated.
According to the data concurrent writing method provided by the embodiment of the invention, the control node can generate the corresponding first version number for the first writing request aiming at the target data sent by the client, the first version number is increased along with the increase of the writing times of the target data, namely the first version number is unique and is increased progressively aiming at the same target data, so that when the storage node receives the second writing request, whether the data writing operation is executed or not can be determined according to the first version number, the consistency of a plurality of copy file data is ensured when a plurality of clients write the same target data concurrently, the problem of the inconsistency of the plurality of copy file data caused when a plurality of clients write the same target data concurrently is solved, and the reliability of the data is improved. Further, the method and the device realize that in the distributed file storage system, a plurality of clients can be allowed to write the same target data simultaneously.
As an optional implementation manner of the embodiment of the present invention, the control node may further receive self-state information sent by the storage node, where the self-state information of the storage node may include: the node state of the storage node, and the size of the remaining space of the storage node. For example, the state of the storage node may include information about whether the storage node is working normally, the normal speed of data writing, and the like, and the size of the remaining space of the storage node is the size of the space that can be currently used by the storage node.
In the embodiment of the invention, the control node can receive the self state information sent by the storage node, and then the state information of the storage node stored in the control node is updated, so that the storage node to be written is selected for the data to be written more accurately.
As shown in fig. 2, another data concurrent writing method provided in an embodiment of the present invention is applied to a storage node in a distributed file storage system, where the storage node may be one of a preset number of storage nodes selected by a control node for data to be written, and the method may include the following steps:
s201, receiving a second write request sent by the client.
The second write request may include a first version number corresponding to the first write request for the target data, where the first version number increases with an increase in the number of times of writing the target data.
S202, judging whether the first version number is higher than the local version number of the target data.
The storage node receives a second write-in request which is sent by the client and contains a first version number corresponding to the first write-in request for the target data, and then judges a relationship between the first version number and a local version number of the target data, wherein the local version number of the target data can be: and the target data corresponds to the version number carried in the second write request when the local last data is written. If the first version number is higher than the local version number of the target data, it indicates that the second write request for the target data is the latest data write request corresponding to the target data, so the step S203 is executed.
S203, if the first version number is higher than the local version number of the target data, executing the write operation corresponding to the second write request, and returning write result information indicating successful write to the client, so that the client sends the write result information to the control node, and the control node updates the metadata information corresponding to the target data.
And if the first version number is higher than the local version number of the target data, executing write operation corresponding to the second write request, updating the local target data, and returning write result information indicating successful write to the client, so that the client sends the write result information to the control node, and the control node updates metadata information corresponding to the target data.
According to the data concurrent writing method provided by the embodiment of the invention, the control node can generate the corresponding first version number for the first writing request aiming at the target data sent by the client, the first version number is increased along with the increase of the writing times of the target data, namely the first version number is unique and is increased progressively aiming at the same target data, so that when the storage node receives the second writing request, whether the data writing operation is executed or not can be determined according to the first version number, the consistency of a plurality of copy file data is ensured when a plurality of clients write the same target data concurrently, the problem of the inconsistency of the plurality of copy file data caused when a plurality of clients write the same target data concurrently is solved, and the reliability of the data is improved. Further, the method and the device realize that in the distributed file storage system, a plurality of clients can be allowed to write the same target data simultaneously.
On the basis of the embodiment shown in fig. 2, as shown in fig. 3, another data concurrent writing method provided by the embodiment of the present invention is applied to a storage node in a distributed file storage system, and the method may include the following steps:
s301, receiving a second write request sent by the client.
The second write request may include a first version number corresponding to the first write request for the target data, where the first version number increases with an increase in the number of times of writing the target data.
S302, whether the first version number is higher than the local version number of the target data is judged.
The local version number of the target data may be: and the target data corresponds to the version number carried in the second write request when the local last data is written.
And S303, if the first version number is higher than the local version number of the target data, executing the write operation corresponding to the second write request, and returning write result information indicating successful write to the client, so that the client sends the write result information to the control node, and the control node updates the metadata information corresponding to the target data.
The implementation process of steps S301 to S303 may be the same as steps S201 to S203, and the embodiment of the present invention is not described herein again.
S304, if the first version number is not higher than the local version number of the target data, the write operation corresponding to the second write request is rejected, and write result information indicating write failure is returned to the client, so that the client sends the write result information to the control node.
And if the first version number is judged to be not higher than the local version number of the target data, which indicates that the locally stored target data is the latest file, the target data is refused to be updated, the write operation corresponding to the second write request is refused, and the write result information indicating the write failure is returned to the client, so that the client sends the write result information to the control node.
In the embodiment of the invention, when the first version number is judged to be not higher than the local version number of the target data, the write operation corresponding to the second write request is rejected, so that the consistency of a plurality of copy file data when a plurality of clients write the same target data simultaneously is ensured, the problem of inconsistency of the plurality of copy file data when the plurality of clients write the same target data simultaneously is solved, and the reliability of the data is improved.
As an optional implementation manner of the embodiment of the present invention, as shown in fig. 4, the second write request received by the storage node may further include data to be written corresponding to the first write request for the target data, and correspondingly, when it is determined that the first version number is higher than the local version number of the target data, the step of executing the write operation corresponding to the second write request may include:
s3031, updating the local data of the target data by using the data to be written.
S3032, the local version number of the target data is updated using the first version number.
And when judging that the first version number is higher than the local version number of the target data, the storage node updates the local data of the target data by using the data to be written contained in the received second write request, and updates the local version number of the target data by using the first version number contained in the received second write request. The client is used for corresponding to the data to be written corresponding to the first write request of the target data, and the control node is used for generating the first version number for the first write request, covering the local data and the local version number corresponding to the locally stored target data, updating the target data, and completing the write operation corresponding to the second write request.
As an optional implementation manner of the embodiment of the present invention, the storage node may further send its own state information to the control node, where the own state information of the storage node may include: node status and remaining space size. For example, the state of the node may include information about whether the storage node is working normally, the normal speed of data writing, and the like, and the size of the remaining space is the size of the space that can be currently used by the storage node.
In the embodiment of the invention, the storage node can send the state information of the storage node to the control node at regular time or periodically, so that the control node can update the stored state information of the storage node, and the storage node to be written is selected for the data to be written more accurately.
Illustratively, when there are two clients a and B concurrently performing a concurrent write operation on target data, the control node receives first write requests sent by the clients a and B for the same target data, and may check the write requests sent by the clients a and B, respectively, while generating a corresponding version number for the write request of the client a, denoted as V1, and a corresponding version number for the write request of the client B, denoted as V2, and V2 is higher than V1. When the control node successfully checks the write requests sent by the client A and the client B, 3 storage nodes C, D and E are selected for data to be written corresponding to the write requests of the client A and the client B respectively, address information corresponding to the storage nodes C, D and E and the version number V1 are sent to the client A, and address information corresponding to the storage nodes C, D and E and the version number V2 are sent to the client B.
The client a receives the address information and the version number V1 corresponding to the storage nodes C, D and E returned by the control node, and sends a second write request to the storage nodes C, D and E, respectively, where the second write request carries the version number V1 and corresponding data to be written.
The client B receives the address information and the version number V2 corresponding to the storage nodes C, D and E returned by the control node, and sends a second write request to the storage nodes C, D and E, respectively, where the second write request carries the version number V2 and corresponding data to be written.
The storage node C is taken as C, the local version number of the target data is V0, the storage node C receives the second write request sent by the client a, at this time, it is determined that the version number V1 carried by the second write request is higher than the local version number V0, the local data is updated to the data to be written corresponding to the second write request sent by the client a, and the local version number V0 is updated to V1. And the storage node C receives the second write request sent by the client B again, at this time, judges that the version number V2 carried by the second write request is higher than the local version number V1, updates the local data to the data to be written corresponding to the second write request sent by the client B, and updates the local version number V1 to V2.
Or, the storage node C receives the second write request sent by the client B first, at this time, determines that the version number V2 carried in the second write request is higher than the local version number V0, updates the local data to the data to be written corresponding to the second write request sent by the client B, and updates the local version number V0 to V2. And the storage node C receives the second write request sent by the client A again, and at this time, if the version number V1 carried by the second write request is not higher than the local version number V2, the local data is not updated, and meanwhile, the local version number of the local data is kept as V2.
The storage nodes D and E execute the same operation, so that no matter the storage nodes C, D and E receive the second write request sent by the client a first or the second write request sent by the client B first, the version numbers corresponding to the data finally stored in the storage nodes C, D and E are all V2, which ensures the consistency of multiple copy file data when multiple clients write the same target data concurrently, solves the problem of inconsistency of multiple copy file data when multiple clients write the same target data concurrently, and improves the reliability of data.
As shown in fig. 5, fig. 5 is an interaction schematic diagram of a data concurrent writing method according to an embodiment of the present invention.
The client sends a first write request for target data to the control node.
The method comprises the steps that a control node checks a first write-in request aiming at target data sent by a client and generates a first version number corresponding to the first write-in request, the first write-in request corresponds to corresponding data to be written in, when the first write-in request is checked successfully, a preset number of storage nodes are selected for the data to be written in, address information corresponding to the preset number of storage nodes is obtained, and the address information corresponding to the preset number of storage nodes and the first version number are sent to the client.
The client receives the address information and the first version number which are sent by the control node and correspond to the storage nodes with the preset number, and sends the second write-in request to the storage nodes with the preset number.
And the storage node receives a second write-in request sent by the client, judges whether a first version number contained in the second write-in request is higher than a local version number of the target data or not, responds to write operation corresponding to the second write-in request, and returns write-in result information to the client.
And the client receives the writing result information returned by the storage node and returns the writing result information to the control node.
And the control node updates the metadata information corresponding to the target data based on the writing result information returned by the client.
Corresponding to the method embodiment, the embodiment of the invention also provides a corresponding system embodiment.
As shown in fig. 6, an embodiment of the present invention provides a distributed data concurrent writing system, where the distributed data concurrent writing system includes: a control node and a storage node;
the control node 401 is configured to verify a first write request for target data sent by the client 402, generate a first version number corresponding to the first write request, where the first write request corresponds to corresponding data to be written, select a preset number of storage nodes 403 for the data to be written when the first write request is successfully verified, obtain address information corresponding to the preset number of storage nodes, send the address information corresponding to the preset number of storage nodes and the first version number to the client 402, and update metadata information corresponding to the target data based on write result information returned by the client 402; the first version number increases with the increase of the writing times of the target data, and the metadata information corresponding to the target data comprises the size of the target data and the information of the storage node in which the target data is written.
The storage node 403 is configured to receive a second write request sent by the client 402, determine whether a first version number included in the second write request is higher than a local version number of the target data, execute a write operation corresponding to the second write request when it is determined that the first version number is higher than the local version number of the target data, and return write result information indicating that the write is successful to the client 402; the second write request comprises a first version number corresponding to the first write request of the target data and the data to be written, and the local version number of the target data is a version number carried in the second write request corresponding to the target data when the local last data is written.
According to the distributed data concurrent writing system provided by the embodiment of the invention, the control node can generate the corresponding first version number for the first writing request aiming at the target data sent by the client, the first version number is increased along with the increase of the writing times of the target data, namely the first version number is unique and is increased aiming at the same target data, so that when the storage node receives the second writing request, whether the data writing operation is executed or not can be determined according to the first version number, the consistency of a plurality of copy file data is ensured when a plurality of clients write the same target data concurrently, the problem of inconsistency of the plurality of copy file data caused when a plurality of clients write the same target data concurrently is solved, and the reliability of the data is improved. Further, the method and the device realize that in the distributed file storage system, a plurality of clients can be allowed to write the same target data simultaneously.
Optionally, the system further includes a client 402:
the client 402 is configured to send a first write request for target data to the control node 401, and receive address information corresponding to a preset number of storage nodes returned by the control node 401 and a first version number; and sending the second write request to the storage nodes 403 with the preset number, receiving write result information returned by the storage nodes with the preset number, and sending the write result information to the control node 401.
Optionally, the control node 401 is specifically configured to:
a first version number is generated that is higher than a corresponding version number of a last past first write request for the target data.
Optionally, the storage node 403 is further configured to:
and when the first version number is judged not to be higher than the local version number of the target data, rejecting the write operation corresponding to the second write request, and returning write result information indicating write failure to the client 402.
Optionally, the storage node 403 is specifically configured to:
updating local data of the target data by using the data to be written;
the local version number of the target data is updated using the first version number.
Optionally, the storage node 403 is further configured to:
sending own state information to the control node 401, where the own state information includes: node status and remaining space size.
In another embodiment of the present invention, a computer-readable storage medium is further provided, in which a computer program is stored, and the computer program, when executed by a processor, implements the steps of any one of the above-mentioned data concurrent writing methods to achieve the same effects.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
All the embodiments in the present specification are described in a related manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only for the preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention shall fall within the protection scope of the present invention.

Claims (16)

1. A data concurrent writing method is applied to a control node in a distributed file storage system, and the distributed file storage system comprises: a control node and a storage node, the method comprising:
the method comprises the steps that a first write-in request aiming at target data sent by a client is verified, a version number corresponding to the first write-in request is generated, a first version number is obtained, the first version number is increased along with the increase of the write-in times of the target data, and the first write-in request corresponds to corresponding data to be written in;
when the first write request is successfully verified, selecting a preset number of storage nodes for the data to be written to obtain address information corresponding to the preset number of storage nodes;
sending the address information corresponding to the storage nodes with the preset number and the first version number to the client, so that the client sends a second write request to the storage nodes with the preset number, and the storage nodes with the preset number execute corresponding write operation according to the second write request and return write result information to the client; the second write request comprises the data to be written and the first version number;
updating metadata information corresponding to the target data based on the writing result information returned by the client; the metadata information corresponding to the target data comprises the size of the target data and information of a storage node in which the target data is written.
2. The method according to claim 1, wherein the step of generating a version number corresponding to the first write request to obtain a first version number comprises:
and generating a first version number higher than the version number corresponding to the first write request aiming at the target data last time in history.
3. The method according to any one of claims 1 or 2, wherein the step of selecting a predetermined number of storage nodes for the data to be written comprises:
and selecting a preset number of storage nodes for the data to be written according to the size of the corresponding data to be written corresponding to the first write request and the size of the residual space of each storage node.
4. The method of claim 1, further comprising:
receiving self-state information sent by a storage node, wherein the self-state information of the storage node comprises: the node state of the storage node, and the size of the remaining space of the storage node.
5. A data concurrent writing method is applied to storage nodes in a distributed file storage system, and the distributed file storage system comprises the following steps: a control node and a storage node, the method comprising:
receiving a second write request sent by a client, wherein the second write request comprises a first version number corresponding to a first write request aiming at target data, and the first version number is increased along with the increase of the write times of the target data;
judging whether the first version number is higher than a local version number of the target data, wherein the local version number of the target data is a version number carried in a corresponding second write request when the target data is written in the local last time;
and if the first version number is higher than the local version number of the target data, executing the write operation corresponding to the second write request, and returning write result information indicating successful write to the client, so that the client sends the write result information to a control node, and the control node updates metadata information corresponding to the target data.
6. The method of claim 5, further comprising:
and if the first version number is not higher than the local version number of the target data, rejecting the write operation corresponding to the second write request, and returning write result information indicating write failure to the client, so that the client sends the write result information to a control node.
7. The method according to any one of claims 5 or 6, wherein the second write request further includes data to be written corresponding to the first write request for the target data; the step of executing the write operation corresponding to the second write request includes:
updating local data of the target data by using the data to be written;
and updating the local version number of the target data by using the first version number.
8. The method of claim 5, further comprising:
sending own state information to the control node, wherein the own state information comprises: the node status and the size of the remaining space.
9. A distributed data concurrent write system, comprising: a control node and a storage node;
the control node is used for verifying a first write-in request aiming at target data sent by a client and generating a first version number corresponding to the first write-in request, the first write-in request corresponds to corresponding data to be written in, when the first write-in request is verified successfully, a preset number of storage nodes are selected for the data to be written in to obtain address information corresponding to the preset number of storage nodes, the address information corresponding to the preset number of storage nodes and the first version number are sent to the client, and metadata information corresponding to the target data is updated based on write-in result information returned by the client; the first version number is increased along with the increase of the writing times of target data, and metadata information corresponding to the target data comprises the size of the target data and information of a storage node in which the target data is written;
the storage node is configured to receive a second write request sent by a client, determine whether a first version number included in the second write request is higher than a local version number of the target data, execute a write operation corresponding to the second write request when it is determined that the first version number is higher than the local version number of the target data, and return write result information indicating that the write is successful to the client; the second write request comprises a first version number corresponding to a first write request for target data and data to be written, and the local version number of the target data is a version number carried in the second write request corresponding to the target data when the local last data is written.
10. The system of claim 9, further comprising a client;
the client is used for sending a first write-in request aiming at target data to the control node, and receiving address information corresponding to a preset number of storage nodes returned by the control node and the first version number; and sending the second write request to a preset number of storage nodes, receiving write result information returned by the preset number of storage nodes, and sending the write result information to the control node.
11. The system according to any of claims 9 or 10, wherein the control node is specifically configured to:
and generating a first version number higher than the version number corresponding to the first write request aiming at the target data last time in history.
12. The system of any of claims 9 or 10, wherein the storage node is further configured to:
and when the first version number is judged to be not higher than the local version number of the target data, rejecting the write operation corresponding to the second write request, and returning write result information indicating write failure to the client.
13. The system according to any of claims 9 or 10, wherein the storage node is specifically configured to:
updating local data of the target data by using the data to be written;
and updating the local version number of the target data by using the first version number.
14. The system of any of claims 9 or 10, wherein the storage node is further configured to:
sending own state information to the control node, wherein the own state information comprises: the node status and the size of the remaining space.
15. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 4.
16. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any of the claims 5-8.
CN202011431452.7A 2020-12-09 2020-12-09 Data concurrent writing method and distributed data concurrent writing system Pending CN112486932A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011431452.7A CN112486932A (en) 2020-12-09 2020-12-09 Data concurrent writing method and distributed data concurrent writing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011431452.7A CN112486932A (en) 2020-12-09 2020-12-09 Data concurrent writing method and distributed data concurrent writing system

Publications (1)

Publication Number Publication Date
CN112486932A true CN112486932A (en) 2021-03-12

Family

ID=74941030

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011431452.7A Pending CN112486932A (en) 2020-12-09 2020-12-09 Data concurrent writing method and distributed data concurrent writing system

Country Status (1)

Country Link
CN (1) CN112486932A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113778331A (en) * 2021-08-12 2021-12-10 联想凌拓科技有限公司 Data processing method, main node and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113778331A (en) * 2021-08-12 2021-12-10 联想凌拓科技有限公司 Data processing method, main node and storage medium
CN113778331B (en) * 2021-08-12 2024-06-07 联想凌拓科技有限公司 Data processing method, master node and storage medium

Similar Documents

Publication Publication Date Title
US10642694B2 (en) Monitoring containers in a distributed computing system
US10896102B2 (en) Implementing secure communication in a distributed computing system
CN102317923B (en) Storage system
US9411685B2 (en) Parity chunk operating method and data server apparatus for supporting the same in distributed raid system
US9607001B2 (en) Automated failover of a metadata node in a distributed file system
US20130282668A1 (en) Automatic repair of corrupt hbases
CN109831540B (en) Distributed storage method and device, electronic equipment and storage medium
CN102938784A (en) Method and system used for data storage and used in distributed storage system
CN108733311B (en) Method and apparatus for managing storage system
US8745342B2 (en) Computer system for controlling backups using wide area network
CN103597463A (en) Automatic configuration of a recovery service
CN104067240A (en) Block level storage
US7849355B2 (en) Distributed object sharing system and method thereof
CN113268472B (en) Distributed data storage system and method
CN112486932A (en) Data concurrent writing method and distributed data concurrent writing system
CN112579550B (en) Metadata information synchronization method and system of distributed file system
CN114138192A (en) Storage node online upgrading method, device, system and storage medium
US10180787B2 (en) Dispersed storage write process with lock/persist
CN111752892B (en) Distributed file system and implementation method, management system, equipment and medium thereof
CN110119388B (en) File reading and writing method, device, system, equipment and computer readable storage medium
CN106991121B (en) Super-fusion data storage method and system
CN117376364A (en) Data processing method and related equipment
CN109154880B (en) Consistent storage data in a decentralized storage network
CN112486942B (en) Multi-copy storage method and multi-copy storage system for file data
CN114528139A (en) Method, device, electronic equipment and medium for data processing and node deployment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination