WO2020253407A1 - 一种执行写操作、读操作的方法及装置 - Google Patents

一种执行写操作、读操作的方法及装置 Download PDF

Info

Publication number
WO2020253407A1
WO2020253407A1 PCT/CN2020/088787 CN2020088787W WO2020253407A1 WO 2020253407 A1 WO2020253407 A1 WO 2020253407A1 CN 2020088787 W CN2020088787 W CN 2020088787W WO 2020253407 A1 WO2020253407 A1 WO 2020253407A1
Authority
WO
WIPO (PCT)
Prior art keywords
client
storage resource
data
server
read
Prior art date
Application number
PCT/CN2020/088787
Other languages
English (en)
French (fr)
Inventor
罗四维
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020253407A1 publication Critical patent/WO2020253407A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Definitions

  • This application relates to the field of communication technology, and in particular to a method and device for performing write operations and read operations.
  • FIG. 1 the system architecture is generally as shown in FIG. 1, and the client forwards the IO request to the cluster server through the coordination node to perform the processing of the IO request.
  • FIG. 2 the specific steps are shown in FIG. 2.
  • the client sends a write request to the corresponding coordination node, and the coordination node requests the cluster management node to process the write after receiving the write request.
  • the requested partition The cluster management node determines a partition for processing the write request according to the current system load, where the partition includes 3 servers.
  • the coordinating node allocates a data write address for the client according to the partition allocated by the cluster management node and the data information contained in the write request, and notifies the partition and the write address To the client.
  • the client writes the data information contained in the write request into the corresponding locations of the three servers in the partition, and completes the data After writing, notify the coordinating node of the end position of this data writing.
  • the main problem with the above-mentioned distributed storage system implementing multi-node cluster operation method is that the client and the cluster server need to forward messages through the coordination node, and there are many interaction processes, which are prone to delay.
  • the coordination node can process a limited number of clients at the same time, it is easy to cause system bottlenecks, and if the coordination node fails, it will affect the normal operation of system services.
  • the present application provides a method and device for performing write operations and read operations, so as to avoid the problems of message forwarding through coordination nodes in the prior art, which has many interaction processes, and is prone to delay and cause system bottlenecks.
  • an embodiment of the present application provides a method for performing a write operation.
  • a server receives a first write request sent by a first client, and the first write request includes an identifier of the first client and a first write The data; the server receives a second write request sent by the second client, the second write request includes the identification of the second client and the second data to be written; the server according to the first client The corresponding relationship between the identification of the client and the stored identification of the client and the allocated storage resource, the storage resource allocated for the first client is determined to be the first storage resource; the server is based on the identification and storage of the second client According to the correspondence between the identification of the client and the allocated storage resource, it is determined that the storage resource allocated for the second client is a second storage resource, and the physical address where the second storage resource is located is different from the The physical address where the first storage resource is located; the server stores the first data to be written into the first storage resource, and stores the second data to be written into the first Two storage resources; the server creates a correspondence between
  • the server when the distributed storage system is performing a write operation, after the server receives the write request sent by the client, it is based on the stored correspondence between the client identifier and the allocated storage resource and the write request The identifier of the client included in the client determines the storage resource allocated for the client; thereby, the data to be written included in the write request is stored in the corresponding storage resource.
  • the server determines the location to store the data to be written, and stores the data to be written to the location. There is no need to coordinate nodes to determine the location to store the data to be written and store all the data. The location is notified to the client.
  • the server creates a correspondence between the identity of the first client and the physical address where the first storage resource is located, and the identity of the second client and the first 2. After the correspondence between the physical addresses where the storage resource is located, the server sends the correspondence between the identification of the first client and the physical address where the first storage resource is located to the first client And sending the correspondence between the identifier of the second client and the physical address where the second storage resource is located to the second client.
  • the server sends the corresponding relationship between the identification of the client and the physical address where the corresponding storage resource is located to the client, so that when the client needs to send a read request to the server To determine the starting position of the data to be read.
  • the server receives a first read request sent by the first client, and the first read request includes the identification of the first client, the starting position and the length of the first data to be read;
  • the server receives a second read request sent by the second client, and the second read request includes the identification of the second client, the start position and the length of the second data to be read;
  • the server Determine the first data to be read according to the first read request, send the first data to be read to the first client, and determine the first data according to the second read request Second data to be read, and sending the second data to be read to the second client.
  • the server After the server receives the read request, it determines the data to be read according to the client identifier in the read request, the starting position and length of the first data to be read, and then reads the corresponding Data and send the read data to the client.
  • the server stores the first data to be written in the first storage resource, and stores the second data to be written in the second storage Before the resource is stored, if the server determines that the remaining storage space in the first storage resource is less than the size of the first data to be written, continue to allocate at least one third storage resource to the first client , And record the correspondence between the identifier of the first client and the identifier of the at least one third storage resource; if the server determines that the remaining storage space in the second storage resource is smaller than the second to-be-written The size of the data, continue to allocate at least one fourth storage resource for the second client, and record the correspondence between the identification of the second client and the identification of the at least one fourth storage resource; the server will The storing the first data to be written into the first storage resource and storing the second data to be written into the second storage resource includes: the server storing the first data Part of the data in the written data is stored in the first storage resource, and the remaining part of the data in the first to-be-written data is stored in
  • the server first determines whether the remaining space of the current storage resource is enough to store the data before storing the data, if it is, it directly stores the data in the storage resource, if not, it creates a new Storage resources. Therefore, the server can create storage resources in real time according to actual conditions, effectively reducing memory usage.
  • an embodiment of the present application also provides a method for performing a read operation.
  • the server receives a first read request sent by a first client, and the first read request includes the identification of the first client, the first waiting The starting position and length of the read data;
  • the server receives a second read request sent by the second client, and the second read request includes the identifier of the second client and the second to-be-read The starting position and length of the data;
  • the server determines the first data to be read according to the first read request, and sends the first data to be read to the first client, and
  • the second data to be read is determined according to the second read request, and the second data to be read is sent to the second client.
  • the server when the distributed storage system is performing a read operation, after the server receives the read request sent by the client, it will determine the starting position of the data to be read according to the client identifier in the read request and Information such as length to read the corresponding data and send the read data to the client.
  • the server determining the first data to be read according to the first read request and determining the second data to be read according to the second read request includes : The server determines that the storage resource allocated to the first client is the first storage resource according to the identifier of the first client; the server determines that it is the second client according to the identifier of the second client
  • the storage resource allocated by the terminal is the second storage resource; the server according to the correspondence between the identification of the first client and the physical address where the first storage resource is located, and the value of the first data to be read
  • the starting position and length are used to determine the first data to be read from the first storage resource; the server is between the identification of the second client and the physical address where the second storage resource is located
  • the corresponding relationship between and the starting position and length of the second data to be read determines the second data to be read from the second storage resource.
  • the server After the server receives the read request, it determines the data to be read according to the client identifier in the read request, the starting position and length of the first data to be read, and then reads the corresponding Data and send the read data to the client.
  • the server determines the first storage resource allocated for the first client according to the identifier of the first client, and determines as the first storage resource according to the identifier of the second client.
  • the second storage resource allocated by the client includes: the server determines that it is the first client according to the identification of the first client and the stored correspondence between the identification of the client and the allocated storage resource
  • the storage resource allocated by the terminal is the first storage resource; the server determines as the second storage resource according to the identification of the second client and the stored correspondence between the identification of the client and the allocated storage resource
  • the storage resource allocated by the client is the second storage resource.
  • a method is provided for the server to determine the storage resources allocated to the client according to the identifier of the client after receiving the read request. That is, the server determines the storage resource allocated to the client according to the identification of the client and the stored correspondence between the identification of the client and the allocated storage resources.
  • an embodiment of the present application also provides a device for performing a write operation.
  • the device may be a server in a distributed storage system.
  • the device may be used to execute the first aspect and any possible implementation of the first aspect.
  • the apparatus may include modules or units for performing the operations in the first aspect or any possible implementation of the first aspect.
  • it includes a processing unit and a communication unit.
  • an embodiment of the present application also provides a device for performing a read operation.
  • the device may be a client in a distributed storage system.
  • the device may be used to perform the second aspect and any possible aspects of the second aspect. Operation in the realization mode.
  • the apparatus may include modules or units for performing each operation in the foregoing second aspect or any possible implementation of the second aspect.
  • it includes a processing unit and a communication unit.
  • an embodiment of the present application also provides a distributed storage system, including the server of the third aspect and the client of the fourth aspect.
  • the embodiments of the present application provide a chip system including a processor and optionally a memory; wherein the memory is used to store a computer program, and the processor is used to call and run the computer program from the memory, so that the The communication device of the chip system executes any method of the first aspect or any possible implementation of the first aspect; and/or, the communication device installed with the chip system executes the second aspect or any of the second aspects Any one of the implementation methods.
  • the embodiments of the present application provide a computer program product, the computer program product includes: computer program code, when the computer program code is run by the communication unit, processing unit or transceiver, or processor of the communication device, the communication device Perform any of the above-mentioned first aspect or any of the possible implementations of the first aspect; and/or make the communication device installed with the chip system execute the above-mentioned second aspect or any of the possible implementations of the second aspect Any method.
  • an embodiment of the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores a program.
  • the program enables a communication device (for example, a server in a distributed storage system) to execute the first aspect or the first aspect described above. Any method in any possible implementation manner of the aspect; and/or so that a communication device (for example, a client in a distributed storage system) installed with a chip system executes the second aspect or any possible aspect of the second aspect Any method of implementation.
  • FIG. 1 is a schematic diagram of a system architecture of a distributed storage system for multi-node operation in the prior art
  • FIG. 2 is a schematic diagram of performing a write operation in the prior art
  • FIG. 3a is a schematic diagram of a system for performing a write operation provided by this application.
  • 3b and 3c are schematic diagrams of the first storage method for performing write operations provided by this application.
  • FIG. 4 provides a schematic diagram of a flow for executing a write operation for this application
  • FIG. 5 provides a schematic diagram of a failure in the data writing process for this application
  • FIG. 6a is a schematic diagram of the first reading situation for performing a read operation provided by this application.
  • FIG. 6b is a schematic diagram of a second reading situation for performing a read operation provided by this application.
  • FIG. 7 provides a schematic diagram of a flow of performing a read operation for this application.
  • FIG. 8 is a schematic diagram of the first device for performing a write operation provided by this application.
  • FIG. 9 is a schematic diagram of a second device for performing a write operation provided by this application.
  • FIG. 10 is a schematic diagram of a method for performing a write operation provided by this application.
  • FIG. 11 is a schematic diagram of the first device for performing a read operation provided by this application.
  • FIG. 12 is a schematic diagram of the second device for performing a read operation provided by this application.
  • FIG. 13 is a schematic diagram of a method for performing a read operation provided by this application.
  • a method for performing a write operation in a distributed storage system is generally: at least one client that needs to perform a write operation requests a corresponding coordination node to allocate a server for processing.
  • the coordination node respectively determines the server to be processed for the client that needs to perform the write operation and allocates the starting position of this data write.
  • At least one client that needs to perform a write operation writes data according to the server allocated by the corresponding coordinating node and the allocated starting position of this data writing, and informs the coordinating node of this data writing after completing the data writing The end position. In this way, through the intermediate coordination and information transmission of the coordination node, the write operation can be realized.
  • the client and the cluster server need to use the coordination node to forward messages, and there are many interaction processes, which is prone to delay.
  • the coordination node can process a limited number of clients at the same time, it is easy to cause system bottlenecks, and if the coordination node fails, it will affect the normal operation of system services.
  • the client 1, client 2, and client 3 need to simultaneously perform write operations, and the client 1, client 2, and client 3 respectively request the coordinating node to allocate a server for processing. Because the coordinating node needs to allocate writing locations to the clients one by one, it is inevitable that some clients will be assigned writing locations first, and some clients have been waiting for the coordinating node to allocate writing locations. As a result, some clients have a delay in writing operations. This phenomenon becomes more prominent when the number of client requests for write operations increases.
  • an embodiment of the present application provides a method for performing a write operation. It can be applied to any distributed storage system that needs to perform write operations.
  • the basic idea of the embodiments of the present application is to remove the steps of message forwarding through the coordination node and allocation of write locations through the coordination node when performing a write operation in a distributed storage system. That is, in the embodiment of the present application, the coordination node is no longer needed in the overall system architecture, thereby minimizing the interaction process and the number of message forwarding.
  • write operations can be directly performed to the server allocated by the cluster management node, without waiting for the coordinating node to allocate the start position of the write, which effectively reduces the waiting delay time.
  • a distributed storage system to which this embodiment of the application can be applied may specifically include a client (client 1, client 2, . client 7), a cluster management node, and a server (server 1 , Server 2 and Server 3).
  • client such as client 1
  • server server 1 , Server 2 and Server 3
  • client 1 is used to send a write request to the cluster management node
  • cluster management node is used according to the current processing resource conditions of servers 1 to 3, such as the current availability of servers, storage resources, etc.
  • Client 1 assigns a server (such as server 2) and sends the assigned server 2’s identity to client 1.
  • client 1 can directly send a write request (including the data to be written) to server 2 to avoid imagery
  • server 2 is used to first determine whether the client 1 has been allocated storage resources, if not, it will directly allocate a piece of storage for the client 1. Resources, and then store the data to be written in the storage resources allocated for client 1; if storage resources have been allocated for client 1 in advance, store the data to be written in the free area in the storage resources allocated for client 1 data.
  • the server 2 may also carry the storage address information of the data to be written in the write request response message and feed it back to the client 1.
  • the server 2 after the server 2 successfully writes the data to be written, the corresponding relationship between the identification of the client 1 and the physical address where the data to be written is written can also be created.
  • the subsequent client 1 can also be used to send a read request to the server, where the read request includes the client identifier, the starting position and length of the data to be read, and other information; the server 2 receives the read request Then, according to the client identifier in the read request, and the correspondence between the created client identifier and the physical address to which the data to be written is written, find the storage address information corresponding to the read request, and then combine the read For information such as the starting position and length of the data, read the corresponding data, and send the read data to the client 1.
  • the client can directly send an IO request to the cluster management node, thereby requesting the server allocated by the cluster management node to perform IO processing.
  • multiple clients may need to perform IO processing at the same time in the distributed storage system.
  • FIG. 3a there are currently 7 clients that need to perform IO processing.
  • client 1, client 2, client 3, and client 5 need to perform read operations in the IO processing.
  • client 4, Client 6, and Client 7 need to perform write operations in IO processing.
  • the client 1, the client 2, the client 3, and the client 5 may directly send a server request for assigning a read operation to the cluster management node. Then, the client 1, client 2, client 3, and client 5 determine the server that executes the write request according to the allocation information sent by the cluster management node, and send a read request to the allocated server.
  • Client 4, client 6, and client 7 can directly send a server request for allocation of write operations to the cluster management node, and then, client 4, client 6, and client determine to execute the above according to the allocation information sent by the cluster management node. Write a request to the server, and send a write request to the server.
  • the allocation information sent by the cluster management node received by the client determines the server that performs IO processing.
  • the allocation information includes the IP address information of the server, so that the client can base on the IP address information of the server Determine the server for IO processing.
  • the server 1 and client 2 corresponding to the read operation is server 1.
  • the server performing the read operation corresponding to the client 3 and the client 5 is the server 2.
  • the server performing the read operation corresponding to the client 4, the client 6 and the client 7 is the server 3.
  • the server 1 receives the read request sent by the client 1 and client 2, it determines the corresponding data in the read request, and returns the data to the corresponding client 1 and client 2.
  • the server 2 determines the corresponding data in the read request, and returns the data to the corresponding client 3 and client 5.
  • the server 3 receives the write request sent by the client 4, the client 6 and the client 7, it stores the corresponding data in the write request in the corresponding location and sends it to the client 4, the client 6 and the client. 7 Return the execution result.
  • a node refers to a device in a distributed storage system.
  • the nodes can be divided into storage nodes and access nodes.
  • the storage nodes are used to store data, and the access nodes are used to access data in the storage nodes.
  • the node may be a client, a cluster management node, or a server in the system shown in FIG. 3a.
  • a distributed hash table (DHT) method is usually used for routing when the storage node is selected, but this embodiment of the application is not limited thereto.
  • DHT distributed hash table
  • the hash ring is evenly divided into several parts, and each part is called a partition, and each partition corresponds to a section of storage space of a set size. It can be understood that the more partitions, the smaller the storage space corresponding to each partition, and the fewer partitions, the larger the storage space corresponding to each partition.
  • the cluster management node is used to manage the distribution of partitions in the distributed storage system, and provides partition change management and cluster management, such as capacity expansion, shrinkage, and upgrades.
  • the cluster management node may allocate a corresponding partition for the client to process the read operation or write operation, and the partition may belong to Different servers.
  • the partition In a distributed storage system, in order to better improve the reliability of data, the partition can generally correspond to multiple servers, and in the process of writing operations, the client needs to receive all the servers in the partition. After the successful execution of the information, the write operation is determined to be successful. In practical applications, the longest number of attributable servers in a partition is 3 at most. Therefore, the data can be stored on the three corresponding servers in the partition to ensure the reliability and availability of the data.
  • the status of the corresponding server in the partition mainly has two types: OK and UNOK. If the server can work normally or is working normally, the status of the server is OK; if the server fails and the server is in the process of recovering from the failure, the status of the server is UNOK. The server whose status is UNOK needs to be updated by the cluster management node to be OK after data recovery is completed.
  • the coordination node is used in the prior art to forward the IO request sent by the client to the cluster server management node, and then used to receive the server ID allocated by the cluster management node and the address allocation for the IO request, etc. Operate and notify the client of the server ID and the assigned address.
  • the client When the client needs to perform a write operation, it directly sends a server request for assigning a write operation to the cluster management node.
  • the cluster management node After the cluster management node receives the server request sent by the client to allocate the write operation, it returns appropriate partition information to the client according to the system load.
  • the partition information includes a partition ID, so that the client can determine the server in the partition corresponding to the partition ID in the partition information allocated by the cluster management node according to the correspondence between the partition ID and the IP address of the server in the partition. IP address.
  • the cluster management node may directly send the request to the client according to the system load after receiving the request from the client to allocate the server for the write operation.
  • An appropriate server ID is returned, so that the client can determine the server that performs the write operation according to the server ID.
  • multiple servers may be set up in one partition.
  • the client may send write operations to multiple servers. Keep the data stored in multiple servers consistent. Therefore, when a server fails or data is damaged, the client can continue to perform read operations and perform data repair through other servers in a partition.
  • the client identifier that sends the request to the cluster management node to allocate the server for the write operation is client 1
  • the cluster management node sends If the partition ID contained in the partition information for the client 1 is partition 1, then through the content shown in Table 1, it can be determined that the IP address of the server performing the write operation of the client 1 is 34.144.246.240 (according to the IP address
  • the server can be determined to be server 1), 46.150.246.240 (the server can be determined to be server 2 based on the IP address), 36.116.246.240 (the server can be determined to be server 3 based on the IP address).
  • the client 1 determines that the IP addresses of the servers performing the write operation are 34.144.246.240, 46.150.246.240, and 36.116.246.240, and can send the IP addresses to 34.144.246.240,
  • the 46.150.246.240 and 36.116.246.240 servers send write requests, that is, send write requests to server 1, server 2, and server 3.
  • the client sends a write operation request to the server, and when the server writes data according to the received write operation request, it needs to determine the corresponding location of the written data.
  • the corresponding relationship between the client identifier and the storage resource is stored in the server. Therefore, after the server receives the write operation request sent by the client and carries the client ID and the data to be written, the server can determine the received data according to the correspondence between the client ID and the storage resource The storage resource corresponding to the client identifier carried in the write operation request. Then, the server writes the to-be-written data carried in the write operation request into the corresponding storage resource.
  • the corresponding relationship between the client ID and the storage resource is shown in Table 2 below, and the ID of the client that sends the write operation request to the server is client 1, then the content shown in Table 2 can be used to determine the client 1
  • the corresponding storage resource in the server is storage resource 1.
  • the client 2 also sends a write operation request to the server, it can be determined from the content shown in Table 2 that the storage resources corresponding to the client 2 in the server are the storage resource 2 and the storage resource 3.
  • the write operation performed in the embodiment of the present application is a continued write operation based on the original stored data, and the server may determine to perform this write according to the end position of the last write of the client 1 The starting position of the operation. That is, if the server determines that the end location of the last write by the client 2 is in the storage resource 3, the server determines the storage resource 3 as the storage resource for data writing of this write operation.
  • each storage resource for data writing is filled with data as much as possible.
  • the client corresponds to multiple storage resources in the server, one storage resource will be occupied before another storage resource is occupied.
  • the server can also determine which of the storage resources 2 and 3 corresponding to the client 2 is not full of data to determine the execution of this write operation. Operational storage resources. Assuming that the memory of the storage resource 2 corresponding to the client 2 is full, and there is still memory remaining in the storage resource 3, the server determines the storage resource 3 as the storage resource for data writing of this write operation.
  • the corresponding relationship between the client identifier and the storage resource can also be determined by means of logs, for example, a hierarchical log (log) is used to organize multiple write requests to write the same log.
  • a hierarchical log (log) is used to organize multiple write requests to write the same log.
  • the structure of the hierarchical Log may be a logical-Log+Physical-Log structure.
  • the Logical-Log in the architecture may be a chain structure, which is used to manage the read-write relationship between multiple requests under the server.
  • the Physical-Log in the architecture is used to manage the read and write of a specific request with a log structure.
  • the Physical-Log space in the embodiment of the present application may be allocated by Thin (thin layer).
  • Storage method 1 Write data to a determined storage resource first, and when it is found that the storage resource is full and cannot be stored, a new storage resource is created for data storage.
  • the server allocates another storage resource to the client. Then, the data that has not been stored among the data to be written is stored in the newly allocated storage resource in the order of the data. If during the storage process, the newly allocated storage resources are full, but the data to be written has not been completely stored, the server allocates another storage resource for the client for data storage until the waiting The written data is successfully stored. In this process, each time the server creates a storage resource, it needs to record the corresponding relationship between the client identifier and the created storage resource, and update the corresponding relationship between the client identifier and the storage resource stored in the server.
  • the storage resource corresponding to the client 1 in the server is storage resource 1, and the storage resource created by the server each time has a fixed size and is 60M. Assume that the current remaining resources in the storage resource 1 are 20M, and the size of the data to be written is 90M. Therefore, the server stores the data to be written included in the write operation request in the storage resource 1 in the order of data following the end position of the last write. Because the storage resource 1 can only store 20M of the data to be written, and the data to be written has not been completely stored, the server allocates another storage resource, such as a storage resource, to the client 6. Then, the data that has not been stored among the data to be written is stored in the newly allocated storage resource in the order of the data.
  • another storage resource such as a storage resource
  • the server allocates another storage resource for the client, such as a storage resource 7.
  • the server stores the remaining 10M data of the data to be written in the storage resource 7.
  • Storage method 2 When the client writes data to the determined server, it first determines whether the remaining space in the storage resource corresponding to the write operation request is sufficient for writing the data to be written. If possible, write data directly; if not, create at least one storage resource for data storage.
  • the storage resource corresponding to the client 1 in the server is storage resource 1, and the storage resource created by the server each time has a fixed size and is 60M. Assuming that the current remaining resource in the storage resource 1 is 20M, and the size of the data to be written is 90M, the server determines that the remaining space in the storage resource corresponding to this write operation request is insufficient to store the data to be written.
  • the server can determine that two new storage resources, such as storage resource 6 and storage resource 7, need to be created.
  • the storage resource 1, the storage resource 6, and the storage resource 7 may perform data storage simultaneously. That is, the first 20M of the data to be written is stored in the remaining space in the storage resource 1, the 21M to 80M of the data to be written is stored in the storage resource 6, and the data to be written is stored in the storage resource 6. The remaining 10M data is stored in the storage resource 7.
  • all new storage resources created by the server need to record the corresponding relationship between the client identifier and the created storage resource, and update the corresponding relationship between the client identifier and the storage resource stored in the server.
  • the server stores the data to be written in the corresponding storage resource, in order to facilitate the client to make a read operation request, the server also needs to create the identification of the client and the data to be written And send the corresponding relationship between the identification of the client and the physical address of the data to be written to the client. Therefore, when the client makes a read operation request to the server, it can determine the starting position of the data to be read according to the correspondence between the identifier of the client and the physical address of the data to be written .
  • the cluster management node determines and returns appropriate partition information to the client 1 according to the system load.
  • the partition information includes a partition ID.
  • the client terminal 1 determines the partition routing information corresponding to the partition ID in the received partition information according to the correspondence between the partition ID and the partition routing information.
  • the client terminal 1 determines three servers corresponding to the partition routing information, and determines the three servers as servers for performing the write operation. Assume that the three servers are server 1, server 2, and server 3.
  • the client 1 requests the three servers (server 1, server 2, and server 3) corresponding to the partition to perform write operations concurrently.
  • S405 Any one of the server 1, the server 2 and the server 3 judges whether it is the first time to perform the write operation sent by the client 1, and if so, allocate the first storage resource to the client 1 and save the identification of the client 1 Correspondence with the identifier of the storage resource; if not, determine the first storage resource allocated for the client according to the correspondence between the identifier of the storage resource and the identifier of the client.
  • the server determines whether the size of the remaining storage space in the first storage resource is less than the size of the data to be written, and if so, continues to allocate at least one second storage resource for the client 1 and records the identification of the client Correspondence with the identifier of the at least one second storage resource. Wherein, the size of the data to be written is determined according to the information in the write operation request.
  • the server stores the data to be written into the first storage resource, and creates a correspondence between the identifier of the client 1 and the physical address of the data to be written.
  • the server will store the data to be written A part of the data of is stored in the first storage resource, and the remaining part of the data in the data to be written is stored in the second storage resource.
  • the server returns the execution result to the client 1 after completing the data storage.
  • the execution result also includes the correspondence between the identification of the client 1 and the physical address of the data to be written.
  • S409 The client 1 receives the execution result returned by the server. If the execution result returned by the three servers in the partition is received within a predetermined time period, and the execution results are all successful, it is determined that the write operation is successful.
  • the client 1 sends write requests to the three servers in the partition
  • another client 2 also needs to send write requests to the three servers in the partition
  • the synchronization can be performed according to the above steps To proceed.
  • the failure recovery processing method may be as shown in Figure 5.
  • the failure scenario is that the server 3 fails. Because the data writing process in the failure scenario only occurs in the data writing stage, when client 1 sends a write operation request, if it does not receive a response from server 3, it will perform a certain number of retry operations (assuming re Try 5 times with an interval of 1s).
  • the client 1 can determine that the data writing failed this time (refer to step 1, step 2, and step 3 for details).
  • the cluster management node updates the state of the partition involved in this write operation and the state of the server corresponding to the partition. That is, the server under the partition cannot continue to undertake the write operation service currently, and needs to wait for the server 3 to go through the failure recovery process (refer to step 4 for details).
  • the failed server 3 After the failed server 3 recovers from the failure state to the normal working state, it will initiate a failure recovery task to the server 1 and server 2 in the current partition (refer to step 5 for details).
  • the main server 1 in the partition initiates a request for obtaining metadata to all servers in the OK state in the partition.
  • the main servers (server 1 and server 2 and server 3) under the partition negotiate the smallest data_length, and determine the data corresponding to the smallest data_length.
  • the main server 1 under the partition writes the data corresponding to the minimum data_length to the other servers 2 and 3 under the partition (refer to step 6, step 7 and step 8 for details).
  • the server 1 in the partition receives the write success information returned by all other servers 2 and 3 in the partition, it determines that the fault recovery task is completed, and the cluster management node updates the server status information in the partition ( Refer to step 9) for details.
  • the client When the client needs to perform a read operation, it can send a server request for assigning a read operation to the cluster management node.
  • the server request for allocating the read operation includes the physical address of the data to be read.
  • the cluster management node After receiving the server request for allocating the read operation sent by the client, the cluster management node reads according to the allocation. The physical address included in the operating server request determines the partition for processing the read request, and then returns appropriate partition information to the client.
  • the partition information includes a partition ID, so that the client can determine the server in the partition corresponding to the partition ID in the partition information allocated by the cluster management node according to the correspondence between the partition ID and the IP address of the server in the partition. IP address.
  • the corresponding relationship between the partition ID and the IP address of the server in the partition is shown in Table 1 above, and the identifier of the client that sends the server request for the write operation to the cluster management node is client 1, and the cluster management node sends
  • the partition ID contained in the partition information for the client 1 is partition 1, and then through the content shown in Table 1, it can be determined that the IP addresses of the corresponding servers in the partition 1 are 34.144.246.240 (determinable according to the IP address)
  • the server is server 1), 46.150.246.240 (the server can be determined to be server 2 based on the IP address), 36.116.246.240 (the server can be determined to be server 3 based on the IP address).
  • the client 1 only needs to randomly select a server from the server 1, the server 2 and the server 3 to determine it as the server performing the read operation, and send it to the selected server.
  • the server sends a read operation request.
  • the read request includes information such as the client identifier, the starting position and length of the data to be read, and so on.
  • the cluster management node may directly return the corresponding server request to the client after receiving the server request for the read operation sent by the client.
  • the server ID in the partition so that the client can determine the server that performs the read operation according to the server ID.
  • the client sends a read operation request to the server, and the server may need to determine the read data when reading data according to the received read operation request.
  • the server determines the data to be read from the corresponding storage resources.
  • the server determines the data to be read from the corresponding storage resources.
  • Reading situation 1 If the server determines that the data to be read is in the same storage resource according to the starting position and length in the read operation request, the server determines that it needs to be read in the corresponding storage resource And send the determined data that needs to be read to the client.
  • the server can determine that the starting position 65M of the data to be read is in the storage resource 6 according to the starting position 65M of the data to be read. According to the length of 20M to be read, the server can determine that the end position of the data to be read is also in the storage resource 6. As shown in FIG. 6a, the server stores the storage resource 6 in the storage resource 6. The corresponding data to be read is sent to the client 1.
  • Reading situation 2 If the server determines that the data to be read is in multiple storage resources according to the starting position and length in the read operation request, the server will select from the corresponding multiple storage resources. The data to be read is determined, and the data read from multiple storage resources are integrated in the order of the data and then sent to the client.
  • the server can determine that the starting position 41M from which the data needs to be read is in the storage resource 1 according to the starting position 41M where the data needs to be read. According to the length 130M to be read, the server can determine that the end position of the data to be read is in the storage resource 7. As shown in FIG. 6b, the part of the data to be read is stored in the storage. In resource 1, part is stored in storage resource 6, and some is stored in storage resource 7. Therefore, the server integrates the corresponding data that needs to be read in the storage resource 1 with the corresponding data that needs to be read in the storage resource 6 and the data that needs to be read in the storage resource 7 in order, and sends them to all Mentioned client 1.
  • the process of performing the read operation provided by the embodiment of the present application can be as shown in FIG.
  • the client 1 determines the partition information of the server where the data is read based on the correspondence between the client's identifier and the physical address of the data written to the server.
  • the partition information includes a partition ID.
  • the client terminal 1 determines the partition routing information corresponding to the partition ID in the partition information according to the correspondence between the partition ID and the partition routing information.
  • the client terminal 1 determines three servers corresponding to the partition routing information, and randomly selects one server from the three servers to determine it as the server performing the read operation.
  • the three servers are server 1, server 2, and server 3.
  • the client 1 sends a read request to a server that is determined to perform a read operation, where the read request includes the client identifier, the starting position and length of the data to be read, and other information.
  • S704 The server receives the read request sent by the client, and determines data corresponding to the read request according to the read request.
  • S705 The server sends the data to the client.
  • S706 The client terminal 1 reads the data sent by the server.
  • client 1 randomly selects a server from the 3 servers in the partition to perform the read operation
  • client 2 it also needs to randomly select a server from the 3 servers in the partition
  • the embodiment of the present application also proposes the following three situations for fault repair processing to ensure the normal progress of the data reading process:
  • Case 1 No fault recovery task is being executed in the partition. In this case, the failed server has not yet started the recovery process.
  • the read request triggers the master server in the partition to perform a data negotiation process (that is, the same as the minimum data_length negotiated in the processing flow of the failure during the write operation process, the specific operation is referred to in Figure 5 Step 5-Step 7).
  • a data negotiation process that is, the same as the minimum data_length negotiated in the processing flow of the failure during the write operation process, the specific operation is referred to in Figure 5 Step 5-Step 7.
  • Situation 2 A fault recovery task is being executed in the partition.
  • the server will first return a BUSY instruction to the upper layer. At this time, it needs to wait for the failure recovery task to complete before responding to the read request.
  • Case 3 The fault recovery task in the partition has been completed, and the status of the failed server becomes OK. In this case, the read request sent by the client can be responded normally.
  • the above-mentioned realization devices include hardware structures and/or software modules corresponding to the respective functions.
  • the present invention can be implemented in the form of hardware or a combination of hardware and computer software. Whether a function is executed by hardware or computer software-driven hardware depends on the specific application and design constraints of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of the present invention.
  • an embodiment of the present invention is a server that performs a write operation.
  • the server includes at least a processor 800 and a memory 801.
  • the memory 801 stores a program 802.
  • the processor 800, the memory 801 and the communication interface are connected through a system bus and complete mutual communication.
  • the processor 800 is a single-core or multi-core central processing unit, or a specific integrated circuit, or one or more integrated circuits configured to implement the embodiments of the present invention.
  • the memory 801 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one hard disk memory.
  • the memory 801 is used to store computer execution instructions.
  • the program 802 may be included in the computer execution instruction.
  • the processor 800 runs the program 802 to execute the method flow of S405-S408 shown in FIG. 4.
  • the present invention provides a server for performing a write operation, and the server includes:
  • the receiving module 900 configured to receive a first write request sent by a first client, the first write request including the first client's identifier and the first data to be written; receive the first write request sent by the second client A second write request, where the second write request includes the identifier of the second client and second data to be written;
  • Processing module 901 configured to determine that the storage resource allocated to the first client is the first storage resource according to the identification of the first client and the correspondence between the stored identification of the client and the allocated storage resource; The identification of the second client and the stored correspondence between the identification of the client and the allocated storage resource determine that the storage resource allocated for the second client is a second storage resource, wherein the second storage The physical address where the resource is located is different from the physical address where the first storage resource is located; the first data to be written is stored in the first storage resource, and the second data to be written is stored Data is stored in the second storage resource;
  • Creation module 902 Create a correspondence between the identification of the first client and the physical address where the first storage resource is located, and the identification of the second client and the physical address where the second storage resource is located Correspondence between.
  • the functions of the receiving module 900, the processing module 901, and the creating module 902 shown in FIG. 9 may be executed by the processor 800 running the program 802, or executed by the processor 800 alone.
  • the embodiment of the present invention also provides a method for performing a write operation, because this method corresponds to the server for performing a write operation introduced in the embodiment of the present invention, and the principle of the method to solve the problem is the same as
  • the servers are similar, so the implementation of this method can refer to the implementation of the server in the embodiment of the present invention, and the repetition will not be repeated.
  • an embodiment of the present invention also provides a method for performing a write operation, and the method includes:
  • Step 1000 The server receives a first write request sent by a first client, where the first write request includes an identifier of the first client and first data to be written;
  • Step 1001 The server receives a second write request sent by a second client, where the second write request includes an identifier of the second client and second data to be written;
  • Step 1002 The server determines that the storage resource allocated to the first client is the first storage resource according to the identification of the first client and the correspondence between the stored identification of the client and the allocated storage resource;
  • Step 1003 The server determines that the storage resource allocated to the second client is a second storage resource according to the identification of the second client and the stored correspondence between the identification of the client and the allocated storage resources. , Wherein the physical address where the second storage resource is located is different from the physical address where the first storage resource is located;
  • Step 1004 The server stores the first data to be written into the first storage resource, and stores the second data to be written into the second storage resource;
  • Step 1005 The server creates a correspondence between the identification of the first client and the physical address where the first storage resource is located, and the identification of the second client and the physical address where the second storage resource is located Correspondence between physical addresses.
  • an embodiment of the present invention is a server that performs a read operation.
  • the server includes at least a processor 1100 and a memory 1101.
  • a program 1102 is stored in the memory 1101.
  • the processor 1100, the memory 1101, and the communication interface are connected through a system bus and complete mutual communication.
  • the processor 1100 is a single-core or multi-core central processing unit, or a specific integrated circuit, or one or more integrated circuits configured to implement the embodiments of the present invention.
  • the memory 1101 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one hard disk memory.
  • the memory 1101 is used to store computer execution instructions.
  • the computer executable instruction may include a program 1102. When the server is running, the processor 1100 runs the program 1102 to execute the method flow of S704-S706 shown in FIG. 7.
  • a server that performs a read operation includes:
  • Receiving module 1200 used to receive a first read request sent by a first client, where the first read request includes the identification of the first client, the starting position and length of the first data to be read; The second read request sent by the second client, where the second read request includes the identifier of the second client, the start position and the length of the second data to be read;
  • Processing module 1201 configured to determine the first data to be read according to the first read request, and send the first data to be read to the first client, and according to the second The read request determines the second data to be read, and sends the second data to be read to the second client.
  • the functions of the receiving module 1200 and the processing module 1201 shown in FIG. 12 may be executed by the processor 1100 running the program 1102, or executed by the processor 1100 alone.
  • the embodiment of the present invention also provides a method for performing a read operation, because this method corresponds to the server for performing a read operation introduced in the embodiment of the present invention, and the principle of the method for solving the problem is the same as
  • the servers are similar, so the implementation of this method can refer to the implementation of the server in the embodiment of the present invention, and the repetition will not be repeated.
  • an embodiment of the present invention also provides a method for performing a read operation, and the method includes:
  • Step 1300 The server receives a first read request sent by a first client, where the first read request includes the identifier of the first client, the starting position and the length of the first data to be read;
  • Step 1301 The server receives a second read request sent by the second client, where the second read request includes the identifier of the second client, the starting position and the length of the second data to be read;
  • Step 1302 The server determines the first data to be read according to the first read request, and sends the first data to be read to the first client, and according to the second The read request determines the second data to be read, and sends the second data to be read to the second client.
  • various aspects of the methods for performing write operations and read operations provided by the embodiments of the present invention may also be implemented in the form of a program product, which includes program code, when the program code is on a computer device When running, the program code is used to make the computer device execute the steps in the method for performing write operations and read operations according to various exemplary embodiments of the present invention described in this specification.
  • the program product can use any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electric, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. More specific examples (non-exhaustive list) of readable storage media include: electrical connections with one or more wires, portable disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Type programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the program product for performing write operations and read operations may adopt a portable compact disk read-only memory (CD-ROM) and include program codes, and may run on a server device.
  • CD-ROM portable compact disk read-only memory
  • the program product of the present invention is not limited to this.
  • the readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an information transmission, device, or device.
  • the readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and readable program code is carried therein. This propagated data signal can take many forms, including, but not limited to, electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the readable signal medium may also be any readable medium other than a readable storage medium, and the readable medium may send, propagate, or transmit a program for use by or in combination with a periodic network action system, apparatus, or device.
  • the program code contained on the readable medium can be transmitted by any suitable medium, including, but not limited to, wireless, wired, optical cable, RF, etc., or any suitable combination of the above.
  • the program code used to perform the operations of the present invention can be written in any combination of one or more programming languages.
  • the programming languages include object-oriented programming languages-such as Java, C++, etc., as well as conventional procedural programming languages. Programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computing device, partly on the user's device, executed as an independent software package, partly on the user's computing device and partly executed on the remote computing device, or entirely on the remote computing device or server Executed on.
  • the remote computing device may be connected to a user computing device through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device.
  • LAN local area network
  • WAN wide area network
  • the embodiment of the present application further provides a computing device readable storage medium for the method for performing a write operation, that is, the content is not lost after a power failure.
  • the storage medium stores a software program, including program code, and when the program code runs on a computing device, the software program can implement any of the above embodiments of the present application when it is read and executed by one or more processors The scheme for performing write operations.
  • the embodiment of the present application also provides a storage medium readable by a computing device for a method for performing a read operation, that is, the content is not lost after a power failure.
  • the storage medium stores a software program, including program code, and when the program code runs on a computing device, the software program can implement any of the above embodiments of the present application when it is read and executed by one or more processors The program for performing read operations.
  • this application may take the form of a computer program product on a computer-usable or computer-readable storage medium, which has a computer-usable or computer-readable program code implemented in the medium to be used by the instruction execution system or Used in conjunction with the instruction execution system.
  • a computer-usable or computer-readable medium can be any medium that can contain, store, communicate, transmit, or transmit a program for use by an instruction execution system, device, or device, or in combination with an instruction execution system, Device or equipment use.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请提供一种执行写操作、读操作的方法及装置。该方法包括:服务器接收客户端发送的写请求,所述写请求中包含所述客户端的标识和待写入的数据;所述服务器根据所述客户端的标识以及保存的客户端的标识与分配的存储资源之间的对应关系,确定为所述客户端分配的存储资源后,将所述待写入的数据存储到所述存储资源,所述服务器创建所述客户端的标识与所述存储资源所位于的物理地址之间的对应关系。该方法由服务器确定存储所述待写入的数据的位置,并将所述待写入数据存储到所述位置,不需要协调节点确定存储待写入数据的位置以及将所述位置通知给所述客户端,减少了写操作过程的交互流程,有效提升写操作的效率。

Description

一种执行写操作、读操作的方法及装置 技术领域
本申请涉及通信技术领域,尤其涉及一种执行写操作、读操作的方法及装置。
背景技术
分布式存储***经常需要实现多节点集群操作,例如需要实现多个客户端同时读写同一份数据。现有技术中分布式存储***进行多节点集群操作时,***架构一般如图1所示,客户端通过协调节点将IO请求转发给集群服务器,执行IO请求的处理。其中,所述客户端执行写操作时,具体步骤如图2所示,客户端向对应的协调节点发送写请求,所述协调节点接收到所述写请求后向集群管理节点请求处理所述写请求的分区。所述集群管理节点根据当前***负载情况确定用于处理此次写请求的分区,其中,所述分区中包括3个服务器。然后,所述协调节点根据所述集群管理节点分配的分区以及所述写请求中包含的数据信息,为所述客户端分配数据写入的地址,并将所述分区以及所述写入地址通知给所述客户端。所述客户端根据所述协调节点通知的所述分区以及所述写入地址,将所述写请求中包含的数据信息分别写入所述分区中的3个服务器的对应位置,并在完成数据写入后通知给所述协调节点本次数据写入的结束位置。
上述分布式存储***实现多节点集群操作的方法存在的主要问题是:客户端与集群服务器间需要通过协调节点进行消息转发,交互流程较多,容易产生延迟。其中,因为协调节点同一时刻能处理的客户端数量有限,容易造成***瓶颈,并且如果协调节点出现故障,则会影响***业务的正常运行。而当同一时刻有众多客户端进行IO操作时,需要协调节点逐一的分配写入位置,从而存在客户端进行IO操作时所述协调节点无法及时处理所述客户端的IO请求,造成客户端进行IO请求延迟的问题。
发明内容
本申请提供一种执行写操作、读操作的方法及装置,用以避免现有技术通过协调节点进行消息转发,交互流程较多,容易产生延迟及造成***瓶颈的问题。
第一方面,本申请实施例提供一种执行写操作的方法,服务器接收第一客户端发送的第一写请求,所述第一写请求包含所述第一客户端的标识和第一待写入的数据;所述服务器接收第二客户端发送的第二写请求,所述第二写请求包含所述第二客户端的标识和第二待写入的数据;所述服务器根据所述第一客户端的标识以及保存的客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是第一存储资源;所述服务器根据所述第二客户端的标识以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是第二存储资源,其中所述第二存储资源所位于的物理地址不同于所述第一存储资源所位于的物理地址;所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中;所述服务器创建所述第一客户端的标识与所述第一存储资源所位于的物理地 址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系。
基于该方案,分布式存储***在进行写操作时,所述服务器接收到所述客户端发送的写请求后,根据保存的客户端的标识与分配的存储资源之间的对应关系以及所述写请求中包含的所述客户端的标识确定为所述客户端分配的存储资源;从而将所述写请求中包含的待写入的数据存储到对应的存储资源中。在进行写操作过程中,由服务器确定存储所述待写入的数据的位置,并将所述待写入数据存储到所述位置,不需要协调节点确定存储待写入数据的位置以及将所述位置通知给所述客户端。减少了信息传递,有效降低现有技术中进行写操作过程交互流程较多,容易产生延迟的问题,以及因为协调节点同一时刻能处理的客户端数量有限,容易造成***瓶颈的问题,所以可有效提升写操作的效率。
在一种可能的实现方式中,所述服务器创建所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系之后,所述服务器将所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系发送给所述第一客户端,以及将所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系发送给所述第二客户端。
基于该方案,所述服务器将所述客户端的标识与对应存储资源所位于的物理地址之间的对应关系发送给所述客户端,从而可以在所述客户端需要向所述服务器发送读请求时,确定需要读取的数据的起始位置。
相应地,所述服务器接收所述第一客户端发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;所述服务器接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;所述服务器根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
基于该方案,所述服务器接收到读请求之后,根据所述读请求中的客户端标识,第一待读取的数据的起始位置及长度,确定待读取的数据,从而读取对应的数据,并将读取的数据发送给客户端。
在一种可能的实现方式中,所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中之前,所述服务器若确定所述第一存储资源中剩余的存储空间大小小于所述第一待写入的数据的大小时,为所述第一客户端继续分配至少一个第三存储资源,并记录所述第一客户端的标识与所述至少一个第三存储资源的标识的对应关系;所述服务器若确定所述第二存储资源中剩余的存储空间大小小于所述第二待写入的数据的大小时,为所述第二客户端继续分配至少一个第四存储资源,并记录所述第二客户端的标识与所述至少一个第四存储资源的标识的对应关系;所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中包括:所述服务器将所述第一待写入的数据中的部分数据存储到所述第一存储资源中,并将所述第一待写入的数据中的剩余部分数据存储到所述第三存储资源中;所述服务器将所述第二待写入的数据中的部分数据存储到所述第二存储资源中,并将所述第二待写入的数据中的剩余部分数据存储到所述第四存储资源中。
基于该方案,所述服务器在进行数据存储前,先判断当前存储资源的剩余空间是否足够存储所述数据,若是,则直接将所述数据存储在所述存储资源中,若否,则创建新的存储资源。从而,所述服务器可根据实际情况实时创建存储资源,有效减少了内存占用。
第二方面,本申请实施例还提供一种执行读操作的方法,服务器接收第一客户端发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;所述服务器接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;所述服务器根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
基于该方案,分布式存储***在进行读操作时,所述服务器接收到所述客户端发送的读请求后,根据所述读请求中的客户端标识,待读取的数据的起始位置及长度等信息,从而读取对应的数据,并将读取的数据发送给所述客户端。
在一种可能的实现方式中,所述服务器根据所述第一读请求确定所述第一待读取的数据,以及根据所述第二读请求确定所述第二待读取的数据,包括:所述服务器根据所述第一客户端的标识,确定为所述第一客户端分配的存储资源是第一存储资源;所述服务器根据所述第二客户端的标识,确定为所述第二客户端分配的存储资源是第二存储资源;所述服务器根据所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系以及所述第一待读取的数据的起始位置及长度,从所述第一存储资源中确定所述第一待读取的数据;所述服务器根据所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系以及所述第二待读取的数据的起始位置及长度,从所述第二存储资源中确定所述第二待读取的数据。
基于该方案,所述服务器接收到读请求之后,根据所述读请求中的客户端标识,第一待读取的数据的起始位置及长度,确定待读取的数据,从而读取对应的数据,并将读取的数据发送给客户端。
在一种可能的实现方式中,所述服务器根据所述第一客户端的标识,确定为所述第一客户端分配的第一存储资源以及根据所述第二客户端的标识,确定为所述第二客户端分配的第二存储资源,包括:所述服务器根据所述第一客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是所述第一存储资源;所述服务器根据所述第二客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是所述第二存储资源。
基于该方案,提供了一种所述服务器在接收到读请求之后,根据所述客户端的标识,确定为所述客户端分配的存储资源的方法。即所述服务器根据所述客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述客户端分配的存储资源。
第三方面,本申请实施例还提供一种执行写操作的装置,该装置可以为分布式存储***中的服务器,该装置可以用来执行上述第一方面及第一方面的任意可能的实现方式中的操作。例如,装置可以包括用于执行上述第一方面或第一方面的任意可能的实现方式中的 各个操作的模块或单元。比如包括处理单元和通信单元。
第四方面,本申请实施例还提供了一种执行读操作的装置,该装置可以为分布式存储***中的客户端,该装置可以用来执行上述第二方面及第二方面的任意可能的实现方式中的操作。例如,装置可以包括用于执行上述第二方面或第二方面的任意可能的实现方式中的各个操作的模块或单元。比如包括处理单元和通信单元。
第五方面,本申请实施例还提供一种分布式存储***,包括上述第三方面的服务器和上述第四方面的客户端。
第六方面,本申请实施例提供了一种芯片***,包括处理器,可选的还包括存储器;其中,存储器用于存储计算机程序,处理器用于从存储器中调用并运行计算机程序,使得安装有芯片***的通信设备执行上述第一方面或第一方面的任意可能的实现方式中的任一方法;和/或,使得安装有芯片***的通信设备执行上述第二方面或第二方面的任意可能的实现方式中的任一方法。
第七方面,本申请实施例提供了一种计算机程序产品,计算机程序产品包括:计算机程序代码,当计算机程序代码被通信设备的通信单元、处理单元或收发器、处理器运行时,使得通信设备执行上述第一方面或第一方面的任意可能的实现方式中的任一方法;和/或,使得安装有芯片***的通信设备执行上述第二方面或第二方面的任意可能的实现方式中的任一方法。
第八方面,本申请实施例提供了一种计算机可读存储介质,计算机可读存储介质存储有程序,程序使得通信设备(例如,分布式存储***中的服务器)执行上述第一方面或第一方面的任意可能的实现方式中的任一方法;和/或,使得安装有芯片***的通信设备(例如,分布式存储***中的客户端)执行上述第二方面或第二方面的任意可能的实现方式中的任一方法。
附图说明
图1为现有技术中分布式存储***进行多节点操作的***架构示意图;
图2为现有技术执行写操作的示意图;
图3a为本申请提供的一种执行写操作的***示意图;
图3b、图3c为本申请提供的第一种执行写操作的存储方式示意图;
图4为本申请提供一种执行写操作的流程示意图;
图5为本申请提供一种数据写入过程发生故障示意图;
图6a为本申请提供的第一种执行读操作的读取情况示意图;
图6b为本申请提供的第二种执行读操作的读取情况示意图;
图7为本申请提供一种执行读操作的流程示意图;
图8为本申请提供的第一种执行写操作的装置示意图;
图9为本申请提供的第二种执行写操作的装置示意图;
图10为本申请提供的一种执行写操作的方法示意图;
图11为本申请提供的第一种执行读操作的装置示意图;
图12为本申请提供的第二种执行读操作的装置示意图;
图13为本申请提供的一种执行读操作的方法示意图。
具体实施方式
目前,分布式存储***中执行写操作的方法一般是:需要进行写操作的至少一个客户端向对应的协调节点请求分配进行处理的服务器。所述协调节点分别为需要进行写操作的客户端确定进行处理的服务器并分配本次数据写入的起始位置。需要进行写操作的至少一个客户端根据对应协调节点分配的服务器以及分配的本次数据写入的起始位置写入数据,并在完成数据写入后通知给所述协调节点本次数据写入的结束位置。这样经过协调节点的中间协调以及信息传递,就可以实现写操作。
但是,上述分布式存储***实现多节点集群操作的方法存在的主要问题是:客户端与集群服务器间需要通过协调节点进行消息转发,交互流程较多,容易产生延迟。其中,因为协调节点同一时刻能处理的客户端数量有限,容易造成***瓶颈,并且如果协调节点出现故障,则会影响***业务的正常运行。而当同一时刻有众多客户端进行IO操作时,需要协调节点逐一的分配写入位置,从而存在客户端进行IO操作时所述协调节点无法及时处理所述客户端的IO请求,造成客户端进行IO请求延迟的问题。
例如,上述图1中,客户端1、客户端2、客户端3需要同时进行写操作,则所述客户端1、客户端2、客户端3分别向协调节点请求分配进行处理的服务器。因为协调节点需要逐一的为客户端分配写入的位置,则不可避免的会产生有的客户端优先被分配写入的位置,有的客户端一直在等待协调节点分配写入的位置。从而造成部分客户端进行写操作时发生延迟的问题。这种现象尤其当客户端请求进行写操作的数量增加时而越发突出。
为解决该问题,本申请实施例提供一种执行写操作的方法。可应用于任何需要执行写操作的分布式存储***中。本申请实施例的基础思想就是在分布式存储***中进行写操作时,去掉通过协调节点进行消息转发以及通过协调节点分配写入位置的步骤。即本申请实施例中整体***架构中不再需要协调节点,从而尽量减少交互流程以及消息的转发次数。与此同时,通过本申请实施例进行写操作时可直接向集群管理节点分配的服务器进行写操作,无需等待协调节点分配写入的起始位置,有效的减少等待延迟的时间。
首先介绍本申请实施例可以应用的场景,本申请实施例可以应用于任何具备有存储功能的通信***中,比如分布式存储***等。如图3a所示,为本申请实施例可以应用的一种分布式存储***,具体可以包括客户端(客户端1、客户端2…..客户端7)、集群管理节点以及服务器(服务器1、服务器2和服务器3)。其中,客户端(比如客户端1)用于向集群管理节点发送写请求;集群管理节点用于根据服务器1至服务器3当前的处理资源情况,比如服务器当前的忙闲情况、存储资源情况等为客户端1分配一个服务器(比如服务器2),并将分配的服务器2的标识发给客户端1;这样客户端1就可以直接向服务器2发送写请求(包括待写入数据),避免像现有技术一样还需要再发送请求到协调节点,导致协调节点容易成为瓶颈的问题;服务器2用于首先判断是否已为客户端1分配有存储资源,如果没有则直接为客户端1先分配一块存储资源,然后将待写入数据存储到为客户端1分配的存储资源中;如果预先已经为客户端1分配有存储资源,则在为客户端1分配的存储资源中的空闲区域存储待写入数据。可选的,服务器2还可以将待写入数据的存储地址信息携带在写请求响应消息中反馈给客户端1。此外,服务器2在成功写入待写入数据之后,还可以创建客户端1的标识与所述待写入数据写入的物理地址之间的对应关系。
当然,后续客户端1还可以用于向服务器发送读取求,其中,所述读请求中包含所述客户端标识、需要读取数据的起始位置及长度等信息;服务器2接收到读请求之后,根据所述读请求中的客户端标识,以及创建的客户端的标识与所述待写入数据写入的物理地址之间的对应关系,找到读请求对应的存储地址信息,再结合读取数据的起始位置及长度等信息,进行读取对应的数据,并将读取的数据发送给客户端1。
通过图3a所示的分布式存储***进行IO处理时,客户端可以直接将IO请求发送给集群管理节点,从而向集群管理节点分配的服务器请求执行IO处理。
进一步的,当分布式存储***进行IO处理时,所述分布式存储***中同一时间可能会有多个客户端需要执行IO处理。例如,图3a所示中,当前共有7个客户端需要执行IO处理。其中,客户端1、客户端2、客户端3、客户端5需要执行IO处理中的读操作。客户端4、客户端6、客户端7需要执行IO处理中的写操作。
在本申请实施例中,客户端1、客户端2、客户端3、客户端5可以直接向集群管理节点发送分配进行读操作的服务器请求。然后,客户端1、客户端2、客户端3、客户端5根据集群管理节点发送的分配信息确定执行所述写请求的服务器,并向分配的所述服务器发送读请求。客户端4、客户端6、客户端7可以直接向集群管理节点发送分配进行写操作的服务器请求,然后,客户端4、客户端6、客户端根据集群管理节点发送的分配信息确定执行所述写请求的服务器,并向所述服务器发送写请求。
其中,客户端接收到的集群管理节点发送的所述分配信息确定进行IO处理的服务器,例如所述分配信息中包含所述服务器的IP地址信息,从而客户端可以根据所述服务器的IP地址信息确定进行IO处理的服务器。从图3a中可以看出,客户端1、客户端2对应执行读操作的服务器为服务器1。客户端3、客户端5对应执行读操作的服务器为服务器2。客户端4、客户端6和客户端7对应的执行读操作的服务器为服务器3。
从而服务器1在接收到客户端1、客户端2发送的读请求后,确定所述读请求中对应的数据,并将所述数据返回对应的客户端1、客户端2。同理,服务器2接收到客户端3、客户端5发送的读请求后,确定所述读请求中对应的数据,并将所述数据返回对应的客户端3、客户端5。服务器3接收到客户端4、客户端6以及客户端7发送的写请求后,将所述写请求中对应的数据存储到对应位置中,并向所述客户端4、客户端6以及客户端7返回执行结果。
以下再对本申请实施例中涉及的部分用语进行解释说明,以便于理解。
1)节点(node),是指分布式存储***中的设备。所述节点可以分为存储节点和访问节点,所述存储节点用于存储数据,所述访问节点用于访问存储节点中的数据。具体地,所述节点可以是上述图3a所示***中的客户端、集群管理节点或服务器等。
其中,为了保证数据均匀存储在各个存储节点中,在选择存储节点时通常采用分布式哈希表(Distributed Hash Table,DHT)方式进行路由,但本申请实施例对此并不限定。也就是说,在本申请实施例的技术方案中,可以采用存储***中的各种可能的路由方式。按照分布式哈希表方式,将哈希环均匀地划分为若干部分,每个部分称为一个分区(partition),每个分区对应一段设定大小的存储空间。可以理解的是,分区越多,每个分区所对应的存储空间越小,分区越少,每个分区所对应的存储空间越大。
2)集群管理节点,用于管理分布式存储***中分区的分配,提供分区变更管理、集群 管理,如扩容、缩容、升级等相关功能。
其中,在本发明实施例中,客户端申请进行写操作或读操作时,所述集群管理节点可以为所述客户端分配用于处理所述读操作或写操作的对应分区,该分区可以归属不同的服务器。
而在分布式存储***中为了更好的提升数据的可靠性,所述分区中一般可以对应多个服务器,并且在进行写操作过程中,所述客户端需要接收到所述分区中所有服务器发送的执行成功的信息后才确定本次写操作成功。在实际应用中,一个分区中最长设置的可以归属的服务器数量最多为3个。从而可以将数据分别存放在该分区中对应的3个服务器上,以保证数据的可靠性和可用性。
其中,所述分区中对应的服务器的状态主要有OK和UNOK两种类型。若所述服务器可正常工作或正处于正常工作时,则所述服务器的状态为OK;若所述服务器发生故障以及所述服务器处于故障恢复过程中,则所述服务器的状态为UNOK。而所述状态为UNOK的服务器需要在数据恢复完成后,由集群管理节点更新所述服务器的状态为OK。
3)协调节点,为现有技术中用于将客户端发送的IO请求转发给集群服管理节点,然后,用于接收所述集群管理节点分配的服务器ID以及进行所述IO请求的地址分配等操作,并将所述服务器ID以及分配的所述地址通知给所述客户端。
除非有相反的说明,本申请实施例提及“第一”、“第二”、“第三”、“第四”等序数词是用于对多个对象进行区分,不用于限定多个对象的顺序、时序、优先级或者重要程度。
此外,本申请实施例和权利要求书及附图中的术语“包括”和“具有”不是排他的。例如,包括了一系列步骤或模块的过程、方法、***、产品或设备,不限定于已列出的步骤或模块,还可以包括没有列出的步骤或模块。
通过本申请实施例中上述应用场景的介绍,下面针对所述客户端向所述服务器进行写操作的具体处理过程进行具体介绍。
当客户端需要进行写操作时,直接向集群管理节点发送分配进行写操作的服务器请求。所述集群管理节点收到客户端发送的分配进行写操作的服务器请求后,根据***负载向所述客户端返回合适的分区信息。其中,所述分区信息中包含分区ID,从而可以使所述客户端根据分区ID与分区中服务器的IP地址的对应关系,确定所述集群管理节点分配的分区信息中分区ID对应的分区中服务器的IP地址。
可选的,所述分布式存储***中一个分区中仅设置一个服务器时,所述集群管理节点收到客户端发送的分配进行写操作的服务器请求后,根据***负载可直接向所述客户端返回合适的服务器ID,从而可以使所述客户端根据所述服务器ID确定进行写操作的服务器。
一般情况下,所述分布式存储***中为保证数据的可靠性和可用性,可以在一个分区中设置多个服务器,则所述客户端进行写操作时,可能会向多个服务器发送写操作,使多个服务器中存储的数据保持一致性。从而在某个服务器发生故障或者数据损坏时,客户端可以通过一个分区下的其它服务器继续执行读操作以及进行数据修复。
例如,假设分区ID与分区中服务器的IP地址的对应关系如下表1所示,向所述集群管理节点发送分配进行写操作的服务器请求的客户端标识为客户端1,所述集群管理节点发送给所述客户端1的分区信息中包含的分区ID为分区1,则通过表1所示内容,可以确定执行所述客户端1写操作的服务器的IP地址分别为34.144.246.240(根据IP地址可确定该服 务器为服务器1)、46.150.246.240(根据IP地址可确定该服务器为服务器2)、36.116.246.240(根据IP地址可确定该服务器为服务器3)。
Figure PCTCN2020088787-appb-000001
表1分区ID与分区中服务器的IP地址的对应关系
因该分区ID对应的服务器为3个,则所述客户端1确定执行写操作的服务器的IP地址为34.144.246.240、46.150.246.240以及36.116.246.240后,就可以向IP地址为34.144.246.240、46.150.246.240以及36.116.246.240的服务器发送写请求,即向服务器1、服务器2以及服务器3发送写请求。
进一步的,本申请实施例中客户端向服务器发送写操作请求,所述服务器根据接收到的写操作请求进行数据写入时,需要确定写入数据的对应位置。其中,所述服务器中存储有客户端标识与存储资源的对应关系。从而所述服务器在接收到所述客户端发送的携带有客户端标识以及待写入的数据的写操作请求后,所述服务器则可以根据客户端标识与存储资源的对应关系,确定接收到的所述写操作请求中携带的所述客户端标识对应的存储资源。然后,所述服务器将所述写操作请求中携带的待写入数据写入对应的存储资源。
例如,假设客户端标识与存储资源的对应关系如下表2所示,向所述服务器发送写操作请求的客户端的标识为客户端1,则通过表2所示内容,可以确定所述客户端1在所述服务器中对应的存储资源为存储资源1。
若所述客户端2也向服务器发送写操作请求,则通过表2所示的内容,可以确定所述客户端2在所述服务器中对应的存储资源为存储资源2和存储资源3。其中,本申请实施例中所执行的写入操作是在原有存储数据的基础上进行的续写操作,则所述服务器可根据所述客户端1上一次写入的结束位置确定执行本次写操作的起始位置。即若所述服务器确定所述客户端2上一次写入的结束位置处于存储资源3中,则所述服务器将所述存储资源3确定为进行本次写操作数据写入的存储资源。
进一步的,本申请实施例进行数据写入时,为避免存储资源浪费的问题,在进行数据写入时,会保证每个进行数据写入的存储资源都尽量写满数据。也就是说,若客户端在服务器中对应多个存储资源,则会在占满一个存储资源后,才开始占用另一个存储资源。
因此,若所述客户端2向服务器发送写操作请求时,所述服务器还可以通过确定所述客户端2对应的存储资源2与存储资源3哪个存储资源未写满数据来确定执行本次写操作的存储资源。假设所述客户端2对应的存储资源2内存已满,存储资源3中还有剩余内存,则所述服务器将所述存储资源3确定为进行本次写操作数据写入的存储资源。
Figure PCTCN2020088787-appb-000002
Figure PCTCN2020088787-appb-000003
表2客户端标识与存储资源的对应关系
进一步的,本申请实施例中还可通过日志的方式确定客户端标识与存储资源的对应关系,例如,以层次型Log(日志)来组织多个写请求写入同一个Log。如所述层次型Log的结构可以为Logical-Log(逻辑日志)+Physical-Log(物理日志)的架构。其中,所述架构中的Logical-Log可以为链式结构,用于管理服务器下多个请求之间的读写关系。所述架构中的Physical-Log用于以日志结构管理某个具体请求的读写。本申请实施例中的所述Physical-Log空间可以采用Thin(薄层)分配。由此,通过层次型Log组织结构,可以更好的对上层多个请求提供透明的、高性能的、互不影响的IO服务,并且在故障场景中可以根据Logical-Log结构,分别进行故障检测和并行恢复,能够极大地提升分布式存储***的可靠性。
所述服务器将所述写操作请求中携带的待写入数据写入对应的存储资源时,可以有多种存储方式,下面分别进行介绍。
存储方式1:先向确定的存储资源进行数据写入,并在发现所述存储资源已满,无法进行存储时,创建新的存储资源进行数据存储。
具体地,若已将所述存储资源1写满,但是所述待写入数据还未完全存储,则所述服务器为所述客户端再分配一个存储资源。然后,将所述待写入数据中还未进行存储的数据按照数据的顺序存储在新分配的存储资源。若在存储过程中,本次新分配的存储资源写满,但是所述待写入数据还未完全存储,则所述服务器为所述客户端再分配一个存储资源进行数据存储,直到所述待写入数据成功存储为止。在该过程中,所述服务器每创建一个存储资源,都需要记录所述客户端的标识与创建的存储资源的对应关系,并更新所述服务器中保存的客户端标识与存储资源的对应关系。
如图3b所示,假设客户端1在所述服务器中对应的存储资源为存储资源1,所述服务器每次创建的存储资源大小固定,且都为60M。假设所述存储资源1中当前剩余资源为20M,所述待写入数据大小为90M。因此,所述服务器紧跟着上一次写入的结束位置将所述写操作请求中包含的待写入数据按照数据的顺序存储在所述存储资源1中。因所述存储资源1中仅能存储所述待写入数据中的20M数据,所述待写入数据还未完全存储,则所述服务器为所述客户端再分配一个存储资源,例如存储资源6。然后,将所述待写入数据中还未进行存储的数据按照数据的顺序存储在新分配的存储资源。因所述存储资源6中仅能存储所述待写入数据中的60M数据,所述待写入数据还未完全存储,则所述服务器为所述客户端再分配一个存储资源,例如存储资源7。所述服务器将所述待写入数据剩余的10M数据存储在存储资源7中。
存储方式2:客户端向确定的服务器进行数据写入时,先确定进行本次写操作请求对应的存储资源中剩余空间是否足够写入待写入数据。若可以,则直接进行数据写入;若不可以,则创建至少一个存储资源用于进行数据存储。
比如,如图3c所示,假设客户端1在所述服务器中对应的存储资源为存储资源1,所述服务器每次创建的存储资源大小固定,且都为60M。假设所述存储资源1中当前剩余资源为20M,所述待写入数据大小为90M,则所述服务器确定进行本次写操作请求对应的存储资源中剩余空间不足够存储待写入数据。
因此,所述服务器根据所述待写入数据的大小以及所述存储资源1中剩余内存的大小,可确定需要再创建2个新的存储资源,例如存储资源6和存储资源7。其中,本申请实施例中为了提升写操作的速度,所述存储资源1、所述存储资源6以及所述存储资源7可以同步进行数据存储。即将所述待写入数据中前20M大小的数据存储在存储资源1中的剩余空间中,将所述待写入数据中21M至80M的数据存储在存储资源6中,将所述待写入数据剩余的10M数据存储在存储资源7中。
在该过程中,所述服务器创建的新的存储资源,都需要记录所述客户端的标识与创建的存储资源的对应关系,并更新所述服务器中保存的客户端标识与存储资源的对应关系。进一步的,所述服务器将所述待写入数据存储到对应的存储资源后,为方便客户端进行读操作请求,则所述服务器还需创建所述客户端的标识与所述待写入的数据的物理地址之间的对应关系,并将所述客户端的标识与所述待写入的数据的物理地址之间的对应关系发送给所述客户端。由此,所述客户端在向所述服务器进行读操作请求时,可根据所述客户端的标识与所述待写入的数据的物理地址之间的对应关系确定需要读取数据的起始位置。
本申请实施例提供的执行写操作的流程具体可以如图4所示,这里假设每个分区对应3个服务器,具体步骤如下:
S400:客户端1向集群管理节点申请分区信息。
S401:集群管理节点根据***负载,确定并向客户端1返回合适的分区信息。其中,所述分区信息中包含分区ID。
S402:客户端1根据分区ID与分区路由信息的对应关系,确定接收到的分区信息中的分区ID对应的分区路由信息。
S403:客户端1根据所述分区路由信息,确定与所述分区路由信息对应的3个服务器,并将3个服务器确定为进行写操作的服务器。假设3个服务器分别为服务器1、服务器2和服务器3。
S404:客户端1根据所述分区路由信息向所述分区对应的3个服务器(服务器1、服务器2和服务器3)请求并发执行写操作。
S405:服务器1、服务器2和服务器3的任意一个判断是否是第一次执行客户端1发送的写操作,若是,则为客户端1分配第一存储资源,并保存所述客户端1的标识与所述存储资源的标识的对应关系;若否,则根据存储资源的标识与客户端的标识的对应关系,确定已为所述客户端分配的第一存储资源。
S406:所述服务器判断所述第一存储资源中剩余的存储空间大小是否小于待写入数据的大小,若是,则为客户端1继续分配至少一个第二存储资源,并记录所述客户端的标识与所述至少一个第二存储资源的标识的对应关系。其中,所述待写入数据的大小是根据所 述写操作请求中的信息进行确定的。
S407:所述服务器将所述待写入的数据存储到所述第一存储资源中,并创建客户端1的标识与所述待写入的数据的物理地址之间的对应关系。
其中,若所述第一存储资源中剩余的存储空间大小小于待写入数据的大小,且服务器已为客户端1分配了至少一个第二存储资源,则服务器将所述待写入的数据中的一部分数据存储到所述第一存储资源中,并将所述待写入的数据中的剩余部分数据存储到所述第二存储资源中。
S408:服务器完成数据存储后向客户端1返回执行结果。其中,所述执行结果中还包含客户端1的标识与所述待写入的数据的物理地址之间的对应关系。
S409:客户端1接收服务器返回的执行结果,若在预定时长内接收到所述分区内的3个服务器返回的执行结果,且所述执行结果都为成功,则确定此次写操作成功。
进一步的,在客户端1向所述分区中的3个服务器发送写请求的同时,如果有另一个客户端2也需要向所述分区中的3个服务器发送写请求,则可按照上述步骤同步进行操作。
可选的,若数据写入过程发生故障,则对故障恢复处理方式可如图5所示。
其中,假设故障场景为服务器3发生故障。因为故障场景下数据写入的过程只会发生在数据写入阶段,则当客户端1下发写操作请求时,若未收到服务器3的响应,会进行一定次数的重试操作(假定重试5次,每次间隔1s)。
当超过预定重试次数,客户端1仍然未能收到服务器3的响应,则客户端1就可以确定本次数据写入失败(具体参照步骤1、步骤2和步骤3)。
此时,集群管理节点更新本次写操作所涉及的分区状态以及所述分区对应的服务器状态。即该分区下的服务器当前不能继续承接写操作业务,需要等待服务器3走故障恢复流程(具体参照步骤4)。
故障的服务器3从故障状态恢复至正常工作状态后,会向当前分区下的服务器1和服务器2发起故障恢复任务(具体参照步骤5)。此时,分区下的主服务器1向分区下的所有状态OK的服务器发起获取元数据的请求。其中,所述分区下的主服务器(服务器1和服务器2和服务器3)协商出最小的data_length,并确定最小的data_length对应的数据。
根据协商出的最小data_length,所述分区下的主服务器1将最小的data_length对应的数据写入所述分区下的其他服务器2和服务器3中(具体参照步骤6、步骤7和步骤8)。
当所述分区中的服务器1收到所述分区中其他所有服务器2和服务器3返回的写入成功信息后,则确定故障恢复任务完成,由集群管理节点更新所述分区内的服务器状态信息(具体参照步骤9)。
下面再进而针对所述客户端向所述服务器进行读操作的具体实现过程进行具体介绍。
当客户端需要进行读操作时,可以向集群管理节点发送分配进行读操作的服务器请求。其中,所述分配进行读操作的服务器请求中包含所述要读取的数据的物理地址,收到客户端发送的分配进行读操作的服务器请求后,所述集群管理节点根据所述分配进行读操作的服务器请求中包含的物理地址确定进行处理所述读请求的分区,然后向所述客户端返回合适的分区信息。
其中,所述分区信息中包含分区ID,从而可以使所述客户端根据分区ID与分区中服务器的IP地址的对应关系,确定所述集群管理节点分配的分区信息中分区ID对应的分区中服 务器的IP地址。
例如,假设分区ID与分区中服务器的IP地址的对应关系如上表1所示,向所述集群管理节点发送分配进行写操作的服务器请求的客户端的标识为客户端1,所述集群管理节点发送给所述客户端1的分区信息中包含的分区ID为分区1,则通过表1所示内容,可以确定所述分区1中对应的服务器的IP地址分别为34.144.246.240(根据IP地址可确定该服务器为服务器1)、46.150.246.240(根据IP地址可确定该服务器为服务器2)、36.116.246.240(根据IP地址可确定该服务器为服务器3)。因为同一路由信息下的服务器中所存储的数据具有一致性,则所述客户端1仅需从服务器1,服务器2以及服务器3中随机选取一个服务器确定为执行读操作的服务器,并向选取的服务器发送读操作请求。其中,所述读请求中包含所述客户端标识、需要读取数据的起始位置及长度等信息。
可选的,所述分布式存储***中一个分区中仅设置一个服务器时,所述集群管理节点收到客户端发送的分配进行读操作的服务器请求后,可直接向所述客户端返回对应的分区中的服务器ID,从而可以使所述客户端根据所述服务器ID确定进行读操作的服务器。
进一步的,本申请实施例中客户端向服务器发送读操作请求,所述服务器可以根据接收到的读操作请求进行数据读取时,需要确定读取的数据。
具体地,所述服务器在接收到所述客户端发送的携带有客户端标识以及需要读取数据的起始位置及长度后,所述服务器从对应的存储资源中确定需要读取的数据的情况可以有多种,下面分别进行介绍。
读取情况1:若所述服务器根据所述读操作请求中的起始位置及长度确定所述需要读取的数据在同一存储资源中,则所述服务器在对应的存储资源中确定需要读取的数据,并将确定的需要读取的数据发送给所述客户端。
比如,如图6a所示,假设客户端1发送给所述服务器的读操作请求中包含的需要读取数据的起始位置为第65M,需要读取的长度为20M,所述服务器中的每个存储资源大小都为60M,且所述客户端1对应的存储资源存储数据的顺序依次是存储资源1、存储资源6、存储资源7。
因此,所述服务器根据需要读取数据的起始位置65M可以确定所述需要读取数据的起始位置在所述存储资源6中。所述服务器再根据所述需要读取的长度20M可以确定所述需要读取数据的结束位置也在所述存储资源6中,则如图6a所示,所述服务器将所述存储资源6中对应的需要读取的数据发送给所述客户端1。
读取情况2:若所述服务器根据所述读操作请求中的起始位置及长度确定所述需要读取的数据在多个存储资源中,则所述服务器分别从对应的多个存储资源中确定需要读取的数据,并将从多个存储资源中读取的数据按照数据的顺序整合后发送给所述客户端。
比如,如图6b所示,假设客户端1发送给所述服务器的读操作请求中包含的需要读取数据的起始位置为第41M,需要读取的长度为130M,所述服务器中的每个存储资源大小都为60M,且所述客户端1对应的存储资源存储数据的顺序依次是存储资源1、存储资源6、存储资源7。
因此,所述服务器根据需要读取数据的起始位置41M可以确定所述需要读取数据的起始位置在所述存储资源1中。所述服务器再根据所述需要读取的长度130M可以确定所述需要读取数据的结束位置在所述存储资源7中,则如图6b所示,所述需要读取的数据部分存 储在存储资源1中,部分存储在存储资源6中,还有部分存储在存储资源7中。故所述服务器将所述存储资源1中对应的需要读取的数据与所述存储资源6中对应的需要读取的数据以及存储资源7中需要读取的数据按照顺序整合,并发送给所述客户端1。
本申请实施例提供的执行读操作的流程可以如图7所示,这里假设每个分区中对应3个服务器,具体步骤如下。
S700:客户端1根据客户端的标识与写入服务器的数据的物理地址之间的对应关系,确定读取数据所在服务器的分区信息。
其中,所述分区信息中包含分区ID。
S701:客户端1根据分区ID与分区路由信息的对应关系,确定分区信息中的分区ID对应的分区路由信息。
S702:客户端1根据所述分区路由信息,确定与所述分区路由信息对应的3个服务器,并从3个服务器中随机选取一个服务器确定为进行读操作的服务器。
其中,假设3个服务器分别为服务器1、服务器2和服务器3。
S703:客户端1向确定进行读操作的服务器发送读请求,其中所述读请求中包含所述客户端标识、需要读取数据的起始位置及长度等信息。
S704:服务器接收所述客户端的发送的读请求,根据所述读请求确定所述读请求对应的数据。
S705:服务器将所述数据发送给所述客户端。
S706:客户端1读取服务器发送的数据。
进一步的,在客户端1从所述分区中的3个服务器中随机选取一个服务器进行读操作的同时,如果有另一个客户端2也需要从所述分区中的3个服务器中随机选取一个服务器进行读操作,则可按照上述步骤同步进行操作。
进一步的,若在数据读取过程发送故障,则本申请实施例也提出以下三种情况进行故障修复处理,以保证数据读取过程正常进行:
情况1:所述分区内没有故障恢复任务正在执行。这种情况下,故障的服务器还没有启动恢复流程。
此时,所述读请求会触发所述分区内的主服务器进行一次数据的协商过程(即与上述写操作过程中出现故障的处理流程中协商出最小的data_length相同,具体操作参照图5中的步骤5-步骤7)。协商结束后,则所述读请求对应的存储资源转变为只读状态,则可以满足前台的读请求。
情况2:所述分区内有故障恢复任务正在执行。这种情况下若收到客户端发送的读请求,所述服务器会先向上层返回BUSY指令,此时,需要等待故障恢复任务完成之后再响应读请求。
情况3:所述分区内故障恢复任务已经完成,故障的服务器状态变为OK。这种情况下则可以正常响应客户端发送的读请求。
通过上述对本申请方案的介绍,可以理解的是,上述实现各设备为了实现上述功能,其包含了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,本发明能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执 行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
如图8所示,本发明实施例一种执行写操作的服务器,该服务器至少包括处理器800和存储器801。所述存储器801中存储有程序802。处理器800、存储器801和通信接口之间通过***总线连接并完成相互间的通信。
处理器800是单核或多核中央处理单元,或者为特定集成电路,或者为被配置成实施本发明实施例的一个或多个集成电路。存储器801可以为高速RAM存储器,也可以为非易失性存储器(non-volatile memory),例如至少一个硬盘存储器。存储器801用于存储计算机执行指令。具体的,计算机执行指令中可以包括程序802。当所述服务器运行时,处理器800运行所述程序802以执行图4所示的S405-S408的方法流程。
如图9所示,本发明提供一种执行写操作的服务器,该服务器包括:
接收模块900:用于接收第一客户端发送的第一写请求,所述第一写请求中包含所述第一客户端的标识和第一待写入的数据;接收第二客户端发送的第二写请求,所述第二写请求包含所述第二客户端的标识和第二待写入的数据;
处理模块901:用于根据所述第一客户端的标识以及保存的客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是第一存储资源;根据所述第二客户端的标识以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是第二存储资源,其中所述第二存储资源所位于的物理地址不同于所述第一存储资源所位于的物理地址;将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中;
创建模块902:创建所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系。
上述图9所示的接收模块900、处理模块901以及创建模块902的功能可以由处理器800运行程序802执行,或者由处理器800单独执行。
基于同一发明构思,本发明实施例中还提供了一种执行写操作的方法,由于该方法对应是本发明实施例中所介绍的用于执行写操作的服务器,并且该方法解决问题的原理与该服务器相似,因此该方法的实施可以参见本发明实施例中服务器的实施,重复之处不再赘述。
如图10所示,本发明实施例还提供一种执行写操作的方法,该方法包括:
步骤1000、服务器接收第一客户端发送的第一写请求,所述第一写请求包含所述第一客户端的标识和第一待写入的数据;
步骤1001、所述服务器接收第二客户端发送的第二写请求,所述第二写请求包含所述第二客户端的标识和第二待写入的数据;
步骤1002、所述服务器根据所述第一客户端的标识以及保存的客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是第一存储资源;
步骤1003、所述服务器根据所述第二客户端的标识以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是第二存储资源,其中所述第二存储资源所位于的物理地址不同于所述第一存储资源所位于的物理地址;
步骤1004、所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中;
步骤1005、所述服务器创建所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系。
如图11所示,本发明实施例一种执行读操作的服务器,该服务器至少包括处理器1100和存储器1101。所述存储器1101中存储有程序1102。处理器1100、存储器1101和通信接口之间通过***总线连接并完成相互间的通信。
处理器1100是单核或多核中央处理单元,或者为特定集成电路,或者为被配置成实施本发明实施例的一个或多个集成电路。存储器1101可以为高速RAM存储器,也可以为非易失性存储器(non-volatile memory),例如至少一个硬盘存储器。存储器1101用于存储计算机执行指令。具体的,计算机执行指令中可以包括程序1102。当所述服务器运行时,处理器1100运行所述程序1102以执行图7所示的S704-S706的方法流程。
如图12所示,一种执行读操作的服务器,该服务器包括:
接收模块1200:用于接收第一客户端的发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;
处理模块1201:用于根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
上述图12所示的接收模块1200和处理模块1201的功能可以由处理器1100运行程序1102执行,或者由处理器1100单独执行。
基于同一发明构思,本发明实施例中还提供了一种执行读操作的方法,由于该方法对应是本发明实施例中所介绍的用于执行读操作的服务器,并且该方法解决问题的原理与该服务器相似,因此该方法的实施可以参见本发明实施例中服务器的实施,重复之处不再赘述。
如图13所示,本发明实施例还提供一种执行读操作的方法,该方法包括:
步骤1300、服务器接收第一客户端发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;
步骤1301、所述服务器接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;
步骤1302、所述服务器根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
在一些可能的实施方式中,本发明实施例提供的执行写操作、读操作的方法的各个方面还可以实现为一种程序产品的形式,其包括程序代码,当所述程序代码在计算机设备上运行时,所述程序代码用于使所述计算机设备执行本说明书中描述的根据本发明各种示例性实施方式的执行写操作、读操作的方法中的步骤。
所述程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
根据本发明的实施方式的用于执行写操作、读操作的程序产品,其可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在服务器设备上运行。然而,本发明的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被信息传输、装置或者器件使用或者与其结合使用。
可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由周期网络动作***、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于——无线、有线、光缆、RF等,或者上述的任意合适的组合。
可以用一种或多种程序设计语言的任意组合来编写用于执行本发明操作的程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算设备,或者,可以连接到外部计算设备。
本申请实施例针对执行写操作的方法还提供一种计算设备可读存储介质,即断电后内容不丢失。该存储介质中存储软件程序,包括程序代码,当所述程序代码在计算设备上运行时,该软件程序在被一个或多个处理器读取并执行时可实现本申请实施例上面任何一种执行写操作的方案。
本申请实施例针对执行读操作的方法还提供一种计算设备可读存储介质,即断电后内容不丢失。该存储介质中存储软件程序,包括程序代码,当所述程序代码在计算设备上运行时,该软件程序在被一个或多个处理器读取并执行时可实现本申请实施例上面任何一种执行读操作的方案。
以上参照示出根据本申请实施例的方法、装置(***)和/或计算机程序产品的框图和/或流程图描述本申请。应理解,可以通过计算机程序指令来实现框图和/或流程图示图的一个块以及框图和/或流程图示图的块的组合。可以将这些计算机程序指令提供给通用计算机、专用计算机的处理器和/或其它可编程数据处理装置,以产生机器,使得经由计算机处理器和/或其它可编程数据处理装置执行的指令创建用于实现框图和/或流程图块中所指定的功能/动作的方法。
相应地,还可以用硬件和/或软件(包括固件、驻留软件、微码等)来实施本申请。更进一步地,本申请可以采取计算机可使用或计算机可读存储介质上的计算机程序产品的形式,其具有在介质中实现的计算机可使用或计算机可读程序代码,以由指令执行***来使用或结合指令执行***而使用。在本申请上下文中,计算机可使用或计算机可读介质可以是任意介质,其可以包含、存储、通信、传输、或传送程序,以由指令执行***、装置或设备使用,或结合指令执行***、装置或设备使用。
尽管结合具体特征及其实施例对本申请进行了描述,显而易见的,在不脱离本申请的精神和范围的情况下,可对其进行各种修改和组合。相应地,本说明书和附图仅仅是所附权利要求所界定的本申请的示例性说明,且视为已覆盖本申请范围内的任意和所有修改、变化、组合或等同物。显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包括这些改动和变型在内。

Claims (17)

  1. 一种执行写操作的方法,其特征在于,包括:
    服务器接收第一客户端发送的第一写请求,所述第一写请求包含所述第一客户端的标识和第一待写入的数据;
    所述服务器接收第二客户端发送的第二写请求,所述第二写请求包含所述第二客户端的标识和第二待写入的数据;
    所述服务器根据所述第一客户端的标识以及保存的客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是第一存储资源;
    所述服务器根据所述第二客户端的标识以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是第二存储资源,其中所述第二存储资源所位于的物理地址不同于所述第一存储资源所位于的物理地址;
    所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中;
    所述服务器创建所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系。
  2. 如权利要求1所述的方法,其特征在于,所述服务器创建所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系之后,还包括:
    所述服务器将所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系发送给所述第一客户端,以及将所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系发送给所述第二客户端。
  3. 如权利要求2所述的方法,其特征在于,所述方法还包括:
    所述服务器接收所述第一客户端发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;
    所述服务器接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;
    所述服务器根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
  4. 如权利要求1所述的方法,其特征在于,所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中之前,还包括:
    所述服务器若确定所述第一存储资源中剩余的存储空间大小小于所述第一待写入的数据的大小时,为所述第一客户端继续分配至少一个第三存储资源,并记录所述第一客户端的标识与所述至少一个第三存储资源的标识的对应关系;
    所述服务器若确定所述第二存储资源中剩余的存储空间大小小于所述第二待写入的数据的大小时,为所述第二客户端继续分配至少一个第四存储资源,并记录所述第二客户端的标识与所述至少一个第四存储资源的标识的对应关系;
    所述服务器将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中包括:
    所述服务器将所述第一待写入的数据中的部分数据存储到所述第一存储资源中,并将所述第一待写入的数据中的剩余部分数据存储到所述第三存储资源中;
    所述服务器将所述第二待写入的数据中的部分数据存储到所述第二存储资源中,并将所述第二待写入的数据中的剩余部分数据存储到所述第四存储资源中。
  5. 一种执行读操作的方法,其特征在于,包括:
    服务器接收第一客户端发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;
    所述服务器接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;
    所述服务器根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
  6. 如权利要求5所述的方法,其特征在于,所述服务器根据所述第一读请求确定所述第一待读取的数据,以及根据所述第二读请求确定所述第二待读取的数据,包括:
    所述服务器根据所述第一客户端的标识,确定为所述第一客户端分配的存储资源是第一存储资源;所述服务器根据所述第二客户端的标识,确定为所述第二客户端分配的存储资源是第二存储资源;
    所述服务器根据所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系以及所述第一待读取的数据的起始位置及长度,从所述第一存储资源中确定所述第一待读取的数据;
    所述服务器根据所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系以及所述第二待读取的数据的起始位置及长度,从所述第二存储资源中确定所述第二待读取的数据。
  7. 如权利要求6所述的方法,其特征在于,所述服务器根据所述第一客户端的标识,确定为所述第一客户端分配的第一存储资源以及根据所述第二客户端的标识,确定为所述第二客户端分配的第二存储资源,包括:
    所述服务器根据所述第一客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是所述第一存储资源;
    所述服务器根据所述第二客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是所述第二存储资源。
  8. 一种服务器,其特征在于,包括:处理单元和通信单元;
    所述通信单元,用于接收第一客户端发送的第一写请求,所述第一写请求中包含所述第一客户端的标识和第一待写入的数据;接收第二客户端发送的第二写请求,所述第二写请求包含所述第二客户端的标识和第二待写入的数据;
    所述处理单元,用于根据所述第一客户端的标识以及保存的客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是第一存储资源;根据所述第二客户端的标识以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确 定为所述第二客户端分配的存储资源是第二存储资源,其中所述第二存储资源所位于的物理地址不同于所述第一存储资源所位于的物理地址;将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中;创建所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系,以及所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系。
  9. 如权利要求8所述的服务器,其特征在于,所述通信单元,还用于:
    将所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系发送给所述第一客户端,以及将所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系发送给所述第二客户端。
  10. 如权利要求9所述的服务器,其特征在于,所述通信单元,还用于:
    接收所述第一客户端的发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;
    所述处理单元,还用于根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
  11. 如权利要求8所述的服务器,其特征在于,所述处理单元还用于:
    若确定所述第一存储资源中剩余的存储空间大小小于所述第一待写入的数据的大小时,为所述第一客户端继续分配至少一个第三存储资源,并记录所述第一客户端的标识与所述至少一个第三存储资源的标识的对应关系;
    若确定所述第二存储资源中剩余的存储空间大小小于所述第二待写入的数据的大小时,为所述第二客户端继续分配至少一个第四存储资源,并记录所述第二客户端的标识与所述至少一个第四存储资源的标识的对应关系;
    所述处理单元在将所述第一待写入的数据存储到所述第一存储资源中,并且将所述第二待写入的数据存储到所述第二存储资源中时,具体用于:
    将所述第一待写入的数据中的部分数据存储到所述第一存储资源中,并将所述第一待写入的数据中的剩余部分数据存储到所述第三存储资源中;
    将所述第二待写入的数据中的部分数据存储到所述第二存储资源中,并将所述第二待写入的数据中的剩余部分数据存储到所述第四存储资源中。
  12. 一种服务器,其特征在于,包括:处理单元和通信单元;
    所述通信单元,用于接收第一客户端的发送的第一读请求,所述第一读请求中包含所述第一客户端的标识、第一待读取的数据的起始位置及长度;接收所述第二客户端发送的第二读请求,所述第二读请求中包含所述第二客户端的标识、第二待读取的数据的起始位置及长度;
    所述处理单元,用于根据所述第一读请求确定所述第一待读取的数据,并将所述第一待读取的数据发送给所述第一客户端,以及根据所述第二读请求确定所述第二待读取的数据,并将所述第二待读取的数据发送给所述第二客户端。
  13. 如权利要求12所述的服务器,其特征在于,所述处理单元具体用于:
    根据所述第一客户端的标识,确定为所述第一客户端分配的存储资源是第一存储资源; 根据所述第二客户端的标识,确定为所述第二客户端分配的存储资源是第二存储资源;
    根据所述第一客户端的标识与所述第一存储资源所位于的物理地址之间的对应关系以及所述第一待读取的数据的起始位置及长度,从所述第一存储资源中确定所述第一待读取的数据;
    根据所述第二客户端的标识与所述第二存储资源所位于的物理地址之间的对应关系以及所述第二待读取的数据的起始位置及长度,从所述第二存储资源中确定所述第二待读取的数据。
  14. 如权利要求13所述的服务器,其特征在于,所述处理单元具体用于:
    根据所述第一客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第一客户端分配的存储资源是所述第一存储资源;
    根据所述第二客户端的标识,以及保存的所述客户端的标识与分配的存储资源之间的对应关系,确定为所述第二客户端分配的存储资源是所述第二存储资源。
  15. 一种分布式存储***,其特征在于,包括如权利要求8~11任一项所述的执行写操作的服务器,和如权利要求12~14任一项所述的执行读操作的服务器。
  16. 一种计算机可读存储介质,其特征在于,存储有计算机可执行指令,所述计算机可执行指令用于使计算机执行如权利要求1至7中任一项所述的方法。
  17. 一种计算机程序产品,其特征在于,包含有计算机可执行指令,所述计算机可执行指令用于使计算机执行如权利要求1至7中任一项所述的方法。
PCT/CN2020/088787 2019-06-18 2020-05-06 一种执行写操作、读操作的方法及装置 WO2020253407A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910528337.2A CN112099728B (zh) 2019-06-18 2019-06-18 一种执行写操作、读操作的方法及装置
CN201910528337.2 2019-06-18

Publications (1)

Publication Number Publication Date
WO2020253407A1 true WO2020253407A1 (zh) 2020-12-24

Family

ID=73748430

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/088787 WO2020253407A1 (zh) 2019-06-18 2020-05-06 一种执行写操作、读操作的方法及装置

Country Status (2)

Country Link
CN (1) CN112099728B (zh)
WO (1) WO2020253407A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113010821A (zh) * 2021-04-14 2021-06-22 北京字节跳动网络技术有限公司 页面加载方法、装置、设备及存储介质
CN115495008A (zh) * 2021-06-18 2022-12-20 华为技术有限公司 一种数据管理方法、存储空间管理方法及装置
CN114448781B (zh) * 2021-12-22 2024-06-07 天翼云科技有限公司 一种数据处理***

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291205B (zh) * 2008-06-16 2011-05-11 杭州华三通信技术有限公司 传输备份数据的方法、***和镜像服务器
CN103873504A (zh) * 2012-12-12 2014-06-18 鸿富锦精密工业(深圳)有限公司 数据分块存储至分布式服务器的***及方法
CN104994135A (zh) * 2015-05-25 2015-10-21 华为技术有限公司 存储***中融合san及nas存储架构的方法及装置
CN107948233A (zh) * 2016-10-13 2018-04-20 华为技术有限公司 处理写请求或读请求的方法、交换机、控制节点

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100100439A1 (en) * 2008-06-12 2010-04-22 Dawn Jutla Multi-platform system apparatus for interoperable, multimedia-accessible and convertible structured and unstructured wikis, wiki user networks, and other user-generated content repositories
CN102035865B (zh) * 2009-09-30 2013-04-17 阿里巴巴集团控股有限公司 数据存储及数据寻址方法、***和设备
CN102882983B (zh) * 2012-10-22 2015-06-10 南京云创存储科技有限公司 一种云存储***中提升并发访问性能的数据快速存储方法
CN103503414B (zh) * 2012-12-31 2016-03-09 华为技术有限公司 一种计算存储融合的集群***
CN107426321A (zh) * 2017-07-31 2017-12-01 郑州云海信息技术有限公司 一种分布式存储***配额分配方法及装置
CN107632791A (zh) * 2017-10-10 2018-01-26 郑州云海信息技术有限公司 一种存储空间的分配方法及***
CN107888657B (zh) * 2017-10-11 2020-11-06 上海交通大学 低延迟分布式存储***

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101291205B (zh) * 2008-06-16 2011-05-11 杭州华三通信技术有限公司 传输备份数据的方法、***和镜像服务器
CN103873504A (zh) * 2012-12-12 2014-06-18 鸿富锦精密工业(深圳)有限公司 数据分块存储至分布式服务器的***及方法
CN104994135A (zh) * 2015-05-25 2015-10-21 华为技术有限公司 存储***中融合san及nas存储架构的方法及装置
CN107948233A (zh) * 2016-10-13 2018-04-20 华为技术有限公司 处理写请求或读请求的方法、交换机、控制节点

Also Published As

Publication number Publication date
CN112099728B (zh) 2022-09-16
CN112099728A (zh) 2020-12-18

Similar Documents

Publication Publication Date Title
WO2020253407A1 (zh) 一种执行写操作、读操作的方法及装置
CN108647104B (zh) 请求处理方法、服务器及计算机可读存储介质
US9003002B2 (en) Efficient port management for a distributed network address translation
JP6275119B2 (ja) メモリ要素の割当てのために一方向リンク付けリストを区分化するシステム及び方法
JP7467593B2 (ja) リソース割振り方法、記憶デバイス、および記憶システム
CN106936931B (zh) 分布式锁的实现方法、相关设备及***
CN111338806B (zh) 一种业务控制方法及装置
US20220114145A1 (en) Resource Lock Management Method And Apparatus
WO2014183531A1 (zh) 一种分配远程内存的方法及装置
US11438423B1 (en) Method, device, and program product for transmitting data between multiple processes
CN107920101B (zh) 一种文件访问方法、装置、***及电子设备
CN112286688A (zh) 一种内存管理和使用方法、装置、设备和介质
CN112052230A (zh) 多机房数据同步方法、计算设备及存储介质
CN110162395B (zh) 一种内存分配的方法及装置
CN107329798B (zh) 数据复制的方法、装置和虚拟化***
US11144207B2 (en) Accelerating memory compression of a physically scattered buffer
CN117407159A (zh) 内存空间的管理方法及装置、设备、存储介质
CN112596669A (zh) 一种基于分布式存储的数据处理方法及装置
CN109478151B (zh) 网络可访问数据卷修改
TW202315360A (zh) 微服務分配方法、電子設備及儲存介質
CN112559164A (zh) 一种资源共享方法及装置
CN113691465A (zh) 一种数据的传输方法、智能网卡、计算设备及存储介质
CN112929459B (zh) 一种边缘***及数据操作请求的处理方法
CN114253733B (zh) 一种内存管理方法、装置、计算机设备和存储介质
CN117873694A (zh) 堆空间分配方法、装置、电子设备和存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20827012

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20827012

Country of ref document: EP

Kind code of ref document: A1