CN109462642B

CN109462642B - Data processing method and device

Info

Publication number: CN109462642B
Application number: CN201811283336.8A
Authority: CN
Inventors: 张天洁
Original assignee: New H3C Technologies Co Ltd Chengdu Branch
Current assignee: New H3C Technologies Co Ltd Chengdu Branch
Priority date: 2018-10-30
Filing date: 2018-10-30
Publication date: 2021-06-08
Anticipated expiration: 2038-10-30
Also published as: CN109462642A

Abstract

The disclosure provides a data processing method and device, and relates to the technical field of internet. The method is applied to a computing node which is communicated with a storage cluster through a switch, and comprises the following steps: calculating to obtain at least two storage nodes corresponding to the data to be stored in the storage cluster; obtaining corresponding multicast addresses according to the at least two storage nodes; and encapsulating data to be stored in a multicast message, taking the multicast address as a destination address of the multicast message, sending the multicast message to the switch, and sending the multicast message to at least two storage nodes corresponding to the multicast address in the storage cluster through the switch so as to store the data to be stored. Thereby improving data processing performance.

Description

Data processing method and device

Technical Field

The present disclosure relates to the field of internet technologies, and in particular, to a data processing method and apparatus.

Background

With the rapid development of internet technology, the data volume in various scenes is becoming larger and larger, and accordingly, a large amount of data needs to be stored and the like. For example, in a distributed storage system, in order to ensure data storage reliability, a plurality of copies are often stored, and data storage reliability is improved by storing a plurality of copies of the same data.

Disclosure of Invention

In view of the above, the present disclosure provides a data processing method and apparatus.

In a first aspect, the present disclosure provides a data processing method applied to a computing node in communication with a storage cluster through a switch, the method including:

calculating to obtain at least two storage nodes corresponding to the data to be stored in the storage cluster;

obtaining corresponding multicast addresses according to the at least two storage nodes;

and encapsulating data to be stored in a multicast message, taking the multicast address as a destination address of the multicast message, sending the multicast message to the switch, and sending the multicast message to at least two storage nodes corresponding to the multicast address in the storage cluster through the switch so as to store the data to be stored.

Optionally, the step of calculating to obtain at least two storage nodes corresponding to the data to be stored in the storage cluster includes:

acquiring topology information of storage nodes in the storage cluster and storage configuration information in the storage cluster;

and calculating to obtain at least two storage nodes corresponding to the data to be stored in the storage cluster according to the topology information and the storage configuration information.

Optionally, the step of obtaining the corresponding multicast address according to the at least two storage nodes includes:

and inquiring whether a preset multicast address data table has multicast addresses corresponding to the at least two storage nodes, and if so, acquiring the multicast address.

Optionally, the method further comprises:

if the multicast address corresponding to the at least two storage nodes does not exist, obtaining a number group corresponding to the at least two storage nodes according to the preset number of each storage node in the storage cluster, wherein the numbers corresponding to the at least two storage nodes in the number group are sequentially arranged according to the size;

acquiring a full-order relation of numbering groups respectively corresponding to the storage nodes with the same number as the at least two storage nodes in the storage cluster;

mapping a full-order relation of a preset multicast address field and a full-order relation of a number group corresponding to each storage node with the same number as the at least two storage nodes in the storage cluster to obtain multicast addresses corresponding to the at least two storage nodes;

and adding the multicast addresses corresponding to the at least two storage nodes to the multicast address data table.

Optionally, the method further comprises:

and judging whether the storage confirmation information sent by each of the at least two storage nodes is received, and if the storage confirmation information sent by each of the at least two storage nodes is received, judging that the storage of the data to be stored is finished.

In a second aspect, the present disclosure provides a data processing method applied to a storage cluster in communication with a compute node through a switch, the method including:

acquiring an Address Resolution Protocol (ARP) table from the switch;

obtaining the IP address and the port information of each storage node in the storage cluster from the ARP table;

acquiring multicast addresses corresponding to at least two storage nodes in the storage cluster;

and sending each multicast address and the IP address and the port information corresponding to the multicast address to the switch.

In a third aspect, the present disclosure provides a data processing method applied to a distributed storage system, where the distributed storage system includes a computing node, a switch, and a storage cluster, where the storage cluster includes a plurality of storage nodes, and the method includes:

the computing node encapsulates data to be stored in a multicast message and sends the multicast message to the switch, wherein the multicast address of the multicast message corresponds to at least two storage nodes in the storage cluster;

the switch receives the multicast message sent by the computing node, searches for at least two destination ports according to the multicast address in the multicast message, copies the multicast message and sends the copied multicast message to corresponding storage nodes through the at least two destination ports respectively;

and after receiving the multicast message, the storage node stores the data to be stored in the multicast message.

Optionally, a multicast address data table is preset in the computing node; the step that the computing node encapsulates the data to be stored in the multicast message and sends the multicast message to the switch comprises the following steps:

inquiring whether a preset multicast address data table has multicast addresses corresponding to the at least two storage nodes, and if the multicast addresses corresponding to the at least two storage nodes exist, obtaining the multicast address;

encapsulating data to be stored in a multicast message, taking the multicast address as a destination address of the multicast message, and sending the multicast message to the switch;

the multicast address data table establishes multicast addresses corresponding to at least two storage nodes in the storage cluster according to the maximum number of the storage nodes which can be included in the storage cluster, adds or deletes corresponding information in the multicast address data table in the capacity expansion or capacity reduction process of the storage nodes in the storage cluster, and updates the capacity expansion or capacity reduction result of the storage nodes to the switch.

In a fourth aspect, the present disclosure provides a data processing apparatus applied to a compute node communicating with a storage cluster through a switch, the data processing apparatus including:

the computing module is used for computing to obtain at least two storage nodes corresponding to the data to be stored in the storage cluster;

an address obtaining module, configured to obtain a corresponding multicast address according to the at least two storage nodes;

and the message processing module is used for encapsulating the data to be stored in a multicast message, taking the multicast address as a destination address of the multicast message, sending the multicast message to the switch, and sending the multicast message to at least two storage nodes corresponding to the multicast address in the storage cluster through the switch so as to store the data to be stored.

Optionally, the calculation module obtains at least two storage nodes corresponding to the data to be stored in the storage cluster by calculating according to the following steps:

Optionally, the address obtaining module obtains the corresponding multicast address according to the at least two storage nodes by:

Optionally, the address obtaining module is further configured to, when there is no multicast address corresponding to the at least two storage nodes, perform the following steps:

obtaining a number group corresponding to the at least two storage nodes according to the preset number of each storage node in the storage cluster, wherein the numbers corresponding to the at least two storage nodes in the number group are sequentially arranged according to the size;

In a fifth aspect, the present disclosure provides a data processing apparatus applied to a storage cluster communicating with a compute node through a switch, the data processing apparatus comprising:

the table item acquisition module is used for acquiring an Address Resolution Protocol (ARP) table from the switch;

the information acquisition module is used for acquiring the IP address and the port information of each storage node in the storage cluster from the ARP table and acquiring the multicast address corresponding to at least two storage nodes in the storage cluster;

and the information sending module is used for sending each multicast address and the IP address and the port information corresponding to the multicast address to the switch.

In a sixth aspect, the present disclosure provides a compute node comprising: memory, processor and computer program stored on the memory and executable on the processor, which when executed by the processor implements the above-described data processing method performed by the computing node.

In a seventh aspect, the present disclosure provides a computer-readable storage medium, where the computer-readable storage medium includes a computer program, and the computer program controls, when running, a computing node where the computer-readable storage medium is located to execute the data processing method executed by the computing node.

According to the data processing method and device, the multi-copy storage processing of the data to be stored is realized in the form of the multicast message, the network utilization rate of data transmission is optimized, the network forwarding times are reduced, the time delay is reduced, and the data processing performance is improved.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

To more clearly illustrate the technical solutions of the present disclosure, the drawings needed for the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present disclosure, and therefore should not be considered as limiting the scope, and those skilled in the art can also obtain other related drawings based on the drawings without inventive efforts.

Fig. 1 is a block schematic diagram of a distributed storage system provided by the present disclosure.

Fig. 2 is a schematic diagram of a multi-copy storage principle provided by the present disclosure.

Fig. 3 is a schematic diagram of network logic of a multi-copy storage according to the present disclosure.

Fig. 4 is a block diagram of a computing node according to the present disclosure.

Fig. 5 is a timing diagram of a data processing method provided by the present disclosure.

Fig. 6 is a schematic flowchart of a compute node according to the present disclosure.

Fig. 7 is a schematic flow chart of a storage cluster according to the present disclosure.

Fig. 8 is a schematic diagram of another network logic of a multi-copy storage provided by the present disclosure.

Fig. 9 is a block diagram illustrating functional modules of a data processing apparatus applied to a compute node according to the present disclosure.

Fig. 10 is a block diagram illustrating functional modules of a data processing apparatus applied to a storage cluster according to the present disclosure.

Icon: 10-a compute node; 11-a memory; 12-a processor; 13-a network module; 14-a data processing device; 141-a calculation module; 142-an address acquisition module; 143-a message processing module; 20-a switch; 30-a storage cluster; 31-a storage node; 311-table entry obtaining module; 312-an information acquisition module; 313-information sending module.

Detailed Description

The distributed storage system dispersedly stores data on a plurality of independent storage servers, adopts an expandable system architecture, and shares storage load by using a plurality of storage servers, thereby not only improving the reliability, the availability and the access efficiency of storage, but also being easy to expand and having wider application scenes. The distributed storage system can interconnect a large number of storage servers through a network, and uses storage software to organize storage devices such as disks on the storage servers into a virtual storage resource pool, so as to provide storage service for an external storage cluster as a whole. The distributed storage system has great advantages in the aspects of expandability, flexibility, cost performance and the like. Particularly, with the development of cloud computing, the distributed storage system is rapidly developed due to the fact that the distributed storage system is deeply matched with the requirements of cloud computing on expandability, flexibility, manageability and the like. According to market research and prediction, the distributed storage system has a high composite growth rate in the future.

Referring to fig. 1, an exemplary networking architecture of a distributed storage system is provided in the present disclosure. The distributed storage system may include a computing node 10, a switch 20, and a storage cluster 30, where the storage cluster 30 includes a plurality of storage nodes 31, and each storage node 31 includes a storage device. The computing node 10 is connected to the switch 20, and reads and writes data in the storage cluster 30 using a network. The storage cluster 30 is connected to the switch 20, and each storage node 31 in the storage cluster 30 communicates through a network and forms a unified storage resource pool to provide storage service for the computing node 10. The storage device included in each storage node 31 may be a magnetic disk, but is not limited thereto.

In the present disclosure, the computing node 10 may be a device with computing power and data storage requirements, such as a personal computer, a server, and the like. The computing node 10 in the distributed storage system may be one or more. The switch 20 may be an Ethernet switch, and may be a gigabit Ethernet/gigabit Ethernet (GE/10 GE) port.

In distributed storage, in order to achieve reliability and high durability of data storage, some data storage technologies in common use include multiple copy technologies, erasure coding technologies, and the like. Among them, the copy (replay) mainly uses multiple copies of the same data to realize reliability and durability of data storage, for example, copy storage is realized by a Redundant Array of Independent Drives (RAID), and 2 copies, 3 copies, 4 copies, and the like can be flexibly configured. Each copy of the same data has strong consistency.

Referring to fig. 2, the present disclosure illustrates the storage principle by taking 3 copies of the storage data as an example. In fig. 2, 5 copies of data are listed, namely data 1, data 2, data 3, data 4 and data 5, which are respectively denoted by 1, 2, 3, 4 and 5 in fig. 2, and each copy of data has 3 copies and is respectively stored on different disks of different storage nodes 31. By the storage mode, even if two storage nodes 31 in the storage cluster 30 fail, data can be acquired from the other storage node 31, so that the storage cluster 30 is ensured not to lose data as a whole, and the reliability and high durability of data storage are ensured.

It has been found that in the multi-copy scheme, the copy is generally done by the storage cluster 30. For example, in a 3-copy scheme, the compute node 10 writes only 1 copy to the storage cluster 30, and the other 2 copies are generated by the storage cluster 30. For example, the storage cluster 30 may implement multi-copy replication using a Primary-Secondary (Primary-Secondary) copy control protocol. In the Primary-Secondary protocol, duplicates are divided into two categories, with one and only one copy being a Primary duplicate, and the other copies in addition to this being Secondary duplicates. Primary copy is used to complete the replication of all Secondary copies.

Referring to fig. 3, the present disclosure takes 3 copies as an example, and illustrates how to implement multi-copy by using a central copy control protocol. For convenience of description, the storage node 31 includes a storage node 1 to a storage node N, where N is an integer greater than 3, 3 copies are respectively located in the storage node 1, the storage node 2, and the storage node 3, and data to be stored is data D.

As shown in fig. 3, the process of the computing node 10 for data storage includes: the computing node 10 writes a Primary copy of the data D to a disk of the storage node 2, which may be the Primary copy. The primary copy of storage node 2 sends data D to a certain disk of storage node 1 and a certain disk of storage node 3, respectively. And after the disk of the storage node 1 and the disk of the storage node 3 complete the writing of the data D, sending a writing completion message to the storage node 2. After receiving the write completion messages of the storage nodes 1 and 3 and completing the writing of the data D, the storage node 2 sends a data D write completion message to the computing node 10.

The determination of the storage location of the primary copy of the data D by the computing node 10 and the determination of the storage location of the other copies by the primary copy may be implemented in various ways, for example, by calculating through a Distributed Hash Table (DHT) algorithm, a crush (controlled Replication Under Scalable hashing) data distribution algorithm adopted in the open source Distributed storage software Ceph, and the like. In order to facilitate that data can be uniformly distributed on different disks, data sent by the computing node 10 can be stored in slices of a certain size, different data slices are stored in multiple copies, and different data can be selected from different disk groups according to an algorithm.

As can be seen from the analysis, in the multi-copy replication scheme shown in fig. 3, after the computing node 10 sends the data to the storage node 2 of the storage cluster 30 through the switch 20, the storage node 2, as a primary copy node, further sends the data to the storage node 1 and the storage node 3 where other Secondary copies are located through the switch 20. After the storage nodes 1 and 3 complete data writing, a write completion message needs to be sent to the storage node 2 through the switch 20. In the process, the switch 20 needs to perform network forwarding for multiple times, which occupies more network resources, resulting in lower network utilization rate and larger time delay of data transmission.

In view of this, the present disclosure provides a data processing method and apparatus, when a computing node 10 writes data, multiple copies of the data to be stored are stored in a multicast packet manner, and under the constraints of fixed copy number and strong data consistency, the network utilization rate of data transmission is optimized, the network forwarding times are reduced, and the read-write time delay is reduced, so as to improve the data processing performance.

The defects existing in the above solutions are the results obtained after the inventor goes through practice and research, therefore, the discovery process of the above problems and the solutions proposed by the present disclosure in the following to the above problems should be considered as the contribution of the inventor to the present disclosure.

The technical solutions in the present disclosure will be described clearly and completely with reference to the accompanying drawings in the present disclosure, and it is to be understood that the described embodiments are only a part of the embodiments of the present disclosure, and not all of the embodiments. The components of the present disclosure, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

Fig. 4 is a block diagram of a computing node 10 provided by the present disclosure. The computing node 10 in the present disclosure may be a device with data storage requirements, such as a server, a personal computer, a tablet, a smartphone, or the like. As shown in fig. 4, the computing node 10 includes: memory 11, processor 12, network module 13 and data processing device 14.

The memory 11, the processor 12 and the network module 13 are electrically connected to each other directly or indirectly to realize data transmission or interaction. For example, the components may be electrically connected to each other via one or more communication buses or signal lines. The memory 11 stores a data processing device 14, the data processing device 14 includes at least one functional module which can be stored in the memory 11 in a form of software or firmware (firmware), and the processor 12 executes various functional applications and data processing by executing the functional module stored in the memory 11 in a form of software or hardware, that is, implements the data processing method executed by the computing node 10 in the present disclosure.

The Memory 11 may be, but is not limited to, a Random Access Memory (RAM), a Read Only Memory (ROM), a Programmable Read-Only Memory (PROM), an Erasable Read-Only Memory (EPROM), an electrically Erasable Read-Only Memory (EEPROM), a magnetic disk, a solid state disk, or the like. The memory 11 is used for storing a program, and the processor 12 executes the program after receiving an execution instruction.

The processor 12 may be an integrated circuit chip having data processing capabilities. The Processor 12 may be a general-purpose Processor including a Central Processing Unit (CPU), a Network Processor (NP), and the like. The various methods, steps and logic blocks disclosed in this disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The network module 13 is configured to establish a communication connection between the computing node 10 and an external communication terminal through a network, so as to implement transceiving operations of network signals and data. The network signal may include a wireless signal or a wired signal.

It will be appreciated that the configuration shown in fig. 4 is merely illustrative and that the computing node 10 may also include more or fewer components than shown in fig. 4, or have a different configuration than shown in fig. 4. The components shown in fig. 4 may be implemented in hardware, software, or a combination thereof.

On the basis, the present disclosure further provides a computer-readable storage medium, where the computer-readable storage medium includes a computer program, and the computer program controls, when running, the computing node 10 where the computer-readable storage medium is located to execute the data processing method on the computing node 10 side.

In the present disclosure, the storage nodes 31 in the storage cluster 30 may also include a structure similar to the computing node 10 shown in fig. 4. Correspondingly, the present disclosure also provides a computer-readable storage medium, which includes a computer program, and when the computer program runs, the computer program controls the storage node 31 in the storage cluster 30 where the computer-readable storage medium is located to execute the data processing method on the storage cluster 30 side.

Referring to fig. 5 in conjunction, the present disclosure provides a data processing method that may be performed by the distributed storage system of fig. 1, which includes computing nodes 10, switches 20, and storage clusters 30. The method comprises the following steps.

In step S11, the computing node 10 encapsulates the data to be stored in a multicast packet and sends the multicast packet to the switch 20.

The multicast address of the multicast packet corresponds to at least two storage nodes 31 in the storage cluster 30, that is, the at least two storage nodes 31 form a multicast group as a multicast member, and after the computing node sends the storage data to the switch 20 by using the multicast address of the multicast group, the switch 20 copies the sent storage data to all multicast members.

Referring to fig. 6, optionally, the computing node 10 encapsulates the data to be stored in the multicast packet through the following steps S111 to S113 and sends the multicast packet to the switch 20, which is described in detail below.

Step S111, calculating to obtain at least two storage nodes 31 corresponding to the data to be stored in the storage cluster 30.

In order to obtain the specific position where the data to be stored is written in the storage cluster 30, the computing node 10 needs to obtain the topology information of the storage nodes 31 in the storage cluster 30 and the storage configuration information in the storage cluster 30, so as to obtain at least two corresponding storage nodes 31 in the storage cluster 30 of the data to be stored according to the topology information and the storage configuration information.

The specific location where the data to be stored is written into the storage cluster 30 refers to which storage nodes 31 in the storage cluster 30 the data to be stored is written, and may further include which storage devices of these storage nodes 31 are written, such as which specific disks. The topology information of the storage node 31 may include names of all storage nodes 31 included in the storage cluster 30, Internet Protocol (IP) addresses, disks included in each storage node 31, fault domain division of the storage nodes 31, and the like. The storage configuration information may include the number of copies or erasure code configuration employed by the data to be stored, etc.

After the computing node 10 acquires the topology information of the storage node 31 in the storage cluster 30 and the storage configuration information in the storage cluster 30, the specific location of the data to be stored in the storage cluster 30 may be calculated. For example, the computing node 10 may compute a specific location of the data to be stored in the storage cluster 30 through a DHT algorithm, a CRUSH algorithm, or the like. Thereby obtaining at least two storage nodes corresponding to the data to be stored in the storage cluster 30.

As an alternative implementation, the computing node 10 may interact with the storage cluster 30 by: client software (client software) is installed in the computing node 10. The Client software communicates with the monitoring software (monitor software) of the storage cluster 30 to obtain the topology information and the storage configuration information of the storage cluster 30. For convenience of description, the topology information and the configuration information are collectively referred to as a Cluster Map in this disclosure. After the client software obtains the Cluster Map, the specific position of the data to be stored in the storage Cluster 30 is calculated through a copy calculation algorithm F such as a DHT algorithm, a CRUSH algorithm and the like. Taking 3 copies as an example, if a result of calculation by using a copy calculation algorithm F is F (D) ═ P1, P2, P3 for a certain data D to be stored, where P1, P2, and P3 represent 3 different storage devices. Further, in combination with the Cluster Map, the storage nodes H1, H2 and H3 of P1, P2 and P3 can be calculated.

Step S112, obtaining the corresponding multicast address according to the at least two storage nodes 31.

There are many alternative ways to obtain the multicast address, for example, the correspondence between the multicast address and the "combination" formed by two or more storage nodes in the storage cluster 30 may be stored. Such as by a multicast address data table. Accordingly, the computing node 10 obtains the multicast addresses corresponding to the at least two storage nodes 31 by: and inquiring whether a preset multicast address data table has multicast addresses corresponding to the at least two storage nodes 31, and if so, acquiring the multicast address.

For another example, the number of each storage node 31 in the storage cluster 30 may be set in advance, and if there are N storage nodes 31, the storage nodes 31 may be numbered sequentially from 1 to N according to the size of the IP address of each storage node 31. According to the preset number of each storage node 31 in the storage cluster 30, the number group corresponding to the at least two storage nodes 31 can be obtained.

Taking a 3-copy scenario as an example, if three storage nodes 31 corresponding to data to be stored are H1, H2, and H3, respectively, the triplet (H1, H2, H3) corresponds to a group of numbers, the group of numbers is from 1 to N, and any 3 combinations of N numbers are used. If storage node H1 corresponds to number 5, storage node H2 corresponds to number 7, and storage node H3 corresponds to number 3, then the number group corresponding to (H1, H2, H3) is (5, 7, 3), the number group corresponding to (H3, H1, H2) is (3, 5, 7), and so on.

Since the combination of storage nodes 31 is not sequentially divided in the distributed storage system, the combination of the same storage nodes 31 does not need to consider the numerical order. In order to make the number groups corresponding to different combinations of the same storage nodes 31 in different arrangements identical, triples (H1, H2, H3), (H3, H1, H2) formed by arbitrary combinations of H1, H2, and H3 correspond to the same number group, and optionally, in the present disclosure, the numbers in the number groups are arranged in order according to size. For example, the numbering groups are numbered in descending order of the numbers and the parentheses are changed to the angle brackets. Thus, the triads formed by any combination of H1, H2 and H3, such as (H1, H2, H3) and (H3, H1, H2), may correspond to the unique number group <3, 5, 7 >.

After each number group is set in the above manner, the size of each number group may be compared, so as to obtain the full-order relationship of the number groups corresponding to the storage nodes 31 in the storage cluster 30, which are the same in number as the at least two storage nodes 31. The multicast address segments can be flexibly selected, and the selected multicast address segments naturally form a full-order relationship. Mapping the full-order relationship of the preset multicast address segment and the full-order relationship of the number group corresponding to each storage node 31 with the same number as the at least two storage nodes 31 in the storage cluster 30, so as to obtain the multicast addresses corresponding to the at least two storage nodes 31.

Assuming that addresses from 225.0.0.1 to 225.255.255.255 are selected as usable multicast address segments, the following description will be made for an alternative implementation of obtaining multicast addresses according to full-order relationship mapping, taking a 3-copy scenario as an example.

The number groups corresponding to any 3 storage nodes 31 may be compared in size in lexicographic order, forming the following full-order relationship from small to large:

<1，2，3><<1，2，4><…<<1，2，N><<1，3，4><1，3，5><…<<N-2，N-1，N>

the selected multicast address segments naturally form the following full-order relationship:

225.0.0.1<225.0.0.2<225.0.0.3<…<225.255.255.255

and mapping the two full-order relations one by one to obtain the mapping from the storage node 31 triple to the multicast address. The mapping method from the storage node 31 triple to the multicast address can be flexibly selected, for example, the number of the number groups is less than the number of the multicast addresses, and each number group can be mapped to the multicast address one by one, so as to obtain the corresponding multicast address.

One alternative implementation of mapping the number groups onto multicast addresses is listed in this disclosure, as follows.

Assume that the number group is < N1, N2, N3>, wherein N1< N2< N3, and the way of mapping < N1, N2, N3> to the multicast address includes:

the position of < N1, N2, N3> in the lexicographic order is determined. The location can be determined by the following several steps.

Computing<P，X，Y>The number of elements contained. Wherein<P，X，Y>Satisfies 1 ≦ P ≦ N1-1, and P<X<And Y. Note the book<P，X，Y>The number of elements contained is n (p), then,

thus, the sum of N (1) + N (2) + … + N (P) can be calculated.

The number of elements contained in < N1, Q, X > is calculated. Wherein Q satisfies N1< Q < N2. Note that the number of elements included in < N1, Q, X > is N (N1, Q), and N (N1, Q) ═ N-Q, the sum of N (N1, N1+1) + … + N (N1, N2-1) can be calculated.

The number of elements contained in < N1, N2, R > was counted. Wherein R satisfies N2< R < N3. Similarly, the number of elements contained in < N1, N2, R > can be calculated as: N3-N2-1.

The sum of the numbers obtained in the above steps is recorded as M, and then M elements precede < N1, N2, N3>, so that the position of < N1, N2, N3> is numbered as M + 1.

After obtaining the position number of < N1, N2, N3>, the multicast address corresponding to (M +1) can be calculated by rolling phase division: x.y.z, 225. Wherein the values of X, Y and Z can be determined in the following manner.

Dividing 256 by (M +1), recording the quotient as K and the remainder as L, and the corresponding formula is: (M +1) ═ K (256) + L. Then, X is 0, Y is K, and Z is L.

In the case where the value of N is less than or equal to 32, the number of number groups is not more than 4960, and thus the value of K must be less than 255. In the case where the number of storage nodes 31 in the storage cluster 30 exceeds 32 or the number of replicas exceeds 4, if M takes a larger value, such as over 65535, where K will be greater than 255, K may be decimated again to determine the values of X and Y.

After determining X, Y and the value of Z, any set of numbers can be mapped to a unique multicast address.

It should be understood that the multicast address mapping rule may be pre-stored in the computing node 10, and the computing node 10 may calculate the corresponding multicast address according to the pre-stored multicast address mapping rule for each of the at least two storage nodes 31 corresponding to the data to be stored at a time.

For example, when the address from 225.0.0.1 to 225.255.255.255 is selected as a usable multicast address segment and the size of the storage cluster is not larger than 255, in a scenario of 3 copies, the multicast address may be 225.a.b.c, where A, B, C is between [1 and 255 ]. At this time, H1 may be mapped to a, H2 to B, and H3 to C, so that different triples (H1, H2, H3) are mapped to different multicast addresses to meet the mapping requirements. It should be understood that the mapping may also be implemented in other ways, and the present disclosure does not need to describe this in detail.

In order to reduce the calculation amount, the calculation node 10 may also store the result obtained by the first calculation in a data table, such as a multicast address data table, for each storage node 31, so that in the subsequent process, the corresponding multicast address may be obtained by table lookup, and therefore, calculation is not required to be performed each time, and the calculation amount is reduced.

In this disclosure, the structure of the multicast address data table is not limited, for example, for a 3-copy scenario, the multicast address data table may be a four-metadata table, and the structure of the four-metadata table may be: storage node 1, storage node 2, storage node 3, multicast address. After verification, assuming that there are 16 storage nodes 31 in the storage cluster 30, each storage node 31 occupies 1 byte according to the number of the IP address being 1-16, and the multicast address occupies 4 bytes, then each quadruplet occupies 7 bytes in total. Therefore, the whole multicast address data table occupies 16 × 7 — 28672 bytes, does not occupy a large amount of storage space, and can be placed in the memory of the computing node 10 for use.

Step S113, encapsulating the data to be stored in a multicast packet, using the multicast address as a destination address of the multicast packet, and sending the multicast packet to the switch 20.

In step S12, the switch 20 receives the multicast packet sent by the computing node 10, finds at least two destination ports according to the multicast address in the multicast packet, copies the multicast packet, and sends the copied multicast packet to the corresponding storage nodes 31 through the at least two destination ports, so as to store the data to be stored.

In the present disclosure, both the computing node 10 and the storage Cluster 30 may compute the multicast address corresponding to any group of storage nodes 31 by using the same method through the Cluster Map. In order to associate each multicast address with a "combination" of each storage node 31 on the switch 20 side, the storage cluster 30 may optionally process the correspondence between each multicast address and the "combination" of each storage node 31 and transmit the correspondence to the switch 20. Accordingly, the switch 20 may obtain the multicast addresses corresponding to the at least two storage nodes 31 by searching the corresponding relationship.

Referring to fig. 7, the storage cluster 30 may obtain the correspondence relationship between the multicast addresses and the "combinations" of the storage nodes 31 through the following steps.

Step S131, an Address Resolution Protocol (ARP) table is obtained from the switch 20.

The storage cluster 30 may obtain the ARP table of the switch 20 through the Netconf protocol. The ARP table of switch 20 is obtained, for example, by monitor software of storage cluster 30 via Netconf protocol.

Step S132 obtains the IP address and the port information of each storage node 31 in the storage cluster 30 from the ARP table.

The ARP table includes the IP address, the port information, and the like of each storage node 31 in the storage Cluster 30, and the port of the switch 20 corresponding to any storage node 31 can be obtained by comparing the Cluster Map.

Step S133 obtains multicast addresses corresponding to at least two storage nodes 31 in the storage cluster 30.

The storage Cluster 30 calculates the multicast address corresponding to any one group of storage nodes 31 by using the same method as the calculation node 10 through the Cluster Map, and obtains the correspondence between each multicast address and the IP address and the port information by combining the port of the switch 20 corresponding to each storage node 31 obtained in step S132.

Step S134, sending each multicast address and its corresponding IP address and port information to the switch 20.

The storage cluster 30 may configure the switch 20 by using the Netconf protocol through monitor software, send a configuration command to the switch 20, send each multicast address and its corresponding IP address and port information to the switch 20, and join a port connected to a corresponding storage node 31 into a multicast group.

With the above configuration, after the switch 20 receives the multicast packet with the destination address as the multicast address from the computing node 10, it can know which ports the multicast packet should be sent to the corresponding storage nodes 31 according to the corresponding relationship between the multicast address and the ports.

Step S13, after receiving the multicast packet, the storage node 31 stores the data to be stored in the multicast packet.

Step S14, after the storage node 31 finishes storing, sending storage confirmation information (the above writing completion message) to the computing node 10, the computing node 10 determining whether to receive the storage confirmation information sent by each storage node 31 of the at least two storage nodes 31, and if receiving the storage confirmation information sent by each storage node 31 of the at least two storage nodes 31, determining that the data to be stored is stored completely. To ensure strong consistency of the data stored by each storage node 31.

On the basis of the above, in consideration of the higher flexibility of the storage cluster 30, capacity expansion or capacity reduction of the storage node 31 may be performed. In order to avoid that the corresponding relationship in the multicast address data table is changed greatly when the storage nodes 31 are expanded and reduced, the maximum storage scale that the storage cluster 30 may not reach in the future may be predicted at an initial stage, for example, there are 8 storage nodes 31 in the storage cluster 30 at present, and it is predicted that the storage cluster 30 may reach 32 or 64 storage nodes 31 in the future, then, the multicast address data table may be established according to the maximum number of the storage nodes 31, for example, 64 storage nodes 31. In this case, the multicast address data table occupies 64 × 7 ≈ 1.8MB, and occupies a small amount of memory. Through the setting, in the original capacity expansion or capacity reduction process of the 8 storage nodes 31, the Monitor software of the storage Cluster 30 only needs to add or delete the changed part in the multicast address data table according to the Cluster Map information, and update the changed result to the switch 20.

Referring to fig. 8, taking the storage node 31 including the storage node 1 to the storage node N, where N is an integer greater than 3, the 3 copies are respectively located at the storage node 1, the storage node 2, and the storage node 3, and the data to be stored is data D as an example, an implementation principle of performing data storage by multicast packets by using the data processing method in the present disclosure is shown.

By adopting the data processing method in the disclosure, after the computing node 10 encapsulates the data D to be stored in a multicast message and sends the multicast message to the switch 20, the switch 20 finds the ports corresponding to the storage node 1, the storage node 2 and the storage node 3 according to the multicast address, copies the multicast message and sends the copied multicast message to the storage node 1, the storage node 2 and the storage node 3 through the found ports respectively for storage. After the storage nodes 1, 2 and 3 finish the storage, the storage nodes directly send storage confirmation information to the computing node 10, and after the computing node 10 receives the storage confirmation information sent by the storage nodes 1, 2 and 3, the data storage is judged to be finished.

Compared with the mode shown in fig. 3, by adopting the scheme shown in fig. 8 in the present disclosure, the storage nodes 1, 2 and 3 do not need to perform multiple network forwarding as shown in fig. 3.

As can be seen from the comparison analysis, the scheme shown in fig. 8 can greatly improve the network utilization ratio compared to the scheme shown in fig. 3. In the N-copy scheme, the scheme shown in fig. 3 needs to transmit (2N-1) data writes in the network of the switch 20 facing the storage cluster 30, whereas the scheme shown in fig. 8 needs to transmit only N data writes in the network of the switch 20 facing the storage cluster 30. In the solution shown in fig. 3, the write completion message of the Secondary copy is forwarded to the primary copy, and the primary copy receives all messages and then sends the messages to the computing node 10, whereas in the solution shown in fig. 8, all copies directly send confirmation messages to the computing node 10, thereby reducing the network delay.

Referring to fig. 9, the present disclosure further provides a data processing apparatus 14, which is applied to a computing node 10 communicating with a storage cluster 30 through a switch 20, where the data processing apparatus 14 includes a computing module 141, an address obtaining module 142, and a message processing module 143.

The calculating module 141 is configured to calculate at least two storage nodes 31 corresponding to data to be stored in the storage cluster 30.

For the implementation of the calculating module 141, reference may be made to the related description of step S111 in fig. 6, which is not repeated herein.

The address obtaining module 142 is configured to obtain the corresponding multicast address according to the at least two storage nodes 31.

As for the implementation of the address obtaining module 142, reference may be made to the related description of step S112 in fig. 6, which is not described herein again.

The message processing module 143 is configured to encapsulate data to be stored in a multicast message, use the multicast address as a destination address of the multicast message, send the multicast message to the switch 20, and send the multicast message to at least two storage nodes 31 corresponding to the multicast address in the storage cluster 30 through the switch 20, so as to store the data to be stored.

As for the implementation of the message processing module 143, reference may be made to the related description of step S113 in fig. 6, which is not described herein again.

Optionally, the calculation module 141 calculates to obtain at least two storage nodes 31 corresponding to data to be stored in the storage cluster 30 by: acquiring topology information of the storage nodes 31 in the storage cluster 30 and storage configuration information in the storage cluster 30; and calculating to obtain at least two storage nodes 31 corresponding to the data to be stored in the storage cluster 30 according to the topology information and the storage configuration information.

Optionally, the address obtaining module 142 obtains the corresponding multicast address according to the at least two storage nodes 31 by: and inquiring whether a preset multicast address data table has multicast addresses corresponding to the at least two storage nodes 31, and if so, acquiring the multicast address.

Optionally, the address obtaining module 142 is further configured to, when there is no multicast address corresponding to the at least two storage nodes 31, perform the following steps: obtaining number groups corresponding to the at least two storage nodes 31 according to preset numbers of the storage nodes 31 in the storage cluster 30, wherein the numbers corresponding to the at least two storage nodes 31 in the number groups are sequentially arranged according to the size; acquiring a full-order relationship of numbering groups respectively corresponding to the storage nodes 31 with the same number as the at least two storage nodes 31 in the storage cluster 30; mapping a full-order relation of a preset multicast address segment and a full-order relation of a number group respectively corresponding to each storage node 31 with the same number as the at least two storage nodes 31 in the storage cluster 30 to obtain multicast addresses corresponding to the at least two storage nodes 31; adding multicast addresses corresponding to the at least two storage nodes 31 to the multicast address data table.

Referring to fig. 10 in combination, the present disclosure provides a data processing apparatus applied to a storage cluster 30 in communication with a computing node 10 through a switch 20, where the data processing apparatus includes an entry obtaining module 311, an information obtaining module 312, and an information sending module 313.

The table entry obtaining module 311 is configured to obtain an address resolution protocol ARP table from the switch 20.

For the implementation of the table entry obtaining module 311, reference may be made to the related description of step S131 in fig. 7, which is not described herein again.

The information obtaining module 312 is configured to obtain the IP address and the port information of each storage node 31 in the storage cluster 30 from the ARP table, and obtain the multicast address corresponding to at least two storage nodes 31 in the storage cluster 30.

As for the implementation of the information obtaining module 312, reference may be made to the related description of step S132 and step S133 in fig. 7, which is not described herein again.

The information sending module 313 is configured to send each multicast address and its corresponding IP address and port information to the switch 20.

As for the implementation of the information sending module 313, reference may be made to the related description of step S134 in fig. 7, which is not described herein again.

In the present disclosure, the implementation principle of the data processing apparatus is similar to that of the data processing method, and corresponding contents may refer to the foregoing method embodiment, and therefore, details are not described herein.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus and method embodiments described above are illustrative only, as the flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

In addition, functional modules in the embodiments of the present disclosure may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.

The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, an electronic device, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.

The foregoing is illustrative of only alternative embodiments of the present disclosure and is not intended to limit the disclosure, which may be modified and varied by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.

Claims

1. A data processing method applied to a compute node in communication with a storage cluster through a switch, the method comprising:

2. The data processing method according to claim 1, wherein the step of calculating at least two storage nodes corresponding to the data to be stored in the storage cluster comprises:

3. The data processing method according to claim 1, wherein the step of obtaining the corresponding multicast address according to the at least two storage nodes comprises:

4. The data processing method of claim 3, wherein the method further comprises:

5. The data processing method of claim 1, wherein the method further comprises:

6. A data processing method applied to a storage cluster in communication with a compute node through a switch, the method comprising:

acquiring an Address Resolution Protocol (ARP) table from the switch;

7. A data processing method is applied to a distributed storage system, wherein the distributed storage system comprises a computing node, a switch and a storage cluster, the storage cluster comprises a plurality of storage nodes, and the method comprises the following steps:

8. The data processing method according to claim 7, wherein a multicast address data table is preset in the computing node; the step that the computing node encapsulates the data to be stored in the multicast message and sends the multicast message to the switch comprises the following steps:

9. A data processing apparatus applied to a compute node in communication with a storage cluster through a switch, the data processing apparatus comprising:

10. The data processing apparatus according to claim 9, wherein the calculation module calculates at least two storage nodes corresponding to data to be stored in the storage cluster by:

11. The data processing apparatus according to claim 9, wherein the address obtaining module obtains the corresponding multicast address from the at least two storage nodes by:

12. The data processing apparatus according to claim 11, wherein the address obtaining module is further configured to, when there is no multicast address corresponding to the at least two storage nodes, perform the following steps:

13. A data processing apparatus applied to a storage cluster communicating with a compute node through a switch, the data processing apparatus comprising: