CN113127267B

CN113127267B - Strong-consistency multi-copy data access response method in distributed storage environment

Info

Publication number: CN113127267B
Application number: CN202110488820.XA
Authority: CN
Inventors: 孙胜耀; 李华英; 杨颖辉; 王仙吉; 张少辉
Original assignee: Zhengzhou Normal University
Current assignee: Zhengzhou Normal University
Priority date: 2021-04-30
Filing date: 2021-04-30
Publication date: 2023-02-17
Anticipated expiration: 2041-04-30
Also published as: CN113127267A

Abstract

A strong-consistency multi-copy data access response method in a distributed storage environment selects a plurality of copy nodes with strong node capacity and high activity to respond according to the service capacity and the activity capacity of the nodes in a dynamic distributed storage environment and the expectation of user access data, replaces the traditional data response mode of indifference participation in service, and achieves the aims of reducing the waiting time of access of multi-copy data and reducing the communication overhead of a system. Firstly, according to the characteristics of the nodes influencing data access, the service capability of the nodes is represented by credit currency values, the influence of the characteristics on the data access capability is evaluated, and the comprehensive influence is marked by adopting a credit currency issuing mode; then, according to the conditions that the node provides service and obtains service in unit time, measuring and marking the activity of the node by adopting a mobile credit monetary value; and finally, determining the copy needing to participate in the response according to the node activity, the node credit monetary value and the expectation of the user on the data freshness.

Description

Strong-consistency multi-copy data access response method in distributed storage environment

Technical Field

The invention relates to a distributed storage technology, in particular to a strong-consistency multi-copy data access response method in a dynamic distributed storage environment.

Background

Distributed computing and distributed storage have been widely used in today's real world scenario. In these applications, multiple copy techniques are typically used to improve the usability of the system and to increase the efficiency of data access. The quality of the system performance is closely related to the number of copies. Generally, the greater the number of copies, the greater the availability, scalability, and access efficiency of the system. While more copies can effectively improve the performance of the system, additional maintenance overhead is added to the system. For example, when an update occurs to one copy, other copies identical to the copy also need to complete the update in time to avoid the poor availability of the system due to data inconsistency. Obviously, the higher the number of copies, the higher the cost of consistency maintenance.

To effectively deal with the problem of multi-copy consistency, a large number of consistency maintenance schemes have been proposed. Generally, these policies can be classified into three types of push mode, polling mode, and hybrid mode. In the push-based policy, update messages are usually pushed to other copies in some form, such as "heartbeat"; in a polling-based policy, a replica node actively polls a service node to actively obtain updates; and the mixed mode combines pushing and polling to complete copy consistency updating. Although these policies can effectively guarantee the consistency of the access data, in the dynamic distributed storage environment, the freshness of the access data cannot be completely guaranteed due to frequent changes of nodes and data.

In order to deal with the problem of strong consistency of access data in a dynamic distributed environment, a distributed system adopts a certain strategy on the response work of the copy; as in a distributed-storage database, all atomic transaction participants are typically managed in a 2PC fashion to commit or terminate transactions; when accessing a data, if and only if all the replica nodes agree to submit the data access request, then the correct data access service is agreed to be provided; otherwise the access transaction is terminated. Although the 2PC mode effectively ensures the correctness of the read data, the response is carried out after each copy node confirms no error; the communication overhead of the system is increased undoubtedly. In order to better balance the performance of the system, in a distributed storage scenario (such as the database of the Taobao maritime, the SQL Server database of Microsoft, etc.) which is currently in practical use, the solutions based on Paxos are adopted to meet the requirement of strong consistency of data access. The scheme can agree to the service by submitting only the values agreed upon by most copies without waiting for each copy to respond. When Paxos responds to the accessed data, the duplicate nodes need to go through multiple rounds of negotiation first, and then take the values adopted by most nodes as response data. The mode enables all the replica nodes to participate in the data response service in a peer-to-peer manner, and effectively reduces the data communication overhead. However, in this method, a problem of whether to allow a part of nodes to replace all nodes for negotiation is not considered, and if the availability of cloud storage is not affected, the communication overhead of the system can be further reduced by using a method in which a part of nodes replace services.

In fact, any distributed storage cannot guarantee 100% perfect consistency of the accessed data, especially dynamic distributed storage. Whenever a user accesses data, they always expect that the data that they can access will not be erroneous, i.e., when accessing data, the user accesses data with an expectation, e.g., 99.99999999%, that it is the correct data. Therefore, in order to effectively respond to the strong consistency expectation of the user to the access data in the distributed application environment, it is necessary to design a strong consistency multi-copy data access response method, so as to reduce the communication overhead of the system on the premise of ensuring the user access data expectation.

Disclosure of Invention

The invention provides a strong-consistency multi-copy data access response method in a distributed storage environment aiming at the defects of the prior art, designs an on-demand response data response scheme, and aims to selectively participate in response service by a plurality of copies of nodes according to the expectation of a user on the freshness of access data in a dynamic distributed storage environment, reduce the access waiting time of the multi-copy data and reduce the communication overhead of a system.

The technical proposal of the invention has the main technical conception that:

according to the service capability and the activity capability of the nodes in the dynamic distributed storage environment and the expectation of a user for accessing data, a plurality of replica nodes with strong node capability and high activity are selected to respond, the traditional data response mode without difference for participating in service is replaced, and the aims of reducing the waiting time of access of a plurality of replica data and reducing the communication overhead of a system are fulfilled.

The invention is mainly divided into three parts of representing the service capability of the node by credit currency, measuring the activity of the node by currency transaction amount and determining the copy needing to participate in response. Firstly, according to the characteristics of nodes influencing data access, the influence of the characteristics on the data access capacity is evaluated by using a class multivariate discriminant z-score model, and the comprehensive influence is marked by adopting a credit currency issuing mode; then, according to the conditions that the node provides service and obtains service in unit time, the activity of the node is marked by adopting a mobile credit monetary value; and finally, determining the copy needing to participate in the response according to the node activity, the node credit currency and the expectation of the user on the data freshness.

The idea of using credit currency to represent the service capability of the node is as follows:

firstly, according to some factors influencing data access by nodes in a dynamic distributed environment, objectively evaluating the factors by using a method similar to multivariate discriminant z-score; the influence degrees of different factors on data access are distinguished by weight; the "Zeta" value of the node is then obtained using the multivariate discriminant z-score method and is labeled in the form of the dispensed currency.

The idea of measuring the node liveness by using the currency transaction amount is as follows:

if a node successfully provides data access service to other nodes, the node charges credit money for other nodes; if one node successfully asks other nodes for the data access service, credit currency is provided for the other nodes. The invention refers to the sum of the two as the liquidity credit fund, which is used for indicating the node activity.

Determining the idea of participating in the response copy:

firstly, the invention acquires all replica nodes of the data to be accessed, obtains the relationship between the credit activity and the probability that the replica is the latest replica according to the credit activity of each node, and then obtains the number of the replicas needing to participate in data response according to the expectation of the freshness of the accessed data by a user and a probability formula. The probability that the copies on the nodes can acquire updates in time is closely related to the access capability and the activity of the nodes for data access, and is in direct proportion. In order to embody the direct proportion relation, the product of the credit currency of the node and the activity of the node is called credit activity. When the responded replica nodes are selected, the replica node set is sequenced according to the credit activity of the nodes; and then selecting the copies on the nodes with high credit activity from the sorted set according to the number of the copies needing to be accessed to participate in data response.

The invention has the beneficial effects that:

1. the invention provides a data participation response method for multi-copy data access in a dynamic distributed storage environment, which reduces the access waiting time of the multi-copy data and reduces the communication overhead of a system on the premise of ensuring the expectation of the user for accessing the data. Under the dynamic distributed storage environment, according to the expectation of a user on the freshness of the access data, a data response scheme responding according to the requirement is adopted, and a plurality of copies of nodes selectively participate in response service, so that the access waiting time for participation of all copies is effectively made up when the access data is requested under the strong consistency environment, and the communication overhead of the system is reduced.

2. The invention represents the service capability of the node in the form of credit currency. And evaluating the service capability of the node by adopting a multivariate judgment z-score method according to the multivariate factors of the node, and intuitively expressing the service willingness of the node by using the credit monetary value.

3. The invention uses the liquidity to characterize the activity of the participating services of the point. According to whether the node provides service or requests service, the node participation service activity is visually represented by the size of the mobile credit currency, and the probability that the copy on the node is the latest copy is measured.

4. The invention obtains the number of the copies to be accessed and the nodes to participate in response according to the relationship between the credit activity and the probability that the copies on the nodes are the latest copies. According to the relation between the credit activity and the probability that the copy on the node is the latest copy, the node with high service capability and high activity is selected to participate in the response service, and the communication overhead of the system is reduced.

Drawings

FIG. 1 is a general flow chart of a multi-copy data access response method according to the present invention;

FIG. 2 is a flow chart illustrating the service capabilities of a node using credit currency;

FIG. 3 is a flow chart for measuring node liveness using monetary transaction amounts;

fig. 4 is a flow diagram of determining that participation in a copy of a response is required.

Detailed Description

The technical solution of the present invention is further described in detail below by means of specific embodiments and with reference to the accompanying drawings.

Example 1

Referring to fig. 1, the strong consistency multi-copy data access response method in the distributed storage environment of the present invention selects a plurality of copy nodes with strong node capability and high liveness to respond according to the service capability and the active capability of the nodes in the dynamic distributed storage environment and the expectation of the user to access data, replaces the traditional data response mode without difference participating in the service, and realizes the objectives of reducing the waiting time of the access of the multi-copy data and reducing the communication overhead of the system, including the following steps:

firstly, according to the characteristics of the nodes influencing data access, the service capability of the nodes is represented by credit currency values, the influence of the characteristics on the data access capability is evaluated, and the comprehensive influence is marked by adopting a credit currency issuing mode;

then, according to the condition that the node provides service and obtains service in unit time, measuring and marking the activity of the node by adopting a flowing credit monetary value (monetary transaction amount);

and finally, determining the copy needing to participate in the response according to the activity of the node, the credit monetary value of the node and the expectation of the user on the data freshness.

Example 2

The strong-consistency multi-copy data access response method in the distributed storage environment of the embodiment is different from the embodiment 1 in that, referring to fig. 2, objective evaluation is performed on some factors of the influence of nodes in the dynamic distributed environment on data access by using a multi-discriminant z-score model similar to the multi-discriminant z-score model; the influence degrees of different factors on data access are distinguished by weight; then, a 'Zeta' value of the node is obtained by a multivariate discriminant z-score method, the value is marked in a currency issuing mode, and the service capacity of the node is represented by a credit currency value.

Example 3

The difference between the strong-consistency multi-copy data access response method in the distributed storage environment of this embodiment and embodiments 1 and 2 is that, referring to the flowchart shown in fig. 3, the node liveness is measured by using the amount of money transactions: if a node successfully provides data access service to other nodes, the node charges credit money for other nodes; if one node successfully asks for the data access service from other nodes, credit money is provided for other nodes; the sum of the two is called the liquidity credit fund and is used for indicating the activity of the node.

Example 4

The method for responding to the access of the strongly consistent multi-copy data in the distributed storage environment of the embodiment is different from the embodiment 3 in that the method determines the steps of participating in the response copy according to the node activity, the node credit monetary value and the expectation of the user on the data freshness:

referring to fig. 4, the process of determining that participation in the response copy is required is as follows:

firstly, acquiring all replica nodes of data to be accessed, and acquiring the relationship between the credit activity and the probability that a replica is the latest replica according to the credit activity of each node;

then, the number of copies needing to participate in data response is obtained according to the expectation of the user on the freshness of the access data and a probability formula.

Example 5

The strong-consistency multi-copy data access response method in the distributed storage environment of this embodiment is different from embodiment 4 in that, since the probability that the copy on the node can be updated in time is closely related to the access capability and the activity of the node for data access, and both are in a direct proportion relationship, in order to embody the direct proportion relationship, the product of the node credit currency and the node activity is called the credit activity; when the responded replica nodes are selected, the replica node sets are sequenced according to the credit activity of the nodes; and then selecting the copies on the nodes with high credit activity from the sorted set according to the number of the copies needing to be accessed to participate in data response.

Example 6

Referring to fig. 1-4, the strong-consistency multi-copy data access response method in the distributed storage environment of the present invention mainly selects a plurality of copy nodes with strong node capability and high liveness for response according to the service capability and the active capability of the nodes in the dynamic distributed storage environment and the expectation of the user for accessing data, and replaces the traditional data response mode without difference participating in service, so as to achieve the objectives of reducing the waiting time for accessing multi-copy data and reducing the communication overhead of the system.

Referring to fig. 1, the present invention is mainly divided into three parts, namely, characterizing the service capability of a node by using credit money, measuring the activity of the node by using money transaction amount, and determining the copy needing to participate in a response. Firstly, according to the characteristics of nodes influencing data access, the influence of the characteristics on the data access capacity is evaluated by using a class multivariate discriminant z-score model, and the comprehensive influence is marked by adopting a credit currency issuing mode; then, according to the conditions that the node provides service and obtains service in unit time, the activity of the node is marked by adopting a mobile credit monetary value; and finally, determining the copy needing to participate in the response according to the node activity, the node credit currency and the expectation of the user on the data freshness.

Some parameters involved in the present invention are:

the nodes in the dynamic distributed storage are represented as { Pn ₁ ，Pn ₂ ，Pn ₃ ，…Pn _N }；

1. Time period U _T : a time constant customized by the user; represents a periodic time unit, such as 1 minute;

2. node available computing power Cu _i : represents node Pn _i At a unit time T _i Available computing resources, represented as:

wherein, cu _i，used Represents Pn _i Computing resource already in use, cu _i，all Represents Pn _i The total available computing resources. Cu _i The larger the size, the more powerful the node is at handling data accesses.

3. Node available storage capacity Ns _i : represents node Pn _i At a unit time T _i Available storage resources, denoted as;

among them, ns _i，used Represents Pn _i Storage resources, ns, that have already been used _i，all Represents Pn _i The total available storage resources. Ns (natural gas) _i The larger the data storage capacity of the node, the more data will be stored to the node.

4. Available bandwidth Nw of node _i : represents node Pn _i At a unit time T _i Available bandwidth resources, denoted as;

wherein, nw _i，used Represents Pn _i Already occupied bandwidth resource, nw _i，all Represents Pn _i The total available bandwidth resources. Nw _i The larger the node, the stronger the communication capability, the shorter the latency in processing the data access.

5. Node load Lq _i : represents node Pn _i At a unit time T _i Load size, expressed as:

wherein, V _i，f Indicating the passage of Pn per unit time _i The forwarding amount of (2); f. of _j，c Represents Pn _i Upper data f _j The amount of requests per unit time; m represents Pn _i The amount of data on;

represents Pn _i Requested amount of all data in unit time; eta (eta is more than or equal to 1) is a weighted value and represents that the load of the node is more influenced by the access quantity received by the local copy; v _i，max Represents Pn _i The maximum number of requests a node can normally respond to per unit of time. Then, there is, lq _i The larger the size, the less capable the representative node is to handle data accesses in a timely manner.

6. Active neighbor ratio An _i : represents node Pn _i Node connected with it in unit time T _i An activity coefficient, expressed as;

wherein N is _conn Represents node Pn _i At a unit time T _i The number of neighbor nodes (i.e., the number of nodes directly connected to other nodes); n is a radical of _av Representing the average number of nodes directly connected to other nodes in a P2P network environment. An _i The larger the data access service is on behalf of the node.

7. Average data request delay rate Da _i : represents node Pn _i The average duration of data when providing data access service is expressed as:

wherein, t _j The access delay of a request task j is pointed, and K represents the number of the request tasks; da (Da) _av Representing all nodes in P2PAverage delay of. Apparently, da _i Smaller data indicates that the data on the node can be responded to in time.

8. Node continuous service duration St _i : represents node Pn _i The duration of service in the P2P network is expressed as:

St _i ＝(n+1)×U _T n∈N

wherein n is a natural number; each pass through U _T N plus 1 when Pn _i When rejoining the P2P system after leaving, st _i ＝0。St _i The larger the size, the longer the data access service is continuously provided on behalf of the node.

9. User request data freshness expectation Fe _f : indicating that the user expects the data to be the most recent data when requesting the data f. The value being a probability value, e.g. Fe _f ＝99.9999％。

The invention utilizes the idea that credit currency represents the service capability of the nodes:

Referring to fig. 2, the steps for characterizing the service capability of the node by using the credit currency are as follows:

1. and the nodes adopt a self-adaptive mode to acquire the attribute influencing data access.

The attributes assessed by the invention include: node available computing power Cu _i Node available storage capacity Ns _i Node available bandwidth Nw _i Node load Lq _i Active neighbor ratio An _i Average data request latency Da _i And node duration of continuous service St _i 。

2. And objectively evaluating the attributes according to the influence of the attributes on the data access service.

The invention carries out scoring by adopting a method similar to a multivariate discriminant z-score, and the scoring rule is as follows:

note that: each item of score value is just one scoring example set by the invention; in practical application, different scoring modes (such as expert valuation) can be adopted for scoring according to actual conditions.

3. And carrying out weight distribution on the investigation attributes.

According to the z-score method, different weights are added to the above factors to distinguish the difference of the influence of each attribute on data access, and the weights are assigned as follows:

4. obtaining Zeta of each node according to a multivariate discriminant z-score method _i "value". The calculation formula is as follows:

5. and expressing the service capability of the node by using credit currency. By Cm _i Represents node Pn _i The credit currency of (3) is issued according to the following rules:

Cm _i ＝β×Zeta _i (β＞0) (3)

wherein β is a customized constant value, which can be adjusted according to the actual situation, and in the present invention, β =1 is set. According to formula (3), if node Pn _i Cm of _i The larger the service willingness of the node, and vice versa.

Description of the drawings: each node according to the attribute value of interest zeta _i The values adopt a dispersion self-adaptive mode, and in each period U _T Evaluating the self; and then issuing the credit currency to the user according to the evaluation result. Dispensed currency does not accumulateI.e. in the period U _T Does not accumulate to U _T+1 And (4) period. And in each dispensing, the currency in the previous period is cleared and then dispensed again.

The invention utilizes the idea of measuring the node liveness by using the currency transaction amount:

Referring to fig. 3, the steps of measuring the node activity by using the currency transaction amount are as follows:

6. earn credit money.

When node Pn _i When providing data access service to other nodes (note: providing data access service to other nodes herein means not only accessing duplicate data on the node but also including data request service forwarded by the node), the other nodes requesting data service are charged with delta _i A credit currency. By ECm _i Indicating earning a credit point, pn _i The formula for earning credit money each time is as follows:

ECm _i ＝ECm _i +δ _i δ _i ≥1 (4)

wherein, delta _i Is a constant, e.g. delta _i =1, which collects money from the other party according to the actual service situation.

7. The credit currency is paid.

When node Pn _i When the data is requested to be accessed from other nodes (note: the data service requested from other nodes not only needs to access some copy data by the node itself but also needs to forward some data by the node itself), the node providing the data service is paid for s _i A credit currency. By CCm _i Credit currency representing a claim, then Pn _i The formula for paying the credit money each time is as follows:

CCm _i ＝CCm _i +ε _i ε _i ≥1 (5)

wherein epsilon _i Is a constant, e.g. epsilon _i =1, paying epsilon to the other party depending on the actual service conditions _i And (4) the currency.

8. And calculating the amount of the node floating fund.

The invention refers to the sum of earning money and paying money as node mobile credit money; by Wf _i Represents node Pn _i The liquidity of (b) according to equations (3) and (4), the liquidity equation is:

Wf _i ＝|ECm _i |+|CCm _i | (6)

according to equation (6), if node Pn _i Wf of (b) _i The more services the node is represented to participate in, the higher the probability that the copy on the node is the most up-to-date copy.

9. And calculating the credit activity of the nodes.

With NCA _i Representing node Pn _i The credit activity of (2) is calculated by the following formula:

note that: due to the characteristic of dynamic distributed storage, in the step, the nodes adopt a self-adaptive mode to count the self node flow fund amount and the node credit activity degree in each period.

The invention determines the idea of participating in the response copy:

firstly, the invention acquires all replica nodes of the data to be accessed, obtains the relationship between the credit activity and the probability that the replica is the latest replica according to the credit activity of each node, and then obtains the number of the replicas needing to participate in data response according to the expectation of the freshness of the accessed data of a user and a probability formula. The probability that the copies on the nodes can acquire updates in time is closely related to the access capability and the activity of the nodes for data access, and is in direct proportion. In order to embody the direct proportion relation, the product of the credit currency of the node and the activity of the node is called credit activity. When the responded replica nodes are selected, the replica node set is sequenced according to the credit activity of the nodes; and then selecting the copies on the nodes with high credit activity from the sorted set according to the number of the copies needing to be accessed to participate in data response.

Referring to fig. 4, the step of determining participation in the reply copy is:

10. the set of all replicas of the requested data f is determined. Assuming the number of copies of f is n, re is used in the present invention _f Representing the replica set, then:

Re _f ＝{Re _f，i |Re _f，1 ，Re _f，2 ，Re _f，3 ，…，Re _f，N }。

11. obtaining Re _f The set of nodes. Assuming that the copies are stored on different nodes, the invention uses Pn _f Representing a set of nodes, then:

Pn _f ＝{Pn _f，i |Pn _f，1 ，Pn _f，2 ，Pn _f，3 ，…，Pn _f，n }

12. obtaining Pn _f And the credit activity of the node where the set is located. NCA for use in the invention _f The node credit activity set representing the node set comprises:

NCA _f ＝{NCA _f，i |NCA _f，1 ，NCA _f，2 ，NCA _f，3 ，…，NCA _f，n }

13. the probability that each copy in the data f is the most recent copy is calculated.

According to the formula (3) and the formula (6), when the node Pn _i NCA of (2) _i The larger the node is, the stronger the willingness of the node to service is, and the frequency of the node participating in the service is high. If the service willingness is strong, the node can provide larger data access service for other nodes; the high activity of the node means that the probability that the copy on the node is the newest copy is higher, that is, the credit activity of the node is in direct proportion to the probability that the copy on the node is updated. By P _i Representing the replica data freshness probability, the probability that a replica is the newest replica can be expressed in relation to the credit activity by the following formula:

f(P _i )＝α×NCA _f，i α＞0 (8)

wherein, alpha is a constant, and represents that the freshness probability and the node credit activity are different by a constant level. In the present invention, let α =1; that is, the probability that a replica is the most current replica can be replaced with the ratio of node credit activity of a node in the replica node set. Then, the probability that each copy in the data f is the latest copy can be represented by the following equation (9):

14. the computation requires several copies to ensure the user's expectations for data freshness.

Setting the probability that a copy on a node is the newest copy as an independent event, then given k' copies, the probability that it can be obtained is expressed by the following equation (10):

that is, when data f is requested to be accessed, k ' copies can be responded to obtain Fe ' meeting the user expectation of data freshness ' _f . Therefore, if the user's desire for access data freshness is Fe _f According to the formula (10), the number of copies to be responded to can be obtained by the following formula (11):

15. a copy that requires a response is determined. Because the probability that the copy generated by the high credit activity node is the newest copy is higher, when the data is requested to be accessed, the invention firstly uses the node set Pn _f According to NCA _f In descending order

And is assembled by nodes

The copies on the first k nodes in the set respond.

Claims

1. A strong consistency multi-copy data access response method in a distributed storage environment is characterized in that: according to the service capability and the active capability of the nodes in the dynamic distributed storage environment and the expectation of a user for accessing data, a plurality of replica nodes with strong node capability and high activity are selected to respond, and the traditional data response mode of indifference participation in service is replaced, so that the aims of reducing the waiting time of access of multi-replica data and reducing the communication overhead of a system are fulfilled, and the method comprises the following steps:

then, according to the conditions that the node provides service and obtains service in unit time, measuring and marking the activity of the node by adopting a mobile credit monetary value;

finally, according to the node activity, the credit monetary value of the node and the expectation of the user to the data freshness, determining the copy needing to participate in the response, and the steps are as follows:

2. A strongly consistent multi-copy data access response method in a distributed storage environment as claimed in claim 1, wherein: according to some factors influencing data access by nodes in a dynamic distributed environment, objective evaluation is carried out on the factors by utilizing a multivariate discriminant z-score model; the influence degrees of different factors on data access are distinguished by weight; the "Zeta" value of the node is then obtained using the multivariate discriminant z-score method and is labeled in the form of the dispensed currency.

3. A strongly consistent multi-copy data access response method in a distributed storage environment according to claim 1 or 2, wherein: measuring node activity by using currency transaction amount: if a node successfully provides data access service to other nodes, the node charges credit money for other nodes; if one node successfully asks for the data access service from other nodes, credit money is provided for other nodes; the sum of these two is called the liquidity credit monetary value and is used to indicate node liveness.

4. A strongly-consistent multi-copy data access response method in a distributed storage environment as claimed in claim 3, wherein: because the copy on the node can acquire the updated probability in time and the access capability and the activity of the node to the data access are closely related and are in a direct proportion relationship, in order to embody the direct proportion relationship, the product of the credit currency of the node and the activity of the node is called the credit activity; when the responded replica nodes are selected, the replica node sets are sequenced according to the credit activity of the nodes; and then selecting the copy on the node with high credit activity from the sorted set according to the number of the copies needing to be accessed to participate in data response.

5. A strongly consistent multi-copy data access response method in a distributed storage environment according to claim 1, 2 or 4, wherein: and (3) characterizing the service capability of the node by using credit currency:

1) The node acquires the attribute influencing data access in a self-adaptive mode;

2) Objectively evaluating the attributes according to the influence of the attributes on the data access service, and grading by adopting a multivariate discriminant z-score method;

3) And (3) carrying out weight distribution on the investigation attributes:

according to the factors influencing data access by nodes in the dynamic distributed environment and the z-score method, different weights are added to various influencing factors to distinguish the difference of the influence of each attribute on the data access, and the weight distribution is shown as the following formula:

4) Obtaining Zeta of each node according to a multivariate discriminant z-score method _i "value; the calculation formula is as follows:

5) The service capability of the node is represented by credit currency:

by Cm _i Represents node Pn _i The credit currency of (2) is as follows:

Cm _i ＝β×Zeta _i (β＞0) (3)

wherein β is a custom constant value, which can be adjusted according to the actual situation, in the present invention, β =1 is set, and according to the formula (3), if the node Pn _i Cm of _i The larger the node is, the stronger the service willingness of the node is, and vice versa;

in the formula (1), the node has available computing power Cu _i Node available storage capacity Ns _i Node available bandwidth Nw _i Node load Lq _i Active neighbor ratio An _i Average data request latency Da _i And node duration of continuous service St _i 。

6. A strongly-consistent multi-copy data access response method in a distributed storage environment as claimed in claim 5, wherein: the steps of measuring the node activity by using the currency transaction amount are as follows:

1) Earning credit money:

when node Pn _i To other nodesWhen providing data access service, receiving delta from other nodes requesting data service _i A credit currency; by ECm _i Indicating that a credit point is earned, then Pn _i The formula for earning credit money each time is as follows:

ECm _i ＝ECm _i +δ _i δ _i ≥1 (4)

wherein, delta _i Is a constant, set delta _i =1, which collects money from a partner according to an actual service situation;

2) Paying credit currency:

when node Pn _i When the data is requested to be accessed from other nodes, the nodes providing the data service are paid with the epsilon _i A credit currency; by CCm _i Credit currency representing a claim, then Pn _i The formula for paying the credit money each time is as follows:

CCm _i ＝CCm _i +ε _i ε _i ≥1 (5)

wherein epsilon _i Is a constant, let ε _i =1, paying epsilon to the other party depending on the actual service conditions _i A currency;

3) Calculating the node flow fund amount:

the sum of earning money and paying money is called node flowing credit money; by Wf _i Represents node Pn _i The liquidity of (b) according to equations (3) and (4), the liquidity equation is:

Wf _i ＝|ECm _i |+|CCm _i | (6)

according to equation (6), if node Pn _i Wf of _i The larger the value is, the more services the node participates in, and the higher the probability that the copy on the node is the latest copy is;

4) The node adopts a self-adaptive mode to count the self node flowing fund amount and the node credit activity in each period, and uses NCA _i Represents node Pn _i The credit activity of (2) is calculated by the following formula:

7. a strongly-consistent multi-copy data access response method in a distributed storage environment as claimed in claim 6, wherein: the step of determining the participation response copy comprises the following steps:

1) Determining all duplicate sets of the request data f:

let the number of f copies be n, re is used in the invention _f Representing the replica set, then:

Re _f ＝{Re _f，i |Re _f，1 ，Re _f，2 ，Re _f，3 ，...，Re _f，n }；

2) Obtaining Re _f The node set is as follows:

assuming that these copies are stored on different nodes, with Pn _f Representing a set of nodes, then:

Pn _f ＝{Pn _f，i |Pn _f，1 ，Pn _f，2 ，Pn _f，3 ，...，Pn _f，n }；

3) Obtaining Pn _f Credit activity of the node where the set is located:

with NCA _f The node credit activity set representing the node set comprises:

NCA _f ＝{NCA _f，i |NCA _f，1 ，NCA _f，2 ，NCA _f，3 ，...，NCA _f，n }；

4) Calculate the probability that each copy in data f is the most recent copy:

according to the formula (3) and the formula (6), when the node Pn _i NCA of (a) _i The larger the node is, the stronger the service will of the node is, the frequency of the node participating in the service is high, and P is used _i Expressing the freshness probability of the copy data, and expressing the relation between the probability that the copy is the newest copy and the credit activity by the following formula:

f(P _i )＝α×NCA _f，i α＞0 (8)

wherein, alpha is a constant and represents that the freshness probability and the node credit activity are different by a constant level; let α =1; namely, the probability that the replica is the latest replica is replaced by the ratio of the credit activity of the node in the replica node set; then, the probability that each copy in the data f is the latest copy can be represented by the following equation (9):

5) Computing requires several copies to ensure the user's expectations of data freshness:

that is, when data f is requested to be accessed, k 'copies are responded to obtain Fe' satisfying the user's expectation of data freshness' _f (ii) a If the expectation of the user on the freshness of the access data is Fe _f According to the formula (10), the number of copies to be responded to can be obtained by the following formula (11):

6) Determining the copy that needs to respond:

since the probability that the replica generated by the high credit activity node is the newest replica is high, when the data is requested to be accessed, the node set Pn is firstly selected _f According to NCA _f In descending order

And is assembled by nodes

The copies on the first k nodes in the set respond.