CN118113526A - Distributed data storage planning method and system for improving disaster recovery capacity of data center - Google Patents

Distributed data storage planning method and system for improving disaster recovery capacity of data center Download PDF

Info

Publication number
CN118113526A
CN118113526A CN202410392022.0A CN202410392022A CN118113526A CN 118113526 A CN118113526 A CN 118113526A CN 202410392022 A CN202410392022 A CN 202410392022A CN 118113526 A CN118113526 A CN 118113526A
Authority
CN
China
Prior art keywords
data
backup
center
scheme
data storage
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410392022.0A
Other languages
Chinese (zh)
Inventor
张腾
谢作斌
怀丹阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Ai Rui Good Technology Co ltd
Original Assignee
Shenzhen Ai Rui Good Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Ai Rui Good Technology Co ltd filed Critical Shenzhen Ai Rui Good Technology Co ltd
Priority to CN202410392022.0A priority Critical patent/CN118113526A/en
Publication of CN118113526A publication Critical patent/CN118113526A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a distributed data storage planning method and a system for improving disaster recovery capacity of a data center, which relate to the technical field of data storage management and comprise the following steps: determining an importance weight for each piece of data in the data center; acquiring all data storage nodes in a data center; calling historical operation data of a data center; determining risk correlation among all data storage nodes; dividing a backup storage space in each data center; and determining a data backup scheme of the data center, and carrying out data backup of the data storage nodes. The invention has the advantages that: the method can ensure that when the actual disaster causes the failure of part of data storage nodes, the data and the system are quickly and accurately recovered, the loss caused by the disaster is reduced, the availability and disaster recovery capability of the data are greatly improved, and the service interruption risk caused by the node failure is reduced.

Description

Distributed data storage planning method and system for improving disaster recovery capacity of data center
Technical Field
The invention relates to the technical field of data storage management, in particular to a distributed data storage planning method and system for improving disaster recovery capacity of a data center.
Background
As data centers continue to scale up, security and reliability of data becomes particularly important. Traditional data center storage architecture often faces the problem of data loss and difficult recovery, and cannot meet the requirement of modern enterprises on data security. Therefore, a new method and system for planning distributed data storage are needed to improve disaster recovery capability of a data center.
Disclosure of Invention
In order to solve the technical problems, the distributed data storage planning method and system for improving the disaster recovery capability of the data center are provided, the disaster recovery capability of the data center can be effectively improved, the integrity and usability of data are ensured, the loss caused by disasters is reduced, and the problems that the traditional data center storage architecture often faces the problem of difficult data loss and recovery and the requirement of modern enterprises on data security cannot be met are solved.
In order to achieve the above purpose, the invention adopts the following technical scheme:
a distributed data storage planning method for improving disaster recovery capacity of a data center comprises the following steps:
analyzing the data in the data center based on a preset data classification logic, and determining the important weight of each data in the data center;
acquiring all data storage nodes in a data center;
Calling historical operation data of a data center;
Determining risk correlation among all data storage nodes based on historical operation data of the data center;
dividing a backup storage space in each data center;
And determining a data backup scheme of the data center based on the important weight of the stored data of each data storage node and the risk correlation degree between the data storage nodes, and carrying out data backup of the data storage nodes.
Preferably, the data classification logic specifically includes:
Adding a weight reference value to each data based on the data type;
Setting a data analysis period;
acquiring the accessed times of each data in a data analysis period closest to the current moment, and recording the accessed times as data access analysis times;
Determining a data access frequency index of each data based on the data access analysis times of all the data;
Calculating the important weight of each data by adopting a data weight formula based on the weight reference value of each data and the data access frequency index of each data;
the data weight formula specifically comprises the following steps:
Zi=Pi×Ji
Wherein Z i is the importance weight of the ith data in the data center, P i is the data access frequency index of the ith data in the data center, and J i is the weight reference value of the ith data in the data center.
Preferably, the determining the data access frequency index of each data based on the data access analysis times of all the data specifically includes:
calculating the data access frequency index of each data by adopting an access frequency index calculation formula based on the data access analysis times of all the data;
the access frequency index calculation formula specifically comprises:
Wherein, P i is the number of data access analysis times of the ith data of the data center, and S i is the total number of data stored in the data center.
Preferably, the determining the risk correlation degree between all the data storage nodes based on the historical operation data of the data center specifically includes:
Determining operation data of storage node faults in historical operation data of a data center, and recording the operation data as fault operation data;
Combining all data storage nodes in the system in a random mode to obtain a plurality of data storage node groups;
And calculating the risk correlation degree between the two data storage nodes in each data storage node group by adopting a correlation algorithm.
Preferably, the correlation algorithm specifically includes:
Marking two data storage nodes in the data storage node group as M j and M k respectively;
screening a plurality of fault operation data of M j with faults from the fault operation data to serve as first fault operation data;
screening a plurality of fault operation data of M k with faults from the fault operation data to serve as second fault operation data;
Intersection of the first fault operation data and the second fault operation data is calculated as third fault operation data;
The first fault operation data and the second fault operation data are combined to form fourth fault operation data;
determining total number of first, second, third and fourth fault operation data, respectively;
calculating the risk correlation degree between M j and M k through a correlation calculation formula;
the correlation calculation formula specifically comprises the following steps:
Where X jk is the risk correlation between M j and M k, N 0 is the total number of all fault operation data, N j is the number of first fault operation data, N K is the number of second fault operation data, N j∩k is the number of third fault operation data, and N j∪k is the number of fourth fault operation data.
Preferably, the determining the data backup scheme of the data center based on the importance weight of the stored data of each data storage node and the risk correlation degree between the data storage nodes specifically includes:
determining a storage node corresponding to each piece of data in the data center;
Determining the size of each data;
Taking the risk correlation degree between the storage node corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
constructing a data backup storage limiting condition;
Generating a plurality of preliminary data backup schemes based on the data backup storage constraint conditions;
A scheme evaluation model is built, wherein the scheme evaluation model takes a backup node corresponding to stored data in a data storage node in a preliminary data backup scheme as input, and takes a scheme reasonable value of the preliminary data backup scheme as output;
Calculating scheme rationality of each preliminary data backup scheme based on the scheme evaluation model;
screening out a preliminary data backup scheme with the minimum scheme rationality as a data backup scheme of the data storage node;
The mathematical expression of the data backup storage limiting condition is as follows:
In the mathematical expression of the data backup storage constraint condition, the data total number of the jth data storage node is taken as a backup node, the size of the ith data of the jth data storage node as the backup node is taken as the backup storage space size of the jth data storage node;
The scheme evaluation model specifically comprises the following steps:
In the scheme evaluation model, H is the scheme rationality of the primary data backup scheme, and X i is the backup risk degree of the backup node corresponding to the ith data in the primary data backup scheme.
Furthermore, a distributed data storage planning system for improving disaster recovery capability of a data center is provided, which is used for implementing the distributed data storage planning method for improving disaster recovery capability of the data center, and the distributed data storage planning method comprises the following steps:
The data classification module is used for analyzing the data in the data center based on preset data classification logic and determining the important weight of each data in the data center;
The data center analysis module is used for acquiring all data storage nodes in the data center, calling historical operation data of the data center, and determining risk correlation among all the data storage nodes based on the historical operation data of the data center;
The storage planning module is electrically connected with the data classification module and the data center analysis module, and is used for determining a data backup scheme of the data center and carrying out data backup of the data storage nodes based on important weights of stored data of each data storage node and risk correlation between the data storage nodes.
Optionally, the data classification module includes:
a reference assignment unit for attaching a weight reference value to each data based on the data type;
The access analysis unit is used for setting a data analysis period, acquiring the accessed times of each data in the data analysis period closest to the current moment, recording the accessed times as data access analysis times, and determining the data access frequency index of each data based on the data access analysis times of all the data;
And the comprehensive weight analysis unit is used for calculating the important weight of each data by adopting a data weight formula based on the weight reference value of each data and the data access frequency index of each data.
Optionally, the data center analysis module includes:
The fault extraction unit is used for determining the operation data of the storage node faults in the historical operation data of the data center, and recording the operation data as fault operation data;
The node combination unit is used for carrying out random pairwise combination on all the data storage nodes in the system to obtain a plurality of data storage node groups;
and the correlation calculation unit is used for calculating the risk correlation degree between the two data storage nodes in each data storage node group by adopting a correlation algorithm.
Optionally, the storage planning module includes:
The data storage analysis unit is used for determining storage nodes corresponding to each data in the data center, determining the size of each data, and taking the risk correlation degree between the storage nodes corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
The primary scheme generation unit is used for constructing data backup storage limiting conditions and generating a plurality of primary data backup schemes based on the data backup storage limiting conditions;
the model building unit is used for building a scheme evaluation model;
And the data backup planning unit is used for calculating the scheme rationality of each preliminary data backup scheme based on the scheme evaluation model, and screening out the preliminary data backup scheme with the minimum scheme rationality as the data backup scheme of the data storage node.
Compared with the prior art, the invention has the beneficial effects that:
The invention provides a distributed data storage planning scheme for improving disaster recovery capacity of a data center, which classifies data into different important weights according to the value and importance of the data; corresponding backup storage strategies are formulated aiming at data with different important weights, and redundant backup storage space is added in a distributed storage system to deal with data backup, when part of nodes fail, the data backup nodes can continue to provide service, the availability of the system is ensured, the data and the system can be ensured to be quickly and accurately recovered when the part of data storage nodes fail due to the actual disaster, the loss caused by the disaster is reduced, the availability and disaster tolerance of the data are greatly improved, and the service interruption risk caused by the node failure is reduced.
Drawings
FIG. 1 is a flow chart of a distributed data storage planning method for improving disaster recovery capacity of a data center according to the present disclosure;
FIG. 2 is a flow chart of a method of data classification logic in the present approach;
FIG. 3 is a flow chart of a method for determining risk correlation among all data storage nodes in the present approach;
FIG. 4 is a flow chart of a method of the correlation algorithm in the present solution;
FIG. 5 is a flow chart of a method for determining a data backup scheme for a data center in the present scheme;
fig. 6 is a block diagram of a distributed data storage planning system for improving disaster recovery capability of a data center according to the present disclosure.
Detailed Description
The following description is presented to enable one of ordinary skill in the art to make and use the invention. The preferred embodiments in the following description are by way of example only and other obvious variations will occur to those skilled in the art.
Referring to fig. 1, a distributed data storage planning method for improving disaster recovery capability of a data center includes:
analyzing the data in the data center based on a preset data classification logic, and determining the important weight of each data in the data center;
acquiring all data storage nodes in a data center;
Calling historical operation data of a data center;
Determining risk correlation among all data storage nodes based on historical operation data of the data center;
dividing a backup storage space in each data center;
And determining a data backup scheme of the data center based on the important weight of the stored data of each data storage node and the risk correlation degree between the data storage nodes, and carrying out data backup of the data storage nodes.
The scheme classifies the data into different important weights according to the value and importance of the data; corresponding backup storage strategies are formulated aiming at data with different important weights, and redundant backup storage space is added in a distributed storage system at the same time so as to deal with data backup, and when part of nodes fail, the data backup nodes can continue to provide service, so that the availability of the system is ensured.
Referring to fig. 2, the data classification logic is specifically:
Adding a weight reference value to each data based on the data type;
Setting a data analysis period;
acquiring the accessed times of each data in a data analysis period closest to the current moment, and recording the accessed times as data access analysis times;
Determining a data access frequency index of each data based on the data access analysis times of all the data;
Calculating the important weight of each data by adopting a data weight formula based on the weight reference value of each data and the data access frequency index of each data;
the data weight formula specifically comprises:
Zi=Pi×Ji
Wherein Z i is the importance weight of the ith data in the data center, P i is the data access frequency index of the ith data in the data center, and J i is the weight reference value of the ith data in the data center.
Based on the data access analysis times of all the data, determining the data access frequency index of each data specifically comprises:
calculating the data access frequency index of each data by adopting an access frequency index calculation formula based on the data access analysis times of all the data;
The access frequency index calculation formula specifically comprises:
Wherein, P i is the number of data access analysis times of the ith data of the data center, and S i is the total number of data stored in the data center.
In the scheme, when the classification calculation of the data is performed, the weight reference value of the data is determined based on the attribute of the data, the weight reference value is determined by the attribute of the data and the data classification rule of the data center, and then the important weight of the data is comprehensively determined by combining the number of times the data is recently called.
Referring to fig. 3, determining risk correlation among all data storage nodes based on historical operating data of the data center specifically includes:
Determining operation data of storage node faults in historical operation data of a data center, and recording the operation data as fault operation data;
Combining all data storage nodes in the system in a random mode to obtain a plurality of data storage node groups;
And calculating the risk correlation degree between the two data storage nodes in each data storage node group by adopting a correlation algorithm.
Referring to fig. 4, the correlation algorithm specifically includes:
Marking two data storage nodes in the data storage node group as M j and M k respectively;
screening a plurality of fault operation data of M j with faults from the fault operation data to serve as first fault operation data;
screening a plurality of fault operation data of M k with faults from the fault operation data to serve as second fault operation data;
Intersection of the first fault operation data and the second fault operation data is calculated as third fault operation data;
The first fault operation data and the second fault operation data are combined to form fourth fault operation data;
determining total number of first, second, third and fourth fault operation data, respectively;
calculating the risk correlation degree between M j and M k through a correlation calculation formula;
the correlation calculation formula specifically comprises the following steps:
Where X jk is the risk correlation between M j and M k, N 0 is the total number of all fault operation data, N j is the number of first fault operation data, N K is the number of second fault operation data, N j∩k is the number of third fault operation data, and N j∪k is the number of fourth fault operation data.
It can be understood that, because the same basic equipment and line connection exist between the data storage nodes in the data center, the data storage nodes with higher association degree usually fail at the same time, and when the data is backed up, the data stored in the data storage nodes need to be backed up to the data storage nodes with low association with the data storage nodes, so that when the data storage nodes fail, the backed up data storage nodes are not affected, and the backup data can be taken to perform normal operation of the system.
Referring to fig. 5, determining a data backup scheme of a data center based on importance weights of stored data of each data storage node and risk correlation between the data storage nodes specifically includes:
determining a storage node corresponding to each piece of data in the data center;
Determining the size of each data;
Taking the risk correlation degree between the storage node corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
constructing a data backup storage limiting condition;
Generating a plurality of preliminary data backup schemes based on the data backup storage constraint conditions;
constructing a scheme evaluation model, wherein the scheme evaluation model takes a backup node corresponding to stored data in a data storage node in a preliminary data backup scheme as input and takes a scheme reasonable value of the preliminary data backup scheme as output;
Calculating scheme rationality of each preliminary data backup scheme based on the scheme evaluation model;
screening out a preliminary data backup scheme with the minimum scheme rationality as a data backup scheme of the data storage node;
The mathematical expression of the data backup storage limiting condition is as follows:
In the mathematical expression of the data backup storage constraint condition, the data total number of the jth data storage node is taken as a backup node, the size of the ith data of the jth data storage node as the backup node is taken as the backup storage space size of the jth data storage node;
the scheme evaluation model is specifically as follows:
In the scheme evaluation model, H is the scheme rationality of the primary data backup scheme, and X i is the backup risk degree of the backup node corresponding to the ith data in the primary data backup scheme.
It can be understood that when data backup is performed, backup data of each data center needs to be ensured to be smaller than backup space, meanwhile, more important data is backed up to data storage nodes with smaller corresponding backup risk, based on the data backup storage limiting conditions and a scheme evaluation model, a backup storage scheme of planning data is performed, and the data are stored on a plurality of nodes in a scattered manner, so that differentiated storage management of different levels of data is realized, and availability and disaster recovery capability of the data are improved.
Further, referring to fig. 6, based on the same inventive concept as the above-mentioned distributed data storage planning method for improving disaster recovery capability of a data center, the present disclosure further provides a distributed data storage planning system for improving disaster recovery capability of a data center, including:
The data classification module is used for analyzing the data in the data center based on preset data classification logic and determining the important weight of each data in the data center;
the data center analysis module is used for acquiring all data storage nodes in the data center, calling historical operation data of the data center, and determining risk correlation among all the data storage nodes based on the historical operation data of the data center;
the storage planning module is electrically connected with the data classification module and the data center analysis module, and is used for determining a data backup scheme of the data center and backing up data of the data storage nodes based on important weights of stored data of each data storage node and risk correlation between the data storage nodes.
The data classification module comprises:
A reference assignment unit for attaching a weight reference value to each data based on the data type;
The access analysis unit is used for setting a data analysis period, acquiring the accessed times of each data in the data analysis period closest to the current moment, recording the accessed times as data access analysis times, and determining the data access frequency index of each data based on the data access analysis times of all the data;
And the comprehensive weight analysis unit is used for calculating the important weight of each data by adopting a data weight formula based on the weight reference value of each data and the data access frequency index of each data.
The data center analysis module comprises:
The fault extraction unit is used for determining the operation data of the storage node faults in the historical operation data of the data center and recording the operation data as fault operation data;
The node combination unit is used for carrying out random pairwise combination on all the data storage nodes in the system to obtain a plurality of data storage node groups;
and the correlation calculation unit is used for calculating the risk correlation degree between the two data storage nodes in each data storage node group by adopting a correlation algorithm.
The storage planning module comprises:
The data storage analysis unit is used for determining storage nodes corresponding to each data in the data center, determining the size of each data, and taking the risk correlation degree between the storage nodes corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
The primary scheme generation unit is used for constructing data backup storage limiting conditions and generating a plurality of primary data backup schemes based on the data backup storage limiting conditions;
the model building unit is used for building a scheme evaluation model;
And the data backup planning unit is used for calculating the scheme rationality of each preliminary data backup scheme based on the scheme evaluation model, and screening out the preliminary data backup scheme with the minimum scheme rationality as the data backup scheme of the data storage node.
The use process of the distributed data storage planning system for improving the disaster recovery capability of the data center is as follows:
Step one: the reference assignment unit attaches a weight reference value to each data based on the data type;
Step two: the access analysis unit sets a data analysis period, acquires the accessed times of each data in the data analysis period closest to the current moment, marks the accessed times as data access analysis times, and determines the data access frequency index of each data based on the data access analysis times of all the data;
Step three: the comprehensive weight analysis unit calculates an important weight of each data using a data weight formula based on the weight reference value of each data and the data access frequency index of each data.
Step four: the fault extraction unit determines the operation data of the storage node faults in the historical operation data of the data center, and records the operation data as fault operation data;
Step five: the node combination unit performs random pairwise combination on all data storage nodes in the system to obtain a plurality of data storage node groups;
Step six: the correlation calculation unit calculates risk correlation between two data storage nodes in each data storage node group by adopting a correlation algorithm.
Step seven: the data storage analysis unit determines storage nodes corresponding to each data in the data center, determines the size of each data, and takes the risk correlation degree between the storage nodes corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
step eight: the preliminary scheme generating unit constructs data backup storage limiting conditions and generates a plurality of preliminary data backup schemes based on the data backup storage limiting conditions;
step nine: the model construction unit constructs a scheme evaluation model;
step ten: the data backup planning unit calculates the scheme rationality of each preliminary data backup scheme based on the scheme evaluation model, and screens out the preliminary data backup scheme with the minimum scheme rationality as the data backup scheme of the data storage node.
In summary, the invention has the advantages that: the method can ensure that when the actual disaster causes the failure of part of data storage nodes, the data and the system are quickly and accurately recovered, the loss caused by the disaster is reduced, the availability and disaster recovery capability of the data are greatly improved, and the service interruption risk caused by the node failure is reduced.
The foregoing has shown and described the basic principles, principal features and advantages of the invention. It will be understood by those skilled in the art that the present invention is not limited to the embodiments described above, and that the above embodiments and descriptions are merely illustrative of the principles of the present invention, and various changes and modifications may be made therein without departing from the spirit and scope of the invention, which is defined by the appended claims. The scope of the invention is defined by the appended claims and equivalents thereof.

Claims (10)

1. A distributed data storage planning method for improving disaster recovery capacity of a data center is characterized by comprising the following steps:
analyzing the data in the data center based on a preset data classification logic, and determining the important weight of each data in the data center;
acquiring all data storage nodes in a data center;
Calling historical operation data of a data center;
Determining risk correlation among all data storage nodes based on historical operation data of the data center;
dividing a backup storage space in each data center;
And determining a data backup scheme of the data center based on the important weight of the stored data of each data storage node and the risk correlation degree between the data storage nodes, and carrying out data backup of the data storage nodes.
2. The distributed data storage planning method for improving disaster recovery capacity of a data center according to claim 1, wherein the data classification logic specifically comprises:
Adding a weight reference value to each data based on the data type;
Setting a data analysis period;
acquiring the accessed times of each data in a data analysis period closest to the current moment, and recording the accessed times as data access analysis times;
Determining a data access frequency index of each data based on the data access analysis times of all the data;
Calculating the important weight of each data by adopting a data weight formula based on the weight reference value of each data and the data access frequency index of each data;
the data weight formula specifically comprises the following steps:
Zi=Pi×Ji
Wherein Z i is the importance weight of the ith data in the data center, P i is the data access frequency index of the ith data in the data center, and J i is the weight reference value of the ith data in the data center.
3. The method for planning distributed data storage for improving disaster recovery capacity of a data center according to claim 2, wherein determining the data access frequency index of each data based on the number of data access analysis of all data specifically comprises:
calculating the data access frequency index of each data by adopting an access frequency index calculation formula based on the data access analysis times of all the data;
the access frequency index calculation formula specifically comprises:
Wherein, P i is the number of data access analysis times of the ith data of the data center, and S i is the total number of data stored in the data center.
4. The method for planning distributed data storage for improving disaster recovery capacity of a data center according to claim 3, wherein determining risk correlation among all data storage nodes based on historical operation data of the data center specifically comprises:
Determining operation data of storage node faults in historical operation data of a data center, and recording the operation data as fault operation data;
Combining all data storage nodes in the system in a random mode to obtain a plurality of data storage node groups;
And calculating the risk correlation degree between the two data storage nodes in each data storage node group by adopting a correlation algorithm.
5. The distributed data storage planning method for improving disaster recovery capacity of a data center according to claim 4, wherein the correlation algorithm specifically comprises:
Marking two data storage nodes in the data storage node group as M j and M k respectively;
screening a plurality of fault operation data of M j with faults from the fault operation data to serve as first fault operation data;
screening a plurality of fault operation data of M k with faults from the fault operation data to serve as second fault operation data;
Intersection of the first fault operation data and the second fault operation data is calculated as third fault operation data;
The first fault operation data and the second fault operation data are combined to form fourth fault operation data;
determining total number of first, second, third and fourth fault operation data, respectively;
calculating the risk correlation degree between M j and M k through a correlation calculation formula;
the correlation calculation formula specifically comprises the following steps:
Where X jk is the risk correlation between M j and M k, N 0 is the total number of all fault operation data, N j is the number of first fault operation data, N K is the number of second fault operation data, N j∩k is the number of third fault operation data, and N j∪k is the number of fourth fault operation data.
6. The method for planning distributed data storage for improving disaster recovery capacity of a data center according to claim 5, wherein determining a data backup scheme of the data center based on importance weights of stored data of each data storage node and risk correlation between the data storage nodes specifically comprises:
determining a storage node corresponding to each piece of data in the data center;
Determining the size of each data;
Taking the risk correlation degree between the storage node corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
constructing a data backup storage limiting condition;
Generating a plurality of preliminary data backup schemes based on the data backup storage constraint conditions;
A scheme evaluation model is built, wherein the scheme evaluation model takes a backup node corresponding to stored data in a data storage node in a preliminary data backup scheme as input, and takes a scheme reasonable value of the preliminary data backup scheme as output;
Calculating scheme rationality of each preliminary data backup scheme based on the scheme evaluation model;
screening out a preliminary data backup scheme with the minimum scheme rationality as a data backup scheme of the data storage node;
The mathematical expression of the data backup storage limiting condition is as follows:
In the mathematical expression of the data backup storage constraint condition, the data total number of the jth data storage node is taken as a backup node, the size of the ith data of the jth data storage node as the backup node is taken as the backup storage space size of the jth data storage node;
The scheme evaluation model specifically comprises the following steps:
In the scheme evaluation model, H is the scheme rationality of the primary data backup scheme, and X i is the backup risk degree of the backup node corresponding to the ith data in the primary data backup scheme.
7. A distributed data storage planning system for improving disaster recovery capacity of a data center, wherein the distributed data storage planning method for improving disaster recovery capacity of a data center according to any one of claims 1 to 6 comprises:
The data classification module is used for analyzing the data in the data center based on preset data classification logic and determining the important weight of each data in the data center;
The data center analysis module is used for acquiring all data storage nodes in the data center, calling historical operation data of the data center, and determining risk correlation among all the data storage nodes based on the historical operation data of the data center;
The storage planning module is electrically connected with the data classification module and the data center analysis module, and is used for determining a data backup scheme of the data center and carrying out data backup of the data storage nodes based on important weights of stored data of each data storage node and risk correlation between the data storage nodes.
8. The distributed data storage planning system for improving disaster recovery capacity of a data center of claim 7 wherein said data classification module comprises:
a reference assignment unit for attaching a weight reference value to each data based on the data type;
The access analysis unit is used for setting a data analysis period, acquiring the accessed times of each data in the data analysis period closest to the current moment, recording the accessed times as data access analysis times, and determining the data access frequency index of each data based on the data access analysis times of all the data;
And the comprehensive weight analysis unit is used for calculating the important weight of each data by adopting a data weight formula based on the weight reference value of each data and the data access frequency index of each data.
9. The distributed data storage planning system for improving disaster recovery capacity of a data center of claim 7 wherein the data center analysis module comprises:
The fault extraction unit is used for determining the operation data of the storage node faults in the historical operation data of the data center, and recording the operation data as fault operation data;
The node combination unit is used for carrying out random pairwise combination on all the data storage nodes in the system to obtain a plurality of data storage node groups;
and the correlation calculation unit is used for calculating the risk correlation degree between the two data storage nodes in each data storage node group by adopting a correlation algorithm.
10. The distributed data storage planning system for improving disaster recovery capacity of a data center of claim 7 wherein said storage planning module comprises:
The data storage analysis unit is used for determining storage nodes corresponding to each data in the data center, determining the size of each data, and taking the risk correlation degree between the storage nodes corresponding to each data and each data storage node as the backup risk degree between the data and each data storage node;
The primary scheme generation unit is used for constructing data backup storage limiting conditions and generating a plurality of primary data backup schemes based on the data backup storage limiting conditions;
the model building unit is used for building a scheme evaluation model;
And the data backup planning unit is used for calculating the scheme rationality of each preliminary data backup scheme based on the scheme evaluation model, and screening out the preliminary data backup scheme with the minimum scheme rationality as the data backup scheme of the data storage node.
CN202410392022.0A 2024-04-02 2024-04-02 Distributed data storage planning method and system for improving disaster recovery capacity of data center Pending CN118113526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410392022.0A CN118113526A (en) 2024-04-02 2024-04-02 Distributed data storage planning method and system for improving disaster recovery capacity of data center

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410392022.0A CN118113526A (en) 2024-04-02 2024-04-02 Distributed data storage planning method and system for improving disaster recovery capacity of data center

Publications (1)

Publication Number Publication Date
CN118113526A true CN118113526A (en) 2024-05-31

Family

ID=91215518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410392022.0A Pending CN118113526A (en) 2024-04-02 2024-04-02 Distributed data storage planning method and system for improving disaster recovery capacity of data center

Country Status (1)

Country Link
CN (1) CN118113526A (en)

Similar Documents

Publication Publication Date Title
CN106327055B (en) A kind of electricity expense control method and system based on big data technology
US6868311B2 (en) Method and system for on-line dynamical screening of electric power system
CN110471820B (en) Cloud storage system disk fault prediction method based on cyclic neural network
US20200133758A1 (en) Method, device, and computer program product for facilitating prediction of disk failure
CN103069752B (en) The method of the agency of collection information and storage management system
CN104123198A (en) Method and device for managing data reproduction mode
CN109376139A (en) Centralized database monitoring method, computer installation and storage medium
CN112162907A (en) Health degree evaluation method based on monitoring index data
CN111858240B (en) Monitoring method, system, equipment and medium of distributed storage system
Kandaperumal et al. AWR: Anticipate, withstand, and recover resilience metric for operational and planning decision support in electric distribution system
CN111459710A (en) Erasure code memory recovery method, device and memory system capable of sensing heat degree and risk
CN118041760A (en) Intelligent self-repairing method for connector network
US20220043581A1 (en) Optimized selection of subset of storage devices for data backup
CN118113526A (en) Distributed data storage planning method and system for improving disaster recovery capacity of data center
Lin et al. Edits: An easy-to-difficult training strategy for cloud failure prediction
CN110261159B (en) Fault diagnosis method for flexible manufacturing cutter subsystem
CN109784629B (en) Transformer substation industrial control network fault positioning method based on neural network
WO2024087404A1 (en) Nuclear reactor fault determination method, apparatus, device, storage medium, and product
CN101866355A (en) Social network partitioning method and system based on cloud computing
CN110415136B (en) Service capability evaluation system and method for power dispatching automation system
CN114116370B (en) Complex electronic system operation health state monitoring point optimization method
JP6811066B2 (en) Risk evaluation device, risk change evaluation method and program
CN115660251A (en) Enterprise health degree evaluation system based on AI big data
CN112231142B (en) System backup recovery method, device, computer equipment and storage medium
CN114675789A (en) Big data analysis storage system and method based on computer system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination