CN109960610B

CN109960610B - Data backup method based on policy splitting

Info

Publication number: CN109960610B
Application number: CN201910147338.2A
Authority: CN
Inventors: 程华平
Original assignee: Shanghai Eisoo Information Technology Co Ltd
Current assignee: Shanghai Eisoo Information Technology Co Ltd
Priority date: 2019-02-27
Filing date: 2019-02-27
Publication date: 2023-06-06
Anticipated expiration: 2039-02-27
Also published as: CN109960610A

Abstract

The invention relates to a data backup method based on strategy distribution, which comprises a data source grouping process and a data source scheduling process, so that backup resources are evenly distributed to different computing nodes or storage nodes to carry out data distribution. Compared with the prior art, the method and the device have the advantages that the backup performance of the virtualized platform is improved, and meanwhile, the influence on the virtualized platform caused by excessive occupation of backup resources is reduced.

Description

Data backup method based on policy splitting

Technical Field

The invention relates to a technology of accelerating virtualized backup, in particular to a data backup method based on policy splitting.

Background

The virtualization platform is mainly responsible for the virtualization of hardware resources and the centralized management of virtual resources, business resources and user resources. The method adopts the technologies of virtual computing, virtual storage, virtual network and the like to complete the virtualization of computing resources, storage resources and network resources.

When the resources of the virtualized platform are backed up, the data to be backed up of the resources are sourced from different computing nodes and storage nodes, and when the tasks are backed up in a multithreading or multiprocessing mode, the computing nodes or the storage nodes are randomly selected for backup, so that the backup tasks are excessively concentrated on one computing node or one storage node for execution, network IO, disk IO and CPU occupation of the node are excessively high to reach bottlenecks, and the backup performance of the whole tasks is further affected.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a data backup method based on policy splitting.

The aim of the invention can be achieved by the following technical scheme:

a data backup method based on policy distribution includes data source grouping process and data source scheduling process, so that backup resources are distributed to different computing nodes or storage nodes evenly to conduct data distribution, and backup performance is improved maximally.

Preferably, the data source grouping process specifically includes:

step 101), calling an API interface of a virtualization platform, and sequentially acquiring the attribute of the acquired backup data source;

step 102), selecting a corresponding strategy mode according to the configuration attribute: a host policy or a storage policy;

step 103), if a host policy is selected: classifying according to the position of the computing node where the data source is located;

step 104), if a storage policy is selected: classifying according to the storage nodes where the data sources are located.

Preferably, the attributes of the backup data source in step 101) include storage locations and computing node locations.

Preferably, the classifying in step 103) is specifically: data sources with the same computing node locations are assigned to the same data source container, and different computing node containers form a data source group.

Preferably, the classifying in step 104) is specifically: data sources with the same storage location are allocated to the same data source container, and different storage node containers form a data source group.

Preferably, the data source scheduling process specifically includes:

step 201), each data source container has an attribute: the backup number BN is used for recording the number of the current data source containers in backup subtasks;

step 202), a scheduler consists of N sub-tasks, and each sub-task is responsible for processing backup work of a data source;

step 203), the subtask Tn of the scheduler applies for the data source to the data source group Gn;

step 204), the data source group Gn searches the data source container Cn with the minimum BN value, and takes out a data source dn from Cn, and adds 1 to the BN value of Cn;

step 205), the data source group returns the searched data source dn to the subtask Tn;

step 206), after the subtask Tn finishes the data source dn backup, notifying the data source group that the data source dn backup is finished;

step 207), searching a data source container Cn to which the found data source dn belongs, and subtracting 1 from the BN value of Cn;

step 208), after the sub-task Tn completes the backup, the execution of steps 203), 204), 205), 206), 207) continues until the sub-task execution is exited when no data source is available.

Step 209), after all the subtasks are executed out, the scheduler completes the backup task of the data source set.

Compared with the prior art, the policy splitting method is suitable for EXSI, fusionCompute and Langchao virtualization platform backup, improves the backup performance of the virtualization platform in a data splitting mode, and simultaneously reduces the influence of excessive centralized backup resource occupation on the virtualization platform.

Drawings

FIG. 1 is a schematic diagram of a data source grouping scheme;

FIG. 2 is a schematic diagram of a data source scheduling scheme.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present invention without making any inventive effort, shall fall within the scope of the present invention.

The data backup method based on policy distribution comprises a data source grouping process and a data source scheduling process, and can effectively solve the problem that backup resources are excessively concentrated in a single node, so that the backup resources are evenly distributed to different computing nodes or storage nodes for data distribution, and the backup performance is improved to the greatest extent.

As shown in fig. 1, the data source grouping process specifically includes:

The attributes of the backup data source in step 101) include storage locations and computing node locations. The classification in the step 103) is specifically as follows: data sources with the same computing node locations are assigned to the same data source container, and different computing node containers form a data source group. The classifying in step 104) specifically includes: data sources with the same storage location are allocated to the same data source container, and different storage node containers form a data source group.

As shown in fig. 2, the data source scheduling process specifically includes:

The invention is realized by adopting C++, so that the implemented backup node needs to install a C++ runtime library.

Windows environment requires the installation of vc++ runtime libraries. Linux environments require glibc version compatibility.

2. The C++ implementation program is divided into several modules: the system comprises a data source module, a strategy module, a data source group module and a scheduler module.

3. The data source module realizes the following functions: getDataSouceinfo, acquires attributes of the data source.

4. The strategy module realizes the following functions: creating a host policy or a storage policy; classifyDataSouce, uses policies to classify according to data source attributes.

5. The data source group module has the following implementation functions: applying for a data source from the data source group, and completing the BN value plus 1 operation of a data source container associated with the data source; freeDataSouce releases the data source after the data source is backed up, and completes BN1 minus 1 operation of the data source container associated with the data source.

6. The scheduler module has the following implementation functions: createTasks, creates a specified number of subtasks. Task run, subtask run process backup flow.

7. There are data source clusters { vm1, vm2, …, vmn }, and the data source attributes { attr1, attr2, …, attrn } are obtained using GetDataSouceInfo.

8. Calling CreateSttategy to create a strategy mode strategy_x according to the configuration information;

9. the interface ClassifyDataSouce of strategy_x is called, and { vm1, vm2, …, vmn } is classified according to { attr1, attr2, …, attrn }, so as to generate a data source group DG1, which is composed of data source container sets { dc1, dc2, …, dcn } of different types.

10. The scheduler module generates a corresponding number of subtask sets { Task1, task2, …, task N }, based on the configuration information.

11. And the scheduler module controls the subtask backup flow through the TaskRun.

12. Any subtask obtains a data source from the data source group DG1 through the interface ApplyDataSouce.

13. And (3) any subtask, and calling FreeDataSouce to release the occupation of the data source after the data source is backed up.

14. And finishing the task running of all tasks, and finishing the backup task.

While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims

1. The data backup method based on policy splitting is characterized by comprising a data source grouping process and a data source scheduling process, so that backup resources are evenly distributed to different computing nodes or storage nodes to perform data splitting, and the backup performance is improved to the greatest extent;

the data source grouping process specifically comprises the following steps:

step 104), if a storage policy is selected: classifying according to storage nodes where data sources are located;

the data source scheduling process specifically comprises the following steps:

step 208), after the sub-task Tn completes the backup, continuing to execute steps 203), 204), 205), 206), 207) until the sub-task execution is exited when no data source is available;

2. The method of claim 1, wherein the attributes of the backup data source in step 101) include storage locations and computing node locations.

3. The method for policy-based data backup according to claim 1, wherein the classifying in step 103) is specifically: data sources with the same computing node locations are assigned to the same data source container, and different computing node containers form a data source group.

4. The method for backup of data based on policy splitting according to claim 1, wherein the classifying in step 104) is specifically: data sources with the same storage location are allocated to the same data source container, and different storage node containers form a data source group.