CN114115744A - Control method and device for data recovery task, electronic equipment and storage medium - Google Patents

Control method and device for data recovery task, electronic equipment and storage medium Download PDF

Info

Publication number
CN114115744A
CN114115744A CN202111441758.5A CN202111441758A CN114115744A CN 114115744 A CN114115744 A CN 114115744A CN 202111441758 A CN202111441758 A CN 202111441758A CN 114115744 A CN114115744 A CN 114115744A
Authority
CN
China
Prior art keywords
task
tokens
storage system
determining
amount
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111441758.5A
Other languages
Chinese (zh)
Inventor
秦璇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN202111441758.5A priority Critical patent/CN114115744A/en
Publication of CN114115744A publication Critical patent/CN114115744A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/0223User address space allocation, e.g. contiguous or non contiguous base addressing
    • G06F12/023Free address space management
    • G06F12/0253Garbage collection, i.e. reclamation of unreferenced memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method and a device for controlling a data recovery task, electronic equipment and a storage medium. The method comprises the following steps: acquiring a first data recovery task which is being executed in a storage system in a current detection period, and determining a first task amount corresponding to the first data recovery task; inquiring the storage capacity of the storage system in the current detection period, and determining a task quantity threshold corresponding to the storage capacity; when the first task quantity is lower than the task quantity threshold value, determining a newly added second task quantity according to a difference value between the task quantity threshold value and the first task quantity; and acquiring the target token quantity meeting the second task quantity from the token bucket, and distributing the tokens of the target token quantity to a second data recovery task corresponding to the second task quantity. When the executed task amount is lower than the task amount threshold corresponding to the current storage amount, the data recovery speed is increased by increasing the task amount, and meanwhile, the number of tokens in the token bucket is dynamically set according to the business pressure level, so that the balance between business processing and data recovery can be realized.

Description

Control method and device for data recovery task, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for controlling a data recovery task, an electronic device, and a storage medium.
Background
The distributed storage system adopts a redirection writing algorithm, and each writing of data can be written into different storage units through redirection, so that the data writing performance is improved. However, even if the same data is modified, deleted, newly written, etc. for many times, the data will be different from the specific location on the disk each time, and therefore, in the same storage system, there will be many copies of repeated and invalid data, occupying storage space. Therefore, a storage system is full quickly after long-time operation, and most of stored data is repetitive and invalid data, so that the utilization rate of storage space is low, and the life cycle of the storage system is short.
In the prior art, in order to solve this problem, the storage space occupied by the duplicated and invalid data needs to be released in time, that is, the duplicated and invalid data is completely deleted from the disk, and this technology is called data reclamation (GC). The data recovery technology divides the repeated and invalid data into garbage, and then adopts operations such as data reading, writing and deleting in the distributed storage system to identify, split and delete the garbage.
In the process of implementing the invention, the inventor finds that the recovery of the garbage data needs to occupy the bandwidth and the flow of the OSD, and when the garbage needing to be recovered is too much, a large amount of internal read-write and delete operations can occupy the bandwidth and the flow of the normal service, thereby affecting the efficiency of the equipment for processing the service.
Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, the application provides a method and a device for controlling a data recovery task, an electronic device and a storage medium.
According to an aspect of an embodiment of the present application, there is provided a method for controlling a data recovery task, which is applied to a distributed storage system, the method including:
acquiring a first data recovery task which is being executed in a storage system in a current detection period, and determining a first task amount corresponding to the first data recovery task;
inquiring the storage capacity of the storage system in the current detection period, and determining a task quantity threshold corresponding to the storage capacity;
under the condition that the first task quantity is lower than the task quantity threshold value, determining a newly added second task quantity according to a difference value between the task quantity threshold value and the first task quantity;
and acquiring the target token quantity meeting the second task quantity from a token bucket, and distributing the tokens of the target token quantity to a second data recovery task corresponding to the second task quantity, wherein the token quantity in the token bucket is determined according to the service pressure level of the storage system in the current detection period.
Further, the determining a task amount threshold corresponding to the storage amount includes:
acquiring an upper limit of storage amount corresponding to the storage system;
calculating the ratio of the storage amount to the storage amount upper limit, and determining a target ratio range in which the ratio is located;
and determining a task quantity threshold corresponding to the target ratio range based on a first corresponding relation between a preset ratio range and the task quantity threshold.
Further, before obtaining a target number of tokens from a token bucket that satisfies the second task amount, the method further comprises:
detecting the service pressure level of the storage system in the current detection period;
and under the condition that the service pressure level is less than a preset level, adding tokens into the token bucket.
Further, the detecting the service pressure level of the storage system in the current detection period includes:
detecting the time delay of the service flow of the storage system in the current detection period;
and acquiring a time delay range in which the time delay is positioned, and determining the service pressure grade according to the time delay range.
Further, adding a new token to the token bucket when the traffic pressure level is less than a preset level includes:
under the condition that the time delay is smaller than a time delay threshold value, determining that the service pressure grade is smaller than a preset grade, and acquiring a second corresponding relation between a preset time delay range and the number of tokens;
determining the number of newly added tokens corresponding to the time delay based on the second corresponding relation;
and distributing the newly added tokens to the token bucket according to the number of the newly added tokens.
Further, after the target token number is allocated to the second data recovery task corresponding to the second task amount, the method further includes:
acquiring a service to be processed, and determining whether the token bucket has residual tokens;
and under the condition that the residual tokens exist in the token bucket, distributing the residual tokens to the to-be-processed service so as to enable the storage system to process the to-be-processed service.
Further, the method further comprises:
under the condition that no residual token exists in the token bucket, inquiring the waiting time corresponding to the service to be processed;
and controlling the storage system to process the service to be processed under the condition of reaching the waiting time.
According to another aspect of the embodiments of the present application, there is also provided a control apparatus for a data recovery task, the control apparatus being deployed in a distributed storage system, the control apparatus including:
the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a first data recovery task which is being executed in the storage system in a current detection period and determining a first task amount corresponding to the first data recovery task;
the query module is used for querying the stored data volume of the storage system in the current detection period and determining a task volume threshold corresponding to the stored data volume;
the determining module is used for determining a newly added second task amount according to a difference value between the task amount threshold and the first task amount under the condition that the first task amount does not meet the task amount threshold;
and the distribution module is used for acquiring the number of target tokens meeting the second task quantity from a token bucket and distributing the tokens of the number of target tokens to a second data recovery task corresponding to the second task quantity, wherein the number of tokens in the token bucket is determined according to the service pressure level of the storage system in the current detection period.
Further, the determining module is further configured to obtain an upper storage limit corresponding to the storage system; calculating the ratio of the storage amount to the storage amount upper limit, and determining a target ratio range in which the ratio is located; and determining a task quantity threshold corresponding to the target ratio range based on a first corresponding relation between a preset ratio range and the task quantity threshold.
Further, the control device for the data recovery task further includes:
the detection module is used for detecting the service pressure grade of the storage system in the current detection period;
and the distribution module is used for adding tokens into the token bucket under the condition that the service pressure level is less than a preset level.
Further, the detection module is further configured to detect a time delay of the traffic flow of the storage system in a current detection period; and acquiring a time delay range in which the time delay is positioned, and determining the service pressure grade according to the time delay range.
Further, the distribution module is further configured to determine that the service pressure level is smaller than a preset level and obtain a second corresponding relationship between a preset time delay range and the number of tokens, when the time delay is smaller than a time delay threshold; determining the number of newly added tokens corresponding to the time delay based on the second corresponding relation; and distributing the newly added tokens to the token bucket according to the number of the newly added tokens.
Further, the control device for the data recovery task further includes:
the query module is used for acquiring the service to be processed and determining whether the token bucket has residual tokens; and under the condition that the residual tokens exist in the token bucket, distributing the residual tokens to the to-be-processed service so as to enable the storage system to process the to-be-processed service.
Further, the query module is further configured to query a waiting time corresponding to the to-be-processed service when no remaining token exists in the token bucket; and controlling the storage system to process the service to be processed under the condition of reaching the waiting time.
According to another aspect of the embodiments of the present application, there is also provided a storage medium including a stored program that executes the above steps when the program is executed.
According to another aspect of the embodiments of the present application, there is also provided an electronic apparatus, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; wherein: a memory for storing a computer program; a processor for executing the steps of the method by running the program stored in the memory.
Embodiments of the present application also provide a computer program product containing instructions, which when run on a computer, cause the computer to perform the steps of the above method.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages: according to the method and the device, the task quantity threshold of the data recovery task is determined through the storage capacity of the device, when the current executed task quantity is lower than the task quantity threshold, the data recovery efficiency is improved by increasing the task quantity, the release of the storage space is accelerated, the service efficiency of the device is ensured, meanwhile, the number of tokens in the token bucket is dynamically set according to the business pressure level, the balance between business processing and data recovery can be achieved, and the data can be recovered as soon as possible while the business performance is ensured.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application.
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
Fig. 1 is a flowchart of a method for controlling a data recovery task according to an embodiment of the present application;
fig. 2 is a flowchart of a method for controlling a data recovery task according to another embodiment of the present application;
fig. 3 is a block diagram of a control device for a data recovery task according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In order to make the purpose, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments, and the illustrative embodiments and descriptions thereof of the present application are used for explaining the present application and do not constitute a limitation to the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another similar entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
At present, the data recovery technology is mainly used for recovering garbage data on a disk in time, and garbage is mostly recovered at a fixed rate without dynamic regulation and flow control. The GC under the strategy is recovered at the same rate no matter how much the garbage amount is, if the rate is set to be slower, when the garbage amount is large, (1) the recovery is not carried out in time, and under the condition of service, the disk space is still full of fast writing, and the service cannot continue to read and write; (2) if the rate is set to be fast, the traffic of the GC needs to occupy the traffic and bandwidth of the disk because the traffic of the GC is data read-write and deletion operations on the disk inside the system, and the traffic and bandwidth of the disk finally need to be occupied by the service traffic of the client, at this time, the problem of resource preemption occurs, the traffic of the GC affects the service traffic of the client, which may cause service performance degradation, and in severe cases, the disk may not have resources to process the service traffic, and the service drops to zero.
Based on this, the embodiment of the application provides a method and a device for controlling a data recovery task, an electronic device and a storage medium. The method provided by the embodiment of the invention can be applied to any required electronic equipment, for example, the electronic equipment can be electronic equipment such as a server and a terminal, and the method is not particularly limited herein, and is hereinafter simply referred to as electronic equipment for convenience in description.
According to an aspect of embodiments of the present application, a method embodiment of a method for controlling a data recovery task is provided. Fig. 1 is a flowchart of a method for controlling a data recovery task according to an embodiment of the present application, and as shown in fig. 1, the method includes:
step S11, acquiring a first data recovery task currently executed in the storage system in the current detection period, and determining a first task amount corresponding to the first data recovery task.
The method provided by the embodiment of the application is applied to a distributed storage system, wherein a timer is arranged in the storage system, and the data recovery task is periodically triggered through the timer. When the timer triggers the data recovery task each time, the storage system queries a first data recovery task which is being executed in the current detection period, counts the task amount corresponding to the first data recovery task, and determines the task amount as the first task amount.
And step S12, inquiring the storage capacity of the storage system in the current detection period, and determining a task quantity threshold corresponding to the storage capacity.
In the embodiment of the application, repeated and invalid data are increased along with the increase of the data storage amount. Correspondingly, more data recovery tasks need to be initiated to timely recover the repeated and invalid data. Therefore, the detection determines the task amount threshold value through the storage amount of the storage system, so that the number of data recovery tasks matches the repeated and invalid data amount as much as possible, and in addition, the more the number of tasks is, the larger the data recovery flow is, and the faster the recovery rate is. And finally, invalid/repeated data objects in the storage system are quickly recovered, and the storage space is timely released to store more valid data.
Specifically, in step S12, determining a task amount threshold corresponding to the storage amount includes the following steps a 1-A3:
and step A1, acquiring the upper limit of the storage amount corresponding to the storage system.
Step a2, calculating the ratio of the storage amount to the storage amount upper limit, and determining the target ratio range in which the ratio is located.
Step A3, based on the first corresponding relationship between the preset ratio range and the task amount threshold, determining the task amount threshold corresponding to the target ratio range.
In the embodiment of the application, the storage amount upper limit corresponding to the storage system is firstly obtained, then the ratio of the storage amount to the storage amount upper limit is calculated, then the target ratio range where the ratio is located is determined, meanwhile, the corresponding relation between the preset ratio range and the task amount threshold is obtained, and then the task amount threshold corresponding to the target ratio range is determined based on the corresponding relation.
As an example, as shown in table 1, the task amount threshold n is first set in segments according to the ratio range, and only one invalid/duplicate data object is recycled by 1 task at the same time. n represents the maximum number of tasks which can be currently operated at the same time, even if more than or equal to n data objects wait for recovery, at most n data objects can be recovered at the same time, when m (m is less than n) invalid/repeated data objects wait for recovery, only m invalid/repeated data objects are being recovered at the same time, and the rest n-m tasks are in an idle state. The number of concurrent tasks in table 1 is multiplied by the number of segments, and the larger the ratio of the storage amount is, the more repeated and invalid data is, and more tasks need to be initiated to timely recover invalid/repeated data objects.
Range of predetermined ratio a Task quantity threshold n
a<20% n=b
20%≤a<30% n=2×b
30%≤a<40% n=3×b
40%≤a<50% n=4×b
50%≤a<60% n=5×b
60%≤a<70% n=6×b
70%≤a<80% n=7×b
TABLE 1
In the embodiment of the application, when the usage rate of the storage space is gradually increased, the amount of the stored data is increased, and the amount of invalid/repeated data is also increased, so that the task amount threshold of the data recovery task is increased, invalid data can be timely recovered, and the storage space is quickly released.
In step S13, in the case that the first task amount is lower than the task amount threshold, a second task amount to be newly added is determined according to a difference between the task amount threshold and the first task amount.
In the embodiment of the application, after the task amount threshold n is determined, the first task amount is compared with the task amount threshold n, and in the case that the first task amount is lower than the task amount threshold, the increased second task amount is determined to be n-m, wherein m represents the data amount being recycled. Since only one invalid/duplicate data object is recovered by one task at the same time, n-m represents the second task amount which can be increased, and the rest idle tasks can be utilized in time at the moment, so that the data recovery efficiency is improved.
In addition, a newly added result is obtained after the second task amount is newly added, and if the newly added result shows that the newly added task amount fails, the process is directly finished when no data is in waiting for recovery at present; and if the newly added result shows that the newly added task amount is successful, the data amount currently being recycled is synchronously increased.
And step S14, obtaining a target token number satisfying the second task amount from the token bucket, and allocating the tokens of the target token number to the second data recovery task corresponding to the second task amount, where the token number in the token bucket is determined according to the traffic pressure level of the storage system in the current detection period.
In the embodiment of the application, after the newly added second task amount is determined, the target token number matched with the second task amount is obtained from the token bucket, and the tokens of the target token number are distributed to the second data recovery task corresponding to the second task amount, so that the second data recovery task performs data recovery operation according to the distributed tokens. It should be noted that the number of tokens in the token bucket is related to the traffic pressure level of the storage system in the current detection period, wherein the larger the traffic pressure level of the storage system is, the smaller the number of tokens in the token bucket is. Conversely, if the pressure level of the storage system is lower, the greater the number of tokens in the token bucket.
In the embodiment of the application, after invalid/repeated data in the current detection period are all recycled, the timer restarts the control to enter the next detection period. And when detecting that data waiting for recovery is generated again according to the time of the timer, starting the flow again until the data objects waiting for recovery are all recovered and the recovery is finished.
According to the method and the device, the task quantity threshold of the data recovery task is determined through the storage capacity of the device, when the current executed task quantity is lower than the task quantity threshold, the data recovery efficiency is improved by increasing the task quantity, the release of the storage space is accelerated, the service efficiency of the device is ensured, meanwhile, the number of tokens in the token bucket is dynamically set according to the business pressure level, the balance between business processing and data recovery can be achieved, and the data can be recovered as soon as possible while the business performance is ensured.
In the embodiment of the present application, before obtaining the target number of tokens satisfying the second task amount from the token bucket, as shown in fig. 2, the method includes the following steps:
and step S21, detecting the service pressure level of the storage system in the current detection period.
In the embodiment of the application, the detection of the traffic pressure level of the storage system in the current detection period comprises the following steps B1-B2:
and step B1, detecting the time delay of the service flow of the storage system in the current detection period.
And step B2, acquiring the time delay range of the time delay, and determining the service pressure grade according to the time delay range.
In the embodiment of the present application, the size of the time delay t of the service traffic may reflect the busy degree, the size of the service pressure, and the like of the current storage system, and based on this, the data traffic of the data recovery is increased or decreased (that is, the second task amount of the corresponding number of data recovery tasks is increased or decreased), so that it may be ensured that the performance of the service traffic is not decreased, and the data can be recovered at the speed as fast as possible.
As an example, a storage system may add 4 × b (b is an integer greater than 0) tokens to the token bucket when the traffic pressure level is one level. When the service pressure level of the storage system is second level, adding 2 x b tokens to the token bucket; when the service pressure level of the storage system is three levels, only b tokens are added to the token bucket; and when the service pressure level of the storage system is four levels, no token is added to the token bucket.
And step S22, adding tokens to the token bucket when the service pressure level is less than the preset level.
In this embodiment of the present application, in step S22, when the traffic pressure level is less than the preset level, adding a new token to the token bucket includes the following steps C1-C3:
and step C1, under the condition that the time delay is smaller than the time delay threshold, determining that the service pressure level is smaller than the preset level, and acquiring a second corresponding relation between the preset time delay range and the number of tokens.
And step C2, determining the number of the newly added tokens corresponding to the time delay based on the second corresponding relation.
And step C3, distributing the new tokens to the token bucket according to the number of the new tokens.
According to the method provided by the embodiment of the application, the service pressure grade of the current system is judged according to the op delay of the osd, when the service pressure grade is small, the flow of the GC is increased, data garbage on a disk is timely recovered, and the disk space is quickly released; when the service pressure level is high, the flow of the GC is reduced, the bandwidth of a magnetic disk is not excessively occupied, and the service performance is ensured.
As an example, if the time delay t <0.5s, it indicates that the traffic pressure level of the storage system is one level (lowest) at this time. If the time delay is more than or equal to 0.5s and t is less than 1s, the service pressure level is two-level at the moment; if t is more than or equal to 1s and less than 1.5s, the service pressure level is three-level; if t is less than or equal to 1.5s, the service pressure level is four levels (highest), and more data recovery tasks are not suitable to be processed at this time.
In an embodiment of the present application, after allocating the number of target tokens to the second data reclamation task corresponding to the second task amount, the method further includes the following steps D1-D2:
and step D1, acquiring the service to be processed and determining whether the token bucket has the residual tokens.
And D2, distributing the residual tokens to the to-be-processed service under the condition that the residual tokens exist in the token bucket so as to enable the storage system to process the to-be-processed service.
In the embodiment of the application, after the token adding process is finished, whether the current token bucket has the remaining tokens or not is judged, and if the current token bucket has the tokens, the current waiting service is issued to the storage system for processing.
In an embodiment of the application, the method further comprises the following steps E1-E2:
and step E1, under the condition that no residual token exists in the token bucket, inquiring the waiting time corresponding to the service to be processed.
And E2, controlling the storage system to process the service to be processed under the condition of reaching the waiting time.
In the embodiment of the application, if there is no waiting service currently, it indicates that all tasks for recovering garbage have been issued to the storage system, and the process is ended at this time; if no token exists in the token bucket, whether the current service to be processed reaches the waiting time needs to be judged, if not, the waiting is continued, and if yes, the service is forcibly issued to a storage system for processing.
According to the method provided by the embodiment of the application, under the condition that no token is left in the token bucket, due to the fact that the waiting time is increased, the storage system can process other services preferentially in the period, the service pressure can be relieved to a certain extent, meanwhile, the data flow of the task of recovering the garbage cannot be issued for a long time, and the garbage data cannot be recovered in time.
The flow judges the service pressure of the current system according to the data flow time delay of the storage system, when the service pressure is low, the flow of data recovery is increased, the garbage data on the disk is recovered in time, and the disk space is released quickly; when the service pressure is high, the flow of data recovery is reduced, the bandwidth of a magnetic disk is not excessively occupied, and the service performance is ensured.
Fig. 3 is a block diagram of a control device for a data recovery task according to an embodiment of the present application, where the control device may be implemented as part or all of an electronic device through software, hardware, or a combination of the two. As shown in fig. 3, the apparatus includes:
the obtaining module 31 is configured to obtain a first data recovery task that is being executed in the storage system in a current detection period, and determine a first task amount corresponding to the first data recovery task.
And the query module 32 is configured to query the stored data amount of the storage system in the current detection period, and determine a task amount threshold corresponding to the stored data amount.
And the determining module 33 is configured to determine, when the first task amount does not satisfy the task amount threshold, a second newly added task amount according to a difference between the task amount threshold and the first task amount.
And the allocating module 34 is configured to obtain a target token number meeting the second task amount from the token bucket, and allocate the tokens of the target token number to the second data recovery task corresponding to the second task amount, where the token number in the token bucket is determined according to the traffic pressure level of the storage system in the current detection period.
In this embodiment of the present application, the determining module 33 is configured to obtain an upper storage limit corresponding to a storage system; and calculating the ratio of the storage amount to the storage amount upper limit, and determining a target ratio range in which the ratio is located based on a first corresponding relation between a preset ratio range and a task amount threshold, and determining the task amount threshold corresponding to the target ratio range.
In an embodiment of the present application, a control device for a data recovery task includes:
the detection module is used for detecting the service pressure grade of the storage system in the current detection period;
and the distribution module is used for adding tokens into the token bucket under the condition that the service pressure level is less than the preset level.
In the embodiment of the application, the detection module is used for detecting the time delay of the service flow of the storage system in the current detection period; acquiring a time delay range in which the time delay is positioned, and determining a service pressure grade according to the time delay range;
in the embodiment of the application, the allocation module is configured to determine that the service pressure level is smaller than the preset level and obtain a second corresponding relationship between the preset delay range and the number of tokens when the delay is smaller than the delay threshold; determining the number of newly added tokens corresponding to the time delay based on the second corresponding relation; and distributing the newly added tokens to the token bucket according to the number of the newly added tokens.
In this embodiment of the present application, the control device for the data recovery task further includes: the query module is used for acquiring the service to be processed and determining whether the token bucket has residual tokens; and under the condition that the residual tokens exist in the token bucket, distributing the residual tokens to the to-be-processed service so as to enable the storage system to process the to-be-processed service.
In the embodiment of the application, the query module is configured to query the waiting time corresponding to the service to be processed when no remaining token exists in the token bucket; and under the condition of reaching the waiting time, controlling the storage system to process the service to be processed.
An embodiment of the present application further provides an electronic device, as shown in fig. 4, the electronic device may include: the system comprises a processor 1501, a communication interface 1502, a memory 1503 and a communication bus 1504, wherein the processor 1501, the communication interface 1502 and the memory 1503 complete communication with each other through the communication bus 1504.
A memory 1503 for storing a computer program;
the processor 1501 is configured to implement the steps of the above embodiments when executing the computer program stored in the memory 1503.
The communication bus mentioned in the above terminal may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The communication interface is used for communication between the terminal and other equipment.
The Memory may include a Random Access Memory (RAM) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the processor.
The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.
In another embodiment provided by the present application, a computer-readable storage medium is further provided, in which instructions are stored, and when the instructions are executed on a computer, the computer is caused to execute the method for controlling a data recovery task in any one of the above embodiments.
In yet another embodiment provided by the present application, there is also provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method for controlling a data recovery task as described in any of the above embodiments.
In the above embodiments, the implementation may be wholly or partially realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with the embodiments of the application to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by wire (e.g., coaxial cable, fiber optic, digital subscriber line) or wirelessly (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., Solid State Disk), among others.
The above description is only for the preferred embodiment of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application are included in the protection scope of the present application.
The above description is merely exemplary of the present application and is presented to enable those skilled in the art to understand and practice the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A control method of a data recovery task is applied to a distributed storage system and is characterized by comprising the following steps:
acquiring a first data recovery task which is being executed in a storage system in a current detection period, and determining a first task amount corresponding to the first data recovery task;
inquiring the storage capacity of the storage system in the current detection period, and determining a task quantity threshold corresponding to the storage capacity;
under the condition that the first task quantity is lower than the task quantity threshold value, determining a newly added second task quantity according to a difference value between the task quantity threshold value and the first task quantity;
and acquiring the target token quantity meeting the second task quantity from a token bucket, and distributing the tokens of the target token quantity to a second data recovery task corresponding to the second task quantity, wherein the token quantity in the token bucket is determined according to the service pressure level of the storage system in the current detection period.
2. The method of claim 1, wherein determining the threshold of the amount of tasks corresponding to the amount of memory comprises:
acquiring an upper limit of storage amount corresponding to the storage system;
calculating the ratio of the storage amount to the storage amount upper limit, and determining a target ratio range in which the ratio is located;
and determining a task quantity threshold corresponding to the target ratio range based on a first corresponding relation between a preset ratio range and the task quantity threshold.
3. The method of claim 1, wherein prior to obtaining a target number of tokens from a token bucket that satisfies the second task amount, the method further comprises:
detecting the service pressure level of the storage system in the current detection period;
and under the condition that the service pressure level is less than a preset level, adding tokens into the token bucket.
4. The method of claim 3, wherein the detecting the traffic pressure level of the storage system in a current detection period comprises:
detecting the time delay of the service flow of the storage system in the current detection period;
and acquiring a time delay range in which the time delay is positioned, and determining the service pressure grade according to the time delay range.
5. The method of claim 4, wherein adding new tokens to the token bucket if the traffic pressure level is less than a preset level comprises:
under the condition that the time delay is smaller than a time delay threshold value, determining that the service pressure grade is smaller than a preset grade, and acquiring a second corresponding relation between a preset time delay range and the number of tokens;
determining the number of newly added tokens corresponding to the time delay based on the second corresponding relation;
and distributing the newly added tokens to the token bucket according to the number of the newly added tokens.
6. The method of claim 1, wherein after assigning the target number of tokens to a second data reclamation task corresponding to the second amount of tasks, the method further comprises:
acquiring a service to be processed, and determining whether the token bucket has residual tokens;
and under the condition that the residual tokens exist in the token bucket, distributing the residual tokens to the to-be-processed service so as to enable the storage system to process the to-be-processed service.
7. The method of claim 6, further comprising:
under the condition that no residual token exists in the token bucket, inquiring the waiting time corresponding to the service to be processed;
and controlling the storage system to process the service to be processed under the condition of reaching the waiting time.
8. A control apparatus for a data recovery task, the control apparatus being deployed in a distributed storage system, the control apparatus comprising:
the device comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a first data recovery task which is being executed in the storage system in a current detection period and determining a first task amount corresponding to the first data recovery task;
the query module is used for querying the stored data volume of the storage system in the current detection period and determining a task volume threshold corresponding to the stored data volume;
the determining module is used for determining a newly added second task amount according to a difference value between the task amount threshold and the first task amount under the condition that the first task amount does not meet the task amount threshold;
and the distribution module is used for acquiring the number of target tokens meeting the second task quantity from a token bucket and distributing the tokens of the number of target tokens to a second data recovery task corresponding to the second task quantity, wherein the number of tokens in the token bucket is determined according to the service pressure level of the storage system in the current detection period.
9. A storage medium, characterized in that the storage medium comprises a stored program, wherein the program is operative to perform the method steps of any of the preceding claims 1 to 7.
10. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; wherein:
a memory for storing a computer program;
a processor for performing the method steps of any of claims 1-7 by executing a program stored on a memory.
CN202111441758.5A 2021-11-30 2021-11-30 Control method and device for data recovery task, electronic equipment and storage medium Pending CN114115744A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111441758.5A CN114115744A (en) 2021-11-30 2021-11-30 Control method and device for data recovery task, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111441758.5A CN114115744A (en) 2021-11-30 2021-11-30 Control method and device for data recovery task, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN114115744A true CN114115744A (en) 2022-03-01

Family

ID=80368027

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111441758.5A Pending CN114115744A (en) 2021-11-30 2021-11-30 Control method and device for data recovery task, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN114115744A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117376373A (en) * 2023-12-07 2024-01-09 新华三技术有限公司 Metadata operation request processing method, device, equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804043A (en) * 2018-06-26 2018-11-13 郑州云海信息技术有限公司 Distributed block storage system bandwidth traffic control method, device, equipment and medium
CN109213555A (en) * 2018-08-16 2019-01-15 北京交通大学 A kind of resource dynamic dispatching method of Virtual desktop cloud
US20190207887A1 (en) * 2017-12-28 2019-07-04 Facebook, Inc. Techniques for message deduplication
CN109977032A (en) * 2017-12-28 2019-07-05 北京忆恒创源科技有限公司 Junk data recycling and control method and its device
CN111813342A (en) * 2020-07-14 2020-10-23 济南浪潮数据技术有限公司 Data recovery method, device, equipment and computer readable storage medium
CN112162937A (en) * 2020-09-30 2021-01-01 深圳市时创意电子有限公司 Data recovery method and device for memory chip, computer equipment and storage medium
CN112306415A (en) * 2020-11-02 2021-02-02 成都佰维存储科技有限公司 GC flow control method and device, computer readable storage medium and electronic equipment
CN112350953A (en) * 2019-08-07 2021-02-09 亿度慧达教育科技(北京)有限公司 Flow limiting method and device, electronic equipment and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190207887A1 (en) * 2017-12-28 2019-07-04 Facebook, Inc. Techniques for message deduplication
CN109977032A (en) * 2017-12-28 2019-07-05 北京忆恒创源科技有限公司 Junk data recycling and control method and its device
CN108804043A (en) * 2018-06-26 2018-11-13 郑州云海信息技术有限公司 Distributed block storage system bandwidth traffic control method, device, equipment and medium
CN109213555A (en) * 2018-08-16 2019-01-15 北京交通大学 A kind of resource dynamic dispatching method of Virtual desktop cloud
CN112350953A (en) * 2019-08-07 2021-02-09 亿度慧达教育科技(北京)有限公司 Flow limiting method and device, electronic equipment and computer readable storage medium
CN111813342A (en) * 2020-07-14 2020-10-23 济南浪潮数据技术有限公司 Data recovery method, device, equipment and computer readable storage medium
CN112162937A (en) * 2020-09-30 2021-01-01 深圳市时创意电子有限公司 Data recovery method and device for memory chip, computer equipment and storage medium
CN112306415A (en) * 2020-11-02 2021-02-02 成都佰维存储科技有限公司 GC flow control method and device, computer readable storage medium and electronic equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117376373A (en) * 2023-12-07 2024-01-09 新华三技术有限公司 Metadata operation request processing method, device, equipment and storage medium
CN117376373B (en) * 2023-12-07 2024-02-23 新华三技术有限公司 Metadata operation request processing method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
US10268410B2 (en) Efficient modification of storage system metadata
CN110858162B (en) Memory management method and device and server
CN110109868B (en) Method, apparatus and computer program product for indexing files
CN111309644B (en) Memory allocation method and device and computer readable storage medium
US20100037231A1 (en) Method for reading/writing data in a multithread system
CN107533435A (en) The distribution method and storage device of memory space
CN114168490A (en) Method for determining memory recovery threshold and related equipment
US10649967B2 (en) Memory object pool use in a distributed index and query system
CN112463058B (en) Fragmented data sorting method and device and storage node
CN115129621B (en) Memory management method, device, medium and memory management module
CN112749135A (en) Method, apparatus and computer program product for balancing storage space of a file system
CN114115744A (en) Control method and device for data recovery task, electronic equipment and storage medium
CN112000281A (en) Caching method, system and device for deduplication metadata of storage system
CN115391609A (en) Data processing method and device, storage medium and electronic equipment
CN113254223B (en) Resource allocation method and system after system restart and related components
CN110795234A (en) Resource scheduling method and device
CN116502225B (en) Virus scanning method and device for self-adaptive packet redundancy arrangement and electronic equipment
CN106537321B (en) Method, device and storage system for accessing file
CN112346848A (en) Method, device and terminal for managing memory pool
CN110618946A (en) Stack memory allocation method, device, equipment and storage medium
CN110362769B (en) Data processing method and device
CN111221468A (en) Storage block data deleting method and device, electronic equipment and cloud storage system
CN115543222A (en) Storage optimization method, system, equipment and readable storage medium
CN112035498B (en) Data block scheduling method and device, scheduling layer node and storage layer node
CN110688226B (en) Cache recovery method, device and equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination