CN113434344A - File storage method and device, computing equipment and computer storage medium - Google Patents

File storage method and device, computing equipment and computer storage medium Download PDF

Info

Publication number
CN113434344A
CN113434344A CN202110830984.6A CN202110830984A CN113434344A CN 113434344 A CN113434344 A CN 113434344A CN 202110830984 A CN202110830984 A CN 202110830984A CN 113434344 A CN113434344 A CN 113434344A
Authority
CN
China
Prior art keywords
backup
file
file data
backup server
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110830984.6A
Other languages
Chinese (zh)
Inventor
刘禹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
MIGU Digital Media Co Ltd
Original Assignee
Migu Cultural Technology Co Ltd
China Mobile Communications Group Co Ltd
MIGU Digital Media Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Migu Cultural Technology Co Ltd, China Mobile Communications Group Co Ltd, MIGU Digital Media Co Ltd filed Critical Migu Cultural Technology Co Ltd
Priority to CN202110830984.6A priority Critical patent/CN113434344A/en
Publication of CN113434344A publication Critical patent/CN113434344A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/07Responding to the occurrence of a fault, e.g. fault tolerance
    • G06F11/14Error detection or correction of the data by redundancy in operation
    • G06F11/1402Saving, restoring, recovering or retrying
    • G06F11/1446Point-in-time backing up or restoration of persistent data
    • G06F11/1448Management of the data involved in backup or backup restore
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Quality & Reliability (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a file storage method, which comprises the following steps: monitoring the performance index of the backup server in real time, and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate; determining the number of backup threads needing to be started according to the health score of the backup server; and starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory, and backing up the backup file data to the backup server. The file storage method remarkably reduces the pressure of the metadata database of the backup server and greatly improves the backup efficiency.

Description

File storage method and device, computing equipment and computer storage medium
Technical Field
The invention relates to the field of file storage, in particular to a file storage method, a file storage device, computing equipment and a computer storage medium.
Background
With the development of the internet industry and the rise of mobile terminal technology, business data such as pictures, electronic books, electronic documents, etc. have been increased exponentially. It has become a normal state that an enterprise has hundreds of millions or even billions of small files of service data (a small file generally refers to a single file with a size of several tens of bytes (KB) to several Megabytes (MB)), and the more strongly the enterprise has a dependency on data, so that effective offline backup of a large number of small files becomes a difficult problem to be solved urgently.
In the prior art, the following technical scheme is mainly adopted for backup of massive small files: installing backup client software on a production host machine with a network file system, wherein the backup client software is used for initiating a subsequent backup strategy and transmitting backup data; the backup server side creates a standard file backup strategy which comprises full backup and incremental backup, when the full backup is initiated, the backup client side scans all files in a required backup directory, and then single threads backup one by one; and each time a file is successfully backed up, the backup server creates a piece of metadata in a metadata base thereof, wherein the metadata comprises a backup file name, a backup client, backup time and md5 check code.
However, in practice, the inventor finds that the backup scheme has a disadvantage when the number of files increases from the million level to the hundreds of millions or even billions: data transmission is required twice for each backup; the single thread backs up the files one by one, the backup time is too long, and one piece of metadata can be generated in a metadata database at the backup server end every time one file is backed up, so that the pressure at the backup server end can be increased, and the backup efficiency is influenced.
Disclosure of Invention
In view of the above, the present invention has been made to provide a file storage method, apparatus, computing device and computer storage that overcome or at least partially address the above-mentioned problems.
According to an aspect of the present invention, there is provided a file storage method including:
monitoring the performance index of the backup server in real time, and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate;
determining the number of backup threads needing to be started according to the health score of the backup server;
and starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory, and backing up the backup file data to the backup server.
Optionally, the file storage method further includes: and mounting the network storage to the backup server in a read-only mode by adopting the optimized mounting parameters.
Optionally, the monitoring the performance index of the backup server in real time further includes: and reading the kernel file of the backup server through the acquisition thread to obtain the performance index of the backup server.
Optionally, the backing up the backup file data to the backup server further comprises: grouping the backup file data to obtain a plurality of backup file data groups; and any backup thread compresses and packs the backup file data groups one by one into a single file to be backed up in the backup server.
Optionally, grouping the backup file data, and obtaining a plurality of backup file data groups further includes: and creating a data index table and a tag table, scanning the current file system, cutting the data index table into a plurality of sub-data index tables, and obtaining a plurality of backup file data groups.
Optionally, the step of compressing and packaging the backup file data groups one by any backup thread into a single file backup, where the step of compressing and packaging the single file backup into the backup server further includes: and checking the tag table, calculating a backup file data group to be backed up, compressing and packaging the backup file data group to be backed up one by one into a single file to be backed up to a backup server based on the subdata index table of the backup file data group to be backed up.
Optionally, the backup file data is small file data, and the network storage is NAS storage.
According to another aspect of the present invention, there is provided a file storage apparatus including:
the monitoring module is suitable for monitoring the performance index of the backup server in real time and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate;
the determining module is suitable for determining the number of backup threads needing to be started according to the health score of the backup server; and
and the starting module is suitable for starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory and backing up the backup file data to the backup server.
Optionally, the monitoring module is further adapted to: and mounting the network storage to the backup server in a read-only mode by adopting the optimized mounting parameters.
Optionally, the monitoring module is further adapted to: and reading the kernel file of the backup server through the acquisition thread to obtain the performance index of the backup server.
Optionally, the opening module is further adapted to: grouping the backup file data to obtain a plurality of backup file data groups; and any backup thread compresses and packs the backup file data groups one by one into a single file to be backed up in the backup server.
Optionally, the opening module is further adapted to: and creating a data index table and a tag table, scanning the current file system, cutting the data index table into a plurality of sub-data index tables, and obtaining a plurality of backup file data groups.
Optionally, the opening module is further adapted to: and checking the tag table, calculating a backup file data group to be backed up, compressing and packaging the backup file data group to be backed up one by one into a single file to be backed up to a backup server based on the subdata index table of the backup file data group to be backed up. Optionally, the backup file data is small file data, and the network storage is NAS storage.
According to yet another aspect of the present invention, there is provided a computing device comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction enables the processor to execute the operation corresponding to the file storage method.
According to still another aspect of the present invention, a computer storage medium is provided, in which at least one executable instruction is stored, and the executable instruction causes a processor to execute operations corresponding to the file storage method.
According to the file storage method, the file storage device, the computing equipment and the storage medium, performance indexes of the backup server are monitored in real time, and health scores of the backup server are obtained through calculation according to the performance indexes; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate; determining the number of backup threads needing to be started according to the health score of the backup server; and starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory, and backing up the backup file data to the backup server. The file storage method remarkably reduces the pressure of the metadata database of the backup server, and greatly improves the backup efficiency.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic flowchart illustrating a file storage method according to an embodiment of the present invention;
FIG. 2 illustrates a scheduling flow diagram of a dynamic scheduler of a file storage method according to an embodiment of the present invention;
FIG. 3 illustrates a flow diagram of a dynamically packaged backup of a file storage method according to an embodiment of the invention;
FIG. 4 is a flowchart illustrating a file storage method according to a second embodiment of the present invention;
FIG. 5 is a functional block diagram of a file storage apparatus according to a third embodiment of the present invention; and
fig. 6 is a schematic structural diagram of a computing device according to a fifth embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention can be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
Example one
Fig. 1 is a flowchart illustrating a file storage method according to an embodiment of the present invention.
As shown in fig. 1, the method includes:
step S110, monitoring the performance index of the backup server in real time, and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization, memory utilization, network utilization, and disk utilization.
The invention adopts a backup scheduling algorithm which is self-researched by an inventor to monitor the performance index of the backup server in real time, and calculates the health score of the backup server according to the performance index, wherein the health score of the backup server is calculated based on one or more performance indexes of the CPU utilization rate, the memory utilization rate, the network utilization rate and the disk utilization rate of the backup server.
When the backup operation is initiated, the dynamic scheduler respectively acquires the CPU utilization rate, the memory utilization rate, the network utilization rate and the disk utilization rate through the acquisition thread.
Wherein the CPU utilization may be obtained in particular by: reading a CPU kernel file of a chip performance index file (/ proc/cpuinfo) under a backup server, and calculating the CPU utilization rate of the current backup server through a calculation formula 100 (threadCpuTime 2-threadCpuTime 1)/(totalcCpuTime 2-totalcuTime 1); wherein, the readCpuTime 2-the readCpuTime1 represents the CPU time difference occupied by the process running, and the totalCpuTime 2-the totalCpuTime1 represents the total CPU time difference.
The memory usage rate may be obtained specifically by the following method: reading a memory kernel file of a memory performance index file (/ proc/meminfo) under the backup server, and calculating the memory utilization rate of the current backup server through a calculation formula 100 (MemTotal-MemFree)/MemTotal, wherein MemTotal represents the total memory capacity, and MemFree represents the vacant memory capacity.
The network usage rate may be obtained by: reading a network kernel file of a network card performance index file (/ proc/net/DEV) under a backup server, and acquiring the network utilization rate through a network card performance index acquisition instruction (sar-n DEV 1);
the disk usage rate may be obtained specifically by the following method: reading an input/output (IO) kernel file of a disk performance index file (/ proc/disks) under a backup server, and acquiring the utilization rate of a disk through a disk performance index acquisition instruction (iostat-x 1).
Step S120, determining the number of backup threads needing to be started according to the health score of the backup server.
After the dynamic scheduler obtains the 4 performance indexes, the health score of the backup server is calculated based on the backup server scoring algorithm.
Specifically, the scoring algorithm for the health score of the backup server is as follows:
S=100-(Σcpu+Σmem+Σnet+Σdisk)*100/4
the device comprises a backup server, a sigma CPU, a sigma mem, a sigma net and a sigma disk, wherein the sigma CPU represents the utilization rate of a CPU (central processing unit) of the backup server, the sigma mem represents the utilization rate of a memory of the backup server, the sigma net represents the network utilization rate of the backup server, and the sigma disk represents the disk utilization rate of the backup server.
Specifically, fig. 2 shows a scheduling flowchart of the dynamic scheduler of the file storage method according to the embodiment, and as shown in fig. 2, a logical judgment is performed on the health score of the backup server to determine the number of backup threads to be started. Specifically, when the health score of the backup server is lower than 70 minutes, the dynamic scheduler does not initiate any backup thread and enters a waiting state, preferably, the health score is obtained every 5 minutes, and the specific time interval is not limited; when the health score of the backup server is between 70 and 80 points, the dynamic scheduler initiates 1 backup thread, and the single thread performs backup; when the health score of the backup server is between 80 and 90 points, the dynamic scheduler initiates 2 backup threads and starts the 2 backup threads for backup; when the health score of the backup server is between 90 and 100, the dynamic scheduler initiates 4 backup threads and starts the 4 backup threads for backup. And after the backup thread finishes the corresponding backup task, informing the dynamic scheduler of finishing the backup task and automatically releasing the resources. Specifically, the corresponding relationship between the health score of the backup server and the number of the initiated backup threads may be specifically set by a person skilled in the art according to a specific situation, and is not limited herein.
Step S130, according to the number of backup threads to be started, starting a corresponding number of backup threads, acquiring backup file data from the network memory, and backing up the backup file data to the backup server.
FIG. 3 shows a flowchart of a dynamically packaged backup of a file storage method according to an embodiment of the invention. As shown in fig. 3, according to the number of backup threads to be started, starting a corresponding number of backup threads, and after the backup threads are started, first creating a data index table and a tag table in a local database, where the database may be embedded, and is not particularly limited. The data index table is used for storing file metadata, and the data in the data table can have an index function, preferably, the file name and the md5 value in the data table have the index function; the tag table is used for recording information of the file data group which has been backed up.
Secondly, the backup thread starts a scanning thread, scans the current file system, acquires the file name and md5 value of each file in the data index table, and stores the acquired file name and md5 value of each file into the index data table correspondingly. Next, the backup thread starts a cutting thread, cuts the data index table into a plurality of sub data index tables, and obtains a plurality of backup file data sets, and optionally cuts the database index table into a plurality of sub data index tables with 200 ten thousand files as a unit, specifically, the number of files in each sub data index table may be set by a person skilled in the art according to specific situations. Alternatively, different sub data index tables may have different numbers of files.
After a plurality of backup file data groups are obtained, the backup file data groups are compared with information of the file data groups which are recorded in the tag table and have completed backup, the backup file data groups which need to be backed up are calculated based on the information of the backup file data groups and the recording information in the tag table, the backup file data groups which need to be backed up in the groups are compressed and packaged one by one based on the sub-data index table of the backup file data groups which need to be backed up, a single tar.
Optionally, the backup file data is small file data, and the network storage is an NAS storage.
The small file data is relative to the large file data, the small file data is generally less than or equal to 64M (or smaller than 64M), and the general small file data can include a JPG file, a txt file, a general doc file, an html file and the like.
Note that 64M is a relative value, and a file larger than 64M is a large file and a file not larger than 64M is a small file with a certain data size as a boundary according to a predetermined file boundary, but a file larger than 32M may be defined as a large file and a file not larger than 32M may be defined as a small file. The file is defined according to the service scene, and the file is determined to be smaller than or equal to a certain defined value, namely the small file, or the large file. Typically, the size of a small file is defined to be 64M or less.
Therefore, according to the file storage method of the embodiment, the single-thread backup is optimized into the multi-thread parallel backup; by means of dynamic compression, packaging and backup, tens of millions or even hundreds of millions of files are compressed into a plurality of files for backup, pressure of a metadata base of a backup server is remarkably relieved, and backup efficiency is greatly improved.
Example two
Fig. 4 is a flowchart illustrating a file storage method according to a second embodiment of the present invention.
As shown in fig. 4, the method includes:
step S410: and mounting the network storage to a backup server in a read-only mode by adopting the optimized mounting parameters.
The file storage method of the embodiment adopts an agentless mode for backup, a production server does not need to install a backup client, and all operations and interactions are completely performed and realized at a backup server side in the whole backup process.
When a file system needs to be backed up, the backup server directly mounts the file system to the backup server in a read-only mode by adopting the optimized and optimized mounting parameters. Specifically, the mounting parameter may be mount-tnfs-ers ═ 3, soft, timeo ═ 300, retry ═ 3, nordirplus, intr, noacl, noct, notome, nondiramate 192.168.1.1:/fs01/bookstore/fs01, where vers denotes the use of NFS V3 version; the timeo and retry respectively represent the overtime time and the overtime retry times of the nfs client and the server, and 300 milliseconds and 3 times are the best practices; norpirplus represents a readadirplus request not using NFS3, noratime represents an inode access time not updating a file, and nondiramate represents an inode access time not updating a directory, and norpirplus, notatime, and nondiramate all increase the speed of NFS reading; soft means mounting the system by using a soft mounting mode, intr means allowing NFS to interrupt file operation and return a value to a program calling the NFS, and soft and intr are used for guaranteeing the accuracy of reading the file.
Step 420, step 430, and step 440 are similar to step 110, step 120, and step 130, respectively, and are not described herein again.
Therefore, in the file storage method of the embodiment, only the backup server interacts with the file system, the production server does not participate, the network interaction times in the backup process are reduced from the original two times to one time, and the backup threads are reduced from the original four times to two times. Due to the fact that the times of network interaction and backup threads are shortened, the backup efficiency is remarkably improved.
EXAMPLE III
Fig. 5 is a schematic functional structure diagram of a file storage device according to a third embodiment of the present invention. As shown in fig. 5, the apparatus includes: a monitoring module 51, a determination module 52 and an opening module 53.
The monitoring module 51 is suitable for monitoring the performance index of the backup server in real time and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate;
the determining module 52 is adapted to determine the number of backup threads to be started according to the health score of the backup server; and
the starting module 53 is adapted to start a corresponding number of backup threads according to the number of backup threads to be started, acquire backup file data from the network memory, and backup the backup file data to the backup server.
Optionally, the monitoring module 51 is further adapted to: and mounting the network storage to the backup server in a read-only mode by adopting the optimized mounting parameters.
Optionally, the monitoring module 51 is further adapted to: and reading the kernel file of the backup server through the acquisition thread to obtain the performance index of the backup server.
Optionally, the opening module 53 is further adapted to: grouping the backup file data to obtain a plurality of backup file data groups; and any backup thread compresses and packs the backup file data groups one by one into a single file to be backed up in the backup server.
Optionally, the opening module 53 is further adapted to: and creating a data index table and a tag table, scanning the current file system, cutting the data index table into a plurality of sub-data index tables, and obtaining a plurality of backup file data groups.
Optionally, the opening module 53 is further adapted to: and checking the tag table, calculating a backup file data group to be backed up, compressing and packaging the backup file data group to be backed up one by one into a single file to be backed up to a backup server based on the subdata index table of the backup file data group to be backed up.
Optionally, the backup file data is small file data, and the network storage is NAS storage.
Therefore, according to the file storage device of the embodiment, the single-thread backup is optimized into the multi-thread parallel backup; by means of dynamic compression, packaging and backup, tens of millions or even hundreds of millions of files are compressed into a plurality of files for backup, pressure of a metadata database of a backup server is obviously relieved, and backup efficiency is greatly improved.
Example four
According to a fourth embodiment of the present invention, a non-volatile computer storage medium is provided, where the computer storage medium stores at least one executable instruction, and the computer executable instruction can execute the method in any of the above-mentioned method embodiments.
The executable instructions may be specifically configured to cause the processor to: monitoring the performance index of the backup server in real time, and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate; determining the number of backup threads needing to be started according to the health score of the backup server; and starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory, and backing up the backup file data to the backup server.
In an alternative embodiment, the executable instructions may be specifically configured to cause the processor to: and mounting the network storage to the backup server in a read-only mode by adopting the optimized mounting parameters.
In an alternative embodiment, the executable instructions may be specifically configured to cause the processor to: and reading the kernel file of the backup server through the acquisition thread to obtain the performance index of the backup server.
In an alternative embodiment, the executable instructions may be specifically configured to cause the processor to: grouping the backup file data to obtain a plurality of backup file data groups; and any backup thread compresses and packs the backup file data groups one by one into a single file to be backed up in the backup server.
In an alternative embodiment, the executable instructions may be specifically configured to cause the processor to: and creating a data index table and a tag table, scanning the current file system, cutting the data index table into a plurality of sub-data index tables, and obtaining a plurality of backup file data groups.
In an alternative embodiment, the executable instructions may be specifically configured to cause the processor to: and checking the tag table, calculating a backup file data group to be backed up, compressing and packaging the backup file data group to be backed up one by one into a single file to be backed up to a backup server based on the subdata index table of the backup file data group to be backed up.
In an alternative embodiment, the backup file data is small file data, and the network storage is NAS storage.
Therefore, according to the file storage method of the embodiment, multithreading parallel backup can be realized, and single-thread backup is optimized into multithreading parallel backup; by means of dynamic compression, packaging and backup, tens of millions or even hundreds of millions of files are compressed into a plurality of files for backup, pressure of a metadata database of a backup server is obviously relieved, and backup efficiency is greatly improved.
EXAMPLE five
Fig. 6 is a schematic structural diagram of a computing device according to a fifth embodiment of the present invention, and the specific embodiment of the present invention does not limit the specific implementation of the computing device.
As shown in fig. 6, the computing device may include: a processor (processor)602, a communication Interface 604, a memory 606, and a communication bus 608.
Wherein: the processor 602, communication interface 604, and memory 606 communicate with one another via a communication bus 608. A communication interface 604 for communicating with network elements of other devices, such as clients or other servers. The processor 602 is configured to execute the program 610, and may specifically perform relevant steps in the foregoing method embodiments.
In particular, program 610 may include program code comprising computer operating instructions.
The processor 602 may be a central processing unit CPU or an application Specific Integrated circuit asic or one or more Integrated circuits configured to implement embodiments of the present invention. The computing device includes one or more processors, which may be the same type of processor, such as one or more CPUs; or may be different types of processors such as one or more CPUs and one or more ASICs.
And a memory 606 for storing a program 610. Memory 606 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory), such as at least one disk memory.
The program 610 may specifically be configured to cause the processor 602 to perform the following operations:
monitoring the performance index of the backup server in real time, and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate;
determining the number of backup threads needing to be started according to the health score of the backup server;
and starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory, and backing up the backup file data to the backup server.
In an alternative embodiment, the program 610 may be specifically configured to cause the processor 602 to perform the following operations:
and mounting the network storage to the backup server in a read-only mode by adopting the optimized mounting parameters.
In an alternative embodiment, the program 610 may be specifically configured to cause the processor 602 to perform the following operations:
in an alternative embodiment, the program 610 may be specifically configured to cause the processor 602 to perform the following operations:
and reading the kernel file of the backup server through the acquisition thread to obtain the performance index of the backup server.
In an alternative embodiment, the program 610 may be specifically configured to cause the processor 602 to perform the following operations:
grouping the backup file data to obtain a plurality of backup file data groups; and any backup thread compresses and packs the backup file data groups one by one into a single file to be backed up in the backup server.
In an alternative embodiment, the program 610 may be specifically configured to cause the processor 602 to perform the following operations:
and creating a data index table and a tag table, scanning the current file system, cutting the data index table into a plurality of sub-data index tables, and obtaining a plurality of backup file data groups.
In an alternative embodiment, the program 610 may be specifically configured to cause the processor 602 to perform the following operations:
and checking the tag table, calculating a backup file data group to be backed up, compressing and packaging the backup file data group to be backed up one by one into a single file to be backed up to a backup server based on the subdata index table of the backup file data group to be backed up.
In an alternative embodiment, the backup file data is small file data, and the network storage is NAS storage.
Therefore, according to the file storage method of the embodiment, the single-thread backup is optimized into the multi-thread parallel backup; by means of dynamic compression, packaging and backup, tens of millions or even hundreds of millions of files are compressed into a plurality of files for backup, pressure of a metadata database of a backup server is obviously relieved, and backup efficiency is greatly improved.
The algorithms or displays presented herein are not inherently related to any particular computer, virtual system, or other apparatus. Various general purpose systems may also be used with the teachings herein. The required structure for constructing such a system will be apparent from the description above. In addition, embodiments of the present invention are not directed to any particular programming language. It is appreciated that a variety of programming languages may be used to implement the teachings of the present invention as described herein, and any descriptions of specific languages are provided above to disclose the best mode of the invention.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the embodiments of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the invention and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: that the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
Those skilled in the art will appreciate that the modules in the device in an embodiment may be adaptively changed and disposed in one or more devices different from the embodiment. The modules or units or components of the embodiments may be combined into one module or unit or component, and furthermore they may be divided into a plurality of sub-modules or sub-units or sub-components. All of the features disclosed in this specification (including any accompanying claims, abstract and drawings), and all of the processes or elements of any method or apparatus so disclosed, may be combined in any combination, except combinations where at least some of such features and/or processes or elements are mutually exclusive. Each feature disclosed in this specification (including any accompanying claims, abstract and drawings) may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise.
Furthermore, those skilled in the art will appreciate that while some embodiments herein include some features included in other embodiments, rather than other features, combinations of features of different embodiments are meant to be within the scope of the invention and form different embodiments. For example, in the following claims, any of the claimed embodiments may be used in any combination.
The various component embodiments of the invention may be implemented in hardware, or in software modules running on one or more processors, or in a combination thereof. Those skilled in the art will appreciate that a microprocessor or Digital Signal Processor (DSP) may be used in practice to implement some or all of the functionality of some or all of the components according to embodiments of the present invention. The present invention may also be embodied as apparatus or device programs (e.g., computer programs and computer program products) for performing a portion or all of the methods described herein. Such programs implementing the present invention may be stored on computer-readable media or may be in the form of one or more signals. Such a signal may be downloaded from an internet website or provided on a carrier signal or in any other form.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim. The word "comprising" does not exclude the presence of elements or steps not listed in a claim. The word "a" or "an" preceding an element does not exclude the presence of a plurality of such elements. The invention may be implemented by means of hardware comprising several distinct elements, and by means of a suitably programmed computer. In the unit claims enumerating several means, several of these means may be embodied by one and the same item of hardware. The usage of the words first, second and third, etcetera do not indicate any ordering. These words may be interpreted as names. The steps in the above embodiments should not be construed as limiting the order of execution unless specified otherwise.

Claims (10)

1. A file storage method, comprising:
monitoring the performance index of a backup server in real time, and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate;
determining the number of backup threads needing to be started according to the health score of the backup server;
and starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring backup file data from the network memory, and backing up the backup file data to the backup server.
2. The method of claim 1, further comprising: and mounting the network storage to the backup server in a read-only mode by adopting the optimized mounting parameters.
3. The method of claim 1, wherein monitoring performance metrics of the backup server in real time further comprises:
and reading the kernel file of the backup server through an acquisition thread to obtain the performance index of the backup server.
4. The method of claim 1, wherein backing up the backup file data to the backup server further comprises:
grouping the backup file data to obtain a plurality of backup file data groups;
and any backup thread compresses and packs the backup file data groups one by one into a single file to be backed up in the backup server.
5. The method of claim 4, wherein grouping the backup file data to obtain a plurality of backup file data groups further comprises:
and creating a data index table and a tag table, scanning the current file system, cutting the data index table into a plurality of sub data index tables, and obtaining a plurality of backup file data groups.
6. The method of claim 5, wherein the step of compressing and packaging the backup file data groups into a single file backup to the backup server by any backup thread further comprises:
and checking the tag table, calculating a backup file data group to be backed up, compressing and packaging the backup file data group to be backed up in the grouping one by one into a single file to be backed up to the backup server based on the subdata index table of the backup file data group to be backed up.
7. The method according to any one of claims 1 to 6, wherein the backup file data is small file data, and the network storage is NAS storage.
8. A file storage device, comprising:
the monitoring module is suitable for monitoring the performance index of the backup server in real time and calculating the health score of the backup server according to the performance index; wherein the performance indicators comprise one or more of the following indicators: CPU utilization rate, memory utilization rate, network utilization rate and disk utilization rate;
the determining module is suitable for determining the number of backup threads needing to be started according to the health score of the backup server; and
and the starting module is suitable for starting the corresponding number of backup threads according to the number of the backup threads needing to be started, acquiring the backup file data from the network memory and backing up the backup file data to the backup server.
9. A computing device, comprising: the system comprises a processor, a memory, a communication interface and a communication bus, wherein the processor, the memory and the communication interface complete mutual communication through the communication bus;
the memory is used for storing at least one executable instruction, and the executable instruction causes the processor to execute the operation corresponding to the file storage method according to any one of claims 1-7.
10. A computer storage medium having stored therein at least one executable instruction for causing a processor to perform operations corresponding to the file storage method of any one of claims 1-7.
CN202110830984.6A 2021-07-22 2021-07-22 File storage method and device, computing equipment and computer storage medium Pending CN113434344A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110830984.6A CN113434344A (en) 2021-07-22 2021-07-22 File storage method and device, computing equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110830984.6A CN113434344A (en) 2021-07-22 2021-07-22 File storage method and device, computing equipment and computer storage medium

Publications (1)

Publication Number Publication Date
CN113434344A true CN113434344A (en) 2021-09-24

Family

ID=77761406

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110830984.6A Pending CN113434344A (en) 2021-07-22 2021-07-22 File storage method and device, computing equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN113434344A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116366709A (en) * 2023-05-31 2023-06-30 天翼云科技有限公司 Connection scheduling method and system for data transmission

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030135650A1 (en) * 2001-12-26 2003-07-17 Hitachi, Ltd. Backup system
US20050108484A1 (en) * 2002-01-04 2005-05-19 Park Sung W. System and method for highspeed and bulk backup
CN101453489A (en) * 2008-12-17 2009-06-10 上海爱数软件有限公司 Network additive storage device, data backup and data restoration method thereof
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN101916296A (en) * 2010-08-29 2010-12-15 武汉天喻信息产业股份有限公司 Mass data processing method based on files
CN102576322A (en) * 2009-08-21 2012-07-11 赛门铁克公司 Proxy backup of virtual disk image files on NAS devices
CN103095843A (en) * 2013-01-28 2013-05-08 刘海峰 Method and client of data backup based on version vectors
CN105095300A (en) * 2014-05-16 2015-11-25 阿里巴巴集团控股有限公司 Method and system for database backup
CN105183585A (en) * 2015-08-27 2015-12-23 北京金山安全软件有限公司 Data backup method and device
CN105373453A (en) * 2015-12-15 2016-03-02 中国农业银行股份有限公司 Data backup method and system
CN111651631A (en) * 2020-04-28 2020-09-11 长沙证通云计算有限公司 High-concurrency video data processing method, electronic equipment, storage medium and system
CN112882818A (en) * 2021-03-30 2021-06-01 中信银行股份有限公司 Task dynamic adjustment method, device and equipment

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030135650A1 (en) * 2001-12-26 2003-07-17 Hitachi, Ltd. Backup system
US20050108484A1 (en) * 2002-01-04 2005-05-19 Park Sung W. System and method for highspeed and bulk backup
CN101453489A (en) * 2008-12-17 2009-06-10 上海爱数软件有限公司 Network additive storage device, data backup and data restoration method thereof
CN102576322A (en) * 2009-08-21 2012-07-11 赛门铁克公司 Proxy backup of virtual disk image files on NAS devices
CN101706825A (en) * 2009-12-10 2010-05-12 华中科技大学 Replicated data deleting method based on file content types
CN101916296A (en) * 2010-08-29 2010-12-15 武汉天喻信息产业股份有限公司 Mass data processing method based on files
CN103095843A (en) * 2013-01-28 2013-05-08 刘海峰 Method and client of data backup based on version vectors
CN105095300A (en) * 2014-05-16 2015-11-25 阿里巴巴集团控股有限公司 Method and system for database backup
CN105183585A (en) * 2015-08-27 2015-12-23 北京金山安全软件有限公司 Data backup method and device
CN105373453A (en) * 2015-12-15 2016-03-02 中国农业银行股份有限公司 Data backup method and system
CN111651631A (en) * 2020-04-28 2020-09-11 长沙证通云计算有限公司 High-concurrency video data processing method, electronic equipment, storage medium and system
CN112882818A (en) * 2021-03-30 2021-06-01 中信银行股份有限公司 Task dynamic adjustment method, device and equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘东志 等: "智慧校园构建实例详解", 31 October 2018, 天津:天津大学出版社, pages: 44 - 46 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116366709A (en) * 2023-05-31 2023-06-30 天翼云科技有限公司 Connection scheduling method and system for data transmission
CN116366709B (en) * 2023-05-31 2023-10-10 天翼云科技有限公司 Connection scheduling method and system for data transmission

Similar Documents

Publication Publication Date Title
US10613970B1 (en) Method and system for managing deployment of software application components based on software performance data
CN110753084B (en) Uplink data reading method, cache server and computer readable storage medium
CN110262807B (en) Cluster creation progress log acquisition system, method and device
CN110968478B (en) Log acquisition method, server and computer storage medium
US9632899B2 (en) Method for analyzing request logs in advance to acquire path information for identifying problematic part during operation
CN109145051A (en) The data summarization method and device and electronic equipment of distributed data base
CN111026455B (en) Plug-in generation method, electronic device and storage medium
CN111880967A (en) File backup method, device, medium and electronic equipment in cloud scene
CN111949611B (en) File processing method, system, device and medium
CN107368563B (en) Database data deleting method and device, electronic equipment and storage medium
CN113434344A (en) File storage method and device, computing equipment and computer storage medium
CN109710679B (en) Data extraction method and device
CN116760860A (en) Cluster log collection method based on cloud computing and related equipment
CN108897858B (en) Distributed cluster index fragmentation evaluation method and device and electronic equipment
CN110333993B (en) Memory snapshot generation method and device, electronic equipment and storage medium
CN112416974A (en) Data processing method, device and equipment and readable storage medium
CN117033058A (en) Analysis method, device, equipment and medium for software crash data
CN108121514B (en) Meta information updating method and device, computing equipment and computer storage medium
CN116187252A (en) Acceleration generation method, device, equipment and storage medium for PCB drawing
CN116132448A (en) Data distribution method based on artificial intelligence and related equipment
CN111376255A (en) Robot data acquisition method and device and terminal equipment
CN105630889B (en) Universal caching method and device
CN110209512B (en) Data checking method and device based on multiple data sources
CN114090673A (en) Data processing method, equipment and storage medium for multiple data sources
CN111367868B (en) File acquisition request processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination