CN115657954B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN115657954B
CN115657954B CN202211356247.8A CN202211356247A CN115657954B CN 115657954 B CN115657954 B CN 115657954B CN 202211356247 A CN202211356247 A CN 202211356247A CN 115657954 B CN115657954 B CN 115657954B
Authority
CN
China
Prior art keywords
data
storage
storage medium
key information
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211356247.8A
Other languages
Chinese (zh)
Other versions
CN115657954A (en
Inventor
梁俊刚
徐庆
元红萍
李彦秋
韩丽敏
朱明新
袁朝贵
张岩
陈炼
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunlun Digital Technology Co ltd
China National Petroleum Corp
Original Assignee
Kunlun Digital Technology Co ltd
China National Petroleum Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunlun Digital Technology Co ltd, China National Petroleum Corp filed Critical Kunlun Digital Technology Co ltd
Priority to CN202211356247.8A priority Critical patent/CN115657954B/en
Publication of CN115657954A publication Critical patent/CN115657954A/en
Application granted granted Critical
Publication of CN115657954B publication Critical patent/CN115657954B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A data processing method and device belong to the technical field of computers. The method is applied to a first storage node in a storage system, the storage system comprises a management node and a plurality of storage nodes, the first storage node is any one of the plurality of storage nodes, the first storage node comprises a first storage medium and a second storage medium, and the data access rate of the first storage medium is larger than that of the second storage medium. The first storage node receives a first storage request carrying first data from the management node; the first storage node stores first data in a first storage medium and sets a first expiration time length for the first data; when the end time of the first expiration time is reached, the first storage node stores the first data in the first storage medium to the second storage medium and deletes the first data in the first storage medium. The present application helps to reduce the storage pressure of a second storage medium (e.g., a storage disk).

Description

Data processing method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a data processing method and apparatus.
Background
A storage system typically includes a management node for managing a plurality of storage nodes and a plurality of storage nodes including disks for storing data.
For example, the management node distributes data storage requests to the plurality of storage nodes, and each storage node stores data in a disk of each storage node according to the received data storage requests.
However, the above storage method tends to result in a large storage pressure of the magnetic disk.
Disclosure of Invention
The application provides a data processing method and device, which are beneficial to reducing the storage pressure of a storage disk (such as a magnetic disk) in a storage node. The technical scheme is as follows:
in a first aspect, a data processing method is provided, applied to a first storage node in a storage system, where the storage system includes a management node and a plurality of storage nodes, the management node is connected to the plurality of storage nodes, the first storage node is any one of the plurality of storage nodes, the first storage node includes a first storage medium and a second storage medium, and a data access rate of the first storage medium is greater than a data access rate of the second storage medium, and the method includes:
receiving a first storage request from the management node, the first storage request carrying first data;
storing the first data in the first storage medium, and setting a first expiration time for the first data;
And when the ending time of the first expiration time is reached, storing the first data in the first storage medium to the second storage medium, and deleting the first data in the first storage medium.
Optionally, the method further comprises:
receiving a first search request from the management node, the first search request being for searching the first data;
searching the first data in the first storage medium according to the first searching request;
and searching the first data in the second storage medium when the first data is not searched in the first storage medium.
Optionally, the method further comprises:
when the first data is not found in the first storage medium and the first data is found in the second storage medium, storing the first data in the first storage medium, and setting a second expiration time for the first data;
and deleting the first data in the first storage medium when the ending time of the second expiration time is reached.
Optionally, the storing the first data in the first storage medium includes: storing the first data and key information of the first data in the first storage medium;
The first search request carries key information of the first data, and the searching the first data in the first storage medium according to the first search request includes: and searching the first data in the first storage medium according to the key information of the first data.
Optionally, the first storage medium is a cache and the second storage medium is a storage disk (e.g., magnetic disk).
Optionally, the management node is configured to distribute data processing tasks to the plurality of storage nodes based on a load balancing policy, where the data processing tasks include at least one of a data storage request and a data search request.
In a second aspect, there is provided a data processing apparatus for use in a first storage node in a storage system, the storage system comprising a management node and a plurality of storage nodes, the management node being connected to the plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the first storage node comprising a first storage medium and a second storage medium, a data access rate of the first storage medium being greater than a data access rate of the second storage medium, the apparatus comprising:
The receiving module is used for receiving a first storage request from the management node, wherein the first storage request carries first data;
the processing module is used for storing the first data in the first storage medium and setting a first expiration time length for the first data; and when the ending time of the first expiration time is reached, storing the first data in the first storage medium to the second storage medium, and deleting the first data in the first storage medium.
Optionally, the receiving module is further configured to receive a first search request from the management node, where the first search request is used to search the first data;
the processing module is further configured to search the first data in the first storage medium according to the first search request; and searching the first data in the second storage medium when the first data is not searched in the first storage medium.
Optionally, the processing module is further configured to store the first data in the first storage medium and set a second expiration time for the first data when the first data is not found in the first storage medium and the first data is found in the second storage medium;
The processing module is further configured to delete the first data in the first storage medium when the end time of the second expiration duration is reached.
Optionally, the processing module is configured to store the first data and key information of the first data in the first storage medium;
the first search request carries key information of the first data, and the processing module is used for searching the first data in the first storage medium according to the key information of the first data.
Optionally, the first storage medium is a cache and the second storage medium is a storage disk (e.g., magnetic disk).
Optionally, the management node is configured to distribute data processing tasks to the plurality of storage nodes based on a load balancing policy, where the data processing tasks include at least one of a data storage request and a data search request.
The beneficial effects that this application provided technical scheme brought are:
after the first storage node receives the first storage request carrying the first data, the first storage node firstly stores the first data in a first storage medium of the first storage node, and does not directly store the first data in a second storage medium of the first storage node, so that the first storage medium can share a part of storage pressure, and the storage pressure of the second storage medium is reduced.
In addition, after the first storage node receives a first search request for searching the first data, the first storage node searches the first data in the first storage medium first, and when the first data is not searched in the first storage medium, the first storage node searches the first data in the second storage medium. Because the data access rate of the first storage medium is greater than that of the second storage medium, the first storage node searches the first data in the first storage medium first, and therefore the searching efficiency of the first data is improved.
Further, when the first storage node does not find the first data in the first storage medium and finds the first data in the second storage medium, the first storage node stores the first data in the first storage medium, so that the first storage node stores the data with higher access rate in the first storage medium with higher data access rate, and the searching efficiency of the data is convenient to improve.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a storage system according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a storage node according to an embodiment of the present application;
FIG. 3 is a flow chart of a data processing method according to an embodiment of the present application;
FIG. 4 is a flow chart of another data processing method provided in an embodiment of the present application;
fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, wherein it is apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
Storage nodes in a storage system typically include disks, and the storage nodes store data in the disks of the storage nodes according to received data storage requests, which tends to result in a greater disk storage pressure.
The embodiment of the application provides a data processing scheme applied to a storage node. After receiving a storage request carrying data, the storage node stores the data in a first storage medium (e.g. a cache) of the storage node, sets an expiration time for the data, stores the data to a second storage medium (e.g. a disk) of the storage node when the expiration time is reached, and deletes the data in the first storage medium. By this way of data processing, the storage node causes the first storage medium to share a portion of the storage pressure, thereby reducing the storage pressure of the second storage medium.
The technical scheme of the application is described below with reference to the accompanying drawings. First, an application scenario of the present application is described.
The application scenario of the present application provides a storage system that can handle data processing tasks from a user device (or referred to as a client), the data processing tasks including at least one of data storage requests and data search requests. Optionally, the storage system includes a management node and a plurality of storage nodes, the storage system interacts with the user device through the management node, the management node may receive data processing tasks from the user device and distribute the data processing tasks to the plurality of storage nodes, and the storage nodes perform data processing according to the data processing tasks. For example, please refer to fig. 1, which shows a schematic diagram of a storage system provided in an embodiment of the present application, where the storage system includes a management node and a plurality of storage nodes 1 to n, the management node is respectively communicatively connected to the storage nodes 1 to n, the management node is configured to distribute data processing tasks to the storage nodes 1 to n, and the storage node is configured to perform data processing according to the data processing tasks distributed by the management node. Optionally, the management node distributes data processing tasks to the storage nodes 1 to n based on a load balancing policy. At least two of the storage nodes 1-n may also be communicatively coupled to enable the at least two storage nodes to communicate.
In an embodiment of the present application, the data processing task includes at least one of a data storage request and a data search request. The management node distributes the data storage requests to the storage nodes 1 to n based on the load balancing policy, and the storage nodes store the data carried by the data storage requests sent by the management node in the storage nodes. The management node may determine a storage node where the data to be searched is located, and send a data search request for searching the data to be searched to the storage node, where the storage node searches the data to be searched according to the data search request.
Optionally, each storage node has a plurality of slots, each slot having a number, the numbers of the slots on different storage nodes being different (different storage nodes may correspond to different slot number ranges). The data storage requests carry key information of data to be stored, for each data storage request, the management node performs verification calculation on the key information carried by the data storage requests through a verification algorithm, determines the storage node corresponding to the data storage request according to a verification calculation result and the slot number range corresponding to each storage node, and sends the data storage request to the storage node, so that the management node distributes the data storage requests to the storage nodes 1-n based on a load balancing strategy. Similarly, the data searching requests carry key information of the data to be searched, for each data searching request, the management node performs verification calculation on the key information carried by the data searching request through a verification algorithm, determines the storage node where the data to be searched is located according to a verification calculation result and the slot number range corresponding to each storage node, and sends the data searching request to the storage node. The above-described checking algorithm is, for example, a cyclic redundancy check (cyclical redundancy check, CRC) algorithm. The number of each slot has a value ranging from 0 to 16383, the management node determines a remainder (i.e., a remainder obtained by dividing the check calculation result by 16384) of the check calculation result relative to 16384, the remainder is a value ranging from 0 to 16383, and the management node determines a storage node corresponding to a corresponding data processing task (such as a data storage request or a data search request) according to the remainder and the slot number range corresponding to each storage node.
Referring to fig. 2, a schematic diagram of a storage node according to an embodiment of the present application is shown. The storage node is any one of the storage nodes 1 to n. The storage node includes a first storage medium and a second storage medium, the first storage medium having a data access rate greater than a data access rate of the second storage medium. The first storage medium may be a cache and the second storage medium may be a storage disk. For example, the first storage medium includes, but is not limited to, a Caffeine cache or a Redis cache, and the second storage medium includes, but is not limited to, a mechanical hard disk, a magnetic disk, a solid state hard disk, a shingled magnetic recording hard disk, or the like. The Caffeine cache realizes high hit rate and low consumption by using an excellent cache elimination algorithm. For example, as shown in fig. 2, each of the storage nodes 1 to n first stores data carried by a data storage request in a first storage medium of the storage node after each storage node receives the data storage request, sets an expiration time length for the data, and then when reaching an end time of the expiration time length, stores the data in a second storage medium of the storage node. After each storage node receives the data searching request, the corresponding data is searched in the first storage medium of the storage node, and if the corresponding data is not searched in the first storage medium, the storage node searches the corresponding data in the second storage medium of the storage node. As shown in fig. 2, the storage node further includes a processor, where the processor is communicatively connected to the first storage medium and the second storage medium, respectively, and the processor may access the first storage medium and the second storage medium, a data access rate of the first storage medium may be an access rate of the processor to access the first storage medium, and a data access rate of the second storage medium may be an access rate of the processor to access the second storage medium. Data processing operations (e.g., storage operations, lookup operations) occurring in the first storage medium and the second storage medium may be specifically performed by a processor included in the storage node, which is not limited by the embodiments of the present application.
Optionally, the storage system shown in fig. 1 is a dis-cluster storage system, and the dis-cluster storage system is a decentralised storage system, and storage nodes in the dis-cluster storage system are communicatively connected to exchange information. The redis-cluster storage system adopts a master-slave mode, wherein the storage nodes comprise a master storage node and at least one slave storage node corresponding to the master storage node, the master storage node and the slave storage nodes corresponding to the master storage node can store the same data, for example, after the master storage node stores the data according to a data storage request sent by a management node, the master storage node sends the data to the slave storage nodes corresponding to the master storage node, so that the slave storage node also stores the data. In the redis-cluster storage system, if a certain main storage node is down, the main storage node opens a fuse, the main storage node and a function depending on the main storage node cannot normally run, and a management node can select a slave storage node corresponding to the main storage node to finish relevant data processing so as to ensure that the redis-cluster storage system can still provide a normal data processing function when the main storage node is down. For example, the slave storage node periodically sends a heartbeat signal to the master storage node, and determines whether the slave storage node and the master storage node can normally communicate according to whether the response of the master storage node is received, if more than half of slave storage nodes corresponding to a certain master storage node cannot normally communicate with the master storage node, the master storage node is considered to be down. After the main storage node opens the fuse, the main storage node can judge whether the main storage node is normal or not after the specified time length. If the primary storage node returns to normal, the primary storage node closes the fuse. If the main storage node is not recovered to be normal, the main storage node controls the fuse to be in an open state. The method for judging whether the main storage node is recovered to be normal or not by the main storage node comprises the following steps: the main storage node controls the fuse of the main storage node to be half-opened, and the main storage node detects whether the main storage node can normally respond to the heartbeat signal received by the main storage node; if the main storage node can normally respond to the heartbeat signal received by the main storage node, the main storage node determines that the main storage node is recovered to be normal; if the main storage node cannot respond normally to the heartbeat signal received by the main storage node, the main storage node determines that the main storage node is not recovered to be normal.
The above is an introduction to the application scenario of the present application, and the data processing method of the present application is described below.
Referring to fig. 3, a flowchart of a data processing method according to an embodiment of the present application is shown. The data processing method is applied to a first storage node in a storage system. For example, as shown in FIG. 1, the first storage node is any storage node in the storage system (e.g., storage node 1). Fig. 3 mainly describes the data storage process. Referring to fig. 3, the method includes the following steps S301 to S303.
S301, a first storage node receives a first storage request from a management node, wherein the first storage request carries first data.
The management node may send a first storage request to the first storage node, and the corresponding first storage node receives the first storage request from the management node. The first storage request carries first data and may also carry key information of the first data. The key information of the first data includes the type of the first data, the identification of the first data, the dynamic factor of the deployment environment of the storage system corresponding to the first data (i.e., the storage system storing the first data, or the storage system to which the first data is to be stored), and so on. The dynamic factors of different deployment environments are different, so that the corresponding key information of the same data in different storage systems is different.
S302, the first storage node stores first data in a first storage medium, and sets a first expiration time length for the first data.
After receiving a first storage request carrying first data, the first storage node firstly stores the first data in a first storage medium, and sets a first expiration time for the first data in the first storage medium. The first expiration duration may be set according to the type of the first data and the access frequency of the data of the type, or may be set randomly or set according to other policies, which is not limited in the embodiment of the present application. Optionally, the first storage request carries key information of the first data, and the first storage node stores the first data and the key information of the first data in the first storage medium. For example, the first storage medium includes a first correspondence, where the first correspondence is a correspondence between data in the first storage medium and key information thereof, and the key information corresponding to each data in the first correspondence is key information of each data, and the first storage node may store the correspondence between the first data and the key information of the first data in the first correspondence. As an example, the first correspondence is configured as a Key-Value structure, the first storage node stores Key information of the first data as a Key in the first correspondence, and the first data as a Value corresponding to the Key in the first correspondence. In this embodiment, the first storage node sets that the first expiration time corresponds to a correspondence between first data included in the first correspondence and key information of the first data.
S303, when the ending time of the first expiration time is reached, the first storage node stores the first data in the first storage medium to the second storage medium, and deletes the first data in the first storage medium.
After the first storage node sets a first expiration period for the first data in the first storage medium, the first storage node starts a timer to count down the first expiration period. When the first storage node determines that the end time of the first expiration time is reached, the first storage node stores the first data in the first storage medium into the second storage medium, and deletes the first data in the first storage medium. By the processing mode, the first storage node can realize dynamic management of data in the first storage medium. Optionally, the first storage node reads the correspondence between the first data and the key information of the first data from the first storage medium into the second storage medium. For example, the second storage medium includes a second correspondence, where the second correspondence is a correspondence between data in the second storage medium and key information thereof, and the key information corresponding to each data in the second correspondence is the key information of each data, and the first storage node reads the correspondence between the first data and the key information of the first data from the first correspondence, and writes the correspondence between the first data and the key information of the first data into the second correspondence.
In an alternative embodiment, the first storage node keeps the first data stored in the first storage medium when the first storage node determines that the end time of the first expiration time period has not arrived. In this embodiment, the first storage node may increase the first expiration period of the first data, e.g., increase the length of the first expiration period. The first storage node can store the first data in the first storage medium with larger data access rate for a longer time by adopting the processing mode, so that the access efficiency of the first data is improved.
In the related art, a processor manages data in a cache using a elimination algorithm. For example, the processor manages data in the cache using a first-in-first-out queue (First Input First Output, FIFO) algorithm, a least recently used (Least Recently Used, LRU) algorithm, a least recently used (Least Frequently Used, LFU) algorithm, or a W-TinyLFU (Window Tiny LFU) algorithm. In the FIFO algorithm, the processor obscures the data in the cache in the order in which the data entered the cache. In the LRU algorithm, the processor stores the data newly entering the cache at the tail part of the data queue, and when the length of the data queue reaches the set length, the processor eliminates the data from the head part of the data queue, and the probability of cache pollution caused by the LRU algorithm is high. In the LFU algorithm, when the processor accesses any data in the cache, the access frequency of the any data is recorded, if the processor determines that the access frequency of any data in the cache is smaller in the specified duration, the processor considers that the access frequency of any data is also smaller in the subsequent access frequency, and the processor eliminates any data. The LFU algorithm can handle the problem of cache pollution due to cold data bursts, but if the data access pattern changes, the LFU algorithm can result in a reduced cache hit rate and requires additional space to store the access frequency. The W-TinyLFU algorithm is an optimization algorithm of the LFU algorithm, when the processor receives an access request of any data under the W-TinyLFU algorithm, the processor records the access frequency of any data, sets a standard frequency value, and stores the data with the access frequency larger than the standard frequency value in a cache. The processor may record the access frequency of the data by using a Count-Min Sketch algorithm, which is an algorithm used for counting. In this embodiment of the present application, the first storage medium may be a cache, when any storage node stores any data in the first storage medium of any storage node, an expiration time is set for any data, and the any storage node eliminates the data in the first storage medium according to the expiration time of the data in the first storage medium of any storage node, so that accurate elimination of the data in the first storage medium can be achieved, and the probability of cache pollution is reduced. Because the data exceeding the expiration time in the first storage medium is eliminated, the unexpired data is still stored in the first storage medium, so that the hit rate of the first storage medium can be ensured. In addition, the method and the device do not need to record the access frequency of the data, and the situation that the access frequency occupies the buffer memory is avoided.
In summary, in the data processing method provided in the embodiment of the present application, after the first storage node receives the first storage request carrying the first data, the first storage node first stores the first data in the first storage medium and sets a first expiration time for the first data, and when the end time of the first expiration time is reached, the first storage node stores the first data in the first storage medium to the second storage medium and deletes the first data in the first storage medium. By means of the data processing mode, the first storage medium can share a part of storage pressure, and the storage pressure of the second storage medium can be reduced.
Referring to fig. 4, a flowchart of another data processing method according to an embodiment of the present application is shown. The data processing method is applied to a first storage node in a storage system. Fig. 4 mainly describes the data lookup process. Referring to fig. 4, the method includes the following steps S401 to S405.
S401, a first storage node receives a first search request from a management node, wherein the first search request is used for searching first data.
The management node may send a first lookup request to the first storage node for looking up the first data, the corresponding first storage node receiving the first lookup request from the management node. The first search request may carry key information of the first data.
S402, the first storage node searches first data in the first storage medium according to the first search request.
After the first storage node receives a first search request for searching the first data, the first storage node first searches the first data in the first storage medium according to the first search request. Optionally, the first storage node searches the first data in the first storage medium according to the key information of the first data carried by the first search request.
As an example, the first storage medium includes a first correspondence, where the first correspondence is a correspondence between data in the first storage medium and key information thereof, and the key information corresponding to each data in the first correspondence is key information of each data, and the first storage node searches for the first data in the first correspondence according to the key information of the first data carried by the first search request. In a specific embodiment, the first storage node searches the key information of the first data in the first corresponding relation, and when the first storage node searches the key information of the first data in the first corresponding relation, the first storage node determines the data corresponding to the key information of the first data in the first corresponding relation as the first data. When the first storage node does not find the key information of the first data in the first corresponding relation, the first storage node determines that the key information of the first data and the first data are not included in the first corresponding relation, and further determines that the first data are not stored in the first storage medium.
S403, when the first data is not found in the first storage medium, the first storage node searches the second storage medium for the first data.
When the first storage node does not find the first data in the first storage medium, the first storage node searches the first data in the second storage medium according to the first search request. Optionally, the first storage node searches the first data in the second storage medium according to the key information of the first data carried by the first search request.
As an example, the second storage medium includes a second correspondence, where the second correspondence is a correspondence between data in the second storage medium and key information thereof, and the key information corresponding to each data in the second correspondence is key information of each data, and the first storage node searches for the first data in the second correspondence according to the key information of the first data carried by the first search request. In a specific embodiment, the first storage node searches the key information of the first data in the second corresponding relationship, and when the first storage node searches the key information of the first data in the second corresponding relationship, the first storage node determines the data corresponding to the key information of the first data in the second corresponding relationship as the first data. When the first storage node does not find the key information of the first data in the second corresponding relation, the first storage node determines that the key information of the first data and the first data are not included in the second corresponding relation, and further determines that the first data are not stored in the second storage medium.
S404, when the first data is not found in the first storage medium and the first data is found in the second storage medium, the first storage node stores the first data in the first storage medium and sets a second expiration time for the first data.
When the first storage node does not find the first data in the first storage medium and finds the first data in the second storage medium, the first storage node stores the first data in the first storage medium and sets a second expiration time for the first data in the first storage medium. The second expiration period may be set according to the type of the first data and the access frequency of the type of data, or may be set randomly or according to other policies. The second expiration time period is greater than, less than, or equal to the first expiration time period, which is not limited in this embodiment of the present application.
Optionally, the second storage medium stores a correspondence between the first data and key information of the first data, and the first storage node reads the correspondence between the first data and the key information of the first data from the second storage medium into the first storage medium. For example, the first storage medium includes a first correspondence, the first correspondence is a correspondence of data and key information thereof in the first storage medium, the second storage medium includes a second correspondence, the second correspondence is a correspondence of data and key information thereof in the second storage medium, and the first storage node reads a correspondence of the first data and key information of the first data from the second correspondence, and writes a correspondence of the first data and key information of the first data into the first correspondence. In this embodiment, the first storage node sets that the second expiration time corresponds to a correspondence between the first data included in the first correspondence and key information of the first data.
S405, when the ending time of the second expiration time is reached, the first storage node deletes the first data in the first storage medium.
After the first storage node sets a second expiration period for the first data in the first storage medium, the first storage node starts a timer to count down the second expiration period. When the first storage node determines that the end time of the second expiration time period arrives, the first storage node deletes the first data in the first storage medium. By the processing mode, the first storage node can realize dynamic management of data in the first storage medium. Optionally, when the first storage node determines that the end time of the second expiration period has not arrived, the first storage node keeps the first data stored in the first storage medium.
In summary, according to the data processing method provided in the embodiment of the present application, after the first storage node receives the first search request for searching the first data, the first storage node searches the first data in the first storage medium first, and when the first data is not found in the first storage medium, the first storage node searches the first data in the second storage medium, and because the data access rate of the first storage medium is greater than the data access rate of the second storage medium, the first storage node searches the first data in the first storage medium first, which is helpful to improve the searching efficiency of the first data. In addition, when the first storage node does not find the first data in the first storage medium and finds the first data in the second storage medium, the first storage node stores the first data in the first storage medium, so that the first storage node stores the data with higher access rate in the first storage medium with higher data access rate, and the searching efficiency of the data is convenient to improve.
The following are device embodiments of the present application, which may be used to perform method embodiments of the present application. For details not disclosed in the device embodiments of the present application, please refer to the method embodiments of the present application.
Referring to fig. 5, a schematic diagram of a data processing apparatus 500 according to an embodiment of the present application is shown. The data processing apparatus 500 is applied to a first storage node in a storage system, the storage system including a management node and a plurality of storage nodes, the management node being connected to the plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the first storage node including a first storage medium and a second storage medium, a data access rate of the first storage medium being greater than a data access rate of the second storage medium. The data processing apparatus 500 may implement the data processing method as shown in fig. 3 and 4. As shown in fig. 5, the data processing apparatus 500 includes: a receiving module 510 and a processing module 520.
The receiving module 510 is configured to receive a first storage request from a management node, where the first storage request carries first data. The functional implementation of the receiving module 510 may refer to the description in step S301 described above.
A processing module 520, configured to store first data in a first storage medium, and set a first expiration time period for the first data; and when the ending time of the first expiration time is reached, storing the first data in the first storage medium to the second storage medium, and deleting the first data in the first storage medium. The functional implementation of the processing module 520 may refer to the descriptions in S302 to S303 above.
Optionally, the receiving module 510 is further configured to receive a first search request from the management node, where the first search request is used to search for the first data. The function implementation of the receiving module 510 is described in step S401 above.
The processing module 520 is further configured to search the first storage medium for the first data according to the first search request; and searching the first data in the second storage medium when the first data is not searched in the first storage medium. The functional implementation of the processing module 520 may also refer to the descriptions in steps S402 to S403 described above.
Optionally, the processing module 520 is further configured to store the first data in the first storage medium and set a second expiration time for the first data when the first data is not found in the first storage medium and the first data is found in the second storage medium. The functional implementation of the processing module 520 may also refer to the description in step S404 above.
Optionally, the processing module 520 is further configured to delete the first data in the first storage medium when the end time of the second expiration time period is reached. The functional implementation of the processing module 520 may also refer to the description in step S405 above.
Optionally, the processing module 520 is configured to store the first data and the key information of the first data in the first storage medium. The first search request carries key information of the first data, and the processing module 520 is configured to search the first data in the first storage medium according to the key information of the first data.
Optionally, the first storage medium is a cache and the second storage medium is a storage disk.
In summary, in the data processing apparatus provided in the embodiment of the present application, after the receiving module receives the first storage request carrying the first data, the processing module first stores the first data in the first storage medium and sets a first expiration time for the first data, and when the ending time of the first expiration time is reached, the processing module stores the first data in the first storage medium to the second storage medium and deletes the first data in the first storage medium. By means of the data processing mode, the first storage medium can share a part of storage pressure, and the storage pressure of the second storage medium can be reduced.
The embodiment of the application also provides a data processing device, which comprises a memory and a processor; the memory is used for storing a computer program; the processor is configured to execute the computer program stored in the memory to cause the article dispensing apparatus to perform all or part of the steps of the method as shown in any one of figures 2 to 3.
Embodiments of the present application provide a computer-readable storage medium having a computer program stored therein, which when executed (e.g., by a data processing apparatus, one or more processors, etc.) implements all or part of the steps of a method as shown in any of fig. 2-3.
The present application provides a computer program product comprising a program or code which, when executed (e.g. by a data processing apparatus, one or more processors, etc.), performs all or part of the steps of a method as provided by the method embodiments described above.
It should be understood that the term "at least one" in this application refers to one or more, and "a plurality" refers to two or more. The term "at least two" refers to two or more. In addition, for purposes of clarity of description, the words "first," "second," "third," and the like are used throughout this application to distinguish between identical or similar items that have substantially the same function and effect. Those skilled in the art will appreciate that the words "first," "second," "third," etc. do not limit the number and order of execution.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the exemplary embodiments of the present application is not intended to limit the invention to the particular embodiments disclosed, but on the contrary, the intention is to cover all modifications, equivalents, alternatives, and alternatives falling within the spirit and scope of the invention.

Claims (6)

1. A data processing method, applied to a first storage node in a storage system, the storage system including a management node and a plurality of storage nodes, the management node being connected to the plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the first storage node including a first storage medium and a second storage medium, a data access rate of the first storage medium being greater than a data access rate of the second storage medium, the method comprising:
receiving a first storage request from a management node, wherein the first storage request carries first data and key information of the first data, the key information of the first data comprises at least one of a type of the first data, an identification of the first data and a dynamic factor of a deployment environment of a storage system corresponding to the first data, each storage node is provided with a plurality of slots, each slot is provided with a number, different storage nodes correspond to different slot number ranges, the management node is used for performing verification calculation on the key information carried by the data storage request through a verification algorithm, determining the storage node corresponding to the data storage request according to a verification calculation result and the slot number range corresponding to each storage node, and sending the first storage request to the storage node;
Storing the first data and key information of the first data in the first storage medium, setting a first expiration time for the first data, wherein the first storage medium comprises a first corresponding relation, the first corresponding relation is a corresponding relation of the data in the first storage medium and the key information thereof, the key information corresponding to each data in the first corresponding relation is the key information of each data, and the first storage node stores the corresponding relation of the first data and the key information of the first data in the first corresponding relation;
when the ending time of the first expiration time is reached, storing the first data in the first storage medium to the second storage medium, deleting the first data in the first storage medium, wherein the second storage medium comprises a second corresponding relation, the second corresponding relation is the corresponding relation of the data in the second storage medium and the key information thereof, the key information corresponding to each data in the second corresponding relation is the key information of each data, and the first storage node is used for reading the corresponding relation of the first data and the key information of the first data from the first storage medium into the second storage medium;
The method further comprises the steps of:
receiving a first search request from the management node, wherein the first search request is used for searching the first data, and the first search request carries key information of the first data;
searching the first data in the first storage medium according to the key information of the first data, and determining the data corresponding to the key information of the first data in the first corresponding relation as the first data by the first storage node when the first storage node searches the key information of the first data in the first corresponding relation;
when the first data is not found in the first storage medium, the first storage node is used for searching key information of the first data in a second corresponding relation, and when the first storage node is used for searching the key information of the first data in the second corresponding relation, the first storage node determines data corresponding to the key information of the first data in the second corresponding relation as the first data.
2. The method according to claim 1, wherein the method further comprises:
When the first data is not found in the first storage medium and the first data is found in the second storage medium, storing the first data in the first storage medium, and setting a second expiration time for the first data;
and deleting the first data in the first storage medium when the ending time of the second expiration time is reached.
3. A method according to claim 1 or 2, characterized in that,
the first storage medium is a cache and the second storage medium is a storage disk.
4. A data processing apparatus for use in a first storage node in a storage system, the storage system comprising a management node and a plurality of storage nodes, the management node being coupled to the plurality of storage nodes, the first storage node being any one of the plurality of storage nodes, the first storage node comprising a first storage medium and a second storage medium, a data access rate of the first storage medium being greater than a data access rate of the second storage medium, the apparatus comprising:
the system comprises a receiving module, a management node and a storage node, wherein the receiving module is used for receiving a first storage request from the management node, the first storage request carries first data and key information of the first data, the key information of the first data comprises at least one of a type of the first data, an identification of the first data and a dynamic factor of a deployment environment of a storage system corresponding to the first data, each storage node is provided with a plurality of slots, each slot is provided with a number, different storage nodes correspond to different slot number ranges, the management node is used for performing verification calculation on the key information carried by the data storage request through a verification algorithm, determining the storage node corresponding to the data storage request according to a verification calculation result and the slot number range corresponding to each storage node, and sending the first storage request to the storage node;
The processing module is used for storing the first data and key information of the first data in the first storage medium and setting a first expiration time length for the first data; when the ending time of the first expiration time is reached, storing the first data in the first storage medium to the second storage medium, deleting the first data in the first storage medium, wherein the first storage medium comprises a first corresponding relation, the first corresponding relation is the corresponding relation of the data in the first storage medium and the key information thereof, the key information corresponding to each data in the first corresponding relation is the key information of each data, the first storage node stores the corresponding relation of the first data and the key information of the first data in the first corresponding relation, the second storage medium comprises a second corresponding relation, the second corresponding relation is the corresponding relation of the data in the second storage medium and the key information thereof, the key information corresponding to each data in the second corresponding relation is the key information of each data, and the first storage node is used for reading the corresponding relation of the first data and the key information of the first data from the first storage medium;
The receiving module is further configured to receive a first search request from the management node, where the first search request is used to search the first data, and the first search request carries key information of the first data;
the processing module is further used for searching the first data in the first storage medium according to the key information of the first data; and searching the first data in the second storage medium when the first data is not searched in the first storage medium, wherein the first storage node is used for searching the key information of the first data in the second corresponding relation, and determining the data corresponding to the key information of the first data in the second corresponding relation as the first data by the first storage node when the first storage node searches the key information of the first data in the second corresponding relation.
5. The apparatus of claim 4, wherein the device comprises a plurality of sensors,
the processing module is further configured to store the first data in the first storage medium and set a second expiration time for the first data when the first data is not found in the first storage medium and the first data is found in the second storage medium;
The processing module is further configured to delete the first data in the first storage medium when the end time of the second expiration duration is reached.
6. The apparatus of claim 4 or 5, wherein the device comprises a plurality of sensors,
the first storage medium is a cache and the second storage medium is a storage disk.
CN202211356247.8A 2022-11-01 2022-11-01 Data processing method and device Active CN115657954B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211356247.8A CN115657954B (en) 2022-11-01 2022-11-01 Data processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211356247.8A CN115657954B (en) 2022-11-01 2022-11-01 Data processing method and device

Publications (2)

Publication Number Publication Date
CN115657954A CN115657954A (en) 2023-01-31
CN115657954B true CN115657954B (en) 2023-06-20

Family

ID=84994781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211356247.8A Active CN115657954B (en) 2022-11-01 2022-11-01 Data processing method and device

Country Status (1)

Country Link
CN (1) CN115657954B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045535A (en) * 2015-07-22 2015-11-11 北京京东尚科信息技术有限公司 Method and system for automatically deleting expired data

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008134777A (en) * 2006-11-28 2008-06-12 Oki Electric Ind Co Ltd Caching method of file allocation table
CN104378396B (en) * 2013-08-15 2018-05-15 上海七牛信息技术有限公司 Data administrator and method
CN106843770A (en) * 2017-01-23 2017-06-13 北京思特奇信息技术股份有限公司 A kind of distributed file system small file data storage, read method and device
CN108845768A (en) * 2018-06-19 2018-11-20 郑州云海信息技术有限公司 A kind of date storage method, device, equipment and storage medium
CN109597568B (en) * 2018-09-18 2022-03-04 天津字节跳动科技有限公司 Data storage method and device, terminal equipment and storage medium
CN110489063B (en) * 2019-08-27 2023-12-19 北京奇艺世纪科技有限公司 Method and device for setting cache expiration time, electronic equipment and storage medium
CN111737254B (en) * 2020-05-29 2022-12-23 苏州浪潮智能科技有限公司 Method and device for improving application cache utilization rate

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105045535A (en) * 2015-07-22 2015-11-11 北京京东尚科信息技术有限公司 Method and system for automatically deleting expired data

Also Published As

Publication number Publication date
CN115657954A (en) 2023-01-31

Similar Documents

Publication Publication Date Title
US9767140B2 (en) Deduplicating storage with enhanced frequent-block detection
CN108268219B (en) Method and device for processing IO (input/output) request
US6785771B2 (en) Method, system, and program for destaging data in cache
US20180081933A1 (en) Granular buffering of metadata changes for journaling file systems
KR100453228B1 (en) Journaling and recovery method for shared disk file system
US7979664B2 (en) Method, system, and article of manufacture for returning empty physical volumes to a storage pool based on a threshold and an elapsed time period
US20040049636A1 (en) Technique for data transfer
US20050262326A1 (en) Method, system, and article of manufacture for borrowing physical volumes
WO2011109564A2 (en) Buffer pool extension for database server
US10048866B2 (en) Storage control apparatus and storage control method
CN107888687B (en) Proxy client storage acceleration method and system based on distributed storage system
US11144508B2 (en) Region-integrated data deduplication implementing a multi-lifetime duplicate finder
CN110688062A (en) Cache space management method and device
EP3588913B1 (en) Data caching method, apparatus and computer readable medium
CN110928496B (en) Data processing method and device on multi-control storage system
CN115167778A (en) Storage management method, system and server
US20170262485A1 (en) Non-transitory computer-readable recording medium, data management device, and data management method
KR101806394B1 (en) A data processing method having a structure of the cache index specified to the transaction in a mobile environment dbms
CN107133334B (en) Data synchronization method based on high-bandwidth storage system
CN115657954B (en) Data processing method and device
CN110413689B (en) Multi-node data synchronization method and device for memory database
CN111459402A (en) Magnetic disk controllable buffer writing method, controller, hybrid IO scheduling method and scheduler
CN116185287A (en) Method and device for reducing read delay and solid state disk
CN115469810A (en) Data acquisition method, device, equipment and storage medium
US7421536B2 (en) Access control method, disk control unit and storage apparatus

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant