WO2020094064A1 - Procédé d'optimisation de performance, dispositif, appareil, et support d'enregistrement lisible par ordinateur - Google Patents

Procédé d'optimisation de performance, dispositif, appareil, et support d'enregistrement lisible par ordinateur Download PDF

Info

Publication number
WO2020094064A1
WO2020094064A1 PCT/CN2019/116024 CN2019116024W WO2020094064A1 WO 2020094064 A1 WO2020094064 A1 WO 2020094064A1 CN 2019116024 W CN2019116024 W CN 2019116024W WO 2020094064 A1 WO2020094064 A1 WO 2020094064A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
node
data node
list
client
Prior art date
Application number
PCT/CN2019/116024
Other languages
English (en)
Chinese (zh)
Inventor
胡晓东
张东涛
辛丽华
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2020094064A1 publication Critical patent/WO2020094064A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Definitions

  • the present disclosure relates to the field of communication technology, and in particular, to a performance optimization method, device, device, and computer-readable storage medium.
  • Hadoop Distributed File System (Hadoop Distributed File System, HDFS) is a core component of Hadoop, and is currently widely used in big data services.
  • HDFS is mainly responsible for storing file data in Hadoop.
  • the files on HDFS are stored in data blocks.
  • Data block is an abstract concept, it is a logical unit of file storage processing.
  • a data block usually has multiple copies to increase data security. Multiple copies of a data block are usually stored in different data nodes, which may be in the same rack or in different racks.
  • a client in HDFS wants to read a file, it usually reads the copy of the data block in the data node closest to it, such as reading the copy of the data block in the data node in the same rack. At this time, the client always accesses the data node closest to it.
  • the data node closest to the client will have too much pressure, while the data node farther away will have less pressure, resulting in uneven pressure distribution of HDFS.
  • the read performance is reduced.
  • the main purpose of the present disclosure is to provide a performance optimization method, device, and computer-readable storage medium, aiming to solve the problem that the HDFS pressure distribution caused by the client always reading data from the data node closest to it in HDFS Evenly, the technical problem of degraded read performance of the entire HDFS.
  • the performance optimization method includes the steps of: after receiving a data read request sent by a client, acquiring a data node where a data block corresponding to the data read request is located Obtain a preset sorting strategy corresponding to the data nodes, sort the data nodes according to the sorting strategy to obtain a list of data nodes; return the list of data nodes to the client for the client According to the data node list, a data node that provides a read data block service is determined.
  • the present disclosure also provides a performance optimization apparatus, wherein the performance optimization apparatus includes: an acquisition module configured to acquire the data reading after receiving the data reading request sent by the client Request the data node where the corresponding data block is located; obtain the preset sorting strategy corresponding to the data node; the sorting module is used to sort the data nodes according to the sorting strategy to obtain a list of data nodes; the data return module is used to Returning the list of data nodes to the client for the client to determine the data node that provides the read data block service according to the list of data nodes.
  • the present disclosure also provides a performance optimization device, which includes a memory, a processor, and a performance optimization program stored on the memory and runable on the processor, the When the performance optimization program is executed by the processor, the steps of the performance optimization method described above are realized.
  • the present disclosure also provides a computer-readable storage medium having a performance optimization program stored on the computer-readable storage medium, the performance optimization program being executed by a processor to achieve the performance optimization described above Method steps.
  • FIG. 1 is a schematic structural diagram of a hardware operating environment involved in a solution of an embodiment of the present disclosure
  • FIG. 3 is a schematic flowchart of a preferred embodiment of the performance optimization method of the present disclosure.
  • FIG. 4 is a sorting diagram of sorting data nodes according to pressure according to an embodiment of the present disclosure
  • FIG. 5 is a sorting diagram of sorting data nodes according to a distance from a client according to an embodiment of the present disclosure
  • FIG. 6 is a sorting diagram of sorting data nodes according to distance and pressure from a client according to an embodiment of the present disclosure
  • FIG. 7 is a functional schematic block diagram of a preferred embodiment of the performance optimization device of the present disclosure.
  • the present disclosure provides a solution by After receiving the data read request sent by the client, obtain the data node where the data block corresponding to the data read request is located; after obtaining the preset sorting strategy corresponding to the data node, sort the data nodes according to the sorting strategy to obtain data Node list; return the data node list to the client for the client to determine the data node that provides the read data block service according to the data node list. It avoids that the client always reads the data block from the data node that is closest to it, reduces the pressure on the data node that is closest to the client, and avoids the problem of uneven HDFS pressure distribution and reduced read performance of the entire HDFS.
  • FIG. 1 is a schematic structural diagram of a hardware operating environment involved in a solution of an embodiment of the present disclosure.
  • FIG. 1 is a schematic diagram of the hardware operating environment of the performance optimization device.
  • the performance optimization device in the embodiment of the present disclosure may be a PC, a server, such as a metadata server of HDFS, or a mobile terminal device such as a smart phone, tablet computer, or portable computer.
  • the performance optimization device may include: a processor 1001, such as a CPU, a network interface 1004, a user interface 1003, a memory 1005, and a communication bus 1002.
  • the communication bus 1002 is used to implement connection communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface and a wireless interface.
  • the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may optionally be a storage device independent of the foregoing processor 1001.
  • the performance optimization device may further include a camera, an RF (Radio Frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on.
  • RF Radio Frequency
  • the structure of the performance optimization device shown in FIG. 2 does not constitute a limitation on the performance optimization device, and may include more or fewer components than the illustration, or a combination of certain components, or different components Layout.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a performance optimization program.
  • the network interface 1004 is mainly used to connect other data nodes, name nodes, or clients; HDFS operation and maintenance personnel can trigger setting instructions through the user interface 1003; and the processor 1001 can be used to call the memory
  • the performance optimization program stored in 1005 and perform the following operations: after receiving the data read request sent by the client, obtain the data node where the data block corresponding to the data read request is located; obtain the pre-correspondence corresponding to the data node Set a sorting strategy, sort the data nodes according to the sorting strategy to obtain a list of data nodes; return the list of data nodes to the client, so that the client determines to provide a read based on the list of data nodes The data node served by the data block.
  • the step of sorting the data nodes according to the sorting strategy to obtain a list of data nodes includes: obtaining pressure values corresponding to the data nodes Determine the pressure corresponding to the data node according to the pressure value, sort the data nodes in order of the pressure from small to large, and obtain a list of data nodes.
  • the step of obtaining the pressure value corresponding to the data node includes: obtaining pressure data of the data node; obtaining the data node according to the pressure data and a preset pressure data score standard The pressure data score; calculate the pressure value corresponding to the data node according to the pressure data score and the corresponding preset pressure data weight value.
  • the step of sorting the data nodes according to the sorting strategy to obtain a list of data nodes includes: sorting the data nodes according to the The distance of the clients is sorted from near to far to obtain a list of preprocessed data nodes; obtain the pressure value corresponding to the data node, and detect whether the pressure value corresponding to the data node meets the preset conditions; when the data node is detected When the corresponding pressure value satisfies the preset condition, move the data node whose pressure value satisfies the preset condition to the end of the pre-processing data node list to obtain the processed data node list.
  • the step of returning the data node list to the client for the client to determine the data node that provides the read data block service according to the data node list includes: The node list is returned to the client for the client to determine the data node ranked first in the data node list as the data node providing the read data block service.
  • the processor 1001 may call the data stored in the memory 1005.
  • the performance optimization program further performs the following operation: after receiving the setting request for setting the sorting strategy, setting the sorting strategy corresponding to the data node according to the setting request.
  • the various embodiments are described with the name node of the metadata server of HDFS as the execution subject.
  • the name node is the HDFS metadata server, used to manage and coordinate the work of the data node. Its memory stores two types of metadata for the entire HDFS: (1) the name space of the file system, namely the file directory tree; Index, that is, the list of data blocks corresponding to each file; (2) The mapping of data blocks and data nodes, that is, the data nodes on which the data blocks are stored.
  • the data node where each data block of each file is located can be obtained from the name node.
  • Each data node corresponds to a port number and IP (Internet Protocol) address. According to the port number or IP address, a data node can be uniquely identified.
  • Arabic numerals are used to name data nodes to distinguish different data nodes. For example, when the number of copies is 3, a data block may be stored in data node 1, data node 2, and data node. 3, this mapping relationship is saved in the name node.
  • the data node is responsible for storing the actual file data block, which is called by the client and the name node, and at the same time, it will periodically send the stored data block information to the name node through the heartbeat.
  • a node is usually a machine.
  • the machine that reads data or files is referred to as a client.
  • the client may be a data node or a node. Name nodes, or other terminals or devices such as personal computers and smart phones. Therefore, the data node and the client where the data block copy is located, and the data node may be in the same rack or in different racks, and the data node and the client may also be the same machine.
  • the process of data reading in HDFS is shown in Figure 2:
  • the client initiates a read data request to the name node.
  • the read data request may be a read file.
  • the name node finds the list of data blocks corresponding to the data to be read by the client according to the data block index, and then finds the data nodes where each copy of each data block is located according to the mapping of the data blocks and the data nodes, and these data nodes Return to the client. As shown in Figure 2, the name node returns data nodes 1, 2, and 3 where the copy of the data block is located to the client.
  • the client determines a data node that provides a read service for it, and sends a read data block request to the data node. As shown in FIG. 2, the client sends a read data block request to the data node 1.
  • the data node 1 After receiving the reading block request, the data node 1 sends a copy of the data block stored on it to the client.
  • the performance optimization method includes:
  • Step S1 After receiving the data read request sent by the client, obtain the data node where the data block corresponding to the data read request is located.
  • the client initiates a data read request to the name node.
  • the name node obtains a list of data blocks corresponding to the data to be read by the client according to the data block index.
  • the data to be read by the client is It is divided into three data blocks for storage.
  • the obtained data block lists are data block 1, data block 2 and data block 3, and each data block has three copies.
  • the name node obtains the data node where each copy of each data block in the data block list is located according to the mapping between the data block and the data node. For example, three copies of data block 1 are stored on data nodes 1, 2, and 3, respectively. For convenience of description, in the following embodiments, description is made according to the number of data blocks being 1 and the number of data block copies being 3.
  • Step S2 Acquire a preset sorting strategy corresponding to the data nodes, and sort the data nodes according to the sorting strategy to obtain a data node list.
  • the name node is preset with a sorting strategy for sorting data nodes.
  • the sorting strategy may be a strategy for sorting according to the distance between the data node and the client, or a strategy for sorting according to the pressure of the data node.
  • the sorting strategy is obtained, and the data nodes are sorted according to the sorting strategy to obtain a list of data nodes. It can be understood that the list of sorted data nodes is a list of data nodes. If data nodes 1, 2, and 3 are sorted, the data node list obtained is data node 1, data node 3, and data node 2.
  • Step S3 Return the data node list to the client, so that the client determines the data node that provides the read data block service according to the data node list.
  • the name node returns the list of data nodes sorted according to the sorting strategy to the client.
  • the client selects a data node from the list of data nodes and determines it to provide read data block service Data node and send a read data block request to the data node.
  • the data node receives the read data block request, it sends the corresponding data block to the client.
  • the client can select the data node ranked first in the data node list, the data node ranked second, or the data node ranked first two first, or randomly select a data Nodes etc.
  • step S3 includes:
  • Step a Return the data node list to the client, so that the client determines the data node ranked first in the data node list as the data node that provides the read data block service.
  • the client After receiving the data node list, the client selects the data node ranked first in the data node list, determines it as the data node providing the read data block service, and ranks the data node ranked first in the data node list Send a read data block request to obtain the data block to be read.
  • step S1 before step S1, it further includes:
  • Step b After receiving the setting request for setting the sorting strategy, set the sorting strategy corresponding to the data node according to the setting request.
  • a variety of sorting strategies are preset in the name node for HDFS operation and maintenance personnel to choose.
  • the operation and maintenance personnel can also set a new sorting strategy in the name node. That is, the operation and maintenance personnel can set different sorting strategies according to specific situations to cope with different HDFS operating environments.
  • the sorting strategy is set according to the setting request. After that, when the data nodes are to be sorted, the data nodes are sorted according to the sorting strategy set according to the setting request.
  • the sorting strategy can be managed through the HDFS configuration file.
  • the operation and maintenance personnel can modify the configuration file in the name node or the specially set management node, such as modifying the sorting strategy of the data node in the configuration file, or setting a new sorting strategy.
  • the name node can obtain the sorting strategy from the HDFS configuration file.
  • the data node where the data block corresponding to the data read request is located is obtained; after obtaining the preset sorting strategy corresponding to the data node, the data is processed according to the sorting strategy
  • the nodes are sorted to obtain a data node list; the data node list is returned to the client, so that the client determines the data node that provides the read data block service according to the data node list.
  • the data node closest to the client is no longer always determined as the data node providing the read data block service, thereby avoiding The client always reads the data block from the data node closest to it, which reduces the pressure on the data node closest to the client, avoids uneven distribution of HDFS pressure, and improves the read performance of the entire HDFS.
  • the second embodiment of the performance optimization method of the present disclosure provides a performance optimization method.
  • the step of sorting the data nodes according to the sorting strategy in step S2 to obtain a list of data nodes includes:
  • Step c Obtain the pressure value corresponding to the data node.
  • the name node After obtaining the data node where the data block is located, the name node first obtains the current pressure value of each data node.
  • the current pressure value of the data node can be calculated by the data node according to its current pressure data and the preset pressure value calculation method in the configuration file.
  • the name node can obtain its current pressure value from the data node.
  • the calculation method of the preset pressure value in the configuration file can be set by the operation and maintenance personnel in the name node or a specially set management node. For example, the pressure value can be obtained by directly adding each pressure data.
  • the pressure value of the data node can also be calculated by the name node based on the current pressure data of the data node obtained from the data node, and the preset pressure value calculation method in the configuration file. At this time, the name node needs to first obtain the data The node obtains the current pressure data of the data node.
  • Pressure data includes but is not limited to disk IO rate (disk read and write rate), memory utilization rate, CPU (Central Processing Unit), and network IO rate (network input and output rate).
  • a data node can monitor its pressure data in real time by setting up a monitoring process. For example, the monitoring process monitors that the current disk IO rate of the data node is 100 megabits per second, the memory usage rate is 20%, the CPU usage rate is 40%, and the network IO rate 50M per second.
  • the disk IO rate and network IO rate monitored by the monitoring process may also be in the form of a percentage, that is, the disk IO rate and network IO rate are converted into percentages, for example, the disk IO rate is 30%.
  • the data node may only add a monitoring process to monitor the pressure data after detecting that the sorting strategy of the data node in the configuration file is the first sorting strategy.
  • Step d Determine the pressure corresponding to the data node according to the pressure value, sort the data nodes according to the order of the pressure from small to large, and obtain a data node list.
  • the pressure of each data node can be determined according to the pressure value corresponding to each data node, and the data node can be adjusted according to the pressure. Arrange them in order from small to large to get the data node list. At this time, the data node with the least pressure is ranked at the top of the data node list. There may be two possible cases here. One is that the larger the pressure value, the greater the pressure on the data node. The other is that the smaller the pressure value, the smaller the pressure on the data node. These two situations are based on the calculation of the data node. The calculation method used for the pressure value is different.
  • the data nodes with less pressure are placed in front of the data list.
  • the larger the pressure value of the data node the smaller the pressure of the data node.
  • the data node 2 with the highest pressure value is ranked at the front of the data node list, and the data node 1 with the lowest pressure value is ranked at the end. surface.
  • step c includes:
  • Step e Obtain pressure data of the data node.
  • the name node obtains the current pressure data of the data node from the data node.
  • Step f Obtain the pressure data score of the data node according to the pressure data and the preset pressure data score standard.
  • the configuration file is preset with the pressure data score standard of each pressure data of the data node.
  • the pressure data score standard reflects the mapping relationship between the pressure data and the pressure data score.
  • HDFS operation and maintenance personnel can Set the pressure data score standard for the situation. For example, you can set the disk IO rate score standard to the disk IO rate score standard shown in Table 1, reflecting the mapping relationship between the disk IO rate and the disk IO rate score.
  • Table 2 is the CPU usage rate score standard
  • Table 3 is the memory usage rate score standard
  • Table 4 is the network IO rate score standard. It should be understood that the pressure data score standards are not limited to the various score standards shown in the table.
  • Disk IO rate Disk IO rate score 0-10% 10 11% -20% 9 21% -30% 8 31% -40% 7 41% -50% 6 51% -60% 5 61% -70% 4 71% -80% 3 81% -90% 2 91% -100% 1
  • CPU usage CPU usage score 0-10% 10 11% -20% 9 21% -30% 8 31% -40% 7 41% -50% 6 51% -60% 5 61% -70% 4 71% -80% 3 81% -90% 2 91% -100% 1
  • Network IO rate Network IO rate score 0-10% 10 11% -20% 9 21% -30% 8 31% -40% 7 41% -50% 6 51% -60% 5 61% -70% 4 71% -80% 3 81% -90% 2 91% -100% 1
  • the name node obtains the pressure data score standard from the configuration file, and compares each pressure data of the data node with the corresponding score standard to obtain each pressure data score. For example, when the current disk IO rate of the data node is 20%, the CPU usage rate is 30%, the memory usage rate is 40%, and the network IO rate is 20%, the data is obtained according to the score criteria shown in Table 1-4.
  • the node's disk IO rate score is 9, the CPU usage score is 8, the memory usage score is 7, and the network IO score is 9.
  • step g the pressure value corresponding to the data node is calculated according to the pressure data score and the corresponding preset pressure data weight value.
  • the configuration file is preset with the weight value of each pressure data of the data node.
  • HDFS operation and maintenance personnel can set the weight value of each pressure data according to the specific situation, for example, the disk IO rate weight value can be set to 10, CPU The usage weight value is set to 5, the memory usage weight value is set to 5, and the network IO rate weight value is set to 8.
  • the pressure value corresponding to the data node is obtained by multiplying each pressure data score and the corresponding pressure data weight value and adding them.
  • the calculation process is also the same as the process of calculating the pressure value by the above-mentioned name node.
  • the first data node in the data node list is the data node with the least pressure, thereby avoiding always being closest to the client
  • the data nodes at the top of the list make the pressure of the closest data node too large and the HDFS pressure distribution uneven, which improves the read performance of the entire HDFS.
  • the third embodiment of the performance optimization method of the present disclosure provides a performance optimization method.
  • the step of sorting the data nodes according to the sorting strategy in step S2 to obtain a list of data nodes includes:
  • step h the data nodes are sorted in order from the shortest to the farthest from the client to obtain a list of preprocessed data nodes.
  • the distance between the data node and the client is the shortest, when the data node and the client are on different machines in the same rack, the distance is farther, when the data node and the client are on different machines When the rack is mounted, the distance is farther.
  • the distance between each data node where the data block is located and the client may be the same or different.
  • the name node sorts the data nodes according to the distance from the data node to the client in the order of near and far. Two data nodes with the same distance from the client can be sorted in either order Process the data node list. For example, the name node sorts the data nodes 1, 2, and 3 in the order of the distance from the client to the farthest, to obtain the pre-processed data node list shown in FIG. 5.
  • Step i Obtain the pressure value corresponding to the data node, and detect whether the pressure value corresponding to the data node satisfies the preset condition.
  • the process of acquiring the pressure value corresponding to the data node by the name node is the same as the process described in step a in the second embodiment, and will not be described in detail here.
  • the name node After the name node obtains the pressure value corresponding to the data node, it traverses the list of pre-processed data nodes to check whether the pressure value of each data node meets the preset conditions.
  • the preset condition can be set according to specific conditions. For example, when the pressure value of the data node is greater, the pressure is greater, and the preset condition can be set when the pressure value of the data node is greater than the preset pressure value; When the pressure value of the data node is larger, the pressure is smaller. It can be set that when the pressure value of the data node is less than the preset pressure value, the preset condition is satisfied.
  • the preset pressure value can be set according to specific conditions.
  • Step j When it is detected that the pressure value corresponding to the data node satisfies the preset condition, move the data node whose pressure value satisfies the preset condition to the end of the pre-processed data node list to obtain the processed data node List.
  • the data node whose pressure value satisfies the preset condition is moved to the end of the preprocessing data node list. After traversing all the data nodes, get the processed, that is, the final list of data nodes. At this time, the pressure of the data node ranked first is relatively small, and the distance to the client is relatively close. As shown in FIG. 6, it is a list of data nodes obtained after moving the data node 1 whose pressure value meets the preset condition in the pre-processing data node list shown in FIG. 5 to the end.
  • the data nodes are sorted in order of the distance from the client to the nearest, and then the data nodes satisfy the preset condition, that is, the data nodes whose pressure exceeds the preset pressure are arranged after all the data nodes , So that the data node ranked at the top of the data node list is a data node with relatively low pressure and relatively close to the client, thereby avoiding always placing the data node closest to the client at the front, making this distance The recent problem of excessive pressure on data nodes.
  • the fourth embodiment of the performance optimization method of the present disclosure provides a performance optimization method.
  • the step of sorting the data nodes according to the sorting strategy in step S2 to obtain a list of data nodes includes:
  • Step k Randomly sort the data nodes to obtain a list of data nodes.
  • the name node randomly sorts the data nodes to obtain a list of data nodes.
  • the random sorting method can be any method that can randomly sort the data.
  • the present disclosure also provides a performance optimization apparatus.
  • the performance optimization apparatus includes: an acquisition module 10, configured to acquire the data read request corresponding to the data read request after receiving the data read request sent by the client The data node where the data block is located; the preset sorting strategy corresponding to the data node is obtained; the sorting module 20 is used to sort the data nodes according to the sorting strategy to obtain a list of data nodes; the data return module 30 is used to Returning the list of data nodes to the client for the client to determine the data node that provides the read data block service according to the list of data nodes.
  • the sorting module 20 when the sorting strategy is the first sorting strategy, includes: a first acquiring unit for acquiring the pressure value corresponding to the data node; a first sorting unit for The pressure value determines the pressure corresponding to the data node, and sorts the data nodes in order of the pressure from small to large to obtain a list of data nodes.
  • the first acquiring unit further includes: an acquiring subunit for acquiring pressure data of the data node; a calculating subunit for calculating score criteria based on the pressure data and preset pressure data To obtain the pressure data score of the data node; and also used to calculate the pressure value corresponding to the data node according to the pressure data score and the corresponding preset pressure data weight value.
  • the sorting module 20 further includes: a second sorting unit for sorting the data node from the nearest to the farthest according to the distance from the client Sort in order to obtain a list of preprocessed data nodes; a second acquisition unit to acquire the pressure value corresponding to the data node; a detection unit to detect whether the pressure value corresponding to the data node satisfies preset conditions; the first The second sorting unit is also used to move the data node whose pressure value satisfies the preset condition to the end of the pre-processed data node list when it is detected that the pressure value corresponding to the data node satisfies the preset condition. List of data nodes.
  • the sorting module 20 further includes: a third sorting unit for randomly sorting the data nodes to obtain a list of data nodes.
  • the data return module 30 is further configured to return the list of data nodes to the client, so that the client can determine the data node ranked first in the list of data nodes as Data nodes that provide read data block services.
  • the performance optimization apparatus further includes: a setting module, configured to set a sorting strategy corresponding to the data node according to the setting request after receiving a setting request to set the sorting strategy.
  • an embodiment of the present disclosure also proposes a computer-readable storage medium having a performance optimization program stored on the computer-readable storage medium, where the performance optimization program is executed by a processor to implement the steps of the performance optimization method described above.
  • the methods in the above embodiments can be implemented by means of software plus a necessary general hardware platform, and of course, can also be implemented by hardware, but in many cases the former is better Implementation.
  • the technical solution of the present disclosure can be embodied in the form of a software product in essence or part that contributes to some situations, and the computer software product is stored in a storage medium (such as ROM / RAM, The magnetic disk and the optical disk) include several instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the embodiments of the present disclosure.
  • a terminal device which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.
  • the present disclosure obtains the data node where the data block corresponding to the data reading request is located after receiving the data reading request sent by the client, and obtains the preset sorting strategy corresponding to the data node, and then the data node is processed according to the sorting strategy Sorting to obtain a list of data nodes; returning the list of data nodes to the client for the client to determine the data node that provides the read data block service according to the list of data nodes.
  • the data node closest to the client is no longer always determined as the data node providing the read data block service, thereby avoiding The client always reads the data block from the data node closest to it, which reduces the pressure on the data node closest to the client, avoids uneven distribution of HDFS pressure, and improves the read performance of the entire HDFS.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

L'invention concerne un procédé d'optimisation de performance, un dispositif, un appareil, et un support d'enregistrement lisible par ordinateur. Le procédé consiste à : obtenir, lors de la réception d'une demande de lecture de données envoyée par un client, des nœuds de données ayant un bloc de données correspondant à la demande de lecture de données (S1) ; obtenir une politique de tri prédéterminée correspondant aux nœuds de données, trier les nœuds de données selon la politique de tri, et obtenir une liste de nœuds de données (S2) ; et renvoyer la liste de nœuds de données au client, de telle sorte que le client détermine, selon la liste de nœuds de données, un nœud de données pour fournir un service de lecture de blocs de données (S3).
PCT/CN2019/116024 2018-11-07 2019-11-06 Procédé d'optimisation de performance, dispositif, appareil, et support d'enregistrement lisible par ordinateur WO2020094064A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811323508.X 2018-11-07
CN201811323508.XA CN111159131A (zh) 2018-11-07 2018-11-07 性能优化方法、装置、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020094064A1 true WO2020094064A1 (fr) 2020-05-14

Family

ID=70554758

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/116024 WO2020094064A1 (fr) 2018-11-07 2019-11-06 Procédé d'optimisation de performance, dispositif, appareil, et support d'enregistrement lisible par ordinateur

Country Status (2)

Country Link
CN (1) CN111159131A (fr)
WO (1) WO2020094064A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11425980B2 (en) 2020-04-01 2022-08-30 Omachron Intellectual Property Inc. Hair dryer

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112995280B (zh) * 2021-02-03 2022-04-22 北京邮电大学 面向多内容需求服务的数据分配方法和装置
CN113778346B (zh) * 2021-11-12 2022-02-11 深圳市名竹科技有限公司 数据读取方法、装置、设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156381A (zh) * 2014-03-27 2014-11-19 深圳信息职业技术学院 Hadoop分布式文件***的副本存取方法、装置和Hadoop分布式文件***
CN105550362A (zh) * 2015-12-31 2016-05-04 浙江大华技术股份有限公司 一种存储***的索引数据修复方法和存储***
US20170373977A1 (en) * 2016-06-28 2017-12-28 Paypal, Inc. Tapping network data to perform load balancing
CN108009260A (zh) * 2017-12-11 2018-05-08 西安交通大学 一种大数据存储下结合节点负载和距离的副本放置方法
US20180285167A1 (en) * 2017-04-03 2018-10-04 Ocient, Inc Database management system providing local balancing within individual cluster node

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9424272B2 (en) * 2005-01-12 2016-08-23 Wandisco, Inc. Distributed file system using consensus nodes
CN102546782B (zh) * 2011-12-28 2015-04-29 北京奇虎科技有限公司 一种分布式***及其数据操作方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104156381A (zh) * 2014-03-27 2014-11-19 深圳信息职业技术学院 Hadoop分布式文件***的副本存取方法、装置和Hadoop分布式文件***
CN105550362A (zh) * 2015-12-31 2016-05-04 浙江大华技术股份有限公司 一种存储***的索引数据修复方法和存储***
US20170373977A1 (en) * 2016-06-28 2017-12-28 Paypal, Inc. Tapping network data to perform load balancing
US20180285167A1 (en) * 2017-04-03 2018-10-04 Ocient, Inc Database management system providing local balancing within individual cluster node
CN108009260A (zh) * 2017-12-11 2018-05-08 西安交通大学 一种大数据存储下结合节点负载和距离的副本放置方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11425980B2 (en) 2020-04-01 2022-08-30 Omachron Intellectual Property Inc. Hair dryer

Also Published As

Publication number Publication date
CN111159131A (zh) 2020-05-15

Similar Documents

Publication Publication Date Title
US10862957B2 (en) Dissemination of node metrics in server clusters
CN109660607B (zh) 一种业务请求分发方法、接收方法、装置及服务器集群
AU2016382908B2 (en) Short link processing method, device and server
US11238175B2 (en) File system permission setting method and apparatus
WO2020094064A1 (fr) Procédé d'optimisation de performance, dispositif, appareil, et support d'enregistrement lisible par ordinateur
CN112860695B (zh) 监控数据查询方法、装置、设备、存储介质及程序产品
US8635250B2 (en) Methods and systems for deleting large amounts of data from a multitenant database
US20180349363A1 (en) Opportunistic gossip-type dissemination of node metrics in server clusters
US20130311742A1 (en) Image management method, mobile terminal and computer storage medium
CN109885786B (zh) 数据缓存处理方法、装置、电子设备及可读存储介质
CN106790552B (zh) 一种基于内容分发网络的内容提供***
WO2020042427A1 (fr) Procédé et appareil de rapprochement basés sur des fragments de données, dispositif informatique et support de stockage
CN109981702B (zh) 一种文件存储方法及***
US20220075757A1 (en) Data read method, data write method, and server
CN105159845A (zh) 存储器读取方法
US20200242118A1 (en) Managing persistent database result sets
CN112732756B (zh) 数据查询方法、装置、设备及存储介质
CN108512768B (zh) 一种访问量的控制方法及装置
US11683316B2 (en) Method and device for communication between microservices
US12014051B2 (en) IO path determination method and apparatus, device and readable storage medium
CN112764948A (zh) 数据发送方法、数据发送装置、计算机设备及存储介质
US20080270483A1 (en) Storage Management System
US11442632B2 (en) Rebalancing of user accounts among partitions of a storage service
CN114745275A (zh) 云服务环境中的节点更新方法、装置和计算机设备
CN114253456A (zh) 一种缓存负载均衡方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19881322

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19881322

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/09/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19881322

Country of ref document: EP

Kind code of ref document: A1