CN114610441A - Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling - Google Patents

Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling Download PDF

Info

Publication number
CN114610441A
CN114610441A CN202210168517.6A CN202210168517A CN114610441A CN 114610441 A CN114610441 A CN 114610441A CN 202210168517 A CN202210168517 A CN 202210168517A CN 114610441 A CN114610441 A CN 114610441A
Authority
CN
China
Prior art keywords
parameter
optimization
parameters
application
taskmanager
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210168517.6A
Other languages
Chinese (zh)
Inventor
薛楚
周明伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202210168517.6A priority Critical patent/CN114610441A/en
Publication of CN114610441A publication Critical patent/CN114610441A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Factory Administration (AREA)

Abstract

The application relates to a method, a system, equipment and a storage medium for flight parameter optimization based on yarn scheduling, wherein the method comprises the following steps: acquiring a task request, and starting a flink application through a yarn cluster according to the task request; acquiring index parameters in the flight application running process in real time; and dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition whether the index parameters can meet the specification requirements. Through the method and the device, the problem that dynamic parameter optimization cannot be performed according to the operation condition is solved, and the effect of dynamic parameter optimization according to the actual operation condition in the flink application operation process is achieved.

Description

Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling
Technical Field
The present application relates to the field of stream computing technologies, and in particular, to a method, a system, a device, and a storage medium for flight parameter optimization based on yarn scheduling.
Background
Flink is a distributed processing engine that executes arbitrary streaming data programs in a data parallel and pipelined manner for stateful computation of unbounded and bounded data streams, and thus has many applications in the field of real-time statistics, analysis or recommendation. Meanwhile, in order to fully utilize application resources in practical application, it is generally necessary to optimize the parameter configuration of the flink application.
In the prior art, parameters of a flink application are usually set manually according to a fixed configuration strategy to realize static resource configuration, so that the parameters cannot be adjusted to optimize according to actual operation conditions in the flight application operation process.
Aiming at the problem that the dynamic parameter optimization can not be carried out according to the operation condition in the related technology, no effective solution is provided at present.
Disclosure of Invention
In the present embodiment, a method, a system, a device and a storage medium for flight parameter optimization based on yarn scheduling are provided to solve the problems in the related art.
In a first aspect, in this embodiment, a method for optimizing a flink parameter based on a yarn scheduling is provided, where the method includes:
acquiring a task request, and starting a flink application through a yarn cluster according to the task request;
acquiring index parameters in the flight application running process in real time;
and dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition whether the index parameters can meet the specification requirements.
In some of these embodiments, the initial parameters include the number of taskmanagers, the number of slots on each TaskManager, the number of cpu cores used by each TaskManager, and the memory used by each TaskManager.
In some embodiments, the dynamically optimizing the initial parameter of the flink application according to a corresponding preset adjustment policy based on whether the index parameter can meet the specification requirement includes:
when the index parameter does not meet the specification requirement, adjusting the initial parameter according to a corresponding preset adjustment strategy to obtain a first optimized parameter;
if the first optimization parameter meets the specification requirement, further optimizing the first optimization parameter to obtain a second optimization parameter;
and if the first optimization parameter does not meet the specification requirement, alarming to be abnormal.
In some embodiments, the adjusting the initial parameter to obtain a first optimized parameter includes:
doubling the number of the taskmanagers in the initial parameter, and keeping the slot number of each TaskManager in the initial parameter and the number of cpu cores used by each TaskManager unchanged;
when the throughput in the index parameter reaches the specification requirement, obtaining the first optimization parameter;
and when the throughput in the index parameters does not meet the specification requirement and the number of the TaskManagers reaches the upper limit, doubling the slot number of each TaskManager and the number of CPU cores used by each TaskManager until the throughput reaches the specification requirement, and obtaining the first optimization parameter.
In some embodiments, the further optimizing the first optimization parameter to obtain a second optimization parameter includes:
halving the number of the TaskManagers in the first optimization parameter by a bisection method, and doubling the slot number on each TaskManager and the number of cpu cores used by each TaskManager in the first optimization parameter until the number of the cpu cores exceeds an upper limit;
when the number of the cpu cores exceeds an upper limit and the throughput in the index parameter reaches the specification requirement, the yarn cluster cannot provide more application resources, and the first optimization parameter which exceeds the upper limit last time is taken as the second optimization parameter.
In some of these embodiments, the method further comprises:
and under the condition that the service requirement can be met, halving the number of cpu cores used by each task manager in the second optimization parameter by a bisection method, and then adaptively adjusting the memory used by each task manager in the second optimization parameter to meet the service requirement by using the minimum memory to obtain a third optimization parameter.
In some of these embodiments, the method further comprises:
and when the index parameters meet the specification requirements, optimizing the initial parameters according to a corresponding preset adjustment strategy to obtain fourth optimized parameters.
In some embodiments, the optimizing the initial parameter to obtain a fourth optimized parameter includes:
halving the number of the TaskManagers in the initial parameters by a bisection method, and doubling the slot number on each TaskManager and the number of cpu cores used by each TaskManager in the initial parameters until the number of the cpu cores exceeds an upper limit;
when the number of cpu cores exceeds the upper limit and the throughput in the index parameter reaches the specification requirement, the yarn cluster cannot provide more application resources, and the initial parameter which exceeds the upper limit last time is used as the fourth optimization parameter.
In some of these embodiments, the method further comprises:
and under the condition that the service requirement can be met, halving the number of cpu cores used by each task manager in the fourth optimization parameter by a bisection method, and then adaptively adjusting the memory used by each task manager in the fourth optimization parameter to meet the service requirement by using the minimum memory to obtain a fifth optimization parameter.
In some of these embodiments, the method further comprises:
after the number of the TaskManagers is adjusted every time, comparing the memory utilization rate in the index parameter with a preset value;
if the memory utilization rate exceeds the preset value, doubling the memory used by each TaskManager;
and if the memory utilization rate does not exceed the preset value, keeping the memory used by each TaskManager unchanged.
In some of these embodiments, the method further comprises:
acquiring a training sample set regularly according to real data acquired in an actual application scene;
and correspondingly configuring the training sample set by the method, and identifying the target according to the training sample set.
In a second aspect, in this embodiment, a flight parameter optimization system based on yarn scheduling is provided, including:
the resource application module is used for acquiring a task request and starting the flink application through a horn cluster according to the task request;
the parameter acquisition module is used for acquiring index parameters in the flink application running process in real time;
and the dynamic optimization module is used for dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition that whether the index parameters can meet the specification requirements.
In a third aspect, in this embodiment, there is provided a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method for optimizing a flight parameter based on a yarn schedule according to the first aspect is implemented.
In a fourth aspect, in the present embodiment, there is provided a storage medium, on which a computer program is stored, and the program, when executed by a processor, implements the method for optimizing a flight parameter based on a yarn scheduling according to the first aspect.
Compared with the related art, the method, the system, the equipment and the storage medium for flight parameter optimization based on yarn scheduling provided by the embodiment comprise the steps of obtaining a task request, and starting the flight application through a yarn cluster according to the task request; acquiring index parameters in the flight application running process in real time; and dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition whether the index parameters can meet the specification requirements. By deploying the flink application on the yarn cluster, dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy and scheduling the flink application resources by the yarn cluster, the problem that the parameters cannot be dynamically optimized according to the operation condition is solved, and the effect of dynamically optimizing the operation parameters according to the actual operation condition is realized.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a block diagram of a hardware architecture of a method for flight parameter optimization based on yarn scheduling in one embodiment;
FIG. 2 is a flow diagram of a method for flight parameter optimization based on yarn scheduling in one embodiment;
FIG. 3 is a flow chart of a corresponding predetermined adjustment strategy when the indicator parameter does not meet the specification requirement in one embodiment;
FIG. 4 is a flow chart of a predetermined adjustment strategy when the index parameter meets the specification requirement according to an embodiment;
FIG. 5 is a diagram illustrating the implementation of the method for flight parameter optimization based on yarn scheduling in the preferred embodiment;
FIG. 6 is a block diagram of a flight parameter optimization system based on yarn scheduling in one embodiment.
In the figure: 10. a resource application module; 20. a parameter acquisition module; 30. and a dynamic optimization module.
Detailed Description
For a clearer understanding of the objects, aspects and advantages of the present application, reference is made to the following description and accompanying drawings.
Unless defined otherwise, technical or scientific terms used herein shall have the same general meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The use of the terms "a" and "an" and "the" and similar referents in the context of this application do not denote a limitation of quantity, either in the singular or the plural. The terms "comprises," "comprising," "has," "having," and any variations thereof, as referred to in this application, are intended to cover non-exclusive inclusions; for example, a process, method, and system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or modules, but may include other steps or modules (elements) not listed or inherent to such process, method, article, or apparatus. Reference throughout this application to "connected," "coupled," and the like is not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. Reference to "a plurality" in this application means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. In general, the character "/" indicates a relationship in which the objects associated before and after are an "or". The terms "first," "second," "third," and the like in this application are used for distinguishing between similar items and not necessarily for describing a particular sequential or chronological order.
The method embodiments provided in the present embodiment may be executed in a terminal, a computer, or a similar computing device. For example, the method is executed on a terminal, and fig. 1 is a block diagram of a hardware structure of the terminal of the method for optimizing a flink parameter based on yann scheduling according to this embodiment. As shown in fig. 1, the terminal may include one or more processors 102 (only one shown in fig. 1) and a memory 104 for storing data, wherein the processor 102 may include, but is not limited to, a processing device such as a microprocessor MCU or a programmable logic device FPGA. The terminal may also include a transmission device 106 for communication functions and an input-output device 108. It will be understood by those of ordinary skill in the art that the structure shown in fig. 1 is merely an illustration and is not intended to limit the structure of the terminal described above. For example, the terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 can be used for storing computer programs, for example, software programs and modules of application software, such as a computer program corresponding to the yann scheduling-based flink parameter optimization method in the present embodiment, and the processor 102 executes various functional applications and data processing by running the computer programs stored in the memory 104, so as to implement the above-mentioned method. The memory 104 may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory located remotely from the processor 102, which may be connected to the terminal over a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. The network described above includes a wireless network provided by a communication provider of the terminal. In one example, the transmission device 106 includes a Network adapter (NIC) that can be connected to other Network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet in a wireless manner.
The Yarn is a distributed resource management scheduling system, which is used to improve the resource utilization rate in a distributed cluster environment, wherein the utilized resources include a memory, an IO, a network, a disk, and the like. The resource scheduling granularity in Yarn is container, and according to the received application resource application, resource manager in Yarn runs container in node manager, and at the same time starts application manager in container, and the application manager performs specific resource configuration on the application.
In this embodiment, a method for optimizing a flink parameter based on yarn scheduling is provided, and fig. 2 is a flowchart corresponding to this embodiment, and as shown in fig. 2, the flowchart includes the following steps:
and step S210, acquiring a task request, and starting the flink application through a yarn cluster according to the task request.
Specifically, a flink client submits a task request to the yann cluster and applies for application resources, after the task request is obtained, resource manager in the yann cluster starts ApplicationManager in an allocated container, and then according to the task request and the applied resources, the JobManager and TaskManager of the flink application are started in the ApplicationManager. In actual operation, tasks are assigned to the TaskManagers by the JobManagers, and each TaskManager executes a task.
It should be noted that, because the flink application is deployed on the yarn cluster, and runs through the virtual machine, when no task resource is applied to start the flink application, the flink application does not occupy physical machine resources, so that the resources can be more fully utilized.
And step S220, acquiring index parameters in the flink application running process in real time.
Specifically, the index parameters include throughput and memory usage, where the throughput refers to the amount of data successfully transmitted in unit time, and the memory usage refers to the memory usage on the TaskManager in the flink application, and the obtained real-time index parameters are stored in the time sequence database influxdb.
Further, according to the actual application requirement, the index parameter may further include a gc fraction and a checkpoint consumption time, where the gc fraction is related to the memory usage rate, and the checkpoint consumption time is related to the throughput.
And step S230, dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjusting strategy based on the condition whether the index parameters can meet the specification requirements.
Specifically, the specification requirement refers to the required throughput, and is usually preset according to the user requirement. Based on the condition that whether the index parameters can meet the specification requirements, namely based on whether the real-time throughput in the flight application running process meets the preset throughput, then dynamically optimizing the initial parameters of the flight application according to corresponding different preset adjustment strategies. The preset adjustment strategy specifically includes: the initial parameters are adjusted firstly to meet the specification requirements, then the initial parameters of all the flink applications are adjusted in a linkage manner, and under the condition that the specification requirements are still met, the utilization rate of the flink application resources is gradually improved, and the consumption of memory resources is reduced. The initial parameter refers to a first operation parameter when the yarn cluster starts the flink application according to the task request for the first time.
Through the steps, the embodiment acquires the task request, and starts the flink application through the yarn cluster according to the task request; acquiring index parameters in the flink application running process in real time; and dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition whether the index parameters can meet the specification requirements.
The existing flink application resource configuration is that a fixed configuration strategy is set by artificially predicting the resource condition possibly needed before the flink application runs, so that the parameters can not be correspondingly adjusted according to different application environments or actual running conditions in the running process of the flink application. The embodiment provides effective supplement on the basis of the prior art, index parameters in the flink application running process are obtained in real time, corresponding different adjustment strategies are preset to dynamically optimize initial parameters based on the real-time index parameters, and the problem that the parameters cannot be dynamically optimized according to the running condition in the prior art is solved.
In some embodiments, the initial parameters include the number of taskmanagers, the number of slots on each TaskManager, the number of cpu cores used by each TaskManager, and the memory used by each TaskManager.
Specifically, the initial parameter setting includes four taskmanagers, each TaskManager is allocated with a slot, each TaskManager uses a cpu core, and each TaskManager uses a 2G memory. And the yarn cluster performs resource configuration on the flink application according to the initial parameter setting in the task request for the first time, and starts the flink application. In other embodiments, the initial parameters may also be configured according to the application environment, which is not limited herein.
Each task manager in the flink application is a JVM process, and one or more threads can be executed in the task manager, where a slot is a slot number, and specifically, how many threads a task manager can receive is controlled by controlling the slot number. In operation, the flink application distributes the management memory to each slot, and the initial parameters are adjusted in a linkage manner, so that the specification requirements are met under the condition of using the minimum memory, the flink application is operated in a mode of few processes (number of taskmanagers) and multiple threads (number of slots), and reasonable and optimized resource configuration and performance are realized.
Further, since the flink application is deployed on the yarn cluster, it runs through a virtual machine. In the yarn scheduling environment, each TaskManager runs in one container, so the number of cpu cores used by each TaskManager can be configured by the yarn.
In some embodiments, based on whether the index parameter can meet the specification requirement, the initial parameter of the flink application is dynamically optimized according to a corresponding preset adjustment policy, and when the index parameter does not meet the specification requirement, the corresponding preset adjustment policy is as shown in fig. 3, and the method includes the following steps:
and S310, when the index parameter does not meet the specification requirement, adjusting the initial parameter according to a corresponding preset adjustment strategy.
Specifically, when the throughput in the index parameter does not reach the throughput required by the specification, the number of the taskmanagers in the initial parameter is doubled, and meanwhile, the slot number on each TaskManager in the initial parameter and the number of cpu cores used by each TaskManager are kept unchanged.
It should be noted that, in the yarn scheduling environment, each time the operation parameter is adjusted, the flight application resubmits the task request and applies for the resource, and each time the yarn cluster releases the previously applied application, the flight application is reallocated with the number of TaskManager and other resources.
Further, after the number of the TaskManagers is adjusted each time, comparing the memory utilization rate in the index parameter with a preset value;
if the memory utilization rate exceeds a preset value, doubling the memory used by each TaskManager;
and if the utilization rate of the memory does not exceed the preset value, keeping the memory used by each TaskManager unchanged.
Specifically, the memory usage rate of the preset value is 70%, and the memory usage rate of each TaskManager changes after the number of the taskmanagers is adjusted each time. If the memory utilization rate exceeds 70%, doubling the memory used by each TaskManager, and if the memory utilization rate does not exceed 70%, keeping the memory unchanged. In other embodiments, the memory usage rate may be other values, such as: 60%, 80%, 90%, etc., without limitation.
Further, the method also comprises the following steps:
and step S311, when the throughput in the index parameters meets the specification requirement, obtaining a first optimization parameter.
Specifically, if the throughput can already meet the specification requirement after the number of taskmanagers is adjusted in step S310, the adjusted initial parameter is used as the first optimization parameter.
Step S312, when the throughput in the index parameters does not meet the specification requirement and the number of the taskmanagers reaches the upper limit, the slot number of each TaskManager and the number of cpu cores used by each TaskManager are doubled until the throughput reaches the specification requirement, and a first optimization parameter is obtained.
Specifically, if only the number of taskmanagers is adjusted in step S310 above and cannot meet the specification requirement, and the number of taskmanagers has reached the upper resource limit of the yann cluster, the slot number on each TaskManager and the number of cpu cores used by each TaskManager are further doubled until the throughput reaches the specification requirement. And adjusting the initial parameters through the steps to increase the throughput during the operation, and assuming that the throughput set by a user in the specification requirement is 3000bit/s, until the throughput during the operation reaches 3000bit/s, indicating that the throughput reaches the specification requirement, so as to obtain a first optimized parameter. In practice, the throughput in the specification requirements can be set autonomously according to the user requirements.
The yarn cluster resources comprise CPU core number, memory and other resources and are limited by the condition of physical machine deployment in an application field.
Step S320, if the first optimized parameter meets the specification requirement, further optimizing the first optimized parameter to obtain a second optimized parameter.
Specifically, after the above steps S310 and S312, if the throughput achieved by currently operating the first optimization parameter can reach the throughput set by the user in the specification requirement, which indicates that the first optimization parameter meets the specification requirement, the first optimization parameter is further optimized to obtain the second optimization parameter, which includes the following steps:
halving the number of the TaskManagers in the first optimization parameter by a bisection method, and doubling the slot number of each TaskManager and the number of cpu cores used by each TaskManager in the first optimization parameter until the number of the cpu cores exceeds the upper limit;
when the number of cpu cores exceeds the upper limit and the throughput in the index parameter meets the specification requirement, the yarn cluster cannot provide more application resources, and the first optimization parameter which exceeds the upper limit last time is used as the second optimization parameter.
Specifically, the number of the task managers is reduced, so that more slot threads can be run on one task manager at the same time, and the resource utilization rate of each task manager is improved. Further, after the number of the taskmanagers is adjusted each time, the memory usage rate in the index parameter needs to be compared with a preset value, and then the memory used by each TaskManager is doubled or kept unchanged.
And step S330, if the first optimization parameter does not meet the specification requirement, alarming to be abnormal.
Specifically, if the parameter adjustment in the above steps is performed, the first optimization parameter still does not meet the specification requirement, that is, it is explained that the current yarn cluster resource is insufficient, the resource of the whole yarn cluster is insufficient to make the actual running throughput reach the throughput set by the user in the specification requirement, and the flink application cannot further apply for the resource, then an alarm is given to throw the exception.
Further, the above strategy further comprises the following steps:
step S340, under the condition that the service requirement can be met, halving the number of cpu cores used by each TaskManager in the second optimization parameter by a bisection method, and then adaptively adjusting the memory used by each TaskManager in the second optimization parameter, so as to meet the service requirement by using the minimum memory, thereby obtaining a third optimization parameter.
Specifically, after the steps S310 to S330, at this time, the second optimization parameter already meets the specification requirement, and then each TaskManager can further meet the service requirement of the actual application under the condition of using the minimum number of cpu cores, and the service requirement is specifically limited by the actual application scene, for example, in the application scene of target identification, the service values such as the identification time, the identification rate, and the identification accuracy rate are all required to be not lower than the preset lower bound, and if in the actual application scene, the service value can be at least reached to the preset lower bound when the third optimization parameter is run, it is indicated that the service requirement can be reached. And simultaneously, according to the memory usage amount required by a user, the memory used by each TaskManager is increased or reduced, and the resource utilization rate of the flink application is improved by running slots (threads) as many as possible on each TaskManager (process).
According to the embodiment, under the condition that the real-time index parameter of the flink application operated by the initial parameter does not meet the specification requirement, the first optimization parameter and the second optimization parameter are obtained based on the specification requirement according to the corresponding preset adjustment strategy, and then the third optimization parameter is obtained based on the service requirement, so that the initial parameter can be dynamically optimized according to the index parameter in the actual operation process of the application under the condition that the third optimization parameter meets the specification requirement and the service requirement, and simultaneously, the yann cluster resource can be maximally utilized.
In some embodiments, based on whether the index parameter can meet the specification requirement, the initial parameter of the flink application is dynamically optimized according to a corresponding preset adjustment policy, and when the index parameter meets the specification requirement, as shown in fig. 4, the corresponding preset adjustment policy includes the following steps:
and step S410, when the index parameters meet the specification requirements, optimizing the initial parameters according to a corresponding preset adjustment strategy to obtain fourth optimized parameters.
Specifically, the number of the taskmanagers in the initial parameters is halved through a bisection method, and the slot number on each TaskManager and the number of cpu cores used by each TaskManager in the initial parameters are doubled at the same time until the number of the cpu cores exceeds the upper limit;
when the number of cpu cores exceeds the upper limit and the throughput in the index parameter meets the specification requirement, the yarn cluster cannot provide more application resources, and the initial parameter which exceeds the upper limit last time is used as a fourth optimization parameter. The fact that the throughput in the index parameter meets the specification requirement means that when the flink application is operated by the optimized parameter, the actual operating throughput can reach the throughput set by the user in the specification requirement.
Further, after the number of the taskmanagers is adjusted each time, the memory usage rate in the index parameter needs to be compared with a preset value, and then the memory used by each TaskManager is doubled or kept unchanged.
And step S420, under the condition that the service requirement can be met, halving the number of CPU cores used by each TaskManager in the fourth optimization parameter through a bisection method, and then adaptively adjusting the memory used by each TaskManager in the fourth optimization parameter to meet the service requirement by using the minimum memory to obtain a fifth optimization parameter.
Specifically, the service requirement is specifically defined by an actual application scenario, wherein the service requirement further includes a service value of a specific service, and a lower bound or an upper bound is required for the service value, and if the service value can reach at least a preset lower bound or upper bound when the fifth optimization parameter is run in the actual application scenario, it indicates that the service requirement can be reached.
According to the embodiment, under the condition that the real-time index parameter of the flink application operated by the initial parameter meets the specification requirement, the fourth optimization parameter is obtained based on the specification requirement according to the corresponding preset adjustment strategy, and then the fifth optimization parameter is obtained based on the service requirement, so that the dynamic optimization of the initial parameter can be realized according to the index parameter in the actual operation process of the application under the condition that the fifth optimization parameter meets the specification requirement and the service requirement, and simultaneously the yann cluster resource can be maximally utilized.
In some of these embodiments, the method further comprises the steps of:
step S510, obtaining a training sample set at regular time according to real data collected in an actual application scene;
specifically, the upper limit of the training sample set is set, so that the training sample set is prevented from being updated in a circulating coverage mode after the training sample set occupies too much resources and reaches a sample threshold value, the acquired training sample set can be updated in real time, and the application environment can be better simulated.
Further, according to different data sources of the sample set, including data sources such as kafka, rabbitmq and Hbase, the sample set of the same data source is stored in the same data source module.
And S520, correspondingly configuring the training sample set through the method, and identifying the target according to the training sample set.
Specifically, the sample sets, such as vehicle data samples, face data samples, and Media Access Control (MAC) data samples, are distinguished according to different field application environments, and the training sample sets are configured accordingly by the above method, and the operation parameter optimization in the target recognition task is performed in the field application environment according to the training sample sets.
Further, when the training sample is not completely collected, the real data is firstly used for target recognition, and the training sample is waited to be collected completely.
According to the method, the training sample set can be obtained at regular time according to real data collected by an actual application scene, and when the method is used for switching tasks among different application scenes, the method can be used for executing operation parameter optimization in target recognition tasks in different application scenes only by carrying out corresponding configuration according to the training sample set.
The present embodiment is described and illustrated below by means of preferred embodiments.
The preferred embodiment provides a method for flight parameter optimization based on yarn scheduling, and fig. 5 is a corresponding method implementation diagram, where the method includes the following steps:
step S610, acquiring a task request, and starting the flink application through the yarn cluster according to the task request.
And step S620, acquiring a training sample set at regular time according to the real data acquired in the actual application scene.
And step S630, correspondingly configuring the training sample set through the method.
And step S640, acquiring index parameters in the flink application running process in real time, and storing the index parameters into the ordinal database influxdb.
And step S650, dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjusting strategy based on the condition whether the index parameters can meet the specification requirements.
Specifically, the above-mentioned corresponding preset adjustment strategy is exemplified as described in all the above-mentioned embodiments.
And step S660, sending the corresponding optimized parameters obtained through dynamic optimization to the flink client, and resubmitting the task request by the flink client according to the optimized parameters.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here. For example, the sequence of step S610 and step S620 may be interchanged, and the training sample set is obtained at a certain timing, and then the flink application is started through the yarn cluster according to the task request.
The present embodiment further provides a flight parameter optimization system based on yarn scheduling, which is used to implement the foregoing embodiments and preferred embodiments, and the description already made is omitted for brevity. The terms "module," "unit," "subunit," and the like as used below may implement a combination of software and/or hardware for a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
Fig. 6 is a structural block diagram of a flash parameter optimization system based on yarn scheduling in this embodiment, as shown in fig. 6, the system includes a resource application module 10, a parameter acquisition module 20, and a dynamic optimization module 30, where:
the resource application module 10 is configured to obtain a task request, and start a flink application through a yarn cluster according to the task request;
the parameter acquisition module 20 is used for acquiring index parameters in the flink application running process in real time;
and the dynamic optimization module 30 is configured to dynamically optimize the initial parameter of the flink application according to a corresponding preset adjustment strategy based on whether the index parameter can meet the specification requirement.
According to the system provided by the embodiment, the flink application is started through the yarn cluster according to the task request, the index parameters in the running process of the flink application are obtained in real time, corresponding different adjustment strategies are preset to dynamically optimize the initial parameters based on the real-time index parameters, and the problem that the parameters cannot be dynamically optimized according to the running condition in the prior art is solved.
The above modules may be functional modules or program modules, and may be implemented by software or hardware. For a module implemented by hardware, the above modules may be located in the same processor; or the modules can be respectively positioned in different processors in any combination.
It should be noted that, for specific examples in this embodiment, reference may be made to the examples described in the foregoing embodiments and optional implementations, and details are not described again in this embodiment.
In addition, in combination with the method for optimizing a flight parameter based on yarn scheduling provided in the foregoing embodiment, a storage medium may also be provided in this embodiment to implement the method. The storage medium having stored thereon a computer program; the computer program, when executed by a processor, implements any of the method for flight parameter optimization based on yarn scheduling in the above embodiments.
It should be understood that the specific embodiments described herein are merely illustrative of this application and are not intended to be limiting. All other embodiments, which can be derived by a person skilled in the art from the examples provided herein without inventive step, shall fall within the scope of protection of the present application.
It is obvious that the drawings are only examples or embodiments of the present application, and it is obvious to those skilled in the art that the present application can be applied to other similar cases according to the drawings without creative efforts. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
The term "embodiment" is used herein to mean that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is to be expressly or implicitly understood by one of ordinary skill in the art that the embodiments described in this application may be combined with other embodiments without conflict.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the patent protection. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims (14)

1. A method for optimizing flink parameters based on yarn scheduling is characterized by comprising the following steps:
acquiring a task request, and starting a flink application through a yarn cluster according to the task request;
acquiring index parameters in the flink application running process in real time;
and dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition whether the index parameters can meet the specification requirements.
2. The method of flying parameter optimization based on yarn scheduling of claim 1, wherein the initial parameters include the number of taskmanagers, the number of slots on each TaskManager, the number of cpu cores used by each TaskManager, and the memory used by each TaskManager.
3. The method for optimizing flink parameters based on yarn scheduling as claimed in claim 1, wherein said dynamically optimizing initial parameters applied to said flink according to a corresponding preset adjustment strategy based on whether said index parameters can meet specification requirements comprises:
when the index parameter does not meet the specification requirement, adjusting the initial parameter according to a corresponding preset adjustment strategy to obtain a first optimized parameter;
if the first optimization parameter meets the specification requirement, further optimizing the first optimization parameter to obtain a second optimization parameter;
and if the first optimization parameter does not meet the specification requirement, alarming to be abnormal.
4. The method of claim 3, wherein the adjusting the initial parameter to obtain a first optimized parameter comprises:
doubling the number of the TaskManagers in the initial parameters, and keeping the slot number on each TaskManager in the initial parameters and the number of cpu cores used by each TaskManager unchanged;
when the throughput in the index parameter reaches the specification requirement, obtaining the first optimization parameter;
and when the throughput in the index parameters does not meet the specification requirement and the number of the taskmanagers reaches the upper limit, doubling the slot number of each TaskManager and the number of cpu cores used by each TaskManager until the throughput reaches the specification requirement, and obtaining the first optimization parameter.
5. The method of claim 3, wherein the further optimizing the first optimization parameter to obtain a second optimization parameter comprises:
halving the number of the TaskManagers in the first optimization parameter by a bisection method, and doubling the slot number on each TaskManager and the number of cpu cores used by each TaskManager in the first optimization parameter until the number of the cpu cores exceeds an upper limit;
when the number of the cpu cores exceeds an upper limit and the throughput in the index parameter reaches the specification requirement, the yarn cluster cannot provide more application resources, and the first optimization parameter which exceeds the upper limit last time is taken as the second optimization parameter.
6. The method of flying parameter optimization based on yann scheduling of claim 3, further comprising:
and under the condition that the service requirement can be met, halving the number of cpu cores used by each task manager in the second optimization parameter by a bisection method, and then adaptively adjusting the memory used by each task manager in the second optimization parameter to meet the service requirement by using the minimum memory to obtain a third optimization parameter.
7. The method for optimizing flink parameters based on yarn scheduling as claimed in claim 1, wherein said dynamically optimizing initial parameters applied to said flink according to a corresponding preset adjustment policy based on whether said index parameters can meet specification requirements further comprises:
and when the index parameters meet the specification requirements, optimizing the initial parameters according to a corresponding preset adjustment strategy to obtain fourth optimized parameters.
8. The method of claim 7, wherein the optimizing the initial parameters to obtain fourth optimized parameters comprises:
halving the number of the TaskManagers in the initial parameters by a bisection method, and doubling the slot number on each TaskManager and the number of cpu cores used by each TaskManager in the initial parameters until the number of the cpu cores exceeds an upper limit;
when the number of cpu cores exceeds the upper limit and the throughput in the index parameter reaches the specification requirement, the yarn cluster cannot provide more application resources, and the initial parameter which exceeds the upper limit last time is used as the fourth optimization parameter.
9. The method of flying parameter optimization based on yann scheduling of claim 7, further comprising:
and under the condition that the service requirement can be met, halving the number of cpu cores used by each task manager in the fourth optimization parameter by a bisection method, and then adaptively adjusting the memory used by each task manager in the fourth optimization parameter to meet the service requirement by using the minimum memory to obtain a fifth optimization parameter.
10. The method for flight parameter optimization based on yarn scheduling according to any one of claims 4, 5 or 8, further comprising:
after the number of the TaskManagers is adjusted every time, comparing the memory utilization rate in the index parameter with a preset value;
if the memory utilization rate exceeds the preset value, doubling the memory used by each TaskManager;
and if the memory utilization rate does not exceed the preset value, keeping the memory used by each TaskManager unchanged.
11. The method of flying parameter optimization based on yann scheduling of claim 1, further comprising:
acquiring a training sample set at regular time according to real data acquired in an actual application scene;
and correspondingly configuring the training sample set by the method, and identifying the target according to the training sample set.
12. A flight parameter optimization system based on yarn scheduling, comprising:
the resource application module is used for acquiring a task request and starting the flink application through a horn cluster according to the task request;
the parameter acquisition module is used for acquiring index parameters in the flink application running process in real time;
and the dynamic optimization module is used for dynamically optimizing the initial parameters of the flink application according to a corresponding preset adjustment strategy based on the condition that whether the index parameters can meet the specification requirements.
13. A computer device comprising a memory and a processor, wherein the memory has stored therein a computer program, and the processor is configured to execute the computer program to perform the method for flying parameter optimization based on yann scheduling of any one of claims 1 to 11.
14. A computer-readable storage medium, having stored thereon a computer program, wherein the computer program, when being executed by a processor, implements the steps of the method for flying parameter optimization based on yann scheduling of any one of claims 1 to 11.
CN202210168517.6A 2022-02-23 2022-02-23 Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling Pending CN114610441A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210168517.6A CN114610441A (en) 2022-02-23 2022-02-23 Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210168517.6A CN114610441A (en) 2022-02-23 2022-02-23 Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling

Publications (1)

Publication Number Publication Date
CN114610441A true CN114610441A (en) 2022-06-10

Family

ID=81859373

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210168517.6A Pending CN114610441A (en) 2022-02-23 2022-02-23 Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling

Country Status (1)

Country Link
CN (1) CN114610441A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328667A (en) * 2022-10-18 2022-11-11 杭州比智科技有限公司 System and method for realizing task resource elastic expansion based on flink task index monitoring

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115328667A (en) * 2022-10-18 2022-11-11 杭州比智科技有限公司 System and method for realizing task resource elastic expansion based on flink task index monitoring

Similar Documents

Publication Publication Date Title
JP7192103B2 (en) DATA PROCESSING METHOD AND APPARATUS, AND COMPUTING NODE
Rodriguez et al. Towards the deployment of a fully centralized Cloud-RAN architecture
CN110119307B (en) Data processing request processing method and device, storage medium and electronic device
CN111767146A (en) Distributed machine learning system acceleration method based on network reconfiguration
CN111258746A (en) Resource allocation method and service equipment
US20230037783A1 (en) Resource scheduling method and related apparatus
US20240202024A1 (en) Thread processing methods, scheduling component, monitoring component, server, and storage medium
CN114610441A (en) Method, system, equipment and storage medium for flight parameter optimization based on yarn scheduling
US20240152395A1 (en) Resource scheduling method and apparatus, and computing node
CN109951311B (en) Method, device, equipment and storage medium for network slice instantiation
CN112398664B (en) Main device selection method, device management method, electronic device and storage medium
WO2020108337A1 (en) Cpu resource scheduling method and electronic equipment
CN107908730B (en) Method and device for downloading data
CN113301087B (en) Resource scheduling method, device, computing equipment and medium
CN112380001A (en) Log output method, load balancing device and computer readable storage medium
CN107493485B (en) Resource control method and device and IPTV server
CN112231223A (en) Distributed automatic software testing method and system based on MQTT
CN108667920B (en) Service flow acceleration system and method for fog computing environment
CN111162942A (en) Cluster election method and system
US20080165770A1 (en) Transmitting and receiving method and apparatus in real-time system
CN111309467B (en) Task distribution method and device, electronic equipment and storage medium
CN111245794B (en) Data transmission method and device
CN110209475B (en) Data acquisition method and device
US10877800B2 (en) Method, apparatus and computer-readable medium for application scheduling
CN106470228B (en) Network communication method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination