CN113220436A - Universal batch operation execution method and device under distributed environment - Google Patents

Universal batch operation execution method and device under distributed environment Download PDF

Info

Publication number
CN113220436A
CN113220436A CN202110588206.0A CN202110588206A CN113220436A CN 113220436 A CN113220436 A CN 113220436A CN 202110588206 A CN202110588206 A CN 202110588206A CN 113220436 A CN113220436 A CN 113220436A
Authority
CN
China
Prior art keywords
scheduling
batch
batch job
scheduler
list
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110588206.0A
Other languages
Chinese (zh)
Inventor
杜海亮
李偲伟
刘洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial and Commercial Bank of China Ltd ICBC
ICBC Technology Co Ltd
Original Assignee
Industrial and Commercial Bank of China Ltd ICBC
ICBC Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial and Commercial Bank of China Ltd ICBC, ICBC Technology Co Ltd filed Critical Industrial and Commercial Bank of China Ltd ICBC
Priority to CN202110588206.0A priority Critical patent/CN113220436A/en
Publication of CN113220436A publication Critical patent/CN113220436A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/48Indexing scheme relating to G06F9/48
    • G06F2209/486Scheduler internals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention can be used in the technical field of big data, and provides a method and a device for executing general batch jobs in a distributed environment, wherein the method for executing general batch jobs in the distributed environment comprises the following steps: receiving a scheduling parameter and a scheduling request sent by a scheduler; generating a scheduling list according to the scheduling parameters; and responding to the scheduling request, and executing the general batch job according to the scheduling list. The universal batch job execution method and the universal batch job execution device under the distributed environment adopt the unique separation design of the scheduler and the actuator, the scheduling arrangement is elaborately designed, the scheduler does not play a role in scheduling and commanding brain any more, the actuator can automatically generate a scheduling task at the running time, the complexity of batch run system design can be effectively reduced, and the technical defects of the prior technical scheme are effectively overcome.

Description

Universal batch operation execution method and device under distributed environment
Technical Field
The invention belongs to the technical field of big data, particularly relates to service invocation of a distributed system, and particularly relates to a method and a device for executing general batch jobs in a distributed environment.
Background
In the prior art, in an application system of batch operation based on data processing, a storage process and task scheduling related to a database system or a third-party scheduling system are adopted for implementation, and special ETL software is also adopted for implementation, so that the scheduling actuator has specificity and is not strong in universality and expansibility. How to design a simple and easy-to-use general framework which can be applied to most batch execution scenes has important significance for the application of batch operation. Specifically, the existing batch operation scenario generally has two technical routes:
(1) batch running system composed of database system storage process and job scheduling chain
In data processing-centric application systems, batch runs are usually implemented by relying on the storage process and JOB scheduling chain of the database system itself. FIG. 1 depicts a database-centric batch computing system, relying on the scheduling and computing capabilities of the database itself.
An operation unit: the role of acting as an executor is typically fulfilled using stored procedures. One storage process completes a certain work, the storage process receives parameters to realize variable computing capacity, different computing units are completed by different storage process groups, and each operating unit is preset and cannot interchange execution capacity.
JOB chain scheduling: and finishing the functions of executing sequence arrangement, executing control and executing scheduling on the execution unit. The execution dependency relationship between the execution units is defined in advance by the JOB chain, and once defined, the scheduling can be completed according to the preset logic in the running stage.
The technology is mainly characterized in that a scheduler plays a role in task scheduling capability, an execution node only completes a certain independent function, scheduling and execution are realized by calling an internal API (application program interface), and the scheduling and execution are in a tightly coupled state.
(2) Batch system composed of special scheduler and special executor
Fig. 2 illustrates a batch operation architecture with a scheduler as a core, and the main part is composed of three parts, namely a scheduler, an executor and a JOB scheduling flow configuration.
A scheduler: acting as a command center for overall batch scheduling, the order of execution between jobs, when each job executes on that executor, and needs to be controlled by the scheduler. The scheduler uses a special storage space to complete the configuration and storage of the JOB scheduling process, and the scheduler is equivalent to a brain control center of the whole batch task;
configuring a JOB scheduling flow: generally, the scheduling order is stored in a database or file mode, the scheduling order is used for storing each pre-designed and defined JOB and the scheduling order among the JOBs, and in order to prevent dead cycles of scheduling JOBs, a plurality of JOB configuration requirements are designed into directed acyclic graph modes;
an actuator: the execution type of the JOB to be executed is preset in each executor, the executor is registered in the scheduler during operation, and the executor receives the instruction of the scheduler to complete the execution of a JOB.
The scheduler and the executor of the scheme adopt special interfaces for communication, and the interfaces between the scheduler and the executor are generally closed.
The prior art has the following disadvantages: for two main scheduling systems at present, the disadvantages of the two methods are analyzed as follows:
(1) batch running system composed of database system storage process and job scheduling chain
The system has great dependence on the database, and the main problems are that:
scheduling and execution are tightly coupled, and expansibility is poor: in the scheme, a general executor is not provided, but a database-dependent technology is adopted to realize each operation logic, a scheduling task chain can only schedule tasks related to the database, and the scheduling task chain is difficult to expand to the field of non-databases, for example, if other non-data processing links exist in the whole batch running task, other implementation is needed to cooperate with a database system, and the expansibility is poor.
The scheduling logic is complex, and the technical conversion of the service capability is difficult: the scheduling method is only suitable for a simple scheduling scene, if the dependency relationship of scheduling jobs is complex, a scheduling chain needs to be arranged and designed in a large quantity, and the scheduling is difficult to adjust and change.
Not suitable for distributed batch computation scenarios: because the database system is a centralized system, the core operational capability is completely centralized on the database server, the operational capability can be generally realized only by upgrading the server configuration or optimizing the database system, the lateral expansion of the operational capability is difficult to be completed by configuring additional operational hosts, and the batch task is difficult to be expanded into a plurality of hosts or a plurality of database systems.
Heterogeneous database environment scheduling is difficult to support: because the scheduling and execution of the database system can not be separated, the mode is difficult to support unified scheduling for the environment of a plurality of sets of heterogeneous database systems involved in batch running operation;
monitoring of batch operations is difficult: different monitoring storage tables or log records need to be designed for each batch, which adds additional work to the batch designer.
(2) Batch system composed of special scheduler and special executor
Compared with a batch running system based on a database storage process and a job scheduling chain, the professional scheduler system has obvious advantages in the aspects of dealing with distributed batches, supporting heterogeneous database environments and the like, but has obvious problems, and mainly comprises the following steps:
the scheduler and the executor belong to exclusive states, and the openness is poor: the two schedulers are not opened, which means that the adaptive range of the scheduler is limited, the two schedulers can only be used in a matched manner, and the scheduler and the executor cannot be flexibly used in a matched manner;
the job scheduling design is complex: since the scheduling of jobs is completely controlled by the scheduler, in a distributed environment, designers need to consider scenarios such as whether different jobs must run on the same machine. The designer must manually design the scheduling logic of the directed acyclic graph, so that the design difficulty is increased;
the parallel execution control is complicated: the scheduler can only execute the parallel logic according to the path of the directed acyclic graph, and cannot control the concurrency degree of each execution path, which may cause the situation that the computing resources cannot be fully utilized, and also cannot display and dynamically adjust the parallelism degree of each path.
Service expansion is difficult: the scheduler can only execute the operation types of the current executor, and for the services with complex services and strong correlation, the services are required to be split into the existing operation types, so that the expansion difficulty is high, and even if a development interface is reserved, the development difficulty is high;
the idempotency of operation execution is difficult to realize, and breakpoint run-through is difficult to realize when a task is interrupted: because the job scheduling plans are distributed by the schedulers, the job scheduling plans are generally of a complex tree structure, when the job execution fails, the new execution cannot be recovered to the current host computer for running, and the breakpoint continuous running is difficult to realize.
Disclosure of Invention
The invention belongs to the technical field of big data, and provides a universal batch job execution method and a universal batch job execution device under a distributed environment.
In order to solve the technical problems, the invention provides the following technical scheme:
in a first aspect, the present invention provides a method for executing a universal batch job in a distributed environment, including:
receiving a scheduling parameter and a scheduling request sent by a scheduler;
generating a scheduling list according to the scheduling parameters;
and responding to the scheduling request, and executing the general batch job according to the scheduling list.
In one embodiment, said executing a generic batch job according to the dispatch list in response to the dispatch request includes:
executing the generic batch job in a process or thread mode.
In one embodiment, the scheduler interface is an HTTP REST interface; the HTTP REST interface adopts a vertical return mechanism.
In one embodiment, said executing a generic batch job according to said dispatch list in response to said dispatch request further comprises:
generating a job list according to the scheduling parameters and the SDK;
and executing the general batch operation according to the operation list.
In an embodiment, the method for executing a generic batch job in a distributed environment further includes:
registering the generic batch job in the scheduler.
In a second aspect, the present invention provides a general batch job execution apparatus in a distributed environment, including:
a scheduling request receiving module, configured to receive a scheduling parameter and a scheduling request sent by a scheduler;
the scheduling list generating module is used for generating a scheduling list according to the scheduling parameters;
and the batch job execution module is used for responding to the scheduling request and executing the general batch job according to the scheduling list.
In one embodiment, the batch job execution module includes:
a batch job execution first unit for executing the generic batch job in a process or thread mode;
the scheduler interface is an HTTP REST interface; the HTTP REST interface adopts a vertical return mechanism.
In one embodiment, the batch job execution module further includes:
the job list generating unit is used for generating a job list according to the scheduling parameters and the SDK;
a batch job execution second unit configured to execute the general batch job according to the job list;
the general batch job execution method in the distributed environment further comprises the following steps:
and the batch job registration module is used for registering the general batch job in the scheduler.
In a third aspect, the present invention provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method for performing generic batch jobs in a distributed environment when executing the program.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a generic batch job execution method in a distributed environment.
As can be seen from the above description, the method and apparatus for executing a general batch job in a distributed environment according to the embodiments of the present invention first receive a scheduling parameter and a scheduling request sent by a scheduler; then, generating a scheduling list according to the scheduling parameters; and finally, responding to the scheduling request, and executing the general batch job according to the scheduling list. The invention adopts a unique method for separating the scheduler from the actuator, the scheduling arrangement is elaborately designed, the scheduler does not act as a role of a scheduling command brain any more, the actuator can automatically generate a scheduling task at the running time, the complexity of batch running system design can be effectively reduced, the defects of the prior art scheme are effectively overcome, and the method specifically comprises the following steps:
(1) the scheduler finishes the control of the step of scheduling the batch only through starting, stopping, checking and measuring four standard interfaces without issuing a task list; the role of a scheduling brain is not needed, the complexity of the scheduling design is reduced, the actuator has good openness, and a scheduling system of a third party is supported to complete batch running execution;
(2) the executor automatically generates a scheduling list through key parameters transmitted by the scheduler, provides an open framework to expand any number of tasks and jobs, and each task is sequentially executed, and the parallelism of the jobs in a single task can be controlled by an implementer to control the serial and parallel logics in the whole system, so that the sequential logics of services can be ensured through simple serial design, and the parallel speed of key execution steps can be ensured to fully utilize computing resources;
(3) the executor supports starting the whole operation in a thread or process mode to adapt to different scenes with different reliability requirements, the thread mode is suitable for scenes with frequent scheduling and short tasks, and the process mode is suitable for scenes with long running time and high reliability requirements;
(4) in a distributed environment, the idempotency of a plurality of actuator start-stop interfaces is completed through a coordinator, so that the actuators can be correctly executed in the distributed environment, and high availability and load sharing are supported;
(5) the executor ensures complex application scenes of re-execution of tasks, new parameter execution, old parameter execution, breakpoint run-through and the like through special design.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a prior art database-centric batch computing system;
FIG. 2 is a block diagram of a prior art architecture for batch operations with a scheduler as a core;
FIG. 3 is a first flowchart illustrating a method for executing a generic batch job in a distributed environment according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an internal structure of a generic batch job execution system in a distributed environment according to an embodiment of the present invention;
FIG. 5 is an internal structural view of an actuator in an embodiment of the invention;
FIG. 6 is a first flowchart illustrating a step 300 of a method for executing a generic batch job in a distributed environment according to an embodiment of the present invention;
FIG. 7 is a second flowchart illustrating step 300 of a method for executing generic batch jobs in a distributed environment according to an embodiment of the present invention;
FIG. 8 is a flowchart illustrating a second method for executing a generic batch job in a distributed environment according to an embodiment of the present invention;
FIG. 9 is a flowchart illustrating a method for executing a generic batch job in a distributed environment according to an embodiment of the present invention;
FIG. 10 is a first block diagram illustrating a generic batch job execution device in a distributed environment according to an embodiment of the present invention;
FIG. 11 is a first block diagram illustrating the structure of the batch job execution module 30 according to an embodiment of the present invention;
FIG. 12 is a second block diagram illustrating the structure of the batch job execution module 30 according to an embodiment of the present invention;
FIG. 13 is a diagram illustrating a second exemplary architecture of a generic batch job execution device in a distributed environment according to an embodiment of the present invention;
fig. 14 is a schematic structural diagram of an electronic device in an embodiment of the invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of this application and the above-described drawings, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present application will be described in detail below with reference to the embodiments with reference to the attached drawings.
An embodiment of the present invention provides a specific implementation of a general batch job execution method in a distributed environment, and referring to fig. 3, the method specifically includes the following steps:
step 100: and receiving the scheduling parameters and the scheduling request sent by the scheduler.
Specifically, the executor receives the scheduling parameters and the scheduling request sent by the scheduler, and referring to fig. 4, in an embodiment of the present invention, the scheduler, the executor and the coordinator are separated compared to the conventional scheduling batch system.
Step 200: and generating a scheduling list according to the scheduling parameters.
Referring to fig. 5, the executor in the embodiment of the present invention is composed of a scheduling interface module, a batch task generating module, and a batch task worker module, and specifically, for each batch STEP, a job list needs to be implemented in advance by an SDK opened by a developer through the executor in a development phase, and the length of the job list may be increased or decreased in a runtime phase according to parameters transmitted by the scheduler, but a task type and a job type included in the job list are designed in advance by the developer.
A STEP at least comprises a TASK, the sequence of the TASK is kept unchanged, each TASK is composed of 1 or a plurality of JOBs, each TASK is designed to be executed sequentially through a framework, the JOBs in a single TASK are designed to be executed in parallel, the parallelism of the JOBs can be specified through parameters in the operation stage, the parallelism of the JOBs can be automatically adjusted according to the resources of an arithmetic host, and when the parallelism of the JOBs in a certain TASK is specified to be 1, the JOBs in the TASK are expressed to be executed according to the sequence of JOB numbers.
Step 300: and responding to the scheduling request, and executing the general batch job according to the scheduling list.
At the running time, the executor receives the START request of the scheduler and the sent batch STEP and parameters to form a JOB list example (the parameters are instantiated) of the current batch and stores the JOB list example in the coordinator, then a BatchWorker is started to execute the complete STEP, the execution unit of the complete STEP is a JOB, and the running result and the running state of the complete STEP are stored in the coordinator after each JOB is executed. When a plurality of JOBs are executed, the JOBs are executed strictly according to the TASK sequence, the JOBs are executed in parallel according to the specified concurrent threads, any JOB fails, the BatchWorker records the JOBs in the coordinator, and all the JOBs which are run are stopped from being executed, so that the whole batch step is interrupted.
The BatchWorker supports a process running mode and a thread running mode, the process mode is suitable for scenes with high reliability requirements, and the thread mode is suitable for scenes with flexible running and short tasks. For each batch step, the number of overtime seconds may be entered at start-up, and when the batch is not completed within the active time, the executor kills the current batch and registers it as an overtime batch step in the coordinator.
The executor may also accept STOP requests from the scheduler to kill the batch steps being executed.
As can be seen from the above description, in the method for executing a general batch job in a distributed environment according to the embodiment of the present invention, first, a scheduling parameter and a scheduling request sent by a scheduler are received; then, generating a scheduling list according to the scheduling parameters; and finally, responding to the scheduling request, and executing the general batch job according to the scheduling list. The invention adopts a unique method for separating the scheduler from the actuator, thereby elaborately designing the scheduling, the scheduler does not play a role of a scheduling command brain any more, the actuator can automatically generate a scheduling task at the running time, the complexity of batch running system design can be effectively reduced, and the defects of the prior art scheme can be effectively overcome.
In one embodiment, referring to fig. 6, step 300 further comprises:
step 301: executing the generic batch job in a process or thread mode.
Specifically, the interface scheduling module responds to a scheduling request of the scheduler, transmits parameters and invokes a batch task generating module batchexplorer, the task generating module completes registration of all tasks on the coordinator and invokes batch task workers BatchWorker in a process or thread mode, the BatchWorker completes execution of tasks and JOB, and an execution result is recorded in the coordinator.
In one embodiment, the scheduler interface is an HTTP REST interface; the HTTP REST interface adopts a vertical return mechanism.
The access control of the actuator is completed through a standard interface, and the HTTP REST interface is recommended to be used in the invention. The interface designs four general operation interfaces of starting, stopping, inquiring and testing for each batch. All operations are directed to batch steps.
1) Starting: starting the current step;
2) stop: stopping the current step;
3) querying the query: inquiring the execution state of the current step;
4) and (5) testing test: and the method has no practical application and is only used for testing the availability of the interface, such as detecting the DB connection state and the like.
The implementation of the interface is done by the actuator, and the start-up interface can support the input of different parameters to ensure that different batch steps are invoked and different parameters are used. Each interface employs an immediate return mechanism to ensure the reliability of the scheduler.
In one embodiment, referring to fig. 7, step 300 further comprises:
step 30A: generating a job list according to the scheduling parameters and the SDK;
step 30B: and executing the general batch operation according to the operation list.
It will be appreciated that a software development kit is generally a collection of development tools used by some software engineers to build application software for a particular software package, software framework, hardware platform, operating system, etc. A software development tool broadly refers to a collection of related documents, paradigms, and tools that assist in developing a certain class of software.
A software development kit is a collection of development tools used by software engineers to create application software for a particular software package, software framework, hardware platform, operating system, etc., and in general, an SDK is an SDK used to develop applications on a Windows platform. It may simply be a file that provides an application program interface API for a certain programming language, but may also include complex hardware that can communicate with a certain embedded system. Typical tools include utility tools for debugging and other purposes. SDKs also often include example code, supporting technical notes, or other supporting documentation to clarify suspicions for basic reference.
In STEP 30A and STEP 30B, for each batch STEP, a job list is implemented in advance by the SDK opened by the developer through the executor in the development phase, and the length of the job list can be increased or decreased in the runtime phase according to the parameters transmitted by the scheduler, but the task type and the job type included in the job list are designed in advance by the developer.
A STEP at least comprises a TASK, the sequence of the TASK is kept unchanged, each TASK is composed of 1 or a plurality of JOBs, each TASK is designed to be executed sequentially through a framework, the JOBs in a single TASK are designed to be executed in parallel, the parallelism of the JOBs can be specified through parameters in the operation stage, the parallelism of the JOBs can be automatically adjusted according to the resources of an arithmetic host, and when the parallelism of the JOBs in a certain TASK is specified to be 1, the JOBs in the TASK are expressed to be executed according to the sequence of JOB numbers. Therefore, any complex batch steps can be simplified into a JOB list instead of using a complex directed acyclic graph mode, the implementation difficulty of batch running business is simplified, and the parallelism of each operation link can be well supported to obtain higher operation efficiency.
In one embodiment, referring to fig. 8, the method for executing a generic batch job in a distributed environment further includes:
step 400: registering the generic batch job in the scheduler.
Specifically, the task generation module completes registration of all generic batch jobs on the coordinator.
To further illustrate the present solution, the present invention further provides a specific application example of the general batch job execution method in a distributed environment.
Description of terms:
operation: english is abbreviated as JOB, which is a processing work that a computer system needs to complete in order to realize a certain service capability in a distributed computing, network and storage environment, for example, some SQL statements are executed in a database system; completing the downloading, analysis and warehousing of certain files; the transfer of some data is completed from two heterogeneous databases, the job in the invention refers to the minimum unit which can be executed by a batch system, the execution of the job itself has atomicity, and the execution result of one job is either successful or failed.
Task: english abbreviated as TASK is composed of a plurality of jobs, and all jobs of one TASK may have correlation or may not be related at all. The execution of a task is the execution of all the jobs it contains.
Batch production: the english is abbreviated as bat, which is a series of computer operations performed to complete a complex service process, and BATCH processing is usually completed according to a certain period. For example, in a bank system, risk level assessment needs to be carried out on newly-opened customers in the day at night every day; the stock share of all the clients per day in the security system needs to be counted to calculate the total assets per day of the clients, and the calculation capacity needs to be run and batched in a day period. To simplify the problem, the batch of the present invention is composed of a plurality of batch steps, which are performed in serial order between the batch steps.
Batch STEP (STEP): is the minimum unit of the batch, and the batch steps are executed in series so as to complete the execution of the whole batch. In the invention, one batch step is formed by executing one task or a plurality of tasks in series.
The execution of a batch step is the execution of all the tasks below the batch step, and the tasks below the batch step generally need to have a sequential relationship when being executed, namely, a plurality of tasks of the batch step need to be executed in sequence.
A scheduler: the system is also called a batch scheduler or a scheduling server, and is called Dispather for short, which refers to a software and hardware system set up for completing execution control of a batch, and it can generally control an actuator to complete the work of starting, stopping, status query, testing, etc. of a certain batch execution.
An actuator: also called batch executer or execution server, called as execute for short, it is a key software and hardware system for implementing batch execution and state management, and the executer is usually an entity mainly consuming computing resources. In a distributed environment, in order to ensure reliability and operating efficiency of batch execution, a plurality of executors are usually set, and the number of the executors can also be designed according to the number of batch types and the operating load, and can be dynamically expanded according to the system data scale. The executor described by the invention consists of three parts, namely an Interface module (Interface), a batch generator (BatchInvoker) and a batch executor (BatchWorker). Referring to fig. 9, the method for executing a general batch job in a distributed environment provided by this specific application example includes the following steps:
s1: the scheduler is separated from the executor.
The coordinator is used as a state recorder of a batch STEP example, state data is shared by each executor, the execution state of any JOB can be recorded in the coordinator at the first time, when all JOBs of a single STEP are executed completely or failed, the coordinator can automatically record the execution state of the single batch STEP, when a plurality of executors start the same type of batch STEP, whether the same STEP runs or not is checked, and for the STEPs with conflicts, the coordinator informs the conflict situation of the STEPs, and the STEPs return to the scheduler through the executors and further convey the STEPs to a scheduler interface.
The scheduler can realize scheduling on different actuators through a virtual IP or a certain scheduling strategy to support a load sharing and high availability mechanism of the actuator work, and the coordinator also records the execution state and the load condition of each scheduler to coordinate the execution and transfer of scheduling tasks, thereby realizing the high availability and load sharing mechanism of batch operation in a distributed environment.
S2: the generic batch job is executed by the executor alone.
In particular, the BatchInvoker itself contains a simple, easy-to-use library of extension frames that can be defined for any one batch step as an extension point as exemplified in Table 1.
TABLE 1
Figure BDA0003088437840000111
The scheduling unit of the scheduler is a batch STEP (STEP), any plurality of batch STEPs form the whole batch, the batch STEPs are sequentially executed, the scheduler is used for sequentially controlling the batch STEPs, and the scheduler is transparent to the executor.
For each batch STEP, a job list is required to be realized in advance by the developer through the SDK opened by the executor in the development phase, the length of the job list can be increased or decreased in the running phase according to the parameters transmitted by the scheduler, but the task type and the job type contained in the job list are designed in advance by the developer.
A STEP at least comprises a TASK, the sequence of the TASK is kept unchanged, each TASK is composed of 1 or a plurality of JOBs, each TASK is designed to be executed sequentially through a framework, the JOBs in a single TASK are designed to be executed in parallel, the parallelism of the JOBs can be specified through parameters in the operation stage, the parallelism of the JOBs can be automatically adjusted according to the resources of an arithmetic host, and when the parallelism of the JOBs in a certain TASK is specified to be 1, the JOBs in the TASK are expressed to be executed according to the sequence of JOB numbers.
Therefore, any complex batch steps can be simplified into a JOB list instead of using a complex directed acyclic graph mode, the implementation difficulty of batch running business is simplified, and the parallelism of each operation link can be well supported to obtain higher operation efficiency.
At the running time, the executor receives the START request of the scheduler and the sent batch STEP and parameters to form a JOB list example (the parameters are instantiated) of the current batch and stores the JOB list example in the coordinator, then a BatchWorker is started to execute the complete STEP, the execution unit of the complete STEP is a JOB, and the running result and the running state of the complete STEP are stored in the coordinator after each JOB is executed. When a plurality of JOBs are executed, the JOBs are executed strictly according to the TASK sequence, the JOBs are executed in parallel according to the specified concurrent threads, any JOB fails, the BatchWorker records the JOBs in the coordinator, and all the JOBs which are run are stopped from being executed, so that the whole batch step is interrupted.
The BatchWorker supports a process running mode and a thread running mode, the process mode is suitable for scenes with high reliability requirements, and the thread mode is suitable for scenes with flexible running and short tasks. On the other hand, for each batch step, the number of overtime seconds may be entered at start-up, and when the batch is not completed within the active time, the executor may kill the current batch and register it as an overtime batch step in the coordinator.
The executor may also accept STOP requests from the scheduler to kill the batch steps being executed.
Aiming at the idempotent and breakpoint continuous running mechanism of the actuator, the minimum execution unit of the actuator is the JOB, each JOB is designed to be executed idempotent, which means that each JOB can be executed for many times without influencing batch running operation as long as parameters are not changed, so when batch step execution fails, batch breakpoint continuous running can be effectively realized as long as the earliest failed JOB is found in a JOB instance list of a coordinator, and execution is completed in a serial mode among TASK and a parallel mode of JOB in TASK.
If the batch parameters during the re-running are changed, the system automatically omits a run-through mechanism, but regenerates a JOB list, and starts to execute from the 1 st JOB, so that the correctness of the service data during the re-running is ensured.
As can be seen from the above description, in the method for executing a general batch job in a distributed environment according to the embodiment of the present invention, first, a scheduling parameter and a scheduling request sent by a scheduler are received; then, generating a scheduling list according to the scheduling parameters; and finally, responding to the scheduling request, and executing the general batch job according to the scheduling list. Specifically, the general batch job execution method in the distributed environment provided by the embodiment of the present invention has the following beneficial effects:
(1) simplicity and flexibility of the scheduling interface: the executor provides a Rest interface for other dispatching platforms to finish batch dispatching, and dispatching supports the batch execution capabilities of starting, stopping, inquiring and the like, and provides the re-running capability for the old batch, supports forced re-running and breakpoint re-running, supports re-running by using new configuration, and supports re-running by using the original configuration; the stop supports both timeout stops and direct stops, which are difficult to implement with conventional schemes;
(2) high availability and load sharing mechanism for batch systems: the executor provides scheduling interfaces in a Rest service mode, and any scheduling interface has idempotent, so that when in deployment, a plurality of hosts can be adopted to finish load sharing and high availability deployment, a scheduling server can point to a plurality of peer-to-peer executors, when any host fails, scheduling can be initiated from other available hosts, and technologies such as Nginx/LB and the like can be matched to finish high availability and load sharing;
(3) an efficient version of batch runs is supported by a simple concurrency design: by designing the execution sequence of the subtasks in each execution step and the dependency relationship among the tasks, the multithread execution capable of performing concurrent operation is supported, so that the computing resources such as a database or a large data platform are utilized to the maximum extent, and the batch running time of the whole system is reduced.
(4) Powerful operation and maintenance management and optimization support capacity: the execution condition of each JOB is recorded through the coordinator, including execution time consumption, execution number, execution results and the like, so that operation and maintenance personnel can easily find the bottleneck of the batch node, and the optimization of the system is purposeful;
based on the same inventive concept, the embodiment of the present application further provides a general batch job execution device in a distributed environment, which can be used to implement the method described in the foregoing embodiment, such as the following embodiments. Because the principle of the universal batch job execution device in the distributed environment for solving the problem is similar to that of the universal batch job execution method in the distributed environment, the implementation of the universal batch job execution device in the distributed environment can be implemented by the universal batch job execution method in the distributed environment, and repeated parts are not described again. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.
An embodiment of the present invention provides a specific implementation manner of a universal batch job execution apparatus in a distributed environment, which is capable of implementing a universal batch job execution method in a distributed environment, and referring to fig. 10, the universal batch job execution apparatus in the distributed environment specifically includes the following contents:
a scheduling request receiving module 10, configured to receive a scheduling parameter and a scheduling request sent by a scheduler;
a scheduling list generating module 20, configured to generate a scheduling list according to the scheduling parameter;
and the batch job execution module 30 is configured to respond to the scheduling request and execute the general batch job according to the scheduling list.
In one embodiment, referring to fig. 11, the batch job execution module 30 includes:
a batch job execution first unit 301 for executing the generic batch job in a process or thread mode;
the scheduler interface is an HTTP REST interface; the HTTP REST interface adopts a vertical return mechanism.
In one embodiment, referring to fig. 12, the batch job execution module 30 further includes:
a job list generating unit 30A, configured to generate a job list according to the scheduling parameter and the SDK;
a batch job execution second unit 30B for executing the general batch job according to the job list;
in an embodiment, referring to fig. 13, the method for executing a generic batch job in a distributed environment further includes:
a batch job registration module 40 for registering the generic batch job in the scheduler.
As can be seen from the above description, the apparatus for executing a general batch job in a distributed environment according to the embodiment of the present invention first receives a scheduling parameter and a scheduling request sent by a scheduler; then, generating a scheduling list according to the scheduling parameters; and finally, responding to the scheduling request, and executing the general batch job according to the scheduling list. The invention adopts a unique method for separating the scheduler from the actuator, thereby elaborately designing the scheduling, the scheduler does not play a role of a scheduling command brain any more, the actuator can automatically generate a scheduling task at the running time, the complexity of batch running system design can be effectively reduced, and the defects of the prior art scheme can be effectively overcome.
Referring now to FIG. 14, shown is a schematic diagram of an electronic device 600 suitable for use in implementing embodiments of the present application.
As shown in fig. 14, the electronic apparatus 600 includes a Central Processing Unit (CPU)601 that can perform various appropriate works and processes according to a program stored in a Read Only Memory (ROM)602 or a program loaded from a storage section 608 into a Random Access Memory (RAM)) 603. In the RAM603, various programs and data necessary for the operation of the system 600 are also stored. The CPU601, ROM602, and RAM603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
The following components are connected to the I/O interface 605: an input portion 606 including a keyboard, a mouse, and the like; an output portion 607 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 608 including a hard disk and the like; and a communication section 609 including a network interface card such as a LAN card, a modem, or the like. The communication section 609 performs communication processing via a network such as the internet. The driver 610 is also connected to the I/O interface 605 as needed. A removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 610 as necessary, so that a computer program read out therefrom is mounted as necessary on the storage section 608.
In particular, according to an embodiment of the present invention, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, an embodiment of the present invention includes a computer-readable storage medium, on which a computer program is stored, the computer program, when being executed by a processor, implementing the steps of the above-mentioned method for determining a distance to a person in a data-based room scenario, the steps including:
step 100: receiving a native load balancing model and a target mirror image version of an application to be upgraded;
step 200: determining a pod list corresponding to the native load balancing model;
step 300: and modifying the image file in the pod file into a target image version according to the pod list.
In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 609, and/or installed from the removable medium 611.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the system embodiment, since it is substantially similar to the method embodiment, the description is simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A general batch job execution method in a distributed environment is characterized by comprising the following steps:
receiving a scheduling parameter and a scheduling request sent by a scheduler;
generating a scheduling list according to the scheduling parameters;
and responding to the scheduling request, and executing the general batch job according to the scheduling list.
2. The method of claim 1, wherein the executing the generic batch job according to the dispatch list in response to the dispatch request comprises:
executing the generic batch job in a process or thread mode.
3. The method of claim 1, wherein the scheduler interface is a HTTP REST interface; the HTTP REST interface adopts a vertical return mechanism.
4. The method of claim 1, wherein the executing the generic batch job according to the dispatch list in response to the dispatch request further comprises:
generating a job list according to the scheduling parameters and the SDK;
and executing the general batch operation according to the operation list.
5. The method of claim 1, further comprising:
registering the generic batch job in the scheduler.
6. A general batch job execution apparatus in a distributed environment, comprising:
a scheduling request receiving module, configured to receive a scheduling parameter and a scheduling request sent by a scheduler;
the scheduling list generating module is used for generating a scheduling list according to the scheduling parameters;
and the batch job execution module is used for responding to the scheduling request and executing the general batch job according to the scheduling list.
7. The generic batch job execution apparatus in a distributed environment according to claim 6, wherein the batch job execution module comprises:
a batch job execution first unit for executing the generic batch job in a process or thread mode;
the scheduler interface is an HTTP REST interface; the HTTP REST interface adopts a vertical return mechanism.
8. The generic batch job execution apparatus in a distributed environment according to claim 7, wherein the batch job execution module further comprises:
the job list generating unit is used for generating a job list according to the scheduling parameters and the SDK;
a batch job execution second unit configured to execute the general batch job according to the job list;
the general batch job execution method in the distributed environment further comprises the following steps:
and the batch job registration module is used for registering the general batch job in the scheduler.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the method for generic batch job execution in a distributed environment according to any one of claims 1 to 5 when executing the program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method for generic batch job execution in a distributed environment according to any one of claims 1 to 5.
CN202110588206.0A 2021-05-28 2021-05-28 Universal batch operation execution method and device under distributed environment Pending CN113220436A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110588206.0A CN113220436A (en) 2021-05-28 2021-05-28 Universal batch operation execution method and device under distributed environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110588206.0A CN113220436A (en) 2021-05-28 2021-05-28 Universal batch operation execution method and device under distributed environment

Publications (1)

Publication Number Publication Date
CN113220436A true CN113220436A (en) 2021-08-06

Family

ID=77099603

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110588206.0A Pending CN113220436A (en) 2021-05-28 2021-05-28 Universal batch operation execution method and device under distributed environment

Country Status (1)

Country Link
CN (1) CN113220436A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116661978A (en) * 2023-08-01 2023-08-29 浙江云融创新科技有限公司 Distributed flow processing method and device and distributed business flow engine

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116661978A (en) * 2023-08-01 2023-08-29 浙江云融创新科技有限公司 Distributed flow processing method and device and distributed business flow engine
CN116661978B (en) * 2023-08-01 2023-10-31 浙江云融创新科技有限公司 Distributed flow processing method and device and distributed business flow engine

Similar Documents

Publication Publication Date Title
Warneke et al. Exploiting dynamic resource allocation for efficient parallel data processing in the cloud
US7779298B2 (en) Distributed job manager recovery
CN111949454B (en) Database system based on micro-service component and related method
CN108429787B (en) Container deployment method and device, computer storage medium and terminal
CN112579267A (en) Decentralized big data job flow scheduling method and device
CN112256414A (en) Method and system for connecting multiple computing storage engines
US7249140B1 (en) Restartable scalable database system updates with user defined rules
CN114416849A (en) Data processing method and device, electronic equipment and storage medium
CN113220436A (en) Universal batch operation execution method and device under distributed environment
Chen et al. Pisces: optimizing multi-job application execution in mapreduce
Bausch et al. Bioopera: Cluster-aware computing
Yu et al. Testing tasks management in testing cloud environment
US20200356885A1 (en) Service management in a dbms
Cai et al. Deployment and verification of machine learning tool-chain based on kubernetes distributed clusters: This paper is submitted for possible publication in the special issue on high performance distributed computing
CN112199184A (en) Cross-language task scheduling method, device, equipment and readable storage medium
US10534640B2 (en) System and method for providing a native job control language execution engine in a rehosting platform
Bodner Elastic Query Processing on Function as a Service Platforms.
US11106395B2 (en) Application execution apparatus and application execution method
Saxena et al. Paradigm shift from monolithic to microservices
CN112711448A (en) Agent technology-based parallel component assembling and performance optimizing method
CN113485894A (en) Data acquisition method, device and equipment and readable storage medium
CN112581080A (en) Lightweight distributed workflow engine construction system
Chen et al. Fangorn: adaptive execution framework for heterogeneous workloads on shared clusters
Jeon et al. Pigout: Making multiple hadoop clusters work together
Ferikoglou Resource aware GPU scheduling in Kubernetes infrastructure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination