CN112231086A - Production workflow description and scheduling method and device for remote sensing information product - Google Patents

Production workflow description and scheduling method and device for remote sensing information product Download PDF

Info

Publication number
CN112231086A
CN112231086A CN202011140093.XA CN202011140093A CN112231086A CN 112231086 A CN112231086 A CN 112231086A CN 202011140093 A CN202011140093 A CN 202011140093A CN 112231086 A CN112231086 A CN 112231086A
Authority
CN
China
Prior art keywords
workflow
execution
task
sub
completed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011140093.XA
Other languages
Chinese (zh)
Other versions
CN112231086B (en
Inventor
张正
李宏益
唐娉
胡昌苗
单小军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Aerospace Information Research Institute of CAS
Original Assignee
Aerospace Information Research Institute of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Aerospace Information Research Institute of CAS filed Critical Aerospace Information Research Institute of CAS
Priority to CN202011140093.XA priority Critical patent/CN112231086B/en
Publication of CN112231086A publication Critical patent/CN112231086A/en
Application granted granted Critical
Publication of CN112231086B publication Critical patent/CN112231086B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The embodiment of the invention discloses a method and a device for describing and scheduling production workflow of a remote sensing information product. The method comprises the following steps: describing a search rule corresponding to an input parameter and an output parameter of each algorithm in a workflow of a remote sensing information product to be produced based on a first extensible markup language file; for each workflow in the remote sensing information product, describing the type, the space range and the time range of a production product of the workflow based on a second extensible markup language file and defining a production flow based on logic execution unit nesting; and in the process of generating the remote sensing information product, controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine.

Description

Production workflow description and scheduling method and device for remote sensing information product
Technical Field
The invention relates to the technical field of remote sensing information product production, in particular to a method and a device for describing and scheduling a generation workflow of a remote sensing information product.
Background
The remote sensing information product is an important carrier for providing remote sensing information service to scientific research institutions, industry departments and administrative institutions, and is also the direction of the industrialized development of the remote sensing technology. The remote sensing information products are derived through calculation of remote sensing data or other remote sensing information products, and in the past, due to the reasons that an algorithm model of the products is simple, the data size and the related space-time range are small, the industrialization degree is low, the demand is not vigorous, and the like, the production of the products is mainly carried out in a small-range manual operation mode. In recent years, with the following development trends, a workflow description and scheduling method capable of adapting to complex processes and large-scale automatic production is urgently needed in remote sensing information product production:
the first trend is as follows: the input of the product production algorithm is increasingly complex, and the single remote sensing data is cooperatively transformed into the multi-source remote sensing data. Along with the gradual refinement of the algorithm model, the types of input data sources required by the algorithm are more and more; with the increasing abundance of the same type of satellite sensors, an algorithm system for multi-source data collaboration is increasingly popular. In order to deal with increasingly complex algorithm input, uniform abstraction and definition are required to be carried out on the input of each algorithm when a product workflow is described, and realization of automatic scheduling is facilitated.
And a second trend: the product system and the variety are increasingly rich, and the product hierarchical nesting relation is deepened continuously. The abundance of the types of data collected by the sensors makes calculation of more types of remote sensing information products possible. In addition, the remote sensing information product has another characteristic of systematization and hierarchy, namely, products in the same field are often connected to form a system, one or more primary products in the system are used as input of high-level products, and products in the highest level are nested in the hierarchy, so that a product production workflow description and scheduling method capable of supporting complex level nesting is required, and the described nesting relation can be continuously expanded.
Trend three: the space-time scale and the calculation scale related to the product are larger and larger, and the timeliness requirement is higher and higher. With the continuous improvement of data acquisition capacity and the accumulation of data, the product space coverage range expected by users is gradually expanded from local areas to the whole country or even the whole world, the time coverage range is also expanded to all time periods since the data exist, the calculation scale is inevitably increased remarkably, meanwhile, the high-frequency updating of the data also puts requirements on the timeliness of the product, and the requirements on the calculation scale are further increased. Therefore, the description and scheduling method for the production workflow should be adaptive to the calculation scale.
And the trend is four: the production of remote sensing information products and high-performance calculation are integrated and developed, and the production system is specialized and specific. The conventional remote sensing information product production mainly uses a general high-performance computing platform and a general operation control program, and the mode is more and more difficult to adapt to the problems in the production process of the remote sensing information product, so that a specialized high-performance production system specially aiming at the production of the remote sensing information product, particularly an algorithm scheduling and operation control part, is becoming an important development direction.
The prior relevant technical method mainly solves the problem of how to serially connect the whole production flow on a general computing platform, some schemes are provided in each link of flow control, but no corresponding scheme is provided aiming at the four trends, especially the description and scheduling aspects of uniform abstraction of algorithm input and complex nested flow.
Disclosure of Invention
The technical problem solved by the invention is as follows: the defects of the prior art are overcome, and a method and a device for describing and scheduling production workflow of remote sensing information products are provided.
In order to solve the technical problem, an embodiment of the present invention provides a method for describing and scheduling a production workflow of a remote sensing information product, including:
describing a search rule corresponding to an input parameter and an output parameter of each algorithm in a workflow of a remote sensing information product to be produced based on a first extensible markup language file;
for each workflow in the remote sensing information product, describing the type, the space range and the time range of a production product of the workflow based on a second extensible markup language file and defining a production flow based on logic execution unit nesting;
and in the process of generating the remote sensing information product, controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine.
Optionally, the describing, based on the first xml file, the search rule corresponding to the input parameter and the output parameter of the algorithm includes:
acquiring data types and parameter serial numbers corresponding to input parameters of each algorithm, whether the input parameters are identifiers of intermediate products, spatial resolution, time span, time-to-time strategies and whether data are framed;
acquiring a data type, a parameter serial number, a spatial resolution, a time resolution, data framing and a grid type corresponding to each algorithm output parameter;
and describing the input parameters and the output parameters based on the first extensible markup language file.
Optionally, the describing, based on the second xml document, the category, the spatial range, the temporal range, and the production flow defined based on the logic execution unit nesting of the production product of the workflow includes:
acquiring four logic execution units corresponding to the workflow, wherein the four logic execution units include: atomic tasks, serial structures, parallel structures, and data parallel structures;
nesting the serial structure, the parallel structure and the data parallel structure mutually to generate a nested structure;
describing the atomic task and the nested structure based on the second extensible markup language document.
Optionally, the workflow-based scheduling engine, in combination with the first xml file and the second xml file, controls an execution process of a workflow, including:
and executing activation operation and check operation for each logic execution unit and the workflow based on the workflow scheduling engine.
Optionally, said executing activation and check operations for each of said logical execution units and said workflows based on said workflow scheduling engine comprises:
aiming at the atomic task, input data are searched according to the type and the space-time range of the remote sensing information product, the atomic task is issued after the input data are determined, and the execution state of the atomic task is judged;
aiming at the serial structure, activating a first sub-execution unit in the serial structure, judging the execution state of the activated sub-execution unit, if the sub-execution unit is completed, activating a next sub-execution unit until all the sub-execution units are completed, and determining that the execution of the serial structure is completed;
aiming at the parallel structure or the data parallel structure, simultaneously activating all sub-execution units, judging the execution states of all the sub-execution units, and determining that the parallel structure or the data parallel structure is executed completely when all the sub-execution units are completed;
and for the workflow, activating a first sub-execution unit in the workflow, judging the execution state of the activated sub-execution unit, and if the sub-execution unit is completed, activating a next sub-execution unit until all the sub-execution units are completed, and determining that the workflow is completed.
Optionally, the workflow scheduling engine comprises four control components of a task publisher, a task executor, a task monitor and a task scheduler, and two message queues of a task queue and a completion queue, wherein,
in the execution process, writing the activated task information into the task queue through the task publisher;
acquiring and executing tasks from the task queue by the task executor when the computing resources allow;
tracking the execution state of the task in real time through the task monitor, and writing information of task completion into a completion queue when the task is completed;
and the task scheduler acquires the completed tasks from the completion queue, analyzes the tasks to be executed next according to the structure of the workflow and activates the tasks.
In order to solve the above technical problem, an embodiment of the present invention further provides a device for describing and scheduling a production workflow of a remote sensing information product, including:
the system comprises a first language description module, a second language description module and a third language description module, wherein the first language description module is used for describing a search rule corresponding to an input parameter and an output parameter of an algorithm based on a first extensible markup language file aiming at each algorithm in a workflow of a remote sensing information product to be produced;
the second language description module is used for describing the type, the space range and the time range of a production product of the workflow and a production flow defined based on logic execution unit nesting aiming at each workflow in the remote sensing information product based on a second extensible markup language file;
and the execution process control module is used for controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine in the process of generating the remote sensing information product.
Optionally, the first language description module comprises:
the input parameter acquisition unit is used for acquiring the data type and the parameter serial number corresponding to each algorithm input parameter, whether the data type and the parameter serial number are the identification of an intermediate product, the spatial resolution, the time span, the time pair strategy and whether the data are framed;
the output parameter acquisition unit is used for acquiring the data type, the parameter serial number, the spatial resolution, the time resolution, the data framing and the grid type corresponding to each algorithm output parameter;
and the first language description unit is used for describing the input parameters and the output parameters based on the first extensible markup language file.
Optionally, the second language description module comprises:
a logic unit obtaining unit, configured to obtain four logic execution units corresponding to the workflow, where the four logic execution units include: atomic tasks, serial structures, parallel structures, and data parallel structures;
a nested structure generating unit configured to nest the serial structure, the parallel structure, and the data parallel structure with each other to generate a nested structure;
a second language description unit for describing the atomic task and the nested structure based on the second extensible markup language file.
Optionally, the executing the process control module comprises:
an activation check operation execution unit to execute an activation operation and a check operation for each of the logic execution units and the workflows based on the workflow scheduling engine.
Optionally, the activation checking operation performing unit includes:
the first execution subunit is used for searching input data according to the type and the space-time range of the remote sensing information product aiming at the atomic task, issuing the atomic task after determining the input data, and judging the execution state of the atomic task;
a second execution subunit, configured to activate, for the serial structure, a first sub-execution unit in the serial structure, determine an execution state of the activated sub-execution unit, and if the sub-execution unit is completed, activate a next sub-execution unit until all sub-execution units are completed, and determine that the serial structure is completed;
a third execution subunit, configured to activate all the sub-execution units simultaneously for the parallel structure or the data parallel structure, determine the execution states of all the sub-execution units, and determine that the execution of the parallel structure or the data parallel structure is completed when all the sub-execution units are completed;
and the fourth execution subunit is configured to, for the workflow, activate the first sub-execution unit in the workflow, determine an execution state of the activated sub-execution unit, activate the next sub-execution unit if the sub-execution unit is completed, until all sub-execution units are completed, and determine that the workflow is completed.
Optionally, the workflow scheduling engine comprises four control components of a task publisher, a task executor, a task monitor and a task scheduler, and two message queues of a task queue and a completion queue, wherein,
in the execution process, writing the activated task information into the task queue through the task publisher;
acquiring and executing tasks from the task queue by the task executor when the computing resources allow;
tracking the execution state of the task in real time through the task monitor, and writing information of task completion into a completion queue when the task is completed;
and the task scheduler acquires the completed tasks from the completion queue, analyzes the tasks to be executed next according to the structure of the workflow and activates the tasks.
Compared with the prior art, the invention has the advantages that:
the embodiment of the invention describes the production workflow of the remote sensing information product in a multilayer nested mode, and provides a corresponding workflow scheduling method under the description so as to support high-performance production of complex-level products. When the input parameters of the remote sensing information product production algorithm in the workflow are described, a uniform and definite algorithm parameter description mode is provided based on induction summary of various algorithm input data, so that automatic algorithm scheduling is realized. The invention has self-adaptive capacity for the calculation scale, can correspondingly adjust the algorithm scheduling scale along with the expansion of the calculation resources, keeps the utilization rate of the calculation resources, has no limit on the technology and tools matched with the method, and is convenient to be fused with the latest high-performance calculation technology.
Drawings
FIG. 1 is a flowchart illustrating steps of a method for describing and scheduling a production workflow of a remote sensing information product according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a device for describing and scheduling a production workflow of a remote sensing information product according to an embodiment of the present invention.
Detailed Description
Example one
Referring to fig. 1, a flowchart illustrating steps of a method for describing and scheduling a production workflow of a remote sensing information product according to an embodiment of the present invention is shown, and as shown in fig. 1, the method may specifically include the following steps:
step 101: and describing a search rule corresponding to the input parameter and the output parameter of each algorithm in the workflow of the remote sensing information product to be produced based on the first extensible markup language file.
In the embodiment of the invention, the first extensible markup language file is a first XML file.
In this example, first, each algorithm in the workflow of the remote sensing information product to be produced may be described by using a first XML file for a search rule corresponding to an input parameter and an output parameter of each algorithm, and specifically, the following specific implementation manner may be described in detail.
In a specific implementation manner of the present invention, the step 101 may include:
substep A1: acquiring data types and parameter serial numbers corresponding to input parameters of each algorithm, whether the input parameters are identifiers of intermediate products, spatial resolution, time span, time-to-time strategies and whether data are framed;
substep A2: acquiring a data type, a parameter serial number, a spatial resolution, a time resolution, data framing and a grid type corresponding to each algorithm output parameter;
substep A3: and describing the input parameters and the output parameters based on the first extensible markup language file.
In the embodiment of the invention, other remote sensing data or products are used as input in the remote sensing information product production algorithm, algorithms of each level are combined and nested to form a complete workflow, for each algorithm, the invention describes the input parameters and the output parameters of the algorithm through an extensible markup language (XML) file, the parameters only comprise data entities, namely each parameter represents one remote sensing data or product, other command line parameters of the algorithm program do not influence the structural description and scheduling of the workflow, the parameters can be written when the algorithm program is called, and the invention is not repeated. The parameters of the algorithm are divided into input parameters and output parameters, and for each input parameter, the data type (1), namely the name of the data type, needs to be explained; (2) parameter sequence number, i.e. the order of the parameters; (3) whether it is an intermediate product, i.e., whether the entry is a previous level product in the workflow; (4) spatial resolution in meters; (5) the time resolution is in the form of numerical values and units, and the units can be days, hours, minutes and seconds; (6) the required time span, namely how long the data in the time span is required to be used as input, also adopts the forms of numerical values and units, and the units can be days, hours, minutes and seconds; (7) a time alignment strategy, namely searching input data from the time point corresponding to the product to the front, the back or both sides; (8) whether framing is required, namely whether the data needs to be framed; (9) the type of the grid, namely the type of the grid adopted by framing; for each output parameter, it is necessary to specify (1) the data type; (2) a parameter number; (3) spatial resolution; (4) a temporal resolution; (5) whether framing is carried out or not; (6) a mesh type.
After the input parameters and the output parameters are obtained, the input parameters may be described based on a first XML file, and the output parameters may be described based on the first XML file.
Step 102: and for each workflow in the remote sensing information product, describing the type, the space range and the time range of a production product of the workflow based on a second extensible markup language file, and defining the production flow based on logic execution unit nesting.
The second extensible markup language file is a second XML file.
For each workflow in the remote sensing information product, the category, the spatial range, the temporal range, and the production flow defined based on the logic execution unit nesting of the workflow may be described based on the second XML file, and specifically, in combination with the following specific implementation manners, the step 102 may include:
substep B1: acquiring four logic execution units corresponding to the workflow, wherein the four logic execution units include: atomic tasks, serial structures, parallel structures, and data parallel structures;
substep B2: nesting the serial structure, the parallel structure and the data parallel structure mutually to generate a nested structure;
substep B3: describing the atomic task and the nested structure based on the second extensible markup language document.
In the embodiment of the invention, four logic execution units corresponding to the workflow, namely an atomic task, a serial structure, a parallel structure and a data parallel structure, can be obtained firstly, then the serial structure, the parallel structure and the data parallel structure can be nested to generate a nested structure, and further, the atomic task and the nested structure can be described based on the second XML file. Specifically, one workflow describes a specific production process of a scene of remote sensing information products, the time range and the space range covered by each scene of product are clear, and input data are searched in the space-time range corresponding to the product during production of the product. The workflow is described by first determining the product's type, spatial extent and temporal extent. Each workflow is also described by an extensible markup language (XML) file, the type of a product needs to be marked in the attribute of the root node, and the time-space range covered by the product is written by the time range node and the space range node respectively. The time range node needs to write (1) the starting time and (2) the ending time which respectively comprise year, month, day, hour, minute and second; the space range node defines the space range in a rectangular grid mode, wherein (1) grid upper left corner longitude, (2) grid upper left corner latitude, (3) grid lower right corner longitude, and (4) grid lower right corner latitude need to be written in the node, and if the grid is subjected to regular numbering, grid numbering needs to be marked in the node attribute.
Step 103: and in the process of generating the remote sensing information product, controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine.
After the above description is performed based on the first XML file and the second XML file, that is, after the above abstract process is performed on the workflow, in the process of producing the remote sensing information product, the execution process of the workflow may be controlled based on the workflow scheduling engine in combination with the first XML file and the second XML file, and specifically, the detailed description may be performed in combination with the following specific implementation manner.
In another specific implementation manner of the present invention, the step 103 may include:
substep C1: and executing activation operation and check operation for each logic execution unit and the workflow based on the workflow scheduling engine.
In this embodiment, the activation operation and the check operation may be performed for each logic execution unit and workflow based on the workflow scheduling engine, and in particular, may be described in detail in conjunction with the following specific implementation manner.
In another specific implementation manner of the present invention, the sub-step C1 may include:
substep D1: aiming at the atomic task, input data are searched according to the type and the space-time range of the remote sensing information product, the atomic task is issued after the input data are determined, and the execution state of the atomic task is judged;
substep D2: aiming at the serial structure, activating a first sub-execution unit in the serial structure, judging the execution state of the activated sub-execution unit, if the sub-execution unit is completed, activating a next sub-execution unit until all the sub-execution units are completed, and determining that the execution of the serial structure is completed;
substep D3: aiming at the parallel structure or the data parallel structure, simultaneously activating all sub-execution units, judging the execution states of all the sub-execution units, and determining that the parallel structure or the data parallel structure is executed completely when all the sub-execution units are completed;
substep D4: and for the workflow, activating a first sub-execution unit in the workflow, judging the execution state of the activated sub-execution unit, and if the sub-execution unit is completed, activating a next sub-execution unit until all the sub-execution units are completed, and determining that the workflow is completed.
In the present embodiment, after defining the category and space-time range of the product, a more core task is to describe the production flow of the product. The production flow of the remote sensing information product has a complex structure of multi-level nesting, and the complexity of the production flow can be continuously improved along with the research and development of finer and higher-level products, so that a workflow description method which can describe any complex nesting structure and is concise and uniform in execution logic is needed. Aiming at the problems, on the basis of summarizing and abstracting the production flows of various remote sensing information products, four logic execution units are defined, and a nested complex production flow is constructed based on the four unit structures. The four logical units are (1) an atomic task, (2) a serial structure, (3) a parallel structure, and (4) a data parallel structure, respectively. The atomic task is the minimum execution unit, namely the production algorithm which needs to be executed, and is responsible for the actual production task; the serial structure, the parallel structure and the data parallel structure are all flow control units, wherein the contained sub-logic units respectively adopt different execution sequences, specifically, the logic units under the serial structure are sequentially executed according to the sequence, and only after the execution of the previous logic unit is completed, the next logic unit can be started; the logic units under the parallel structure can be executed simultaneously; the logic units in the data parallel structure execute the same atomic tasks, but with different input data, these tasks can also run simultaneously. Three types of logic units except for the atomic task can be nested with each other, and the logic units jointly describe the logic structure of the workflow.
The four logic execution units in the invention, together with the workflow itself, all adopt compact and uniform scheduling interfaces, and specifically, the four logic execution units realize two operations of (1) activation and (2) check. For an atomic task, input data needs to be searched according to the type and the space-time range of a target product during activation, as described above, the requirement of each product on the input data is defined in an independent XML file, and the task is issued after the input data is determined; and judging the execution state of the task during checking. For the serial structure, only the first sub-execution unit is activated when the serial structure is activated; when checking, the execution state of the activated sub-execution unit is judged first, if the sub-execution unit is completed, the next sub-execution unit is deactivated, and if all the sub-execution units are completed, the serial structure is determined to be completed. For the parallel structure and the data parallel structure, activating all the sub-execution units in the parallel structure and the data parallel structure simultaneously during activation; and judging the execution states of all the sub-execution units during checking, and if all the sub-execution units are finished, determining that the parallel or data parallel structure is finished. For the entire workflow, its activation and checking logic is consistent with the serial structure. Note that no nested activation occurs at the time of activation, i.e., only the directly contained sub-execution units are activated. In order to realize the logic operation, the information required to be stored by each execution unit comprises (1) the execution state of the unit, (2) a parent unit (3) to which the unit belongs, a former unit (4) of the unit under the same parent unit, a latter unit (5) of the unit under the same parent unit, and a workflow to which the unit belongs, wherein the workflow is sorted under the same parent unit (6).
In another specific implementation manner of the present invention, the whole process of workflow execution is controlled by one workflow scheduling engine, and the engine includes four control components of a task publisher, a task executor, a task monitor and a task scheduler, and two message queues of a task queue and a completion queue. The task publisher writes the activated task information into a task queue; the task executor acquires and executes the task from the task queue when the computing resource allows; the task monitor tracks the execution state of the task in real time, and writes the information of task completion into a completion queue when monitoring that the task is completed; and the task scheduler acquires the completed tasks from the completion queue, analyzes the tasks to be executed next according to the structure of the workflow and activates the tasks, so that the whole process of task scheduling is completed. In the invention, a standard Universal Unique Identifier (UUID) is used as an Identifier, and the workflow and the task code are marked when the message is sent to two message queues. The engine itself needs to be able to receive and record all the currently scheduled workflows, the engine can run in the main thread or independent sub-threads of the program, the four control components run in independent sub-threads respectively, and the message queue can use any message queue middleware.
Example two
Referring to fig. 2, a schematic structural diagram of a device for describing and scheduling a production workflow of a remote sensing information product according to an embodiment of the present invention is shown, and as shown in fig. 2, the device may specifically include the following modules:
the first language description module 210 is configured to describe, for each algorithm in a workflow of a remote sensing information product to be produced, a search rule corresponding to an input parameter and an output parameter of the algorithm based on a first extensible markup language file;
the second language description module 220 is used for describing the type, the space range and the time range of a production product of the workflow and a production flow defined based on logic execution unit nesting on the basis of a second extensible markup language file aiming at each workflow in the remote sensing information product;
and the execution process control module 230 is used for controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine in the process of generating the remote sensing information product.
Optionally, the first language description module comprises:
the input parameter acquisition unit is used for acquiring the data type and the parameter serial number corresponding to each algorithm input parameter, whether the data type and the parameter serial number are the identification of an intermediate product, the spatial resolution, the time span, the time pair strategy and whether the data are framed;
the output parameter acquisition unit is used for acquiring the data type, the parameter serial number, the spatial resolution, the time resolution, the data framing and the grid type corresponding to each algorithm output parameter;
and the first language description unit is used for describing the input parameters and the output parameters based on the first extensible markup language file.
Optionally, the second language description module comprises:
a logic unit obtaining unit, configured to obtain four logic execution units corresponding to the workflow, where the four logic execution units include: atomic tasks, serial structures, parallel structures, and data parallel structures;
a nested structure generating unit configured to nest the serial structure, the parallel structure, and the data parallel structure with each other to generate a nested structure;
a second language description unit for describing the atomic task and the nested structure based on the second extensible markup language file.
Optionally, the executing the process control module comprises:
an activation check operation execution unit to execute an activation operation and a check operation for each of the logic execution units and the workflows based on the workflow scheduling engine.
Optionally, the activation checking operation performing unit includes:
the first execution subunit is used for searching input data according to the type and the space-time range of the remote sensing information product aiming at the atomic task, issuing the atomic task after determining the input data, and judging the execution state of the atomic task;
a second execution subunit, configured to activate, for the serial structure, a first sub-execution unit in the serial structure, determine an execution state of the activated sub-execution unit, and if the sub-execution unit is completed, activate a next sub-execution unit until all sub-execution units are completed, and determine that the serial structure is completed;
a third execution subunit, configured to activate all the sub-execution units simultaneously for the parallel structure or the data parallel structure, determine the execution states of all the sub-execution units, and determine that the execution of the parallel structure or the data parallel structure is completed when all the sub-execution units are completed;
and the fourth execution subunit is configured to, for the workflow, activate the first sub-execution unit in the workflow, determine an execution state of the activated sub-execution unit, activate the next sub-execution unit if the sub-execution unit is completed, until all sub-execution units are completed, and determine that the workflow is completed.
Optionally, the workflow scheduling engine comprises four control components of a task publisher, a task executor, a task monitor and a task scheduler, and two message queues of a task queue and a completion queue, wherein,
in the execution process, writing the activated task information into the task queue through the task publisher;
acquiring and executing tasks from the task queue by the task executor when the computing resources allow;
tracking the execution state of the task in real time through the task monitor, and writing information of task completion into a completion queue when the task is completed;
and the task scheduler acquires the completed tasks from the completion queue, analyzes the tasks to be executed next according to the structure of the workflow and activates the tasks.
Those skilled in the art will appreciate that those matters not described in detail in the present specification are well known in the art.

Claims (12)

1. A production workflow description and scheduling method of remote sensing information products is characterized by comprising the following steps:
describing a search rule corresponding to an input parameter and an output parameter of each algorithm in a workflow of a remote sensing information product to be produced based on a first extensible markup language file;
for each workflow in the remote sensing information product, describing the type, the space range and the time range of a production product of the workflow based on a second extensible markup language file and defining a production flow based on logic execution unit nesting;
and in the process of generating the remote sensing information product, controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine.
2. The method according to claim 1, wherein the describing the search rule corresponding to the input parameter and the output parameter of the algorithm based on the first extensible markup language file comprises:
acquiring data types and parameter serial numbers corresponding to input parameters of each algorithm, whether the data types and the parameter serial numbers are identifiers of intermediate products, spatial resolution, time span, time-to-time strategies and whether the data are framed;
acquiring data types, parameter serial numbers, spatial resolutions, time resolutions, data framing and grid types corresponding to output parameters of each algorithm;
and describing the input parameters and the output parameters based on the first extensible markup language file.
3. The method of claim 1, wherein the describing the category, spatial extent, temporal extent, and production flow based on the logical execution unit nesting definition of the production product of the workflow based on the second extensible markup language file comprises:
acquiring four logic execution units corresponding to the workflow, wherein the four logic execution units include: atomic tasks, serial structures, parallel structures, and data parallel structures;
nesting the serial structure, the parallel structure and the data parallel structure mutually to generate a nested structure;
describing the atomic task and the nested structure based on the second extensible markup language document.
4. The method of claim 3, wherein the workflow-based scheduling engine, in conjunction with the first extensible markup language file and the second extensible markup language file, controls execution of a workflow, comprising:
and executing activation operation and check operation for each logic execution unit and the workflow based on the workflow scheduling engine.
5. The method of claim 4, wherein the performing activation and inspection operations for each of the logical execution units and the workflow based on the workflow scheduling engine comprises:
aiming at the atomic task, input data are searched according to the type and the space-time range of the remote sensing information product, the atomic task is issued after the input data are determined, and the execution state of the atomic task is judged;
aiming at the serial structure, activating a first sub-execution unit in the serial structure, judging the execution state of the activated sub-execution unit, if the sub-execution unit is completed, activating a next sub-execution unit until all the sub-execution units are completed, and determining that the execution of the serial structure is completed;
aiming at the parallel structure or the data parallel structure, simultaneously activating all sub-execution units, judging the execution states of all the sub-execution units, and determining that the parallel structure or the data parallel structure is executed completely when all the sub-execution units are completed;
and for the workflow, activating a first sub-execution unit in the workflow, judging the execution state of the activated sub-execution unit, and if the sub-execution unit is completed, activating a next sub-execution unit until all the sub-execution units are completed, and determining that the workflow is completed.
6. The method of claim 1, wherein the workflow scheduling engine comprises four control components, a task publisher, a task executor, a task monitor, and a task scheduler, and two message queues, a task queue and a completion queue, wherein,
in the execution process, writing the activated task information into the task queue through the task publisher;
acquiring and executing tasks from the task queue by the task executor when the computing resources allow;
tracking the execution state of the task in real time through the task monitor, and writing information of task completion into a completion queue when the task is completed;
and the task scheduler acquires the completed tasks from the completion queue, analyzes the tasks to be executed next according to the structure of the workflow and activates the tasks.
7. A production workflow description and scheduling device of remote sensing information products is characterized by comprising:
the system comprises a first language description module, a second language description module and a third language description module, wherein the first language description module is used for describing a search rule corresponding to an input parameter and an output parameter of an algorithm based on a first extensible markup language file aiming at each algorithm in a workflow of a remote sensing information product to be produced;
the second language description module is used for describing the type, the space range and the time range of a production product of the workflow and a production flow defined based on logic execution unit nesting aiming at each workflow in the remote sensing information product based on a second extensible markup language file;
and the execution process control module is used for controlling the execution process of the workflow by combining the first extensible markup language file and the second extensible markup language file based on a workflow scheduling engine in the process of generating the remote sensing information product.
8. The apparatus of claim 7, wherein the first language description module comprises:
the input parameter acquisition unit is used for acquiring the data type and the parameter serial number corresponding to each algorithm input parameter, whether the data type and the parameter serial number are the identification of an intermediate product, the spatial resolution, the time span, the time pair strategy and whether the data are framed;
the output parameter acquisition unit is used for acquiring the data type, the parameter serial number, the spatial resolution, the time resolution, the data framing and the grid type corresponding to each algorithm output parameter;
and the first language description unit is used for describing the input parameters and the output parameters based on the first extensible markup language file.
9. The apparatus of claim 7, wherein the second language description module comprises:
a logic unit obtaining unit, configured to obtain four logic execution units corresponding to the workflow, where the four logic execution units include: atomic tasks, serial structures, parallel structures, and data parallel structures;
a nested structure generating unit configured to nest the serial structure, the parallel structure, and the data parallel structure with each other to generate a nested structure;
a second language description unit for describing the atomic task and the nested structure based on the second extensible markup language file.
10. The apparatus of claim 9, wherein said executing a process control module comprises:
an activation check operation execution unit to execute an activation operation and a check operation for each of the logic execution units and the workflows based on the workflow scheduling engine.
11. The apparatus of claim 10, wherein the activation checking operation performing unit comprises:
the first execution subunit is used for searching input data according to the type and the space-time range of the remote sensing information product aiming at the atomic task, issuing the atomic task after determining the input data, and judging the execution state of the atomic task;
a second execution subunit, configured to activate, for the serial structure, a first sub-execution unit in the serial structure, determine an execution state of the activated sub-execution unit, and if the sub-execution unit is completed, activate a next sub-execution unit until all sub-execution units are completed, and determine that the serial structure is completed;
a third execution subunit, configured to activate all the sub-execution units simultaneously for the parallel structure or the data parallel structure, determine the execution states of all the sub-execution units, and determine that the execution of the parallel structure or the data parallel structure is completed when all the sub-execution units are completed;
and the fourth execution subunit is configured to, for the workflow, activate the first sub-execution unit in the workflow, determine an execution state of the activated sub-execution unit, activate the next sub-execution unit if the sub-execution unit is completed, until all sub-execution units are completed, and determine that the workflow is completed.
12. The apparatus of claim 7, wherein the workflow scheduling engine comprises four control components, a task publisher, a task executor, a task monitor, and a task scheduler, and two message queues, a task queue and a completion queue, wherein,
in the execution process, writing the activated task information into the task queue through the task publisher;
acquiring and executing tasks from the task queue by the task executor when the computing resources allow;
tracking the execution state of the task in real time through the task monitor, and writing information of task completion into a completion queue when the task is completed;
and the task scheduler acquires the completed tasks from the completion queue, analyzes the tasks to be executed next according to the structure of the workflow and activates the tasks.
CN202011140093.XA 2020-10-22 2020-10-22 Method and device for describing and scheduling production workflow of remote sensing information product Active CN112231086B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011140093.XA CN112231086B (en) 2020-10-22 2020-10-22 Method and device for describing and scheduling production workflow of remote sensing information product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011140093.XA CN112231086B (en) 2020-10-22 2020-10-22 Method and device for describing and scheduling production workflow of remote sensing information product

Publications (2)

Publication Number Publication Date
CN112231086A true CN112231086A (en) 2021-01-15
CN112231086B CN112231086B (en) 2024-04-26

Family

ID=74109224

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011140093.XA Active CN112231086B (en) 2020-10-22 2020-10-22 Method and device for describing and scheduling production workflow of remote sensing information product

Country Status (1)

Country Link
CN (1) CN112231086B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757124A (en) * 2022-04-21 2022-07-15 哈尔滨工程大学 CFD workflow modeling method and device based on XML, computer and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662725A (en) * 2012-03-15 2012-09-12 中国科学院软件研究所 Event-driven high concurrent process virtual machine realization method
CN106775632A (en) * 2016-11-21 2017-05-31 中国科学院遥感与数字地球研究所 A kind of operation flow can flexible expansion high-performance geographic information processing method and system
WO2018057799A1 (en) * 2016-09-21 2018-03-29 iUNU, LLC Horticultural care tracking, validation and verification
CN108985709A (en) * 2018-06-26 2018-12-11 中国科学院遥感与数字地球研究所 Workflow management method towards more satellite data centers collaboration Remote Sensing Products production
WO2020040763A1 (en) * 2018-08-23 2020-02-27 Siemens Aktiengesellschaft Real-time production scheduling with deep reinforcement learning and monte carlo tree search

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102662725A (en) * 2012-03-15 2012-09-12 中国科学院软件研究所 Event-driven high concurrent process virtual machine realization method
WO2018057799A1 (en) * 2016-09-21 2018-03-29 iUNU, LLC Horticultural care tracking, validation and verification
CN106775632A (en) * 2016-11-21 2017-05-31 中国科学院遥感与数字地球研究所 A kind of operation flow can flexible expansion high-performance geographic information processing method and system
CN108985709A (en) * 2018-06-26 2018-12-11 中国科学院遥感与数字地球研究所 Workflow management method towards more satellite data centers collaboration Remote Sensing Products production
WO2020040763A1 (en) * 2018-08-23 2020-02-27 Siemens Aktiengesellschaft Real-time production scheduling with deep reinforcement learning and monte carlo tree search

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
HONGYI LI等: "A Web-Based Remote Sensing Data Processing and Production System With the Unified Integration of Multi-Disciplinary Data and Models", 《IEEE ACCESS》, pages 162961 - 162972 *
冯阳;张国强;: "基于工作流技术的遥感卫星数据接收调度***的设计与实现", 无线电工程, no. 11, pages 96 - 100 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757124A (en) * 2022-04-21 2022-07-15 哈尔滨工程大学 CFD workflow modeling method and device based on XML, computer and storage medium
CN114757124B (en) * 2022-04-21 2024-02-27 哈尔滨工程大学 CFD workflow modeling method and device based on XML, computer and storage medium

Also Published As

Publication number Publication date
CN112231086B (en) 2024-04-26

Similar Documents

Publication Publication Date Title
CN103441900B (en) Centralized cross-platform automatization test system and control method thereof
Casati et al. Deriving active rules for workflow enactment
CN110287097A (en) Batch testing method, device and computer readable storage medium
CN105243528A (en) Financial IT system graphical centralized reconciliation system and method under big data environment
CN111159157A (en) Method and device for indexing processing of enterprise report data
US20210081241A1 (en) Pipeline task verification for a data processing platform
CN110362315A (en) Software systems dispatching method and device based on DAG
CN106528169A (en) Web system development reusable method based on AnGo dynamic evolution model
CN112308443B (en) Batch scheduling method and device for remote sensing information product generation workflow
CN112231086B (en) Method and device for describing and scheduling production workflow of remote sensing information product
CN115904638A (en) Intelligent management method and system for database affairs
CN115712623A (en) Batch data fault-tolerant acquisition method based on capture metadata change
Liang et al. Lenovo schedules laptop manufacturing using deep reinforcement learning
CN112395371B (en) Financial institution asset classification processing method, device and readable medium
CN114281509A (en) Spacecraft multi-user collaborative task planning system based on scene description
US7437739B1 (en) Synchronizing data between a data store and a project management client tool
CN113010296A (en) Task analysis and resource allocation method and system based on formalized model
US20220405665A1 (en) Method and device for managing project by using data merging
CN115169578A (en) AI model production method and system based on meta-space data markers
CN110262973B (en) Data maintenance method, device, equipment and computer storage medium
CN112256978A (en) Data processing method, device and medium based on data model
CN112581080A (en) Lightweight distributed workflow engine construction system
CN112559641A (en) Processing method and device of pull chain table, readable storage medium and electronic equipment
CN112262350A (en) Block-based prediction for manufacturing environments
OMRI et al. Towards an intelligent approach to workflow integration in a quality management system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant