CN117271099B

CN117271099B - Automatic space data analysis scheduling system and method based on rule base

Info

Publication number: CN117271099B
Application number: CN202311552215.XA
Authority: CN
Inventors: 李想
Original assignee: Shandong Normal University
Current assignee: Shandong Normal University
Priority date: 2023-11-21
Filing date: 2023-11-21
Publication date: 2024-01-26
Anticipated expiration: 2043-11-21
Also published as: CN117271099A

Abstract

The system comprises a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic scheduling and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module is used for defining rules of the rule base and matching the space data and the rules, the data processing automatic scheduling and optimizing module is used for scheduling space data processing tasks and optimizing scheduling processes, the error processing module is used for processing scheduling anomalies, and the user interface module is used for providing a user interface. The invention provides a distributed optimal matching algorithm for improving a bidirectional multiplier method, which is used for matching space data with rules and an automatic scheduling algorithm for improving a deep Q network for automatically scheduling space data processing tasks.

Description

Automatic space data analysis scheduling system and method based on rule base

Technical Field

The invention relates to the field of spatial data processing, optimal matching and task automatic scheduling, in particular to a spatial data analysis automatic scheduling system and method based on a rule base.

Background

Spatial data processing is a technical means for performing acquisition, preprocessing, analysis, storage and visualization operations on spatial data, which are used to process different kinds of spatial data, such as Geographic Information System (GIS) data, remote sensing data, geolocation data, etc., to perform various spatial data analysis tasks, acquire various types of spatial data, including satellite images, sensor data, map data and GPS track data, derive useful information about the spatial data using various geographic information analysis and spatial data analysis methods, such as geospatial pattern recognition, spatial buffer analysis and map algebra operations, apply predefined rule bases and models to the spatial data, to automatically perform specific analysis tasks, so that a user can perform spatial data analysis more easily.

The best match is a method for determining the best matching rule, the purpose of which is to select the appropriate rule from a rule base in order to perform the spatial data analysis task and obtain the best result when considering the performance and constraints of the different rules, and a set of evaluation criteria are typically used to evaluate the suitability of each rule, including accuracy, precision, computational efficiency and memory usage of the rule, and rank the rule mouth model according to these evaluation criteria to determine which rule is most suitable for the current analysis task, the choice of the best matching technique will depend on the specific requirements of the problem, the nature of the rule base and the characteristics of the spatial data analysis task, the purpose of which techniques is to ensure that the rule most suitable for the current task can be selected and performed to achieve the best analysis result.

The task automatic scheduling is a technology capable of automatically identifying, planning and arranging to execute different spatial data analysis tasks without manual intervention of a user, the technology generally executes the following tasks based on a set of predefined rules and conditions, the task description is completed based on a condition trigger in a rule base, the execution sequence and time of various analysis tasks are planned, the task dependency analysis and resource allocation are involved, computing resources, storage resources and data access rights are allocated according to task requirements so as to execute the analysis tasks, once the tasks are identified, planned and allocated, the tasks can be automatically executed, the aim of the automatic scheduling technology is to reduce the operation burden of the user, improve the efficiency and automation degree of spatial data analysis, the user can set rule base and task parameters when the requirements exist, then automatically execute the tasks according to an automatic scheduling method, the task automatic scheduling method is particularly useful for large-scale and complex spatial data analysis tasks, and the work efficiency and accuracy can be improved.

Disclosure of Invention

Aiming at the problems, the invention aims to provide an automatic scheduling method for spatial data analysis based on a rule base.

The aim of the invention is realized by the following technical scheme:

the automatic space data dispatching method based on the rule base comprises a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic dispatching and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module comprises a rule base definition unit and a rule matching unit, the rule base definition unit is used for defining rule base rules, the rule matching unit provides a distributed optimal matching algorithm for improving a bidirectional multiplier method to optimally match the space data with the rule base rules, the data processing automatic dispatching and optimizing module comprises an automatic dispatching unit and a dispatching optimizing unit, the automatic dispatching unit provides an automatic dispatching algorithm for improving a depth Q network to automatically dispatch space data processing task, the dispatching optimizing unit is used for optimizing an automatic dispatching process, the error processing module is used for processing anomalies in the dispatching process, and the user interface module is used for providing a user interface.

Furthermore, the space data input module acquires various types of space data by using satellite remote sensing, sensors, databases and internet data sources, and uploads the space data to the rule base.

Furthermore, the data preprocessing module performs data calibration and data dimension reduction simultaneously by clearing and repairing error, missing and inconsistent information in the space data, so as to preprocess the space data.

Further, the rule base definition unit is used for defining and managing rules, guiding the system how to effectively process and analyze large-scale space data through conditions, operations and data processing steps, and classifying the rules according to different tasks, analysis types and data types so as to process the space data according to the application of the appropriate rules.

Furthermore, the rule matching unit provides a distributed optimal matching algorithm for improving the bidirectional multiplier method to optimally match the space data with rules of the rule base.

Further, the distributed optimal matching algorithm for improving the bidirectional multiplier method is specifically as follows: firstly, adaptively judging a neighborhood radius by calculating the distance between spatial data samples, obtaining the density of each type of data spatial samples according to the neighborhood radius, increasing a clustering center, judging the current clustering effect by using a fuzzy clustering effectiveness index, then selecting the optimal clustering number and the optimal clustering center, and finally, optimizing the clustering result by minimizing a clustering objective function, wherein the clustering result is as follows: spatial data set at turning pointWherein, is->For the data set of turning points, +.>For the density of the 1 st turning point, +.>For the density of the 2 nd turning point, +.>Is->Density of individual turning points +.>For the density of the nth turning point, in order to ensure the adaptability of the algorithm, the neighborhood radius is adaptively determined according to the equation, namely:wherein M is the neighborhood radius, +.>Is->Density of individual turning points +.>Is->Density of individual turning points +.>Is->And->Euclidean distance between, when->When being clustered into k, the clustering centerIs->Wherein->，/>For a selected group of cluster centers, +.>Is->And set->The sum of the distances of the cluster centers in the system is used for measuring the clustering effect by a user fuzzy clustering effectiveness index, wherein the fuzzy clustering effectiveness index is as follows: />Wherein->Is->Class +.>Data samples->Membership of->Is the center of the mth cluster, +.>Is the center of the h cluster, +.>Is->And->Distance between->Is->Anddistance between->Is->And->Maximum common divisor of->Is->And->Then searching the optimal steering area from the position of the starting point;

then solving the optimal matching rule, wherein the matching objective function between the space data and the rule is as followsWherein R is a matching objective function between the spatial data and the rules, < >>For successive cross product operations->Is a rule set, A is a set of spatial data, a is spatial data in A, B is a set of rules, B is a rule in B, < ->As utility function of spatial data a, +.>As a utility function of rule b,for the optimal matching rule, the bi-directional multiplier method is +.>And->Wherein BM is a bi-directional multiplier objective function, < ->As utility function of spatial data a, +.>For the utility function of rule b, X and Y are constant matrix, c is constant vector, for the convenience of solving the matching objective function R, the matching objective function R is improved to ++by the bidirectional multiplier method>I.e. +.>Wherein->For improved matching objective function, the Lagrangian multiplier pair ++>Solving, i.e.Wherein L is Lagrangian functional formula,>for Lagrangian multiplier, to make the solving process more accurate, add the dual condition will +.>Further improve to->I.e.Wherein->For->The matching objective function after that is to find the optimal matching rule at the fastest speed +.>The space data can be optimally matched with rules of the rule base, so that the iterative process of the algorithm is performedLine improvement, i.e+，/>+Wherein->For the number of iterations->Is->Successive cross product operations under multiple iterations, +.>To find the smallest matching rule +.>，/>In order to control the parameters of the convergence speed,is->Matching rules under multiple iterations, +.>Is->Matching rules under multiple iterations, pairs ofSolving, have->；

Finally, according to the optimal matching rule obtained by solvingPerforming optimal matching between the spatial data and the rule, and according to the optimal matching rule->Matching between spatial data and rules is performed, firstly, a feature steering region data set is established +.>Find->Is +.>The distance between the turning areas of any feature is then given, namely: />Wherein->For characteristic turning area->And->The speed and course of the characteristic steering region is: />，，/>Wherein->A number of turning points of a particular type in the feature turning region; calculate the first steering area +.>And the next turning region->The difference between these is expressed in terms of the total distance of the steering zones, namely:wherein, the method comprises the steps of, wherein, ，/>is->And->Steering difference between them, finally, the optimal matching rule is converted into total distance +.>I.e.: />Wherein->Weight of route distance +.>For the distance of the steering area, the distributed optimal matching algorithm for improving the bidirectional multiplier method firstly classifies the data, obtains the density of each data point area by adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density in the classification process, and then improves the matching objective function twice to conveniently and accurately solveAnd finally, converting the optimal matching rule into a total distance in the space data steering area, and realizing optimal matching of the space data and the rule base rule.

Furthermore, the automatic scheduling unit provides an automatic scheduling algorithm for improving the deep Q network to automatically schedule the space data processing task.

Further, the automatic scheduling algorithm for the improved deep Q network is specifically as follows: the reward function isWherein->For rewarding function->For scheduling policy->As a result of the normalization factor,for maximum completion time of scheduling spatial data, +.>For the lower limit of the maximum completion time of the scheduling spatial data, the cumulative prize is +.>Wherein AR accumulates rewards,>for rewards at time t, +.>For a reward at time t+1, +.>For a reward at time t+2, +.>For a reward at time t+N, +.>For the discount factor, n is->The time integer between, the update process of the Q value in the deep Q network is +.>Wherein->Taking action in state s>Q value of>To control the learning rate of the update step of the Q value in each iteration, < >>To take action in state sRewards obtained later, < >>For discounts factor->To take action->New state obtained later,/->To be in new state->The optimal scheduling action in the lower scheduling strategy is to solve the problem of overestimation of the deep Q network>Modified to two independent Q functions, i.e. +.>，Wherein->For the 1 st independent Q function, < ->For the 2 nd independent Q function, will +.>Evaluating the best action of the automatic schedule, to beThe method is used for updating the Q value, and solves the problem of overestimation of the deep Q network through interactive use of two independent Q value functions, so that the method has better self-adaptability in the automatic scheduling process, and therefore the learning rate is +.>Improvement of attenuation by learning rate>To improve the algorithm performance, i.e->Wherein->For improved learning rate, m is the attenuation factor, < ->To update the step number iteratively, add +.>Factor controlled learning rate decay magnitudeI.e. +.>Wherein->For adding->The learning rate after the factor is improved, an automatic scheduling algorithm of the depth Q network is used for decomposing an original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q network, and then learning rate attenuation and attenuation amplitude control factors are provided for improving the learning rate so that the automatic scheduling process has self-adaptability, can be better converged, and realize automatic scheduling of spatial data processing tasks.

Furthermore, the scheduling optimization unit adjusts the execution sequence of the tasks by dynamically scheduling the real-time space data, establishes a monitoring system to track the performance and resource utilization condition of task execution and the change of task execution time, and realizes the optimization of the automatic scheduling process.

Further, the error processing module monitors the whole data analysis and scheduling process and the input data to detect potential errors and abnormal conditions, and once errors occur, the error processing module records the types, time and related information of the errors and performs fault elimination and problem analysis so as to process and manage the errors and abnormal conditions occurring in the space data analysis and automatic scheduling process.

Furthermore, the user interface module is used for providing a visual and interactive interface for the user, so that the user can easily apply the automatic scheduling method for space data analysis to manage the space data, simultaneously, the user is allowed to input the space data to be processed and rule base rules, the user can set specific requirements of analysis tasks through the interface, the user can view the results of the analysis tasks through the user interface and present the results in the form of graphs and charts, and the visualization of the results is helpful for the user to better understand the analysis results of the space data.

The invention has the beneficial effects that: the invention is characterized in that the distributed optimal matching algorithm of the improved bidirectional multiplier method is used for automatically scheduling the spatial data processing tasks, the distributed optimal matching algorithm of the improved bidirectional multiplier method is firstly used for classifying the data, the density of each data point area is obtained by adaptively determining the neighborhood radius and gradually increasing the cluster center based on the sample density in the classifying process, then the optimal matching rule is conveniently and accurately solved for two times of improvement of a matching objective function, finally the optimal matching rule is converted into the total distance in a spatial data steering area, the optimal matching of the spatial data and the rule base rule is realized, the automatic scheduling algorithm of the improved depth Q network is provided for automatically scheduling the spatial data processing tasks, the automatic scheduling algorithm of the improved depth Q network is firstly used for automatically classifying the data, the adaptive decision of the adaptive decision neighborhood radius and the sample density are used for gradually increasing the cluster center in the classifying process, then the two times of the matching objective function are improved to conveniently and accurately solve the optimal matching rule, the optimal matching rule is finally converted into the total distance in the spatial data steering area, the automatic scheduling algorithm of the improved depth Q network is provided for automatically scheduling the spatial data processing tasks, the automatic scheduling algorithm of the improved depth Q network is firstly, the automatic scheduling function of the automatic adaptive decision-based on the automatic depth Q network is used for automatically estimating the depth value of the automatic data-dependent depth-based on the automatic data, the automatic data-base has the automatic depth-dependent analysis factor-based on the automatic analysis effect, the automatic data-based on the automatic depth-dependent analysis factor is better achieved, the method provides a more comprehensive and accurate technical support for a spatial data analysis automatic scheduling method based on a rule base, provides a better decision support for a safe, scientific and efficient spatial data analysis automatic scheduling method based on the rule base, simultaneously relates to an optimal matching algorithm and a reinforcement learning algorithm, provides a convenient and efficient spatial data analysis automatic scheduling method based on the rule base for people, can also be used for strengthening the foundation for development of other application fields, lays a solid foundation for development of fusion of multiple fields in the times of spatial data processing, optimal matching and task automatic scheduling, can be applied to multiple industries and fields in the market, provides a new development direction for fusion of spatial data processing, optimal matching and task automatic scheduling, and contributes important application value for the technical field of spatial data processing.

Drawings

The invention will be further described with reference to the accompanying drawings, in which embodiments do not constitute any limitation on the invention, and other drawings can be obtained by one of ordinary skill in the art without undue effort from the following drawings.

Fig. 1 is a schematic diagram of the structure of the present invention.

Detailed Description

The invention will be further described with reference to the following examples.

Preferably, the spatial data input module obtains various types of spatial data, such as Geographic Information System (GIS) data, meteorological data, topographic data and population data, by using satellite remote sensing, sensors, a database and an internet data source, and uploads the spatial data to a rule base.

Preferably, the data preprocessing module performs data calibration and data dimension reduction simultaneously by clearing and repairing error, missing and inconsistent information in the spatial data, so that the computational complexity is reduced and the analysis efficiency is improved, and the spatial data is preprocessed.

Preferably, the rule base definition unit is used for defining and managing rules, guiding the system how to effectively process and analyze large-scale space data through conditions, operations and data processing step rules, classifying the rules according to different tasks, analysis types and data types, so as to process the space data according to the application of the appropriate rules, improve the efficiency of space data analysis, and ensure consistency and repeatability.

Preferably, the rule matching unit proposes a distributed optimal matching algorithm for improving the bidirectional multiplier method to optimally match the space data with rules of the rule base.

Specifically, the distributed optimal matching algorithm for improving the bidirectional multiplier method is specifically as follows: firstly, adaptively judging a neighborhood radius by calculating the distance between spatial data samples, obtaining the density of each type of data spatial samples according to the neighborhood radius, increasing a clustering center, judging the current clustering effect by using a fuzzy clustering effectiveness index, then selecting the optimal clustering number and the optimal clustering center, and finally, optimizing the clustering result by minimizing a clustering objective function, wherein the clustering result is as follows: spatial data set at turning pointWherein KP is the data set of turning points, ">For the density of the 1 st turning point, +.>For the density of the 2 nd turning point, +.>Is->Density of individual turning points +.>For the density of the nth turning point, the density of the turning point is +>Is its neighborhood radius +.>The number of adjacent turning points, and therefore the turning point density, is related to the domain radius, and in order to ensure the adaptability of the algorithm, the neighborhood radius is adaptively determined according to the equation, namely: />Wherein M is the neighborhood radius, +.>Is->Density of individual turning points +.>Is->Density of individual turning points +.>Is->And->Euclidean distance between, when KP is clustered into +.>At the time, clustering center->Is->Wherein, the method comprises the steps of, wherein,for a selected group of cluster centers, +.>Is->And set->The sum of the distances of the cluster centers in the system is used for measuring the clustering effect by a user fuzzy clustering effectiveness index, wherein the fuzzy clustering effectiveness index is as follows: />Wherein->Is the firstClass +.>Data samples->Membership of->Is the center of the mth cluster, +.>Is the center of the h cluster, +.>Is->And->Distance between->Is thatAnd->Distance between->Is->And->Maximum common divisor of->Is->And->Then searching the optimal steering area from the position of the starting point;

then solving the optimal matching rule, wherein the matching objective function between the space data and the rule is as followsWherein R is a matching objective function between the spatial data and the rules, < >>For successive cross product operations->For rule set, A is set of spatial data, +.>For the spatial data in A, B is a set of rules, B is a rule in B, < ->For spatial data->Utility function of->As a utility function of rule b +.>For the optimal matching rule, the bi-directional multiplier method is +.>And (2) andwherein BM is a bi-directional multiplier objective function, < ->For spatial data->Utility function of->For the utility function of rule b, X and Y are constant matrix, c is constant vector, for the convenience of solving the matching objective function R, the matching objective function R is improved to ++by the bidirectional multiplier method>I.e.Wherein, the method comprises the steps of, wherein,for improved matching objective function, by Lagrangian multiplier methodFor->Solving, i.e.Wherein L is Lagrangian functional formula,>for Lagrangian multiplier, to make the solving process more accurate, add the dual condition will +.>Further improve to->I.e.Wherein->For->The matching objective function after that is to find the optimal matching rule at the fastest speed +.>The space data can be optimally matched with rules of the rule base, so that the iterative process of the algorithm is improved, namely+，Wherein->For the number of iterations->Is->Successive cross product operations under multiple iterations, +.>To find the smallest matching rule +.>，/>For controlling the parameters of the convergence speed +.>Is->The matching rule at the time of iteration,is->Matching rule under multiple iterations, for->To solve, there are；

Finally, according to the optimal matching rule obtained by solvingPerforming optimal matching between the spatial data and the rule, and according to the optimal matching rule->Matching between spatial data and rules is performed, firstly, a feature steering region data set is established +.>Find->Is +.>The distance between the turning areas of any feature is then given, namely: />Wherein->For characteristic turning area->And->The speed and course of the characteristic steering region is: />，，/>Wherein->A number of turning points of a particular type in the feature turning region; calculate the first steering area +.>And the next turning region->The difference between these is expressed in terms of the total distance of the steering zones, namely:wherein, the method comprises the steps of, wherein,，/>is->And->Steering difference between them, finally, the optimal matching rule is converted into total distance +.>I.e.:wherein->Weight of route distance +.>For the distance of the turning area, the distributed optimal matching algorithm of the improved bidirectional multiplier method firstly classifies data, obtains the density of each data point area by adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density in the classifying process, then improves the matching objective function twice to conveniently and accurately solve the optimal matching rule, and finally converts the optimal matching rule into the total distance of the space data turning area so as to realize the optimal matching of the space data and the rule base rule.

Preferably, the automatic scheduling unit proposes an automatic scheduling algorithm that improves the deep Q network to automatically schedule spatial data processing tasks.

Specifically, the automatic scheduling algorithm for the improved deep Q network is specifically as follows: the reward function isWherein->For rewarding function->For scheduling policy->As a result of the normalization factor,for maximum completion time of scheduling spatial data, +.>To schedule the lower limit of the maximum completion time of the spatial data, the bonus function is more suitable for the scheduling problem of the spatial data, and the cumulative bonus isWherein AR is cumulative rewards,>for rewards at time t, +.>For a reward at time t+1, +.>For a reward at time t+2, +.>For a reward at time t+N, +.>For the discount factor, n is->The time integer between the time, the updating process of the Q value in the deep Q network is thatWherein->To take action in state s>Q value of>To control the learning rate of the update step of the Q value in each iteration, < >>To take action in state sRewards obtained later, < >>For discounts factor->To take action->New state obtained later,/->To be in new state->The optimal scheduling action in the following scheduling strategy, the Q value is updated through continuous iteration so as to better estimate the long-term accumulated rewards of each action in each state, and the Q value is better estimated in order to solve the problem of overestimation of the deep Q network>Modified to two independent Q functions, i.e. +.>，Wherein->For the 1 st independent Q function, < ->For the 2 nd independent Q function, will +.>Evaluating the best action of the automatic schedule, to beThe method is used for updating the Q value, and solves the problem of overestimation of the deep Q network through interactive use of two independent Q value functions, so that the method has better self-adaptability in the automatic scheduling process, and therefore the learning rate is +.>Improvement of attenuation by learning rate>To improve the algorithm performance, i.e->Wherein->For improved learning rate, m is the attenuation factor, < ->To update the step number iteratively, add +.>Factor controls the decay amplitude of learning rate, i.e +.>Wherein->For adding->The learning rate after the factor is improved, an automatic scheduling algorithm of the depth Q network is improved to firstly decompose an original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q networkAnd then, the learning rate attenuation and attenuation amplitude control factors are provided for improving the learning rate, so that the automatic scheduling process has self-adaptability, can be converged better, and the automatic scheduling of the spatial data processing task is realized.

Preferably, the scheduling optimization unit adjusts the execution sequence of the tasks by dynamically scheduling the real-time space data, establishes a monitoring system to track the performance and resource utilization condition of task execution and the change of task execution time, and realizes the optimization of the automatic scheduling process.

Preferably, the error handling module detects potential errors and anomalies by monitoring the overall data analysis and scheduling process, and the incoming data, and upon occurrence of an error, the error handling module records the type, time and related information of the error and performs troubleshooting and problem analysis to handle and manage the errors and anomalies occurring during spatial data analysis and automatic scheduling, and for some known errors and anomalies the error handling module attempts to automatically handle them, e.g., if some data is missing, filling according to predefined rules, and if a rule match is problematic, adjustment of the matching rules.

Preferably, the user interface module is used for providing a visual and interactive interface for the user, so that the user can easily apply the automatic space data analysis scheduling method to manage the space data, and simultaneously, the user is allowed to input the space data to be processed and rule base rules, such as a space data set, a rule base, analysis task parameters and the like, the user can set specific requirements of the analysis task through the interface, including the required rules, the analysis method and expected results, the user can view the results of the analysis task through the user interface and present the results in a graph and chart form, and the visualization of the results is helpful for the user to better understand the analysis results of the space data.

The invention provides a space data analysis automatic scheduling method based on a rule base, which is used for automatic scheduling of space data analysis, and provides an automatic scheduling method for space data processing analysis through fusion of a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic scheduling and optimizing module, an error processing module and a user interface module, and provides a distributed optimal matching algorithm for improving a bidirectional multiplier method for automatic scheduling of space data processing tasks, the invention is innovative in that the distributed optimal matching algorithm for improving the bidirectional multiplier method firstly classifies data, and in the classification process, the density of each data point area is obtained through self-adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density, then the matching objective function is improved twice to conveniently and accurately solve the optimal matching rule, finally the optimal matching rule is converted into the total distance in the space data steering area, the optimal matching of the space data and the rule base rule is realized, the automatic scheduling algorithm of the improved depth Q network is provided for automatically scheduling the space data processing task, the innovation of the invention is that the automatic scheduling algorithm of the improved depth Q network is firstly to decompose the original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q network, then the learning rate attenuation and attenuation amplitude control factor is provided for improving the learning rate so that the automatic scheduling process has self-adaptability and can be better converged, the automatic scheduling of the space data processing task is realized, the working effect of the space data analysis automatic scheduling method based on the rule base is effectively improved, the method provides a more comprehensive and accurate technical support for a spatial data analysis automatic scheduling method based on a rule base, provides a better decision support for a safe, scientific and efficient spatial data analysis automatic scheduling method based on the rule base, simultaneously relates to an optimal matching algorithm and a reinforcement learning algorithm, provides a convenient and efficient spatial data analysis automatic scheduling method based on the rule base for people, can also be used for strengthening the foundation for development of other application fields, lays a solid foundation for development of fusion of multiple fields in the times of spatial data processing, optimal matching and task automatic scheduling, can be applied to multiple industries and fields in the market, provides a new development direction for fusion of spatial data processing, optimal matching and task automatic scheduling, and contributes important application value for the technical field of spatial data processing.

Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the scope of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications can be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims

1. The space data analysis automatic dispatching system based on the rule base is characterized by comprising a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic dispatching and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module comprises a rule base definition unit and a rule matching unit, the rule base definition unit is used for defining rule base rules, the rule matching unit adopts a distributed optimal matching algorithm for improving a bidirectional multiplier method to optimally match the space data with the rule base rules, the data processing automatic dispatching and optimizing module comprises an automatic dispatching unit and a dispatching optimizing unit, the automatic dispatching unit is used for proposing an automatic dispatching algorithm for improving a depth Q network to automatically dispatch space data processing task, the dispatching optimizing unit is used for optimizing an automatic dispatching process, and the error processing module is used for processing anomalies in a dispatching process, and the user interface module is used for providing a user interface;

the distributed optimal matching algorithm of the improved bidirectional multiplier method is specifically as follows: firstly, adaptively judging a neighborhood radius by calculating the distance between spatial data samples, obtaining the density of each type of data spatial samples according to the neighborhood radius, increasing a clustering center, judging the current clustering effect by using a fuzzy clustering effectiveness index, then selecting the optimal clustering number and the optimal clustering center, and finally, optimizing the clustering result by minimizing a clustering objective function, wherein the clustering result is as follows: spatial dataset kp= { KP at turning point ₁ ,kp ₂ ,…,kp _i ,…,kp _n In } where KP is the data set of turning points, KP ₁ To the density of the 1 st turning point, kp ₂ To the density of the 2 nd turning point, kp _i For the density of the ith turning point, kp _n For the density of the nth turning point, in order to ensure the adaptability of the algorithm, the neighborhood radius is adaptively determined according to the equation, namely: wherein M is the neighborhood radius, kp _i For the density of the ith turning point, kp _j For the density of the j-th turning point, d (kp _i ,kp _j ) Is kp _i And kp _j Euclidean distance between, when KP is clustered into k, clustering center O _k Is O _k ＝{kp _i |i＝argmax(B _i ×|M _ε (kp) |) wherein ∈j>O＝{O ₁ ,O ₂ ,…,O _k-1 -a selected set of cluster centers, < +.>For tp _i And the sum of distances of clustering centers in the set O, measuring a clustering effect by a user fuzzy clustering effectiveness index, wherein the fuzzy clustering effectiveness index is as follows: />Wherein U is _m,i For the ith (i=1, 2, …, n) data sample kp in class m (m=1, 2, …, k) _i Membership degree of O _m Is the m-th cluster center, O _h Is the center of the h cluster, d (O _m ,O _h ) Is O _m And O _h Distance between d (kp) _i ,O _m ) For kp _i And O _m Distance between, maxd (O _m ,O _h ) Is O _m And O _h Greatest common divisor of (d), mini (O) _m ,O _h ) Is O _m And O _h Then searching the optimal steering area from the position of the starting point;

then solving the optimal matching rule, wherein the matching objective function between the space data and the rule is as followsWherein R is a matching objective function between the spatial data and the rule, Γ is a continuous cross product operation, Γ is a rule set, A is a set of spatial data, a is spatial data in A, B is a set of rules, B is a rule in B, f _a As utility function of spatial data a, g _b As a utility function of rule b, pi _ab For the best matching rule, the bi-directional multiplier method is bm=min (f _a +g _b ) And xa+yb=c, where BM is a bi-directional multiplier objective function, f _a As utility function of spatial data a, g _b For the utility function of rule b, X and Y are constant matrix, c is constant vector, for the convenience of solving the matching objective function R, the matching objective function R is improved to R' by the bidirectional multiplier method, namely Wherein R 'is an improved matching objective function, and is solved by a lagrangian multiplier method, i.e., l=r' ++Σ _a∈A ∑ _b∈B λ(f _a (π _ab )-g _a (π _a ) Where L is Lagrange's functional formula, lambda is Lagrange multiplier, and adding a dual condition to further improve R' to R "in order to make the solving process more accurate, i.e. & lt & gt>Wherein R 'is a matching objective function after R', and the best matching is found out at the highest speedRule pi of allocation _ab Can optimally match the spatial data with rules of the rule base, thus improving the iterative process of the algorithm, i.e. +.> Where k is the number of iterations, Γ (k+1) is the successive cross product operation at the (k+1) th iteration,to find the smallest matching rule pi _ab Eta is a parameter for controlling convergence rate, pi _ab (k) Pi is the matching rule at the kth iteration _ab (k+1) is the matching rule at the (k+1) th iteration, for pi _ab (k+1) solving for

Finally, according to the optimal matching rule pi obtained by solving _ab (k) Performing optimal matching between the space data and the rules, and pi is performed according to the optimal matching rules _ab (k) Matching between space data and rules is performed by first establishing a feature steering region data set KR _typei Find outIs +.>The distance between the turning areas of any feature is then given, namely: wherein,for Euclidean distances of feature steer zones i and i+k, the speed and course of feature steer zones are: /> Wherein x is the number of turning points of a particular type in the feature turning region; calculating a first steering region KR _i And the next turn region KR _i+1 The difference between these is expressed in terms of the total distance of the steering zones, namely: /> Wherein (1)> Theta is KR _i And KR _i+1 The steering difference between them, finally, the optimal matching rule is converted into the total distance D in the steering region _total I.e.: d (D) _total ＝D _course ω _C +D _distance ω _D Wherein ω is _C Distance to routeWeights, ω _D For the distance of the turning area, the distributed optimal matching algorithm of the improved bidirectional multiplier method classifies data firstly, obtains the density of each data point area by adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density in the classifying process, then improves the matching objective function twice to conveniently and accurately solve the optimal matching rule, and finally converts the optimal matching rule into the total distance in the space data turning area to realize the optimal matching of the space data and the rule base rule;

the automatic scheduling unit provides an automatic scheduling algorithm for improving the deep Q network to automatically schedule the space data processing task; the automatic scheduling algorithm of the improved deep Q network is specifically as follows: the reward function is Wherein r (pi) is a reward function, pi is a scheduling strategy, mu is a normalization factor, T _max For maximum completion time of scheduling spatial data, +.>To schedule the lower limit of the maximum completion time of the spatial data, the cumulative rewards areWherein AR is cumulative prize, r _t For rewards at time t, r _t+1 For rewards at time t+1, r _t+2 For rewards at time t+2, r _t+N For rewards at time t+N, gamma is the discount factor, N is [0, N]The time integer between the two is that the updating process of the Q value in the depth Q network is Q (s, a) = (1-alpha) Q (s, a) +alpha [ r+gamma max) _a' Q(s',a')]Wherein Q (s, a) is the Q value of action a taken in state s, alpha is the learning rate of the update step of the Q value in each iteration, r is the reward obtained after action a is taken in state s, gamma is the discount factor, s' isThe new state obtained after action a 'is the optimal scheduling action in the scheduling strategy under the new state s', and Q (s, a) is improved into two independent Q value functions, namely Q ₁ (s,a)＝(1-α)Q ₁ (s,a)+α[r+γmax _a' Q ₁ (s′,a′)]，Q ₂ (s,a)＝(1-α)Q ₂ (s,a)+α[r+γmax _a' Q ₁ (s',a')]Wherein Q is ₁ (s, a) is the 1 st independent Q value function, Q ₂ (s, a) is the 2 nd independent Q value function, Q is taken as ₁ (s, a) evaluate the best action of automatic dispatch, will Q ₂ (s, a) is used for updating the Q value, and the two independent Q value functions are used in an interactive way, so that the problem of overestimation of the deep Q network is solved, and the learning rate alpha is improved to alpha 'through learning rate attenuation to improve the algorithm performance, namely alpha' =alpha.e ^-m·epoch The method comprises the steps of determining an improved learning rate, wherein alpha 'is the improved learning rate, m is an attenuation factor, epoch is an iterative update step number, and simultaneously adding a factor to control the attenuation amplitude of the learning rate, namely alpha' =alpha '. Factor, wherein alpha' is the learning rate after the factor is added, an automatic scheduling algorithm of an improved depth Q network firstly decomposes an original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q network, and then proposes the attenuation of the learning rate and the attenuation amplitude control factor to improve the learning rate so that the automatic scheduling process has self-adaptability, can be better converged, and realize automatic scheduling of a spatial data processing task.

2. The automated scheduling system for spatial data analysis based on a rule base of claim 1, wherein the spatial data input module obtains various types of spatial data from satellite remote sensing, sensors, databases, internet data sources and uploads the spatial data to the rule base.

3. The automated scheduling system for spatial data analysis based on rule base of claim 1, wherein the data preprocessing module performs preprocessing of the spatial data by cleaning and repairing information of errors, deletions and inconsistencies in the spatial data while performing data calibration and data dimension reduction.

4. The automatic scheduling system for spatial data analysis based on a rule base according to claim 1, wherein the rule base definition unit is used for defining and managing rules, guiding the system how to efficiently process and analyze large-scale spatial data through conditions, operations and data processing step rules, and classifying the rules according to different tasks, analysis types and data types to process the spatial data according to application of appropriate rules.

5. The automatic scheduling system for spatial data analysis based on rule base according to claim 1, wherein the rule matching unit proposes a distributed optimal matching algorithm for improving the bidirectional multiplier method to optimally match the spatial data with the rule base rules.

6. The automatic scheduling system for space data analysis based on rule base according to claim 1, wherein the scheduling optimizing unit adjusts the execution sequence of tasks by dynamically scheduling real-time space data, establishes a monitoring system to track the performance and resource utilization of task execution and the change of task execution time, and optimizes the automatic scheduling process; the error processing module detects potential errors and abnormal conditions by monitoring the whole data analysis and scheduling flow and input data, and once errors occur, the error processing module records the types, time and related information of the errors and performs fault elimination and problem analysis so as to process and manage the errors and abnormal conditions occurring in the space data analysis and automatic scheduling process.

7. A method of automatic scheduling of spatial data analysis using the system of any one of claims 1-6.