CN117271099B - Automatic space data analysis scheduling system and method based on rule base - Google Patents

Automatic space data analysis scheduling system and method based on rule base Download PDF

Info

Publication number
CN117271099B
CN117271099B CN202311552215.XA CN202311552215A CN117271099B CN 117271099 B CN117271099 B CN 117271099B CN 202311552215 A CN202311552215 A CN 202311552215A CN 117271099 B CN117271099 B CN 117271099B
Authority
CN
China
Prior art keywords
data
rule
matching
scheduling
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311552215.XA
Other languages
Chinese (zh)
Other versions
CN117271099A (en
Inventor
李想
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Normal University
Original Assignee
Shandong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Normal University filed Critical Shandong Normal University
Priority to CN202311552215.XA priority Critical patent/CN117271099B/en
Publication of CN117271099A publication Critical patent/CN117271099A/en
Application granted granted Critical
Publication of CN117271099B publication Critical patent/CN117271099B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/29Geographical information databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Software Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Remote Sensing (AREA)
  • Probability & Statistics with Applications (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The system comprises a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic scheduling and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module is used for defining rules of the rule base and matching the space data and the rules, the data processing automatic scheduling and optimizing module is used for scheduling space data processing tasks and optimizing scheduling processes, the error processing module is used for processing scheduling anomalies, and the user interface module is used for providing a user interface. The invention provides a distributed optimal matching algorithm for improving a bidirectional multiplier method, which is used for matching space data with rules and an automatic scheduling algorithm for improving a deep Q network for automatically scheduling space data processing tasks.

Description

Automatic space data analysis scheduling system and method based on rule base
Technical Field
The invention relates to the field of spatial data processing, optimal matching and task automatic scheduling, in particular to a spatial data analysis automatic scheduling system and method based on a rule base.
Background
Spatial data processing is a technical means for performing acquisition, preprocessing, analysis, storage and visualization operations on spatial data, which are used to process different kinds of spatial data, such as Geographic Information System (GIS) data, remote sensing data, geolocation data, etc., to perform various spatial data analysis tasks, acquire various types of spatial data, including satellite images, sensor data, map data and GPS track data, derive useful information about the spatial data using various geographic information analysis and spatial data analysis methods, such as geospatial pattern recognition, spatial buffer analysis and map algebra operations, apply predefined rule bases and models to the spatial data, to automatically perform specific analysis tasks, so that a user can perform spatial data analysis more easily.
The best match is a method for determining the best matching rule, the purpose of which is to select the appropriate rule from a rule base in order to perform the spatial data analysis task and obtain the best result when considering the performance and constraints of the different rules, and a set of evaluation criteria are typically used to evaluate the suitability of each rule, including accuracy, precision, computational efficiency and memory usage of the rule, and rank the rule mouth model according to these evaluation criteria to determine which rule is most suitable for the current analysis task, the choice of the best matching technique will depend on the specific requirements of the problem, the nature of the rule base and the characteristics of the spatial data analysis task, the purpose of which techniques is to ensure that the rule most suitable for the current task can be selected and performed to achieve the best analysis result.
The task automatic scheduling is a technology capable of automatically identifying, planning and arranging to execute different spatial data analysis tasks without manual intervention of a user, the technology generally executes the following tasks based on a set of predefined rules and conditions, the task description is completed based on a condition trigger in a rule base, the execution sequence and time of various analysis tasks are planned, the task dependency analysis and resource allocation are involved, computing resources, storage resources and data access rights are allocated according to task requirements so as to execute the analysis tasks, once the tasks are identified, planned and allocated, the tasks can be automatically executed, the aim of the automatic scheduling technology is to reduce the operation burden of the user, improve the efficiency and automation degree of spatial data analysis, the user can set rule base and task parameters when the requirements exist, then automatically execute the tasks according to an automatic scheduling method, the task automatic scheduling method is particularly useful for large-scale and complex spatial data analysis tasks, and the work efficiency and accuracy can be improved.
Disclosure of Invention
Aiming at the problems, the invention aims to provide an automatic scheduling method for spatial data analysis based on a rule base.
The aim of the invention is realized by the following technical scheme:
the automatic space data dispatching method based on the rule base comprises a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic dispatching and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module comprises a rule base definition unit and a rule matching unit, the rule base definition unit is used for defining rule base rules, the rule matching unit provides a distributed optimal matching algorithm for improving a bidirectional multiplier method to optimally match the space data with the rule base rules, the data processing automatic dispatching and optimizing module comprises an automatic dispatching unit and a dispatching optimizing unit, the automatic dispatching unit provides an automatic dispatching algorithm for improving a depth Q network to automatically dispatch space data processing task, the dispatching optimizing unit is used for optimizing an automatic dispatching process, the error processing module is used for processing anomalies in the dispatching process, and the user interface module is used for providing a user interface.
Furthermore, the space data input module acquires various types of space data by using satellite remote sensing, sensors, databases and internet data sources, and uploads the space data to the rule base.
Furthermore, the data preprocessing module performs data calibration and data dimension reduction simultaneously by clearing and repairing error, missing and inconsistent information in the space data, so as to preprocess the space data.
Further, the rule base definition unit is used for defining and managing rules, guiding the system how to effectively process and analyze large-scale space data through conditions, operations and data processing steps, and classifying the rules according to different tasks, analysis types and data types so as to process the space data according to the application of the appropriate rules.
Furthermore, the rule matching unit provides a distributed optimal matching algorithm for improving the bidirectional multiplier method to optimally match the space data with rules of the rule base.
Further, the distributed optimal matching algorithm for improving the bidirectional multiplier method is specifically as follows: firstly, adaptively judging a neighborhood radius by calculating the distance between spatial data samples, obtaining the density of each type of data spatial samples according to the neighborhood radius, increasing a clustering center, judging the current clustering effect by using a fuzzy clustering effectiveness index, then selecting the optimal clustering number and the optimal clustering center, and finally, optimizing the clustering result by minimizing a clustering objective function, wherein the clustering result is as follows: spatial data set at turning pointWherein, is->For the data set of turning points, +.>For the density of the 1 st turning point, +.>For the density of the 2 nd turning point, +.>Is->Density of individual turning points +.>For the density of the nth turning point, in order to ensure the adaptability of the algorithm, the neighborhood radius is adaptively determined according to the equation, namely:wherein M is the neighborhood radius, +.>Is->Density of individual turning points +.>Is->Density of individual turning points +.>Is->And->Euclidean distance between, when->When being clustered into k, the clustering centerIs->Wherein->,/>For a selected group of cluster centers, +.>Is->And set->The sum of the distances of the cluster centers in the system is used for measuring the clustering effect by a user fuzzy clustering effectiveness index, wherein the fuzzy clustering effectiveness index is as follows: />Wherein->Is->Class +.>Data samples->Membership of->Is the center of the mth cluster, +.>Is the center of the h cluster, +.>Is->And->Distance between->Is->Anddistance between->Is->And->Maximum common divisor of->Is->And->Then searching the optimal steering area from the position of the starting point;
then solving the optimal matching rule, wherein the matching objective function between the space data and the rule is as followsWherein R is a matching objective function between the spatial data and the rules, < >>For successive cross product operations->Is a rule set, A is a set of spatial data, a is spatial data in A, B is a set of rules, B is a rule in B, < ->As utility function of spatial data a, +.>As a utility function of rule b,for the optimal matching rule, the bi-directional multiplier method is +.>And->Wherein BM is a bi-directional multiplier objective function, < ->As utility function of spatial data a, +.>For the utility function of rule b, X and Y are constant matrix, c is constant vector, for the convenience of solving the matching objective function R, the matching objective function R is improved to ++by the bidirectional multiplier method>I.e. +.>Wherein->For improved matching objective function, the Lagrangian multiplier pair ++>Solving, i.e.Wherein L is Lagrangian functional formula,>for Lagrangian multiplier, to make the solving process more accurate, add the dual condition will +.>Further improve to->I.e.Wherein->For->The matching objective function after that is to find the optimal matching rule at the fastest speed +.>The space data can be optimally matched with rules of the rule base, so that the iterative process of the algorithm is performedLine improvement, i.e+,/>+Wherein->For the number of iterations->Is->Successive cross product operations under multiple iterations, +.>To find the smallest matching rule +.>,/>In order to control the parameters of the convergence speed,is->Matching rules under multiple iterations, +.>Is->Matching rules under multiple iterations, pairs ofSolving, have->
Finally, according to the optimal matching rule obtained by solvingPerforming optimal matching between the spatial data and the rule, and according to the optimal matching rule->Matching between spatial data and rules is performed, firstly, a feature steering region data set is established +.>Find->Is +.>The distance between the turning areas of any feature is then given, namely: />Wherein->For characteristic turning area->And->The speed and course of the characteristic steering region is: />,/>Wherein->A number of turning points of a particular type in the feature turning region; calculate the first steering area +.>And the next turning region->The difference between these is expressed in terms of the total distance of the steering zones, namely:wherein, the method comprises the steps of, wherein, ,/>is->And->Steering difference between them, finally, the optimal matching rule is converted into total distance +.>I.e.: />Wherein->Weight of route distance +.>For the distance of the steering area, the distributed optimal matching algorithm for improving the bidirectional multiplier method firstly classifies the data, obtains the density of each data point area by adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density in the classification process, and then improves the matching objective function twice to conveniently and accurately solveAnd finally, converting the optimal matching rule into a total distance in the space data steering area, and realizing optimal matching of the space data and the rule base rule.
Furthermore, the automatic scheduling unit provides an automatic scheduling algorithm for improving the deep Q network to automatically schedule the space data processing task.
Further, the automatic scheduling algorithm for the improved deep Q network is specifically as follows: the reward function isWherein->For rewarding function->For scheduling policy->As a result of the normalization factor,for maximum completion time of scheduling spatial data, +.>For the lower limit of the maximum completion time of the scheduling spatial data, the cumulative prize is +.>Wherein AR accumulates rewards,>for rewards at time t, +.>For a reward at time t+1, +.>For a reward at time t+2, +.>For a reward at time t+N, +.>For the discount factor, n is->The time integer between, the update process of the Q value in the deep Q network is +.>Wherein->Taking action in state s>Q value of>To control the learning rate of the update step of the Q value in each iteration, < >>To take action in state sRewards obtained later, < >>For discounts factor->To take action->New state obtained later,/->To be in new state->The optimal scheduling action in the lower scheduling strategy is to solve the problem of overestimation of the deep Q network>Modified to two independent Q functions, i.e. +.>Wherein->For the 1 st independent Q function, < ->For the 2 nd independent Q function, will +.>Evaluating the best action of the automatic schedule, to beThe method is used for updating the Q value, and solves the problem of overestimation of the deep Q network through interactive use of two independent Q value functions, so that the method has better self-adaptability in the automatic scheduling process, and therefore the learning rate is +.>Improvement of attenuation by learning rate>To improve the algorithm performance, i.e->Wherein->For improved learning rate, m is the attenuation factor, < ->To update the step number iteratively, add +.>Factor controlled learning rate decay magnitudeI.e. +.>Wherein->For adding->The learning rate after the factor is improved, an automatic scheduling algorithm of the depth Q network is used for decomposing an original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q network, and then learning rate attenuation and attenuation amplitude control factors are provided for improving the learning rate so that the automatic scheduling process has self-adaptability, can be better converged, and realize automatic scheduling of spatial data processing tasks.
Furthermore, the scheduling optimization unit adjusts the execution sequence of the tasks by dynamically scheduling the real-time space data, establishes a monitoring system to track the performance and resource utilization condition of task execution and the change of task execution time, and realizes the optimization of the automatic scheduling process.
Further, the error processing module monitors the whole data analysis and scheduling process and the input data to detect potential errors and abnormal conditions, and once errors occur, the error processing module records the types, time and related information of the errors and performs fault elimination and problem analysis so as to process and manage the errors and abnormal conditions occurring in the space data analysis and automatic scheduling process.
Furthermore, the user interface module is used for providing a visual and interactive interface for the user, so that the user can easily apply the automatic scheduling method for space data analysis to manage the space data, simultaneously, the user is allowed to input the space data to be processed and rule base rules, the user can set specific requirements of analysis tasks through the interface, the user can view the results of the analysis tasks through the user interface and present the results in the form of graphs and charts, and the visualization of the results is helpful for the user to better understand the analysis results of the space data.
The invention has the beneficial effects that: the invention is characterized in that the distributed optimal matching algorithm of the improved bidirectional multiplier method is used for automatically scheduling the spatial data processing tasks, the distributed optimal matching algorithm of the improved bidirectional multiplier method is firstly used for classifying the data, the density of each data point area is obtained by adaptively determining the neighborhood radius and gradually increasing the cluster center based on the sample density in the classifying process, then the optimal matching rule is conveniently and accurately solved for two times of improvement of a matching objective function, finally the optimal matching rule is converted into the total distance in a spatial data steering area, the optimal matching of the spatial data and the rule base rule is realized, the automatic scheduling algorithm of the improved depth Q network is provided for automatically scheduling the spatial data processing tasks, the automatic scheduling algorithm of the improved depth Q network is firstly used for automatically classifying the data, the adaptive decision of the adaptive decision neighborhood radius and the sample density are used for gradually increasing the cluster center in the classifying process, then the two times of the matching objective function are improved to conveniently and accurately solve the optimal matching rule, the optimal matching rule is finally converted into the total distance in the spatial data steering area, the automatic scheduling algorithm of the improved depth Q network is provided for automatically scheduling the spatial data processing tasks, the automatic scheduling algorithm of the improved depth Q network is firstly, the automatic scheduling function of the automatic adaptive decision-based on the automatic depth Q network is used for automatically estimating the depth value of the automatic data-dependent depth-based on the automatic data, the automatic data-base has the automatic depth-dependent analysis factor-based on the automatic analysis effect, the automatic data-based on the automatic depth-dependent analysis factor is better achieved, the method provides a more comprehensive and accurate technical support for a spatial data analysis automatic scheduling method based on a rule base, provides a better decision support for a safe, scientific and efficient spatial data analysis automatic scheduling method based on the rule base, simultaneously relates to an optimal matching algorithm and a reinforcement learning algorithm, provides a convenient and efficient spatial data analysis automatic scheduling method based on the rule base for people, can also be used for strengthening the foundation for development of other application fields, lays a solid foundation for development of fusion of multiple fields in the times of spatial data processing, optimal matching and task automatic scheduling, can be applied to multiple industries and fields in the market, provides a new development direction for fusion of spatial data processing, optimal matching and task automatic scheduling, and contributes important application value for the technical field of spatial data processing.
Drawings
The invention will be further described with reference to the accompanying drawings, in which embodiments do not constitute any limitation on the invention, and other drawings can be obtained by one of ordinary skill in the art without undue effort from the following drawings.
Fig. 1 is a schematic diagram of the structure of the present invention.
Detailed Description
The invention will be further described with reference to the following examples.
The automatic space data dispatching method based on the rule base comprises a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic dispatching and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module comprises a rule base definition unit and a rule matching unit, the rule base definition unit is used for defining rule base rules, the rule matching unit provides a distributed optimal matching algorithm for improving a bidirectional multiplier method to optimally match the space data with the rule base rules, the data processing automatic dispatching and optimizing module comprises an automatic dispatching unit and a dispatching optimizing unit, the automatic dispatching unit provides an automatic dispatching algorithm for improving a depth Q network to automatically dispatch space data processing task, the dispatching optimizing unit is used for optimizing an automatic dispatching process, the error processing module is used for processing anomalies in the dispatching process, and the user interface module is used for providing a user interface.
Preferably, the spatial data input module obtains various types of spatial data, such as Geographic Information System (GIS) data, meteorological data, topographic data and population data, by using satellite remote sensing, sensors, a database and an internet data source, and uploads the spatial data to a rule base.
Preferably, the data preprocessing module performs data calibration and data dimension reduction simultaneously by clearing and repairing error, missing and inconsistent information in the spatial data, so that the computational complexity is reduced and the analysis efficiency is improved, and the spatial data is preprocessed.
Preferably, the rule base definition unit is used for defining and managing rules, guiding the system how to effectively process and analyze large-scale space data through conditions, operations and data processing step rules, classifying the rules according to different tasks, analysis types and data types, so as to process the space data according to the application of the appropriate rules, improve the efficiency of space data analysis, and ensure consistency and repeatability.
Preferably, the rule matching unit proposes a distributed optimal matching algorithm for improving the bidirectional multiplier method to optimally match the space data with rules of the rule base.
Specifically, the distributed optimal matching algorithm for improving the bidirectional multiplier method is specifically as follows: firstly, adaptively judging a neighborhood radius by calculating the distance between spatial data samples, obtaining the density of each type of data spatial samples according to the neighborhood radius, increasing a clustering center, judging the current clustering effect by using a fuzzy clustering effectiveness index, then selecting the optimal clustering number and the optimal clustering center, and finally, optimizing the clustering result by minimizing a clustering objective function, wherein the clustering result is as follows: spatial data set at turning pointWherein KP is the data set of turning points, ">For the density of the 1 st turning point, +.>For the density of the 2 nd turning point, +.>Is->Density of individual turning points +.>For the density of the nth turning point, the density of the turning point is +>Is its neighborhood radius +.>The number of adjacent turning points, and therefore the turning point density, is related to the domain radius, and in order to ensure the adaptability of the algorithm, the neighborhood radius is adaptively determined according to the equation, namely: />Wherein M is the neighborhood radius, +.>Is->Density of individual turning points +.>Is->Density of individual turning points +.>Is->And->Euclidean distance between, when KP is clustered into +.>At the time, clustering center->Is->Wherein, the method comprises the steps of, wherein,for a selected group of cluster centers, +.>Is->And set->The sum of the distances of the cluster centers in the system is used for measuring the clustering effect by a user fuzzy clustering effectiveness index, wherein the fuzzy clustering effectiveness index is as follows: />Wherein->Is the firstClass +.>Data samples->Membership of->Is the center of the mth cluster, +.>Is the center of the h cluster, +.>Is->And->Distance between->Is thatAnd->Distance between->Is->And->Maximum common divisor of->Is->And->Then searching the optimal steering area from the position of the starting point;
then solving the optimal matching rule, wherein the matching objective function between the space data and the rule is as followsWherein R is a matching objective function between the spatial data and the rules, < >>For successive cross product operations->For rule set, A is set of spatial data, +.>For the spatial data in A, B is a set of rules, B is a rule in B, < ->For spatial data->Utility function of->As a utility function of rule b +.>For the optimal matching rule, the bi-directional multiplier method is +.>And (2) andwherein BM is a bi-directional multiplier objective function, < ->For spatial data->Utility function of->For the utility function of rule b, X and Y are constant matrix, c is constant vector, for the convenience of solving the matching objective function R, the matching objective function R is improved to ++by the bidirectional multiplier method>I.e.Wherein, the method comprises the steps of, wherein,for improved matching objective function, by Lagrangian multiplier methodFor->Solving, i.e.Wherein L is Lagrangian functional formula,>for Lagrangian multiplier, to make the solving process more accurate, add the dual condition will +.>Further improve to->I.e.Wherein->For->The matching objective function after that is to find the optimal matching rule at the fastest speed +.>The space data can be optimally matched with rules of the rule base, so that the iterative process of the algorithm is improved, namely+Wherein->For the number of iterations->Is->Successive cross product operations under multiple iterations, +.>To find the smallest matching rule +.>,/>For controlling the parameters of the convergence speed +.>Is->The matching rule at the time of iteration,is->Matching rule under multiple iterations, for->To solve, there are
Finally, according to the optimal matching rule obtained by solvingPerforming optimal matching between the spatial data and the rule, and according to the optimal matching rule->Matching between spatial data and rules is performed, firstly, a feature steering region data set is established +.>Find->Is +.>The distance between the turning areas of any feature is then given, namely: />Wherein->For characteristic turning area->And->The speed and course of the characteristic steering region is: />,/>Wherein->A number of turning points of a particular type in the feature turning region; calculate the first steering area +.>And the next turning region->The difference between these is expressed in terms of the total distance of the steering zones, namely:wherein, the method comprises the steps of, wherein,,/>is->And->Steering difference between them, finally, the optimal matching rule is converted into total distance +.>I.e.:wherein->Weight of route distance +.>For the distance of the turning area, the distributed optimal matching algorithm of the improved bidirectional multiplier method firstly classifies data, obtains the density of each data point area by adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density in the classifying process, then improves the matching objective function twice to conveniently and accurately solve the optimal matching rule, and finally converts the optimal matching rule into the total distance of the space data turning area so as to realize the optimal matching of the space data and the rule base rule.
Preferably, the automatic scheduling unit proposes an automatic scheduling algorithm that improves the deep Q network to automatically schedule spatial data processing tasks.
Specifically, the automatic scheduling algorithm for the improved deep Q network is specifically as follows: the reward function isWherein->For rewarding function->For scheduling policy->As a result of the normalization factor,for maximum completion time of scheduling spatial data, +.>To schedule the lower limit of the maximum completion time of the spatial data, the bonus function is more suitable for the scheduling problem of the spatial data, and the cumulative bonus isWherein AR is cumulative rewards,>for rewards at time t, +.>For a reward at time t+1, +.>For a reward at time t+2, +.>For a reward at time t+N, +.>For the discount factor, n is->The time integer between the time, the updating process of the Q value in the deep Q network is thatWherein->To take action in state s>Q value of>To control the learning rate of the update step of the Q value in each iteration, < >>To take action in state sRewards obtained later, < >>For discounts factor->To take action->New state obtained later,/->To be in new state->The optimal scheduling action in the following scheduling strategy, the Q value is updated through continuous iteration so as to better estimate the long-term accumulated rewards of each action in each state, and the Q value is better estimated in order to solve the problem of overestimation of the deep Q network>Modified to two independent Q functions, i.e. +.>Wherein->For the 1 st independent Q function, < ->For the 2 nd independent Q function, will +.>Evaluating the best action of the automatic schedule, to beThe method is used for updating the Q value, and solves the problem of overestimation of the deep Q network through interactive use of two independent Q value functions, so that the method has better self-adaptability in the automatic scheduling process, and therefore the learning rate is +.>Improvement of attenuation by learning rate>To improve the algorithm performance, i.e->Wherein->For improved learning rate, m is the attenuation factor, < ->To update the step number iteratively, add +.>Factor controls the decay amplitude of learning rate, i.e +.>Wherein->For adding->The learning rate after the factor is improved, an automatic scheduling algorithm of the depth Q network is improved to firstly decompose an original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q networkAnd then, the learning rate attenuation and attenuation amplitude control factors are provided for improving the learning rate, so that the automatic scheduling process has self-adaptability, can be converged better, and the automatic scheduling of the spatial data processing task is realized.
Preferably, the scheduling optimization unit adjusts the execution sequence of the tasks by dynamically scheduling the real-time space data, establishes a monitoring system to track the performance and resource utilization condition of task execution and the change of task execution time, and realizes the optimization of the automatic scheduling process.
Preferably, the error handling module detects potential errors and anomalies by monitoring the overall data analysis and scheduling process, and the incoming data, and upon occurrence of an error, the error handling module records the type, time and related information of the error and performs troubleshooting and problem analysis to handle and manage the errors and anomalies occurring during spatial data analysis and automatic scheduling, and for some known errors and anomalies the error handling module attempts to automatically handle them, e.g., if some data is missing, filling according to predefined rules, and if a rule match is problematic, adjustment of the matching rules.
Preferably, the user interface module is used for providing a visual and interactive interface for the user, so that the user can easily apply the automatic space data analysis scheduling method to manage the space data, and simultaneously, the user is allowed to input the space data to be processed and rule base rules, such as a space data set, a rule base, analysis task parameters and the like, the user can set specific requirements of the analysis task through the interface, including the required rules, the analysis method and expected results, the user can view the results of the analysis task through the user interface and present the results in a graph and chart form, and the visualization of the results is helpful for the user to better understand the analysis results of the space data.
The invention provides a space data analysis automatic scheduling method based on a rule base, which is used for automatic scheduling of space data analysis, and provides an automatic scheduling method for space data processing analysis through fusion of a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic scheduling and optimizing module, an error processing module and a user interface module, and provides a distributed optimal matching algorithm for improving a bidirectional multiplier method for automatic scheduling of space data processing tasks, the invention is innovative in that the distributed optimal matching algorithm for improving the bidirectional multiplier method firstly classifies data, and in the classification process, the density of each data point area is obtained through self-adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density, then the matching objective function is improved twice to conveniently and accurately solve the optimal matching rule, finally the optimal matching rule is converted into the total distance in the space data steering area, the optimal matching of the space data and the rule base rule is realized, the automatic scheduling algorithm of the improved depth Q network is provided for automatically scheduling the space data processing task, the innovation of the invention is that the automatic scheduling algorithm of the improved depth Q network is firstly to decompose the original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q network, then the learning rate attenuation and attenuation amplitude control factor is provided for improving the learning rate so that the automatic scheduling process has self-adaptability and can be better converged, the automatic scheduling of the space data processing task is realized, the working effect of the space data analysis automatic scheduling method based on the rule base is effectively improved, the method provides a more comprehensive and accurate technical support for a spatial data analysis automatic scheduling method based on a rule base, provides a better decision support for a safe, scientific and efficient spatial data analysis automatic scheduling method based on the rule base, simultaneously relates to an optimal matching algorithm and a reinforcement learning algorithm, provides a convenient and efficient spatial data analysis automatic scheduling method based on the rule base for people, can also be used for strengthening the foundation for development of other application fields, lays a solid foundation for development of fusion of multiple fields in the times of spatial data processing, optimal matching and task automatic scheduling, can be applied to multiple industries and fields in the market, provides a new development direction for fusion of spatial data processing, optimal matching and task automatic scheduling, and contributes important application value for the technical field of spatial data processing.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the scope of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications can be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (7)

1. The space data analysis automatic dispatching system based on the rule base is characterized by comprising a space data input module, a data preprocessing module, a rule definition and matching module, a data processing automatic dispatching and optimizing module, an error processing module and a user interface module, wherein the space data input module is used for uploading space data, the data preprocessing module is used for preprocessing the space data, the rule definition and matching module comprises a rule base definition unit and a rule matching unit, the rule base definition unit is used for defining rule base rules, the rule matching unit adopts a distributed optimal matching algorithm for improving a bidirectional multiplier method to optimally match the space data with the rule base rules, the data processing automatic dispatching and optimizing module comprises an automatic dispatching unit and a dispatching optimizing unit, the automatic dispatching unit is used for proposing an automatic dispatching algorithm for improving a depth Q network to automatically dispatch space data processing task, the dispatching optimizing unit is used for optimizing an automatic dispatching process, and the error processing module is used for processing anomalies in a dispatching process, and the user interface module is used for providing a user interface;
the distributed optimal matching algorithm of the improved bidirectional multiplier method is specifically as follows: firstly, adaptively judging a neighborhood radius by calculating the distance between spatial data samples, obtaining the density of each type of data spatial samples according to the neighborhood radius, increasing a clustering center, judging the current clustering effect by using a fuzzy clustering effectiveness index, then selecting the optimal clustering number and the optimal clustering center, and finally, optimizing the clustering result by minimizing a clustering objective function, wherein the clustering result is as follows: spatial dataset kp= { KP at turning point 1 ,kp 2 ,…,kp i ,…,kp n In } where KP is the data set of turning points, KP 1 To the density of the 1 st turning point, kp 2 To the density of the 2 nd turning point, kp i For the density of the ith turning point, kp n For the density of the nth turning point, in order to ensure the adaptability of the algorithm, the neighborhood radius is adaptively determined according to the equation, namely: wherein M is the neighborhood radius, kp i For the density of the ith turning point, kp j For the density of the j-th turning point, d (kp i ,kp j ) Is kp i And kp j Euclidean distance between, when KP is clustered into k, clustering center O k Is O k ={kp i |i=argmax(B i ×|M ε (kp) |) wherein ∈j>O={O 1 ,O 2 ,…,O k-1 -a selected set of cluster centers, < +.>For tp i And the sum of distances of clustering centers in the set O, measuring a clustering effect by a user fuzzy clustering effectiveness index, wherein the fuzzy clustering effectiveness index is as follows: />Wherein U is m,i For the ith (i=1, 2, …, n) data sample kp in class m (m=1, 2, …, k) i Membership degree of O m Is the m-th cluster center, O h Is the center of the h cluster, d (O m ,O h ) Is O m And O h Distance between d (kp) i ,O m ) For kp i And O m Distance between, maxd (O m ,O h ) Is O m And O h Greatest common divisor of (d), mini (O) m ,O h ) Is O m And O h Then searching the optimal steering area from the position of the starting point;
then solving the optimal matching rule, wherein the matching objective function between the space data and the rule is as followsWherein R is a matching objective function between the spatial data and the rule, Γ is a continuous cross product operation, Γ is a rule set, A is a set of spatial data, a is spatial data in A, B is a set of rules, B is a rule in B, f a As utility function of spatial data a, g b As a utility function of rule b, pi ab For the best matching rule, the bi-directional multiplier method is bm=min (f a +g b ) And xa+yb=c, where BM is a bi-directional multiplier objective function, f a As utility function of spatial data a, g b For the utility function of rule b, X and Y are constant matrix, c is constant vector, for the convenience of solving the matching objective function R, the matching objective function R is improved to R' by the bidirectional multiplier method, namely Wherein R 'is an improved matching objective function, and is solved by a lagrangian multiplier method, i.e., l=r' ++Σ a∈Ab∈B λ(f aab )-g aa ) Where L is Lagrange's functional formula, lambda is Lagrange multiplier, and adding a dual condition to further improve R' to R "in order to make the solving process more accurate, i.e. & lt & gt>Wherein R 'is a matching objective function after R', and the best matching is found out at the highest speedRule pi of allocation ab Can optimally match the spatial data with rules of the rule base, thus improving the iterative process of the algorithm, i.e. +.> Where k is the number of iterations, Γ (k+1) is the successive cross product operation at the (k+1) th iteration,to find the smallest matching rule pi ab Eta is a parameter for controlling convergence rate, pi ab (k) Pi is the matching rule at the kth iteration ab (k+1) is the matching rule at the (k+1) th iteration, for pi ab (k+1) solving for
Finally, according to the optimal matching rule pi obtained by solving ab (k) Performing optimal matching between the space data and the rules, and pi is performed according to the optimal matching rules ab (k) Matching between space data and rules is performed by first establishing a feature steering region data set KR typei Find outIs +.>The distance between the turning areas of any feature is then given, namely: wherein,for Euclidean distances of feature steer zones i and i+k, the speed and course of feature steer zones are: /> Wherein x is the number of turning points of a particular type in the feature turning region; calculating a first steering region KR i And the next turn region KR i+1 The difference between these is expressed in terms of the total distance of the steering zones, namely: /> Wherein (1)> Theta is KR i And KR i+1 The steering difference between them, finally, the optimal matching rule is converted into the total distance D in the steering region total I.e.: d (D) total =D course ω C +D distance ω D Wherein ω is C Distance to routeWeights, ω D For the distance of the turning area, the distributed optimal matching algorithm of the improved bidirectional multiplier method classifies data firstly, obtains the density of each data point area by adaptively determining the neighborhood radius and gradually increasing the clustering center based on the sample density in the classifying process, then improves the matching objective function twice to conveniently and accurately solve the optimal matching rule, and finally converts the optimal matching rule into the total distance in the space data turning area to realize the optimal matching of the space data and the rule base rule;
the automatic scheduling unit provides an automatic scheduling algorithm for improving the deep Q network to automatically schedule the space data processing task; the automatic scheduling algorithm of the improved deep Q network is specifically as follows: the reward function is Wherein r (pi) is a reward function, pi is a scheduling strategy, mu is a normalization factor, T max For maximum completion time of scheduling spatial data, +.>To schedule the lower limit of the maximum completion time of the spatial data, the cumulative rewards areWherein AR is cumulative prize, r t For rewards at time t, r t+1 For rewards at time t+1, r t+2 For rewards at time t+2, r t+N For rewards at time t+N, gamma is the discount factor, N is [0, N]The time integer between the two is that the updating process of the Q value in the depth Q network is Q (s, a) = (1-alpha) Q (s, a) +alpha [ r+gamma max) a' Q(s',a')]Wherein Q (s, a) is the Q value of action a taken in state s, alpha is the learning rate of the update step of the Q value in each iteration, r is the reward obtained after action a is taken in state s, gamma is the discount factor, s' isThe new state obtained after action a 'is the optimal scheduling action in the scheduling strategy under the new state s', and Q (s, a) is improved into two independent Q value functions, namely Q 1 (s,a)=(1-α)Q 1 (s,a)+α[r+γmax a' Q 1 (s′,a′)],Q 2 (s,a)=(1-α)Q 2 (s,a)+α[r+γmax a' Q 1 (s',a')]Wherein Q is 1 (s, a) is the 1 st independent Q value function, Q 2 (s, a) is the 2 nd independent Q value function, Q is taken as 1 (s, a) evaluate the best action of automatic dispatch, will Q 2 (s, a) is used for updating the Q value, and the two independent Q value functions are used in an interactive way, so that the problem of overestimation of the deep Q network is solved, and the learning rate alpha is improved to alpha 'through learning rate attenuation to improve the algorithm performance, namely alpha' =alpha.e -m·epoch The method comprises the steps of determining an improved learning rate, wherein alpha 'is the improved learning rate, m is an attenuation factor, epoch is an iterative update step number, and simultaneously adding a factor to control the attenuation amplitude of the learning rate, namely alpha' =alpha '. Factor, wherein alpha' is the learning rate after the factor is added, an automatic scheduling algorithm of an improved depth Q network firstly decomposes an original depth Q value function into two independent depth Q value functions to solve the problem of overestimation of the depth Q network, and then proposes the attenuation of the learning rate and the attenuation amplitude control factor to improve the learning rate so that the automatic scheduling process has self-adaptability, can be better converged, and realize automatic scheduling of a spatial data processing task.
2. The automated scheduling system for spatial data analysis based on a rule base of claim 1, wherein the spatial data input module obtains various types of spatial data from satellite remote sensing, sensors, databases, internet data sources and uploads the spatial data to the rule base.
3. The automated scheduling system for spatial data analysis based on rule base of claim 1, wherein the data preprocessing module performs preprocessing of the spatial data by cleaning and repairing information of errors, deletions and inconsistencies in the spatial data while performing data calibration and data dimension reduction.
4. The automatic scheduling system for spatial data analysis based on a rule base according to claim 1, wherein the rule base definition unit is used for defining and managing rules, guiding the system how to efficiently process and analyze large-scale spatial data through conditions, operations and data processing step rules, and classifying the rules according to different tasks, analysis types and data types to process the spatial data according to application of appropriate rules.
5. The automatic scheduling system for spatial data analysis based on rule base according to claim 1, wherein the rule matching unit proposes a distributed optimal matching algorithm for improving the bidirectional multiplier method to optimally match the spatial data with the rule base rules.
6. The automatic scheduling system for space data analysis based on rule base according to claim 1, wherein the scheduling optimizing unit adjusts the execution sequence of tasks by dynamically scheduling real-time space data, establishes a monitoring system to track the performance and resource utilization of task execution and the change of task execution time, and optimizes the automatic scheduling process; the error processing module detects potential errors and abnormal conditions by monitoring the whole data analysis and scheduling flow and input data, and once errors occur, the error processing module records the types, time and related information of the errors and performs fault elimination and problem analysis so as to process and manage the errors and abnormal conditions occurring in the space data analysis and automatic scheduling process.
7. A method of automatic scheduling of spatial data analysis using the system of any one of claims 1-6.
CN202311552215.XA 2023-11-21 2023-11-21 Automatic space data analysis scheduling system and method based on rule base Active CN117271099B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311552215.XA CN117271099B (en) 2023-11-21 2023-11-21 Automatic space data analysis scheduling system and method based on rule base

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311552215.XA CN117271099B (en) 2023-11-21 2023-11-21 Automatic space data analysis scheduling system and method based on rule base

Publications (2)

Publication Number Publication Date
CN117271099A CN117271099A (en) 2023-12-22
CN117271099B true CN117271099B (en) 2024-01-26

Family

ID=89218081

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311552215.XA Active CN117271099B (en) 2023-11-21 2023-11-21 Automatic space data analysis scheduling system and method based on rule base

Country Status (1)

Country Link
CN (1) CN117271099B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3525136A1 (en) * 2018-02-08 2019-08-14 Prowler.io Limited Distributed machine learning system
CN111813982A (en) * 2020-07-23 2020-10-23 中原工学院 Data processing method and device based on subspace clustering algorithm of spectral clustering
CN111860612A (en) * 2020-06-29 2020-10-30 西南电子技术研究所(中国电子科技集团公司第十研究所) Unsupervised hyperspectral image hidden low-rank projection learning feature extraction method
CN116156421A (en) * 2023-02-22 2023-05-23 重庆邮电大学 Differentiated service transmission method based on double-layer satellite heterogeneous network
CN116450993A (en) * 2023-04-24 2023-07-18 哈尔滨工业大学 Multi-measurement vector satellite data processing method, electronic equipment and storage medium
CN116700340A (en) * 2023-06-29 2023-09-05 同济大学 Track planning method and device and unmanned aerial vehicle cluster
CN117076077A (en) * 2023-08-18 2023-11-17 上海墅字科技有限公司 Planning and scheduling optimization method based on big data analysis

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3525136A1 (en) * 2018-02-08 2019-08-14 Prowler.io Limited Distributed machine learning system
CN111860612A (en) * 2020-06-29 2020-10-30 西南电子技术研究所(中国电子科技集团公司第十研究所) Unsupervised hyperspectral image hidden low-rank projection learning feature extraction method
CN111813982A (en) * 2020-07-23 2020-10-23 中原工学院 Data processing method and device based on subspace clustering algorithm of spectral clustering
CN116156421A (en) * 2023-02-22 2023-05-23 重庆邮电大学 Differentiated service transmission method based on double-layer satellite heterogeneous network
CN116450993A (en) * 2023-04-24 2023-07-18 哈尔滨工业大学 Multi-measurement vector satellite data processing method, electronic equipment and storage medium
CN116700340A (en) * 2023-06-29 2023-09-05 同济大学 Track planning method and device and unmanned aerial vehicle cluster
CN117076077A (en) * 2023-08-18 2023-11-17 上海墅字科技有限公司 Planning and scheduling optimization method based on big data analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Pan-Sharpening Method with Beta-Divergence Non-Negative Matrix Factorization in Non-Subsampled Shear Transform Domain;Pan, Yuetao;REMOTE SENSING Volume14 Issue12;全文 *
利用坐标下降实现并行稀疏子空间聚类;吴杰祺;李晓宇;袁晓彤;刘青山;;计算机应用(第02期);88-92 *

Also Published As

Publication number Publication date
CN117271099A (en) 2023-12-22

Similar Documents

Publication Publication Date Title
US11281969B1 (en) Artificial intelligence system combining state space models and neural networks for time series forecasting
CN110866528B (en) Model training method, energy consumption use efficiency prediction method, device and medium
WO2022095302A1 (en) Hierarchical gaussian mixture model-based fast and robust robot three-dimensional reconstruction method
CN111127246A (en) Intelligent prediction method for transmission line engineering cost
CN111127364B (en) Image data enhancement strategy selection method and face recognition image data enhancement method
CN103257921A (en) Improved random forest algorithm based system and method for software fault prediction
CN104091216A (en) Traffic information predication method based on fruit fly optimization least-squares support vector machine
CN111611085B (en) Yun Bian collaboration-based man-machine hybrid enhanced intelligent system, method and device
CN112464567B (en) Intelligent data assimilation method based on variational and assimilative framework
CN111126865B (en) Technology maturity judging method and system based on technology big data
CN113590807B (en) Scientific and technological enterprise credit evaluation method based on big data mining
Gautam et al. A novel moving average forecasting approach using fuzzy time series data set
CN113807900A (en) RF order demand prediction method based on Bayesian optimization
Hanslo et al. Machine learning models to predict agile methodology adoption
CN114119110A (en) Project cost list collection system and method thereof
CN112215412A (en) Dissolved oxygen prediction method and device
CN117114184A (en) Urban carbon emission influence factor feature extraction and medium-long-term prediction method and device
CN117271099B (en) Automatic space data analysis scheduling system and method based on rule base
CN116934486A (en) Decision evaluation method and system based on deep learning
CN111510473A (en) Access request processing method and device, electronic equipment and computer readable medium
Poornima et al. Prediction of water consumption using machine learning algorithm
CN116307038A (en) Irrigation water consumption prediction method, irrigation water consumption prediction system and computer storage medium
CN113779859B (en) Interpretable time sequence prediction model training method and device and computing equipment
CN114970674A (en) Time sequence data concept drift adaptation method based on relevance alignment
CN113886451A (en) Multi-view-integrated POI recommendation method based on self-attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant