CN117932533A

CN117932533A - Emotion science multi-source data fusion method and system based on Bayesian statistics

Info

Publication number: CN117932533A
Application number: CN202410064329.8A
Authority: CN
Inventors: 田玉昆; 杜治利; 冯***; 王鹏飞; 秦会海; 王斐然
Original assignee: Command Center Of Natural Resources Comprehensive Survey Of China Geological Survey
Current assignee: Command Center Of Natural Resources Comprehensive Survey Of China Geological Survey
Priority date: 2024-01-16
Filing date: 2024-01-16
Publication date: 2024-04-26

Abstract

The invention relates to the technical field of multi-source data fusion, in particular to an earth science multi-source data fusion method and system based on Bayesian statistics, comprising the following steps: constructing a probability model which fuses the multi-source data and is used for quantitatively deciding a prediction problem and an uncertainty problem in the geoscience research through a Bayesian statistical framework to obtain a multi-source data fusion decision model; and inputting the prior information of the mathematical multisource data and the observation information of the multisource data into a multisource data fusion decision model to obtain the decision probability of the prediction problem and the uncertainty problem in the quantitative decision-making geoscience research. The invention can effectively fuse multi-source data, convert some priori information which is difficult to mathematical into various probability distributions, effectively improve the data utilization efficiency and improve the decision accuracy of the prediction problem and the uncertainty problem in the earth science research.

Description

Emotion science multi-source data fusion method and system based on Bayesian statistics

Technical Field

The invention relates to the technical field of multi-source data fusion, in particular to an earth science multi-source data fusion method and system based on Bayesian statistics.

Background

Geoscience is a typical multi-source data discipline, and therefore sources of a priori information are also diverse. Taking geophysical data inversion as an example, besides defining the distribution type of the observed data, the constraint can be introduced by the experience of geologist, geological horizon information, drilling logging data, different geophysical parameter data and the like. These data are not only of diverse origin, but also vary widely in data type. For example, lithology data is usually classified data, drilling and logging data is usually continuous data, the experience of geologist is more difficult to mathematical, and how to statistically integrate the information determines the solving effect of prediction problems and uncertainty problems in the earth science.

In the prior art, data information is usually processed respectively and then manually interpreted, and the method is time-consuming and labor-consuming and cannot obtain better results.

Disclosure of Invention

The invention aims to provide a Bayesian statistics-based geoscience multi-source data fusion method to solve the technical problems that in the prior art, manual interpretation is time-consuming and labor-consuming, and a good result cannot be obtained.

In order to solve the technical problems, the invention specifically provides the following technical scheme:

a Bayesian statistics-based geoscience multi-source data fusion method comprises the following steps:

Determining multi-source data in a ground-based personalized teleoperation of the geoscience study, wherein the data types of the multi-source data comprise continuous data types, classified data types and experience data types;

Acquiring prior information of multi-source data and observation information of the multi-source data;

constructing a probability model which fuses the multi-source data and is used for quantitatively determining the prediction problem and the uncertainty problem in the geoscience research through a Bayesian statistical framework, and obtaining prior information representing the multi-source data and a multi-source data fusion decision model of mapping relation between observation information of the multi-source data and decision probabilities of the prediction problem and the uncertainty problem in the quantitative decision geoscience research;

carrying out mathematic on prior information of the multi-source data according to the data type of the multi-source data through the prior model;

and inputting the prior information of the mathematical multisource data and the observation information of the multisource data into a multisource data fusion decision model to obtain the decision probability of the prediction problem and the uncertainty problem in the quantitative decision-making geoscience research.

As a preferable scheme of the invention, the construction method of the multi-source data fusion decision model comprises the following steps:

The multi-source data are fused through prior distribution, the prior distribution of the multi-source data is obtained, and the prior distribution of the multi-source data is as follows: p (m) =p ₁(m)p₂(m)…p_n (m), where p (m) is the prior distribution of the multi-source data, p ₁ (m) is the prior distribution of the 1 st data in the multi-source data, p ₂ (m) is the prior distribution of the 2 nd data in the multi-source data, p _n (m) is the prior distribution of the n-th data in the multi-source data, and m is the identifier of the multi-source data;

Obtaining a multi-source data fusion decision model according to prior distribution of multi-source data by a Bayesian theorem formula, wherein the multi-source data fusion decision model is as follows:

p(m|d)＝const*p(d|m)p(m)；

Wherein p (m|d) is a posterior distribution of decision probabilities representing a prediction problem and an uncertainty problem in the geoscience study, p (d|m) is a likelihood function representing a linear or nonlinear relationship between multi-source data and observation information, p (m) is a priori distribution of the multi-source data, d is an identifier of the observation information, m is an identifier of the multi-source data, and const is a constant coefficient.

As a preferable scheme of the invention, the decision probability is obtained by carrying out weighted average on the posterior distribution through the decision weight of prior information of the multi-source data and the decision weight of observation information of the multi-source data.

As a preferred embodiment of the present invention, the method for determining the decision weight of the prior information of the multi-source data includes:

Calculating the dispersion of prior information of the multi-source data and observation information of the multi-source data, and quantifying the instability of various data in the multi-source data by using the dispersion, wherein the dispersion is calculated by using a similarity function, and the instability is as follows:

K_i＝Dis(m_old,i,m_new,i)；

Wherein K _i is instability of ith data in the multi-source data, m _old,i is priori information of the ith data in the multi-source data, m _new,i is observation information of the ith data in the multi-source data, and Dis (m _old,i,m_new,i) is Euclidean distance between m _old,i and m _new,i;

Carrying out min-max standardization processing on the instability, and linearly mapping the instability to the range of the [0,1] interval, wherein K _i after the min-max standardization processing is as follows: k _i'＝(K_i -minK)/(maxK-minK), wherein minK is the minimum value of the instability of n data in the multi-source data, and max is the maximum value of the instability of n data in the multi-source data;

according to the instability of the multi-source data, constructing decision weights of prior information of the multi-source data which are adaptively adjusted according to the instability, wherein the decision weights of the prior information of the multi-source data are as follows:

W_i＝-(1-K_i')^r*log K_i’；

Wherein W _i is the decision weight of priori information of ith data in the multi-source data, K _i' is the instability of the ith data in the multi-source data after min-max standardization processing, i is a counting variable, r is greater than 1, and r is the constant power.

As a preferred embodiment of the present invention, the method for determining decision weights of observation information of multi-source data includes:

Obtaining the decision weight of the observation information of the multi-source data according to the decision weight of the prior information of the multi-source data;

the decision weight of the observation information of the multi-source data is as follows:

M_i＝1-W_i；

Wherein M _i is the decision weight of the observation information of the ith data in the multi-source data, W _i is the decision weight of the prior information of the ith data in the multi-source data, and i is the counting variable.

As a preferred embodiment of the present invention, the method for mathematically relating a priori information of multi-source data includes:

when the data type of the prior information of the multi-source data is continuous data type, according to the distribution type of the prior information of the multi-source data, the prior information of the multi-source data is approximately Gaussian distribution or Cauchy distribution for mathematic;

when the data type of the prior information of the multi-source data is classified data type, adopting a Markov random field to carry out numbering classification on the categories, establishing a category random field, and carrying out mathematic on the prior information of the multi-source data through Gibbs distribution;

And when the data type of the prior information of the multi-source data is the experience information type, converting the prior information of the multi-source data into a counting matrix form, and then carrying out mathematic on the prior information of the multi-source data by utilizing a transition probability matrix of a Markov process.

As a preferred scheme of the invention, the invention provides a Bayesian statistics-based geoscience multi-source data fusion system, which is applied to the Bayesian statistics-based geoscience multi-source data fusion method, and comprises the following steps:

The data preprocessing module is used for determining multi-source data in the ground physical teleoperation of the earth science research, and the data types of the multi-source data comprise continuous data types, classified data types and experience data types;

The data acquisition module is used for acquiring prior information of the multi-source data and observation information of the multi-source data;

The model building module is used for building a probability model which fuses the multi-source data and is used for quantitatively determining the prediction problem and the uncertainty problem in the geoscience research through a Bayesian statistical framework, and obtaining a multi-source data fusion decision model which characterizes prior information of the multi-source data and mapping relation between observation information of the multi-source data and decision probabilities of the prediction problem and the uncertainty problem in the quantitative decision geoscience research;

The mathematical processing module is used for carrying out mathematic on prior information of the multi-source data according to the data type of the multi-source data through the prior model;

The decision output module is used for inputting the prior information of the mathematical multi-source data and the observation information of the multi-source data into the multi-source data fusion decision model to obtain the decision probability of the prediction problem and the uncertainty problem in the quantitative decision geoscience research.

As a preferred scheme of the invention, the model building module builds a multi-source data fusion decision model, which comprises the following steps:

p(m|d)＝const*p(d|m)p(m)；

As a preferable scheme of the invention, the model building module obtains the decision probability by carrying out weighted average on the posterior distribution through the decision weight of prior information of the multi-source data and the decision weight of observation information of the multi-source data.

As a preferred solution of the present invention, the mathematical processing module performs mathematical processing on a priori information of the multi-source data, including:

Compared with the prior art, the invention has the following beneficial effects:

According to the invention, a Bayesian statistical method is introduced in the aspect of fusion of the geoscience multisource data, the multisource data can be effectively fused, a plurality of priori information which is difficult to mathematical is converted into various probability distributions, the data utilization efficiency is effectively improved, and the decision accuracy of the prediction problem and the uncertainty problem in the geoscience research is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It will be apparent to those of ordinary skill in the art that the drawings in the following description are exemplary only and that other implementations can be obtained from the extensions of the drawings provided without inventive effort.

FIG. 1 is a flowchart of a method for geoscience multisource data fusion provided by an embodiment of the present invention;

Fig. 2 is a block diagram of an geoscience multi-source data fusion system provided by an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

As shown in fig. 1, the invention provides a bayesian statistics-based geoscience multi-source data fusion method, which comprises the following steps:

In order to realize quantitative decision of the prediction problem and the uncertainty problem in the geoscience research based on the multi-source data, a probability model is constructed by utilizing Bayesian statistics, data of different sources (namely multi-source data from the field of earth materialization tele) are fused, posterior probability distribution calculation and analysis are performed by utilizing the probability model formed by Bayesian formulas (namely the multi-source data fusion decision model), and decision probability is obtained, so that the prediction problem and the uncertainty problem in the geoscience are solved.

The multi-source data fusion decision model obtains decision probability according to prior information and observation information of multi-source data, mathematics experience (prior information) of geological personnel, combines current actual conditions (observation information), improves data utilization efficiency, promotes fusion of the multi-source data, improves stability of Bayesian classification, and enhances decision accuracy.

Furthermore, in the multi-source data fusion decision model established based on the Bayesian formula, the decision force or speaking right (the capability of influencing posterior distribution) of prior information and observation information of the multi-source data is adaptively quantized, so that the accuracy of the multi-source data fusion decision model is maintained, and the robustness of the multi-source data fusion decision model is enhanced.

Specifically, the decision force of the prior information of the multi-source data is related to the time sequence discreteness of the multi-source data, the time sequence discreteness of the multi-source data reflects the stability degree of the multi-source data, the worse the stability degree (the higher the time sequence discreteness is, the higher the instability value is), the worse the reliability of the prior information is indicated, and therefore the smaller the influence capacity on posterior distribution is supposed to be, namely the smaller the decision force is, and on the contrary, the better the stability degree (the lower the time sequence discreteness is, the lower the instability value is), the higher the reliability of the prior information is indicated, and the larger the influence capacity on posterior distribution is supposed to be, namely the larger the decision force is supposed to be.

In summary, the invention constructs the self-adaptive adjustment of the prior information of the multi-source data according to the stability of the data to endow the prior information of the multi-source data with decision force, so that the higher the instability is, the smaller the decision weight is, the smaller the capability of influencing decision results is, and the accuracy of the decision probability in the multi-source data fusion decision model is ensured.

Similarly, the more unstable the multi-source data is, the lower the dependence on priori information is, and the more the current observation information is relied on, the optimal decision accuracy can be obtained.

According to the invention, the decision weight is adaptively adjusted in the multi-source data fusion, so that the quantitative decision of the multi-source data in the field of materialization remote of the fusion site is realized, and meanwhile, the accuracy of the quantitative decision of the multi-source data is ensured.

Multisource data includes logging information, lithology information, geological knowledge, and the like, wherein lithology data is typically of the categorical data type, drilling logging data is typically of the continuous data type, and geological knowledge is typically of the empirical data type.

In order to realize quantitative decision on the prediction problem and the uncertainty problem in the geoscience research based on multi-source data, a probability model is constructed by utilizing Bayesian statistics, data of different sources (namely multi-source data from the field of earth materialization tele) are fused, posterior probability distribution calculation and analysis are carried out by utilizing the probability model formed by Bayesian formulas (namely a multi-source data fusion decision model), and decision probability is obtained, so that the prediction problem and the uncertainty problem in the geoscience are solved, and the method specifically comprises the following steps:

The construction method of the multi-source data fusion decision model comprises the following steps:

p(m|d)＝const*p(d|m)p(m)；

In essence, d is observation information, and a relation can be established between d and m through a linear or nonlinear relation, and an operator of the relation is assumed to be G, and in general, observation data dζ Gm and p (m|d) express posterior distribution of decision probability of prediction problem and uncertainty problem in the geoscience research under the condition of acquiring the observation data d, wherein the distribution of m is calculated.

And carrying out weighted average on the posterior distribution through the decision weight of the prior information of the multi-source data and the decision weight of the observation information of the multi-source data to obtain the decision probability.

In the multi-source data fusion decision model established based on a Bayesian formula, the decision force or speaking right (the capability of influencing posterior distribution) of prior information and observation information of the multi-source data is adaptively quantized, so that the accuracy of the multi-source data fusion decision model is maintained, and the robustness of the multi-source data fusion decision model is enhanced, and the method specifically comprises the following steps:

the method for determining the decision weight of the prior information of the multi-source data comprises the following steps:

K_i＝Dis(m_old,i,m_new,i)；

W_i＝-(1-K_i')^r*log K_i’；

The method for determining the decision weight of the observation information of the multi-source data comprises the following steps:

M_i＝1-W_i；

In order to realize the fusion effect of the multi-source data, the invention carries out mathematical treatment on the multi-source data, and specifically comprises the following steps:

the mathematical method for the prior information of the multi-source data comprises the following steps:

As shown in fig. 2, the present invention provides a bayesian statistics-based geoscience multi-source data fusion system, which is applied to the bayesian statistics-based geoscience multi-source data fusion method, and the geoscience multi-source data fusion system includes:

The model building module builds a multi-source data fusion decision model, which comprises the following steps:

p(m|d)＝const*p(d|m)p(m)；

The model building module obtains the decision probability by carrying out weighted average on the posterior distribution through the decision weight of prior information of the multi-source data and the decision weight of observation information of the multi-source data.

The mathematical processing module performs mathematical processing on prior information of the multi-source data, and the mathematical processing module comprises:

The above embodiments are only exemplary embodiments of the present application and are not intended to limit the present application, the scope of which is defined by the claims. Various modifications and equivalent arrangements of this application will occur to those skilled in the art, and are intended to be within the spirit and scope of the application.

Claims

1. The geoscience multi-source data fusion method based on Bayesian statistics is characterized by comprising the following steps of:

2. The bayesian-statistics-based geoscience multi-source data fusion method according to claim 1, wherein: the construction method of the multi-source data fusion decision model comprises the following steps:

p(m|d)＝const*p(d|m)p(m)；

3. The bayesian-statistics-based geoscience multi-source data fusion method according to claim 1, wherein: and carrying out weighted average on the posterior distribution through the decision weight of the prior information of the multi-source data and the decision weight of the observation information of the multi-source data to obtain the decision probability.

4. The bayesian-statistics-based geoscience multi-source data fusion method according to claim 1, wherein: the method for determining the decision weight of the prior information of the multi-source data comprises the following steps:

K_i＝Dis(m_old,i,m_new,i)；

W_i＝-(1-K_i')^r*log K_i’；

5. The bayesian-statistics-based geoscience multi-source data fusion method according to claim 1, wherein: the method for determining the decision weight of the observation information of the multi-source data comprises the following steps:

M_i＝1-W_i；

6. The bayesian-statistics-based geoscience multi-source data fusion method according to claim 1, wherein: the mathematical method for the prior information of the multi-source data comprises the following steps:

7. A bayesian statistics-based geoscience multi-source data fusion system, characterized in that it is applied to a bayesian statistics-based geoscience multi-source data fusion method according to any one of claims 1-6, said geoscience multi-source data fusion system comprising:

8. A bayesian statistics-based geoscience multi-source data fusion system according to claim 1, wherein: the model building module builds a multi-source data fusion decision model, which comprises the following steps:

p(m|d)＝const*p(d|m)p(m)；

9. A bayesian statistics-based geoscience multi-source data fusion system according to claim 1, wherein: the model building module obtains the decision probability by carrying out weighted average on the posterior distribution through the decision weight of prior information of the multi-source data and the decision weight of observation information of the multi-source data.

10. A bayesian statistics-based geoscience multi-source data fusion system according to claim 1, wherein: the mathematical processing module performs mathematical processing on prior information of the multi-source data, and the mathematical processing module comprises: