CN114429140A

CN114429140A - Case cause identification method and system for causal inference based on related graph information

Info

Publication number: CN114429140A
Application number: CN202210178807.9A
Authority: CN
Inventors: 李玉军; 郭润东; 贲晛烨; 胡伟凤; 赵思文; 刘保臣
Original assignee: Shandong University
Current assignee: Shandong University
Priority date: 2022-02-25
Filing date: 2022-02-25
Publication date: 2022-05-03

Abstract

The invention provides a case identification method and a case identification system for causal inference based on related graph information, wherein the method comprises the steps of obtaining the fact description of a case; constructing a cause and effect graph according to the fact description of the case; carrying out causal discovery on the constructed causal graph by using a GFCI algorithm and sampling to obtain a causal graph; denoising the causal subgraph obtained by sampling, and adding the causal subgraph into loss to obtain a case law identification result; the method comprises the steps of constructing a causal graph, obtaining keywords of a case by using a KeyBERT algorithm, and clustering the keywords of the case. The invention provides a method for causal inference by using a causal graph to identify cases, which fully utilizes unstructured information in case fact description, better distinguishes the similarity and difference of different cases, effectively solves the situation of different case judgment of the same type, improves the accuracy of case identification, has less model parameter quantity, high training speed, is convenient to deploy and can be quickly realized.

Description

Case cause identification method and system for causal inference based on related graph information

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a case law identification method and system for causal inference based on relevant graph information.

Background

As research on documents becomes more sophisticated, research on company documents is also growing. The case name is formed by summarizing the nature of the legal relationship related to the case by a company, and in some scenes with actual application requirements, such as name prediction or legal provision recommendation, the case name of the case is often determined according to the description text of a company file.

Since the document structure of the company document is similar to the structure of the document, many previous researches on the document can be directly migrated to the research on the company document, but currently, when determining the case of the case, a professional usually manually reads the company document and analyzes the description thereof to determine the corresponding case. Such manual methods are often inefficient, and are easily affected by different levels of business of people, resulting in the accuracy of results being unable to be guaranteed, and determination of case routing through automatic analysis has not been achieved at present.

However, in addition to some structured data, there are many unstructured information waiting to be mined in case fact description by means of manual lifting, and there are two key points: company employee behavior and characteristics of company employees. The mining of these two pieces of information in unstructured text can be more convincing on the results identified by the case. Secondly, it is found that some traditional deep neural network experiments are not beneficial, for example, they often make causal results when facing similar situations and small sample problems, because they mainly learn based on features and probabilities, which is caused by the possible errors of the learned features.

Therefore, a case identification method and system for causal inference based on the related graph information are needed.

Disclosure of Invention

In order to solve the above problems, the present invention proposes a case identification method and system for causal inference based on the information of the correlation diagram.

Interpretation of terms:

KeyBERT, proposed by Prafull et al in 2019, utilizes BERT embedding and simple cosine similarity to automatically label a short sentence corpus and keywords and key phrases to establish a factual basis, which saves time for manually labeling keys and does not require any prior knowledge.

GFCI: the GFCI algorithm was proposed by Ogarrio et al in 2016 and is based on a combination of the causal discovery algorithm FCI (FastCausalInference) for restriction and the algorithm FGES based on score, initialized with FGES based on the FCI framework to improve accuracy and efficiency.

And the BIC is a Bayesian decision theory which is an important component of a subjective Bayesian reasoning theory. Under incomplete intelligence, subjective probability estimation is carried out on partially unknown states, then Bayesian formula is used for correcting occurrence probability, and finally, an expected value and correction probability are used for making an optimal decision.

PAG: is a hybrid graph that contains the common features of all Directed Acyclic Graphs (DAGs) that represent the same conditional independent relationships between measurands. In other words, the PAG contains all the possibilities for an efficient DAG on the raw data.

BilSTM: the network is a bidirectional long-time and short-time memory network and belongs to a basic model.

BilSTM-Att model: is BilSTM + attention, and is the existing neural network architecture.

According to some embodiments, the invention adopts the following technical scheme:

a case identification method for causal inference based on correlation diagram information comprises the following steps:

acquiring fact description of a case, including a case main body, a case scenario and a case result;

extracting key information according to the fact description of the case;

constructing a causal graph by using a GFCI algorithm and sampling to obtain a causal graph, wherein the causal graph comprises key words of cases obtained by using a KeyBERT algorithm and clustering the key words of the cases;

estimating the causal strength of the edges of the causal graph, and obtaining the causal strength of the causal graph by combining a BIC algorithm to obtain a judgment result;

constructing auxiliary loss by using the causal strength of the causal graph, and improving the performance of the BilSTM-Att model to obtain a trained BilSTM-Att model;

and acquiring the fact description of the case to be determined, and inputting the fact description into the trained BilSTM-Att model to obtain a case determination result.

Further, the acquiring of the keywords of the case through the KeyBERT algorithm comprises extracting document embedding by using BERT to obtain document-level vector expression, extracting word vectors for N-garm words, and finding out words most similar to the documents by using cosine similarity.

Further, the clustering of the keywords of the cases comprises dividing the keywords of the cases into K groups, randomly selecting K objects as initial clustering centers, and calculating the distance between each object and each clustering center to allocate each object to the nearest clustering center.

Further, the constructing the causal graph further comprises judging whether to establish the edge and the type of the edge through causal inference.

Further, the estimation of causal strength of the edges of the causal graph is performed, and the causal strength of the causal graph is obtained by combining with a BIC algorithm to obtain a decision result, and the method comprises the following steps:

estimating the causal strength of the edge of the causal graph by using ATE (automatic test equipment) to obtain

Evaluating the quality of each causal graph by using BIC to obtain BIC (G)_q,X)；

Calculating the Total causal Strength

The quality BIC (G) of each causal graph_qX) and

combining to obtain:

wherein, Y_iThe presentation case is represented by_iIs submitted to the server and is sent to the client,

is represented in a causal graph G_qIf T is_j→Y_iIn the figure G_qIs 0 if it is not present;

and calculating the score of each case route according to the total cause and effect strength, and sending the score to a random forest model to judge which case route corresponds to the cause and effect graph.

Further, calculating a score for each case by the overall causal strength, comprising: for case by Y_iIn other words, the score S (Y) is obtained_i) Comprises the following steps:

wherein τ (T)_j) Indicating the presence of T in this case_jAnd Tr (Y)_i) Is Y_iThe processing set of (2).

Further, the constructing of the auxiliary loss by using the causal strength of the causal graph improves the performance of the BilSTM-Att model to obtain the trained BilSTM-Att model, and the method comprises the following steps:

embedding words into the fact description of the case;

inputting the imbedding result into the BilSTM;

in the loss calculating stage, introducing auxiliary loss, and introducing the causality intensity corresponding to the causality subgraph obtained before to an auxiliary attention part to obtain better effect; the calculation formula is as follows:

L＝L_cross+αL_cons；

wherein a is_iIs the weight of each word in the attribute, g_iIs a normalized value of causal intensity, L_crossIt is the value of the cross entropy loss. The auxiliary loss can make our attention achieve better effect, thereby improving the effect of the model.

A case identification system for causal inference based on correlation map information, comprising:

the data acquisition module is configured to acquire fact description of the case, including a case main body, a case scenario and a case result;

the key information extraction module is configured to obtain the key words of the case through a KeyBERT algorithm and cluster the key words of the case;

the sampling module is configured to construct a causal graph by using a GFCI algorithm and obtain the causal graph by sampling;

the noise reduction module is configured to utilize the causal strength of the causal graph to construct auxiliary loss, improve the performance of the BilSTM-Att model and obtain a trained BilSTM-Att model;

and the case identification module is configured to acquire the fact description of the case for the case to be identified and input the fact description into the trained BilSTM-Att model to obtain a case identification result.

A computer-readable storage medium having stored therein a plurality of instructions adapted to be loaded by a processor of a terminal device and to execute said method of case identification for causal inference based on correlation map information.

A terminal device comprising a processor and a computer readable storage medium, the processor being configured to implement instructions; the computer readable storage medium stores instructions adapted to be loaded by a processor and to perform the case identification method for causal inference based on correlation map information.

Compared with the prior art, the invention has the following beneficial effects:

the invention provides a method for causal inference by using a causal graph to identify cases, which fully utilizes unstructured information in case fact description, better distinguishes the similarity and difference of different cases, effectively solves the situation of different case judgment of the same type, improves the accuracy of case identification, has less model parameter quantity, high training speed, is convenient to deploy and can be quickly realized.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the application and, together with the description, serve to explain the application and are not intended to limit the application.

FIG. 1 is a block flow diagram of an overall framework of a case identification method for causal inference based on correlation map information;

FIG. 2 is a schematic flow diagram of a case identification method for causal inference based on correlation map information;

FIG. 3 is a schematic diagram of a BiLSTM-Att model for identification.

Detailed Description

The invention is further described with reference to the following figures and examples.

It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs.

It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of example embodiments according to the present application. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, and it should be understood that when the terms "comprises" and/or "comprising" are used in this specification, they specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof, unless the context clearly indicates otherwise.

Example 1

As shown in fig. 2, a case identification method for causal inference based on correlation diagram information includes:

s1: acquiring fact description of a case;

s2: extracting key information according to the fact description of the case; the method comprises the steps of obtaining keywords of cases by using a KeyBERT algorithm and clustering the keywords of the cases.

S3: constructing a causal graph by using a GFCI algorithm and sampling to obtain a causal graph;

s4: estimating the causal strength of the edges of the causal graph, and obtaining the causal strength of the causal graph by combining a BIC algorithm to obtain a judgment result;

s5: constructing auxiliary loss by using the causal strength of the causal graph, and improving the performance of the BilSTM-Att model to obtain a trained BilSTM-Att model;

s6: acquiring fact description of a case to be identified, and inputting the fact description into a trained BilSTM-Att model to obtain a case identification result; s6 is shown in fig. 3.

Fig. 1 is a schematic overall framework flow diagram of a case identification method for causal inference based on related graph information. Extracting the most relevant keywords from the patterns y1 and y2, clustering the keywords into A, B, C and D, establishing a causal graph in the second step, namely determining whether an edge needs to be established among all nodes and determining the type of the edge, sampling the causal graph based on the edge characteristics to obtain causal subgraphs in the third step, and evaluating the intensity of the edge aiming at each causal subgraph in the fourth step. And fifthly, evaluating the quality of each causal graph by using BIC, and calculating the causal strength of the upper side of the causal graph by combining the quality of each causal graph and the causal strength of the side in each causal graph. A score is calculated for each case by using the causal graph, and the calculated score is used for determining the final case by using a random forest algorithm. The method is not used for judging the result, but is used for integrating the causal strength of a causal graph into a neural network to train by combining with an Attention mechanism so as to improve the result of the neural network.

Example 2

The case identification method for causal inference based on the correlation diagram information according to embodiment 1 is different in that:

the specific implementation process of S1 is as follows:

the fact description of the case is obtained, and various facts and specific plots comprising the case are formed, including the case body, generally an individual or an organization. There are also the behaviour and the various plots of the subject of the case, and the consequences of the harm caused, i.e. the fact that the object and the specific object that the behaviour is harmful to, and the severity of, the cause and effect relationship between the behaviour and the result. There is also the motivation and purpose of the case.

Example 3

at S2, for the structured document type such as company document, the KeyBERT algorithm is used to extract the key information in the description, including extracting document embedding using BERT to obtain document-level vector representation, and then extracting word vectors for N-garm words. The cosine similarity is then used to find the most similar words to the document. This algorithm calculates the importance of each word to the case. Meanwhile, in order to distinguish different cases of case, the most important p words for case, namely p words with the highest cosine similarity are selected and aggregated into q types of keywords with similar characteristics, and the words are clustered by adopting a K mean clustering algorithm. The specific implementation process of S2 is as follows:

s21: extracting document vectors (embedding) from the fact description of the case using BERT to obtain a document-level representation;

s22: extracting word vectors for the N-gram words/phrases through an N-gram grammar model; an n-gram refers to n words that occur consecutively in text. An n-gram model is a probabilistic language model based on (n-1) order markov chains that infers the structure of a sentence by the probability of n words occurring. Taking the text "i am a good person" as an example, the text is firstly segmented into words: "i", "is", "one", "good person". For binary grammar, two-by-two combinations in order result in '″ me', 'is', 'one', 'good'.

S23: searching the most similar words/phrases to the document by using the cosine similarity, and defining the searched most similar words/phrases as the words which can describe the whole document most; the method specifically comprises the following steps:

the cosine similarity is obtained by calculating two(Vector)Angle of (2)CosineValue to evaluate themDegree of similarity. For a word vector, it represents the semantic similarity between two words. For both word vectors A, B, the remaining chord similarity calculation formula is:

s24: selecting p most important words for each case, and clustering the words into q classes by adopting a K mean algorithm;

precouple data D ═ x₁,x₂,…,x_i…,x_pRandomly dividing into q groups, i.e. C ═ C₁,C₂,…,C_i…,C_q}，x₁,x₂,…,x_nThe first p keywords with the highest cosine similarity obtained in the step S23 are referred to;

randomly selecting q objects from C as initial clustering centers { u }₁,u₂,…,u_j…,u_qCalculate each object x_iClustering centers u with respective seeds_jA distance d between_ijThe calculation formula is as follows:

assigning each object to the cluster center u closest to it_j(ii) a The cluster centers and the objects assigned to them represent a cluster; each sample is allocated, the clustering center of the cluster is recalculated according to the existing object in the cluster, and the calculation formula is as follows:

this process will be repeated until some termination condition is met. The termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal.

Example 4

the q-type key and the M-type table together construct a causal graph. All nodes are binary, when the graph is applied to a case, 1 indicates that the element exists in the case, and 0 does not. Unlike the extraction of manual features, the automatic extraction of keywords may not be complete, which can lead to unobservable confusion in causal discovery. The specific implementation process of S3 is as follows:

s31: taking the extracted q-type keywords and all case relations as nodes of a causal graph, judging whether causal relations exist among the nodes or not, and if yes, establishing edge; besides the causal relationships, some practical situations are also considered. First, the identification of case routing is certainly described based on the fact that case routing is the result of final identification, so that there is no possibility of edges pointing from case routing to other nodes. While time may be considered, for a causal relationship, the cause is certainly in front of the result, and the fact descriptions in the company document are usually written in chronological order, so that the chronological order of the descriptions can be considered as a time constraint for filtering the noise edges. In most cases, if the factor A appears after B, the edge from A to B will not be allowed. However, this does not affect the edge from B to A, so the chronological order is not a sufficient condition for causality.

The FCI takes sample data and optional background knowledge as input, ensures Markov equivalence class representing a true causal DAG, and is divided into two stages, namely adjacency and orientation. It does not rely on the assumption that there is no potential confuser, but it performs relatively poorly, especially on real data. On the other hand, FGES greedily searches all potential DAGs and outputs the highest score graph it finds. It is fast and accurate in meeting the requirements, but all are based on the premise that no potential confusion factor exists.

The GFCI takes the output of FGES as an initialization map and further expands the map by the FCI's adjacency phase where partial adjacency is removed by the conditional independence test. A similar procedure is used in the localization phase, where FGES localization is provided as initialization and FCI further localization is applied using the Proof of Ogarrio et al. (2016) The guaranteed GFCI algorithm outputs a PAG (partial anticancer graph) that represents the true causal DAG. The output PAG contains all possible instances of the valid DAG that can be derived from the raw data.

S32: determining the type of edge: for node pairs for which a relationship has been determined, we also need to determine what type of causal relationship is exactly, and the specific relationship is shown in table 1, where table 1 is the type of all edges in the PAG; the type of the edge is determined according to the relationship type.

TABLE 1

S33: sampling to obtain a causal graph: because of the uncertainty of the relationship, the cause and effect graph has uncertainty, so the cause and effect graph is sampled to extract each possible cause and effect graph, also called cause and effect graphAnd (4) sub-graph. Different edges have different sampling methods. The specific method comprises the following steps: of the four edge types, → and

is determined. Therefore, in each of the sampling charts,

will be retained, and

is not reserved. For the

It has two possible options, so that half of it may be retained and half discarded when it is sampled. In the same way, for

In each case 1/3 possibilities.

Example 5

the specific implementation process of S4 is as follows:

s41: estimation of causal strength is performed for each causal graph, and for edges (edge) in the causal graph, ATE (average Treatment effect) is used

As the strength of the node T to node Y edge in graph G, it is evaluated using PSM (proportionality Score matching);

ATE is used to evaluate the average intervention outcome of an individual under an intervention state, i.e., the difference between the observed outcome and the counter-fact of the individual i under the intervention state.

The principle is that for edge T → Y, if intervention T is from 0 to 1, then Y is expected to change:

ψ_T,Y＝E[Y|do(T＝1)]-E[Y|do(T＝0)]

where E is desired, do (T ═ 1), meaning that intervention T is set to 1.

Trend scoring matching PSM is a statistical method, mainly to reduce the effects of data bias and confounding variables, making the comparison experiments on the same starting line. The two methods are combined to realize the evaluation of causal strength, and the formula is as follows:

wherein, the first and the second end of the pipe are connected with each other,

is the most similar example in the opposite group of i, L is the likelihood function, t_i,y_i,z_iIs the value of the intervention, outcome and confounding factor of i;

s42: and evaluating the quality of each causal subgraph on the basis of obtaining the causal strength of each causal subgraph, wherein BIC is used for evaluation.

BIC is a standard for measuring the goodness of the fitting of a statistical model, and BIC ═ kln (n) -2ln (L), wherein K is the number of model parameters, n is the number of samples, and L is a likelihood function.

Calculating the total causal intensity, and calculating the quality BIC (G) of each causal graph_qX) and causal quality of each sub-graph

Combining to obtain:

wherein Y is_iThe indication scheme is represented by c_iIs submitted to the server and is sent to the client,

is represented in a causal graph G_qCause and effect intensities of (1), e.g.Fruit T_j→Y_iIn the figure G_qIs 0 if it is not present.

S43: and calculating the score of each case through the causal strength of the causal graph, and sending the score to a random forest model to judge which case corresponds to the causal graph.

For case by Y_iIn other words, the score S (Y) is obtained_i) Comprises the following steps:

The random forest model is a classifier that contains a plurality of decision trees and whose output classes are dependent on the mode of the class output by the individual trees. The algorithm process is as follows:

(1) the number of training cases (samples) is represented by N, and the number of features is represented by M.

(2) Number of input features m for determiningDecision treeThe decision result of the previous node; where M should be much smaller than M.

(3) Sampling N times from N training cases (samples) in a mode of putting back samples to form one training caseTraining set(i.e., bootstrap sampling) and using the unpumped use case (sample) as a prediction to evaluate the error.

(4) For each node, m features are randomly selected, and the decision for each node on the decision tree is determined based on these features. Based on the m features, the optimal splitting mode is calculated.

(5) Each tree grows completely without pruning, which may be employed after a normal tree classifier is built.

Example 6

next, building a neural network and constructing an auxiliary loss using causal strength to achieve an improvement in model performance will be described. The specific implementation process of S5 is as follows:

s51: embedding words into the fact description of the case obtained in the step S1;

s52, inputting the imbedding result in the S51 into the BilSTM;

s53, in the loss calculating stage, introducing auxiliary loss, and introducing the causality to an auxiliary attention part by utilizing the causality corresponding to the causality graph obtained previously so as to obtain a better effect; the method comprises the following steps:

L＝L_cross+αL_cons；

Example 7

and the case identification module is configured to acquire the fact description of the case for the case to be identified, input the fact description into the well-trained BilSTM-Att model and obtain a case identification result.

Example 8

A computer-readable storage medium, wherein a plurality of instructions are stored, the instructions are suitable for being loaded by a processor of a terminal device and executing a case identification method for causal inference based on correlation diagram information provided by the embodiment.

Example 9

A terminal device comprising a processor and a computer readable storage medium, the processor for implementing instructions; the computer readable storage medium is used for storing a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the case identification method for causal inference based on the correlation diagram information provided by the embodiment.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

The above description is only a preferred embodiment of the present application and is not intended to limit the present application, and various modifications and changes may be made to the present application by those skilled in the art. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Although the embodiments of the present invention have been described with reference to the accompanying drawings, it is not intended to limit the scope of the present invention, and it should be understood by those skilled in the art that various modifications and variations can be made without inventive efforts by those skilled in the art based on the technical solution of the present invention.

Claims

1. A case identification method for causal inference based on correlation diagram information is characterized by comprising the following steps:

extracting key information according to the fact description of the case;

2. The case identification method for causal inference based on correlation diagram information as claimed in claim 1, wherein said obtaining keywords of case through KeyBERT algorithm comprises extracting document embedding using BERT to obtain document level vector representation, extracting word vector for N-garm word, and using cosine similarity to find the most similar words to document.

3. The case identification method according to claim 1, wherein the clustering of case keywords comprises dividing case keywords into K groups, randomly selecting K objects as initial clustering centers, and calculating the distance between each object and each clustering center to assign each object to the nearest clustering center.

4. The method of claim 1, wherein the constructing the causal graph further comprises determining whether to establish the edge and the type of the edge by causal inference.

5. The method for identifying causal patterns based on correlation diagram information as claimed in claim 1, wherein the estimation of causal strength of the edges of the causal graph and the combination of BIC algorithm to obtain causal strength of the causal graph to obtain decision result comprises:

Calculating the Total causal Strength

The quality BIC (G) of each causal graph_qX) and

combining to obtain:

wherein, Y_iThe indication scheme is represented by c_iIs submitted to the server and is sent to the client,

is represented in a causal graph G_qCause and effect strength of (1) if T_j→Y_iIn the figure G_qIs 0 if it is not present;

6. The method for identifying cases for causal inference based on correlation map information as claimed in claim 1, wherein calculating a score for each case by overall causal strength comprises: for case by Y_iIn other words, the score S (Y) is obtained_i) Comprises the following steps:

7. The method of claim 1, wherein constructing an auxiliary loss using the causal strength of the causal graph to improve the performance of the BilSTM-Att model to obtain a trained BilSTM-Att model comprises:

embedding words into the fact description of the case;

inputting the imbedding result into the BilSTM;

in the loss calculating stage, auxiliary loss is introduced, the causality intensity corresponding to the causality subgraph obtained before is introduced to an auxiliary attention part, and the calculation formula is as follows:

L＝L_cross+αL_cons；

wherein a is_iIs the weight of each word in the attribute, g_iIs a normalized value of causal intensity, L_crossIt is the value of the cross entropy loss.

8. A case identification system for causal inference based on correlation graph information, comprising:

9. A computer-readable storage medium having stored thereon a plurality of instructions adapted to be loaded by a processor of a terminal device and to execute a case identification method for causal inference based on correlation map information as claimed in any of claims 1 to 7.

10. A terminal device comprising a processor and a computer readable storage medium, the processor being configured to implement instructions; the computer readable storage medium storing instructions adapted to be loaded by a processor and to perform a case identification method for causal inference based on correlation map information as claimed in any of claims 1 to 7.