CN116090522A - Causal relation discovery method and system for missing data set based on causal feedback - Google Patents

Causal relation discovery method and system for missing data set based on causal feedback Download PDF

Info

Publication number
CN116090522A
CN116090522A CN202310364531.8A CN202310364531A CN116090522A CN 116090522 A CN116090522 A CN 116090522A CN 202310364531 A CN202310364531 A CN 202310364531A CN 116090522 A CN116090522 A CN 116090522A
Authority
CN
China
Prior art keywords
causal
model
data set
missing data
sub
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310364531.8A
Other languages
Chinese (zh)
Inventor
马从锂
黄飞虎
弋沛玉
王琳娜
彭舰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202310364531.8A priority Critical patent/CN116090522A/en
Publication of CN116090522A publication Critical patent/CN116090522A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The disclosure provides a causal relation discovery method and system for a missing data set based on causal feedback, wherein the method comprises the following steps: establishing a causal relation finding model, wherein the causal relation finding model comprises a missing data set complement sub-model and a causal finding sub-model, the missing data set complement sub-model is used for complementing the missing data set, and the causal finding sub-model is used for determining an optimal causal graph corresponding to the complemented missing data set; performing joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model; the to-be-processed missing data set is input into the trained causal relation discovery model, and the trained causal relation discovery model outputs the optimal causal diagram corresponding to the to-be-processed missing data set, so that the causal relation of the missing data set can be discovered more accurately.

Description

Causal relation discovery method and system for missing data set based on causal feedback
Technical Field
The invention relates to the field of data processing, in particular to a causal relation discovery method and system for a missing data set based on causal feedback.
Background
For a long time, discussions about causal relationships have been limited to the areas of literature and philosophy until causal inferences appear. Causal inference is used to reveal the inherent mechanisms of generation of things, discover the laws of operation of things, and has application in many other fields of statistics, medicine, economics, law, etc. Causal relationship discovery is a very important branch in the field of causal inference, and aims to derive a causal relationship model from data to reveal the inherent generation mechanism of the data.
The most directly effective method of causal relationship discovery is the randomized controlled trial (Randomized controlled trials, RCTs), which is considered the "gold standard" for study causal inference in the clinic. However, RCTs are not feasible in most cases due to high cost, ethical inadvisance, or even impractical factors. For example, in assessing the effect of a pregnant woman's smoking on fetal development, it is not uncommon for one group of subjects to smoke while another group does not. Thus, researchers turn their attention to viewing data that is available for direct use. Most of the related studies are currently based on complete datasets, but causal relationship findings with missing datasets are less discussed. In practice, missing data sets are ubiquitous, and therefore causal relationship discovery on observed data sets containing missing values is critical.
In the related art, records containing missing values are directly deleted when a missing data set is processed, and only available data entries are used for causal discovery. In order to make maximum use of the data set, a method of trial deletion is proposed: when the condition test is carried out, only unusable data is deleted, and the maximum data utilization rate is ensured. Both deletion strategies are simple and intuitive, but either do not perform well or require a lot of a priori knowledge.
Therefore, it is desirable to provide a causal relationship discovery method and system for a missing data set based on causal feedback, which are used for more accurately discovering the causal relationship of the missing data set.
Disclosure of Invention
One of the embodiments of the present specification provides a causal relationship discovery method for a missing data set based on causal feedback, the method comprising: establishing a causal relation discovery model, wherein the causal relation discovery model comprises a missing data set complement sub-model and a causal discovery sub-model, the missing data set complement sub-model is used for complementing a missing data set, and the causal discovery sub-model is used for determining an optimal causal graph corresponding to the complemented missing data set; performing joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model; inputting the missing data set to be processed into the trained causal relation discovery model, and outputting an optimal causal graph corresponding to the missing data set to be processed by the trained causal relation discovery model.
In some embodiments, the inputs of the missing data set complement sub-model include the missing data set to be processed, a mask matrix, and a random noise matrix, wherein the random noise matrix obeys a standard normal distribution; and outputting the complement sub-model of the missing data set, wherein the output of the complement sub-model of the missing data set comprises the complement data set corresponding to the missing data set to be processed.
In some embodiments, the missing dataset complement sub-model includes a generator and a arbiter whose inputs include a hint matrix.
In some embodiments, the causal discovery submodel includes a graph generation unit for capturing variable relationships from a complement dataset corresponding to the pending missing dataset and generating a causal graph adjacency matrix, and a graph search unit for searching a graph space for an optimal causal graph corresponding to the pending missing dataset.
In some embodiments, the graph generation unit includes an encoder consisting of a multi-layer self-attention convolutional network and a decoder comprising one single layer neural network.
In some embodiments, the graph search unit includes a three-layer feed-forward multi-layer perceptron with a ReLU as an activation function.
In some embodiments, the performing the joint training on the missing dataset complement sub-model and the causal discovery sub-model to obtain a trained causal relationship discovery model includes: and carrying out joint training on the missing data set complement sub-model and the causal relationship discovery sub-model based on a causal characterization extraction feedback mechanism, and obtaining the trained causal relationship discovery model.
In some embodiments, the feedback mechanism based on causal characterization extraction performs joint training on the missing dataset complement sub-model and the causal discovery sub-model, and obtains the trained causal relationship discovery model, including: in the combined training process, the graph searching unit fuses the classification errors for training.
In some embodiments, the method further comprises: pruning the best causal graph corresponding to the to-be-processed missing data set output by the trained causal relationship discovery model, and determining the target best causal graph corresponding to the to-be-processed missing data set.
One of the embodiments of the present specification provides a causal relationship discovery system for a missing data set based on causal feedback, comprising: the system comprises a model building module, a causal relation finding module and a causal relation finding module, wherein the causal relation finding module comprises a missing data set complement sub-model and a causal relation finding sub-model, the missing data set complement sub-model is used for complementing a missing data set, and the causal relation finding sub-model is used for determining an optimal causal diagram corresponding to the complemented missing data set; the model training module is used for carrying out joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model; and the causal relation discovery module is used for inputting the missing data set to be processed into the trained causal relation discovery model, and the trained causal relation discovery model outputs an optimal causal diagram corresponding to the missing data set to be processed.
Compared with the prior art, the causal relation discovery method and system for the missing data set based on causal feedback provided by the specification have the following beneficial effects:
1. the missing data set is complemented, and the causal relationship of the missing data set can be found more accurately based on the complemented missing data set through the causal relationship finding model;
2. the arbiter of the missing data set complement sub-model does not distinguish the authenticity of the generated vector of the generator, but tries to judge which elements in the generated vector are real and which are generated, a large amount of continuous missing data does not have any prompt information, the accuracy of data complement is affected, the generator outputs different results each time, a prompt mechanism of missing difference perception is introduced, and the problem is alleviated through a prompt matrix;
3. in order to utilize the mutual promotion of the deficiency completion and the causality discovery, the prior experience is moderately utilized while the completion data and the new graph searching are ensured so as to obtain better performance, and a causality discovery result is fused in a data completion stage by introducing a causality characterization extraction-based feedback mechanism;
4. in the combined training process, the graph searching unit fuses the classification errors to train so as to help the model to quickly converge and improve the stability;
5. pruning the best causal graph output by the causal relation discovery model to remove false edges in the best causal graph.
Drawings
The present specification will be further elucidated by way of example embodiments, which will be described in detail by means of the accompanying drawings. The embodiments are not limiting, in which like numerals represent like structures, wherein:
FIG. 1 is a flow chart of a causal relationship discovery method for a missing data set based on causal feedback, shown in accordance with some embodiments of the present description;
FIG. 2 is a schematic structural diagram of a causal relationship discovery model shown in accordance with some embodiments of the present description;
FIG. 3 is a diagram of all paths shown and in an estimated causal graph according to some embodiments of the present description;
FIG. 4 is a block diagram of a missing dataset causal relationship discovery system based on causal feedback shown in some embodiments of the present description;
fig. 5 is a schematic structural diagram of an electronic device according to some embodiments of the present description.
Description of the embodiments
In order to more clearly illustrate the technical solutions of the embodiments of the present specification, the drawings that are required to be used in the description of the embodiments will be briefly described below. It is apparent that the drawings in the following description are only some examples or embodiments of the present specification, and it is possible for those of ordinary skill in the art to apply the present specification to other similar situations according to the drawings without inventive effort. Unless otherwise apparent from the context of the language or otherwise specified, like reference numerals in the figures refer to like structures or operations.
It will be appreciated that "system," "apparatus," "unit" and/or "module" as used herein is one method for distinguishing between different components, elements, parts, portions or assemblies at different levels. However, if other words can achieve the same purpose, the words can be replaced by other expressions.
As used in this specification and the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.
A flowchart is used in this specification to describe the operations performed by the system according to embodiments of the present specification. It should be appreciated that the preceding or following operations are not necessarily performed in order precisely. Rather, the steps may be processed in reverse order or simultaneously. Also, other operations may be added to or removed from these processes.
The existing causality discovery methods can be mainly divided into three categories: constraint-based methods, score-based methods, and hybrid methods. Wherein the score-based approach is widely adopted, the core idea is to define a causal graph
Figure SMS_1
Scoring function of->
Figure SMS_2
The function is combined with a search algorithm, mapping space +.>
Figure SMS_3
Each of->
Figure SMS_4
The one with the best score, i.e. the one with the smallest score, is found. The objective function of the score-based method is: />
Figure SMS_5
FIG. 1 is a flow chart of a causal relationship discovery method for a missing data set based on causal feedback, according to some embodiments of the present description. As shown in fig. 1, the causal relationship discovery method for a missing data set based on causal feedback may include the following steps.
Step 110, a causal relationship discovery model is established.
FIG. 2 is a schematic diagram of a causal relationship discovery model according to some embodiments of the present disclosure, where the causal relationship discovery model includes a missing dataset complement sub-model for complementing a missing dataset and a causal discovery sub-model for determining an optimal causal graph corresponding to the complemented missing dataset, as shown in FIG. 2.
In the case of missing data, the original data set
Figure SMS_30
Incomplete shape of (a)The formula is->
Figure SMS_34
. Is provided with
Figure SMS_38
Is->
Figure SMS_8
A corresponding mask matrix, and->
Figure SMS_14
For indicating->
Figure SMS_18
Is the location of the missing data.
Figure SMS_23
And->
Figure SMS_29
Are all->
Figure SMS_33
Vector of dimensions>
Figure SMS_37
Is->
Figure SMS_40
Middle->
Figure SMS_32
Column vectors corresponding to individual features (also referred to as ">
Figure SMS_36
The%>
Figure SMS_41
Column vector "),>
Figure SMS_43
is->
Figure SMS_19
Middle->
Figure SMS_22
Column vectors corresponding to individual features (also referred to as ">
Figure SMS_25
The%>
Figure SMS_27
Column vector "),>
Figure SMS_6
is a data set->
Figure SMS_10
The number of samples in (a) is also the data set +.>
Figure SMS_11
The number of samples in (a) is determined. />
Figure SMS_15
、/>
Figure SMS_24
And->
Figure SMS_26
The correspondence relationship is as follows:
Figure SMS_28
wherein (1)>
Figure SMS_31
Representing missing data,/->
Figure SMS_35
Representing the original dataset +.>
Figure SMS_39
Corresponding incomplete form of the data set +.>
Figure SMS_42
Middle->
Figure SMS_44
The>
Figure SMS_7
Data of->
Figure SMS_13
Representing the original dataset +.>
Figure SMS_17
Middle->
Figure SMS_21
The>
Figure SMS_9
Data of->
Figure SMS_12
Representing the%>
Figure SMS_16
The>
Figure SMS_20
The elements.
The causal relationship discovery model implements missing data complementation based on the distribution of the missing data sets estimated by the antagonism network (Generative Adversarial Network, GAN). Since GAN cannot accept NaN (Not a Number) input, it is necessary to provide a random noise matrix that is subject to a standard normal distribution
Figure SMS_61
Use->
Figure SMS_64
Replace original->
Figure SMS_67
,/>
Figure SMS_47
Representing element multiplication, wherein NaN (Not a Number) input is used to describe a value, representing that the value is not a significant value, and does not belong to any meaningful class. Generator->
Figure SMS_50
Data set to be deleted->
Figure SMS_54
Mask matrix->
Figure SMS_58
Random noise matrix->
Figure SMS_70
As input: />
Figure SMS_73
Figure SMS_77
Is generator->
Figure SMS_84
Is provided with a set of output data of (a),
Figure SMS_76
will be +.>
Figure SMS_79
Each of->
Figure SMS_81
Generating a corresponding estimated value +.>
Figure SMS_83
In order not to change the true value +.>
Figure SMS_60
Replacement->
Figure SMS_63
Corresponding missing elements of (a) new dataset is obtained for +.>
Figure SMS_66
The representation is: />
Figure SMS_71
In data compensationIn the whole scene, there is no true/false label, so the arbiter does not distinguish between the true and false of the generator generated vector, but tries to judge +.>
Figure SMS_45
Which elements are true and which are generated, a large amount of continuously missing data does not have any prompt information, the accuracy of data complement is affected, the GAN outputs different results each time, a prompt mechanism for introducing missing difference perception is introduced to alleviate the problem, and the realization depends on a prompt matrix->
Figure SMS_49
Prompt matrix->
Figure SMS_53
Is input to a arbiter network, the arbiter can obtain the mask matrix +.>
Figure SMS_57
And provide a hint for it. Prompt matrix->
Figure SMS_48
Is subject to->
Figure SMS_51
The generation mode is as follows: />
Figure SMS_55
Wherein, the liquid crystal display device comprises a liquid crystal display device,
Figure SMS_69
representing prompt matrix->
Figure SMS_74
Middle->
Figure SMS_75
The>
Figure SMS_78
Element(s)>
Figure SMS_80
The expression probability is +.>
Figure SMS_59
Bernoulli distribution of (c). />
Figure SMS_62
Is a data set->
Figure SMS_65
Is>
Figure SMS_68
The corresponding missing rate of each feature. The cue mechanism of the miss-rate difference perception enables the discriminator to pay more attention to the features with higher miss rates. The discriminator receives->
Figure SMS_72
And->
Figure SMS_82
As input, output +.>
Figure SMS_85
Prediction of->
Figure SMS_86
Figure SMS_46
Wherein (1)>
Figure SMS_52
For the output data set of the arbiter D, +.>
Figure SMS_56
Passed to arbiter D, which causes arbiter D to know the answers to most of the questions: 0 is missing data, 1 is real data, and the arbiter learns the known answer and pays attention to the unknown answer (represented by 0.5), i.e. the data represented by 0.5 is the one that the arbiter needs to distinguish. Through iteration, the arbiter eventually learns the distribution of the data.
The generator loss function is as follows:
Figure SMS_97
wherein (1)>
Figure SMS_89
Output data set for discriminator D +.>
Figure SMS_93
Middle->
Figure SMS_90
Individual column vectors>
Figure SMS_92
Generator->
Figure SMS_96
Output data set->
Figure SMS_100
Middle->
Figure SMS_102
Individual column vectors>
Figure SMS_105
For the original data set->
Figure SMS_87
Middle->
Figure SMS_94
Individual column vectors>
Figure SMS_99
Is a data set->
Figure SMS_103
The number of samples in->
Figure SMS_104
Representing a continuous feature +.>
Figure SMS_106
Representing discrete features->
Figure SMS_91
Is a weight whose value can be changed +.>
Figure SMS_95
The importance in the overall loss function, the larger the value,
Figure SMS_98
the greater the impact on the overall loss function. In some embodiments, ->
Figure SMS_101
Can be->
Figure SMS_88
The corresponding missing rate of each feature.
In some embodiments, the loss function of the arbiter is:
Figure SMS_107
in some embodiments, the causal discovery submodel may include a graph generation unit, which may be based on an encoder-decoder model, responsible for capturing variable relationships from input data and generating a causal graph adjacency matrix, and a graph search unit, which may utilize the exploring-utilizing capabilities of Actor-Critic, in conjunction with a custom reward function, to search for an optimal causal graph in graph space. An Actor-Critic algorithm is used to explore the optimal causal graph, which can be selected over a continuous action space. Regarding the data set and the potential causal graph as the state and the action of the model respectively, the Actor is composed of an encoder-decoder of the graph generating unit, critic is a graph searching unit, and the graph searching unit is a three-layer feedforward multi-layer perceptron taking ReLU (Rectified Linear Unit) as an activation function.
In some embodiments, the graph generation unit may include an encoder-decoder architecture. The encoder consists of a multi-layer self-attention convolutional network. The self-attention convolutional network in combination with the codec can find causal relationships between variables. Inspired by a combinatorial optimization problem, particularly with respect to Pointer network (Pointer network) research, the input to the encoder is made by a slave
Figure SMS_109
Random extraction->
Figure SMS_115
Sample->
Figure SMS_119
Is composed of, and is remodeled into->
Figure SMS_110
Matrix of->
Figure SMS_113
I.e. +.>
Figure SMS_117
Figure SMS_121
Is a connection->
Figure SMS_108
First->
Figure SMS_112
A vector of the composition of the individual elements. Thus->
Figure SMS_116
The individual nodes are distributed in a +.>
Figure SMS_120
In the dimensional space, the capturing of +.>
Figure SMS_111
Causal relationships between nodes. The output of the encoder is denoted +.>
Figure SMS_114
Wherein->
Figure SMS_118
Is the dimension of the encoder.
The decoder comprises only one single-layer neural network by calculating every twoDifferent coding results
Figure SMS_130
And->
Figure SMS_124
Generating an adjacency matrix->
Figure SMS_126
Figure SMS_133
Wherein (1)>
Figure SMS_137
For adjacency matrix->
Figure SMS_138
Middle->
Figure SMS_141
The>
Figure SMS_131
Element(s)>
Figure SMS_135
And->
Figure SMS_122
Are all parameters that can be trained and,
Figure SMS_127
is the dimension of the hidden layer in the decoder. The above-mentioned output of encoder +.>
Figure SMS_125
Mapping to real space->
Figure SMS_128
Is a sigmoid function that maps the result between 0 and 1. />
Figure SMS_132
Is an indication function, when->
Figure SMS_136
Above 0, the case is->
Figure SMS_139
,/>
Figure SMS_142
Is a superparameter for limiting the number of edges of the directed acyclic graph (DAG, directed acyclic graph),/->
Figure SMS_140
The larger the number of sides, the fewer. />
Figure SMS_143
Is the coding result->
Figure SMS_123
Transpose of->
Figure SMS_129
Is the coding result->
Figure SMS_134
Is a transpose of (a).
In some embodiments, to avoid trapping self-loops,
Figure SMS_144
is forcedly set to 0.
In some embodiments, the graph search unit uses eBIC as a scoring function for a given causal graph
Figure SMS_146
The general form of eBIC is defined as follows: />
Figure SMS_148
Wherein (1)>
Figure SMS_151
Is->
Figure SMS_147
Edge set of->
Figure SMS_149
Representation->
Figure SMS_152
Maximum log likelihood of model, ++>
Figure SMS_153
Is a parameter of maximum log likelihood.
Figure SMS_145
The model is composed of parameters->
Figure SMS_150
Constraint.
Is provided with
Figure SMS_155
The relation regression model is: />
Figure SMS_160
Wherein (1)>
Figure SMS_164
Is->
Figure SMS_156
About->
Figure SMS_161
Estimated amount of model->
Figure SMS_163
Indicate->
Figure SMS_165
Sample of observations->
Figure SMS_154
Indicate->
Figure SMS_159
The elements. Regression model
Figure SMS_162
Can be flexibly selected according to the needs. Is provided with->
Figure SMS_166
Is->
Figure SMS_157
Sum of squares of residuals of the individual variables. Thus, eBIC can be expressed as follows: />
Figure SMS_158
Wherein the first term is equivalent to a maximum log likelihood.
There are also some limitations to the learning paradigm of exploration-utilization. The Actor-Critic learns continuously based on continuous attempts in the solution space from which returns are obtained. However, since the solution space is typically very large, a large amount of exploration is required to learn effectively, thereby increasing training time. Furthermore, during the learning process of exploration-utilization, exploration strategies are often random, which can easily lead to unstable results if exploration is inadequate. In some embodiments, classification errors are fused in the causal graph search to help the Actor-Critic converge quickly, improving stability. Given that the data set dimension may be large, there are a large number of causal relationships that need to be predicted, affecting the performance of the classifier. The classifier section output is thus randomly lost.
Since the causal graph has no self-circulation,
Figure SMS_169
always 0, i.e. not taking into account the pair +.>
Figure SMS_170
Prediction is performed, so that training tag of classifier +.>
Figure SMS_174
Wherein (1)>
Figure SMS_167
Representing a longitudinal stacking of column vectors, ">
Figure SMS_171
Representing the culling matrix->
Figure SMS_175
Middle->
Figure SMS_178
First column vector->
Figure SMS_168
New vectors of elements. The output layer calculation process of the classifier is as follows: />
Figure SMS_173
Wherein (1)>
Figure SMS_176
And->
Figure SMS_188
Is a trainable parameter of the penultimate layer, < ->
Figure SMS_177
Is the penultimate layer vector representation.
Figure SMS_180
Is a loss indication vector, ">
Figure SMS_183
Is a sigmoid function. The classification loss function is as follows: />
Figure SMS_186
Wherein (1)>
Figure SMS_189
Is a regularization coefficient. />
Figure SMS_192
Is->
Figure SMS_193
The%>
Figure SMS_194
Element(s)>
Figure SMS_172
,/>
Figure SMS_179
Is the loss rate. />
Figure SMS_182
Training tag for classifier->
Figure SMS_185
The%>
Figure SMS_181
Element(s)>
Figure SMS_184
Is the +.f in the output data of the classifier>
Figure SMS_187
The number of elements to be added to the composition,
Figure SMS_191
is a classifier parameter. Since a part of the output nodes are lost, the value of the lost node should be ignored when calculating the loss, the above formula is multiplied by +.>
Figure SMS_190
In some embodiments, to enable derivation of a directed acyclic graph in conjunction with a continuous optimization method, causal graph acyclic constraints, i.e., directed graphs, are used
Figure SMS_195
Is acyclic if and only if: />
Figure SMS_196
Wherein (1)>
Figure SMS_197
Is a culling matrix->
Figure SMS_198
Matrix index of>
Figure SMS_199
Is a culling matrix->
Figure SMS_200
Is a trace of (1).
In some embodiments, the custom reward function is defined as follows:
Figure SMS_210
wherein (1)>
Figure SMS_203
And
Figure SMS_205
for punishment factors->
Figure SMS_214
For indicating function +.>
Figure SMS_218
Indication->
Figure SMS_216
Whether it is a directed acyclic graph, if it is 1,/or->
Figure SMS_220
For the directed acyclic graph, if its value is 0, < >>
Figure SMS_209
Not directed acyclic graph, can prove +.>
Figure SMS_213
Always non-negative, when->
Figure SMS_201
Less time, ->
Figure SMS_206
It may still be looped, thus increasing the indicator function constraint. Simultaneous acyclic constraint->
Figure SMS_204
Also multiplied by penaltyFactor->
Figure SMS_208
To ensure that it is as small as possible. It is desirable to maximize the bonus function, whose objective function is:
Figure SMS_212
when->
Figure SMS_215
And
Figure SMS_211
when an appropriate value is selected, formula (11) is equivalent to formula (1). Combining the rewarding function, finally obtaining the following optimization targets:
Figure SMS_217
wherein (1)>
Figure SMS_219
Representing action strategy->
Figure SMS_221
Is a trainable parameter in the graph generation model. The gradient is calculated using a monte carlo strategy gradient algorithm.
Figure SMS_202
Sample->
Figure SMS_207
Is randomly selected from the dataset and used as a round to estimate the gradient and update the parameters.
And 120, performing joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model.
To take advantage of the interoperability of deficiency complement and causal discovery, the previous experience is moderately exploited to obtain better performance while ensuring complement data and searching new graphs. As shown in fig. 1, the CRE-based feedback mechanism is incorporated herein into the results of causal relationship discovery during the data complement phase. In the initial study, it was attempted to use B as a supplemental feature input GAN, but experiments showed that this did not have any promoting effect on the performance enhancement of the model. The reason is that the simple relationship expression mode of the adjacency matrix can only be used for expressing whether the causal relationship exists between the variables, and the relevance between the nodes cannot be expressed from the deep level, for example, the indirect causal effect between the nodes cannot be expressed, and the causal strength cannot be expressed.
In some embodiments, one-hot encoding is used to generate a unique initial representation for each node. When updating the node vector representation, CRE makes good use of the acyclic nature of the causal graph. The node vector representation update formula is as follows:
Figure SMS_240
wherein (1)>
Figure SMS_244
Representing node->
Figure SMS_247
Is>
Figure SMS_224
Is node->
Figure SMS_229
Is a set of all immediate parent nodes of the network. Hyper-parameters->
Figure SMS_234
Is the information attenuation rate, < >>
Figure SMS_237
Representing node->
Figure SMS_235
Is used to determine the embedded vector of (a). To gain access to each causal path, duplicate computations are avoided, and updating of the node vector is accomplished using a depth-first search. In order to avoid that the learned erroneous causal relationship affects the subsequent training, the calculation is calculated +.>
Figure SMS_238
Later, the wrong directed edges need to be removed by independence checking.
Figure SMS_242
Wherein (1)>
Figure SMS_245
All are intermediate variables>
Figure SMS_243
For fusing characteristic parameters->
Figure SMS_246
Is->
Figure SMS_248
Causal characterization of the individual feedback periods,/->
Figure SMS_249
Is->
Figure SMS_225
Transpose of->
Figure SMS_226
Is->
Figure SMS_230
Dimension of->
Figure SMS_232
The missing part skips the operation. />
Figure SMS_222
、/>
Figure SMS_227
、/>
Figure SMS_233
Are trainable parameters. A complete data completion to causality discovery process is referred to as a feedback period. The +.>
Figure SMS_241
Just indicate +.>
Figure SMS_223
Symptoms under the feedback period, +.>
Figure SMS_228
Indicate->
Figure SMS_231
And fusing the feature matrix of each feedback period. The new features are then spliced to the original features into the input generator network. First feedback period +.>
Figure SMS_236
For a random initialization +.>
Figure SMS_239
A matrix.
And 130, inputting the missing data set to be processed into a trained causal relationship discovery model, and outputting an optimal causal graph corresponding to the missing data set to be processed by the trained causal relationship discovery model.
In some embodiments, because the causality discovery model is aimed at searching for the highest scoring graph, the strategy is not output, but rather the best scoring causality graph is recorded as output for all causality graphs generated during the training process, but this is not the end result, and the best scoring causality graph is likely to contain "false edges" requiring further pruning processing. Although it may be
Figure SMS_250
Middle enlargement->
Figure SMS_251
The value of (2) achieves the aim of pruning, but can easily lead to the loss of correct edges. Accordingly, pruning may be performed using the following procedure: for each edge in the best scoring causal graph, it is deleted first, and the remaining edges are kept unchanged, if the performance of the next best scoring causal graph is not degraded or is degraded within an acceptable degree, the deleted edges are kept, otherwise discarded. The best scoring causal graph after pruning is taken as the output of the Actor-Critic.
The performance of the causal relationship discovery model (CF-ICD) is described below by experimental data.
The completion algorithm corresponding to the causal relationship discovery model includes List-wise Deletion (LD), MF (Matrix Factorization), multiple interpolation (MICE, multiple Imputation by Chained Equations), GAIN (Generative Adversarial Imputation Nets), the causal relationship discovery algorithm includes constraint-based method PC, score-based method GES, and hybrid method Max-Min components-child-additive noise model (MMPC). The evaluation index used is the structural hamming distance (Structure Hamming Distance, SHD), which is the most commonly used graph model evaluation index, reflecting the minimum number of additions, deletions and flipping edges required to convert a certain DAG into a true causal graph, so the smaller the SHD, the more accurate the model-derived causal graph.
The test causality discovery model requires a definite authoritative causal map of the data set and therefore cannot be evaluated using the real world data set (no causal map). Thus, the causal relationship discovery model is evaluated using simulation data and real public data (providing a causal graph).
When given a generation mode
Figure SMS_253
Can be according to the paradigm->
Figure SMS_256
Simulation data is generated. Given node count +.>
Figure SMS_259
In order to generate a strict upper triangle adjacency matrix +.>
Figure SMS_254
Let->
Figure SMS_258
Ensuring that they are non-zero. Three different types of simulated data sets are generated herein: linear gaussian, nonlinear gaussian and nonlinear gaussian are specified as follows: />
Figure SMS_261
The last term of the above three formulas is noise,/->
Figure SMS_263
Is a coefficient of->
Figure SMS_252
Use +.>
Figure SMS_257
As a base->
Figure SMS_260
The non-gaussian noise is obtained as an index,
Figure SMS_262
and->
Figure SMS_255
The above three schemes are similar to the generation process used in the NOTEARS algorithm and DAG-GNN, and the causal graph is identifiable, each class of data generates 50 sets of data, 10000 samples, and 12 variables for each sample.
Sachs is a protein signal network based on protein and phospholipid expression levels, and is widely applied to a real reference data set in the field of graph modeling, and contains intervention and observation data. Since the study herein is based on observation data, only observation samples are considered, and intervention samples are ignored, resulting in a data set of 853 samples, each sample having 11 attributes, and the corresponding causal DAG contains 17 directed edges.
The feedback period in the experiment is set to 10, each period iterates 1000 rounds, the batch size is 128, the GAN learning rate is 0.001, the Encoder embedding dimension is 64, the initial learning rates of the actor and the Critic are all 0.001, the learning attenuation rates are all 0.96, and the loop-free constraint lambda is satisfied 1 And lambda (lambda) 2 1.2 and 0.01 respectively, the dag limiting parameter ϵ is 0.02, the decay rate gamma is 0.3, the loss rate of the output layer of the classification module is 0.5. The optimizer used in the model is Adam.The deletion rates were set to 10%, 20% and 30% for each class of simulation data set, respectively. The training set test set is divided into 4:1 ratios, and tables 1, 2 and 3 sequentially show results of the causal relationship discovery model and other comparison models on three data types, wherein table 1 characterizes the SHD comparison of the causal relationship discovery model and other comparison models on the linear gaussian data set, table 2 characterizes the SHD comparison of the causal relationship discovery model and other comparison models on the nonlinear gaussian data set, and table 3 characterizes the SHD comparison of the causal relationship discovery model and other comparison models on the nonlinear non-gaussian data set.
Figure SMS_264
/>
Figure SMS_265
Figure SMS_266
As can be seen from tables 1, 2 and 3, CF-ICD achieved the lowest SHD on all simulated data sets compared to the comparative model. The average SHD of CF-ICD over the three classes of data sets was reduced by about 25.8%, 16.5%, 15.1%, respectively. Therefore, it is known that the powerful production capacity of GAN in combination with the exploratory capacity of Actor-Critic can effectively derive potential causal patterns in missing datasets. The CRE-based causal feedback mechanism can effectively utilize the mutual promotion of complementation and discovery, so that the overall performance of the model is improved, and the CRE can capture more complex causal relationship to provide richer prompt for complementation. The CF-ICD has the lowest standard deviation in all methods and is the most stable in performance, and the advantage is beneficial to the fact that the exploring and finding module for fusing the classification errors can guide the Actor-Critic to find the optimal causal graph more accurately and rapidly.
Experiments are carried out on the Sachs data set, the CRE result of the last feedback period is recorded in the experimental process, and the thermodynamic diagram corresponding to the estimated causal diagram and the CRE matrix is drawn. The CRE matrix better reflects the hierarchical causal relationship among the variables. For example, causal relationships
Figure SMS_267
Relation to cause and effect->
Figure SMS_268
It can be seen that the effect of PKC on P38 is greater than the effect of PKC on Mek. FIG. 3 shows->
Figure SMS_269
And->
Figure SMS_270
All paths in the estimated causal graph.
As shown in fig. 3, PKC and Mek are indirect causal on all paths, and there are two paths with false edges that are rejected in the independence check with a high probability. While there is no false edge for both paths from PKC to P38, there is a direct causal relationship of PKC to P38. This is why the above-described results are achieved. Such information may be embodied in the CRE matrix, but cannot be conveyed by the adjacency matrix alone. Therefore, the CRE is used as a type prompt information to be input into the complement model, and the GAN can be helped to generate data more conforming to the real distribution.
To further verify the effectiveness of CRE and fusion classification errors, ablation experiments were performed herein for CRE and classification errors. The model with CRE and classification error removed simultaneously is denoted by A, the model with CRE removed only is denoted by B, and the model with classification error removed only is denoted by C. For fairness, the values of the same parameters of all control groups are kept consistent. The experimental results are shown in table 4.
Figure SMS_271
From the above table, it can be seen that the CF-ICD model has almost the lowest SHD among the four comparative models, which indicates that the performance of the model can be significantly improved after CRE and classification errors are fused. To the extent that performance improves, the SHD value of B is reduced by 1.9% for only the CRE module, compared to a, while the SHD value of model C is reduced for only the classification module12.0% lower, and the SHD value of CF-ICD is reduced by 12.9%. This is sufficient to demonstrate that CRE has a significant boosting effect on the accuracy of the estimated causal graph, while classification errors have a relatively small impact on SHD. The improvement of the model by the classification module is mainly embodied in accelerating the convergence speed of the model and improving the stability of the model.
FIG. 4 is a block diagram of a missing dataset causal relationship discovery system based on causal feedback, shown in accordance with some embodiments of the present description. As shown in FIG. 4, the causal relationship discovery system for a missing data set based on causal feedback may include a model building module, a model training module, and a causal discovery module.
The model building module may be configured to build a causal relationship discovery model, where the causal relationship discovery model includes a missing dataset complement sub-model and a causal discovery sub-model, the missing dataset complement sub-model is configured to complement a missing dataset, and the causal discovery sub-model is configured to determine a best causal map corresponding to the completed missing dataset.
The model training module can be used for carrying out joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model.
The causal relationship discovery module can be used for inputting the missing data set to be processed into a trained causal relationship discovery model, and the trained causal relationship discovery model outputs an optimal causal graph corresponding to the missing data set to be processed.
Fig. 5 is a schematic structural diagram of an electronic device shown according to some embodiments of the present specification, and referring to fig. 5, a structural block diagram of an electronic device that can be a server or a client of the present invention will now be described, which is an example of a hardware device that can be applied to aspects of the present invention. Electronic devices are intended to represent various forms of digital electronic computer devices, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the inventions described and/or claimed herein.
As shown in fig. 5, the electronic device includes a computing unit that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) or a computer program loaded from a storage unit into a Random Access Memory (RAM). In the RAM, various programs and data required for the operation of the device may also be stored. The computing unit, ROM and RAM are connected to each other by a bus. An input/output (I/O) interface is also connected to the bus.
A plurality of components in an electronic device are connected to an I/O interface, comprising: an input unit, an output unit, a storage unit, and a communication unit. The input unit may be any type of device capable of inputting information to the electronic device, and may receive input numeric or character information and generate key signal inputs related to user settings and/or function controls of the electronic device. The output unit may be any type of device capable of presenting information and may include, but is not limited to, a display, speakers, video/audio output terminals, vibrators, and/or printers. Storage units may include, but are not limited to, magnetic disks, optical disks. The communication unit allows the electronic device to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, modems, network cards, infrared communication devices, wireless communication transceivers and/or chipsets, such as bluetooth (TM) devices, wiFi devices, wiMax devices, cellular communication devices, and/or the like.
The computing unit may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing units include, but are not limited to, central Processing Units (CPUs), graphics Processing Units (GPUs), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, digital Signal Processors (DSPs), and any suitable processors, controllers, microcontrollers, and the like. The computing unit performs the various methods and processes described above. For example, in some embodiments, the causal relationship discovery method for a missing data set based on causal feedback may be implemented as a computer software program, tangibly embodied on a machine-readable medium, such as a storage unit. In some embodiments, part or all of the computer program may be loaded and/or installed onto the electronic device via the ROM and/or the communication unit. In some embodiments, the computing unit may be configured to perform the causal relationship discovery method of the missing data set based on causal feedback by any other suitable means (e.g. by means of firmware).
While the basic concepts have been described above, it will be apparent to those skilled in the art that the foregoing detailed disclosure is by way of example only and is not intended to be limiting. Although not explicitly described herein, various modifications, improvements, and adaptations to the present disclosure may occur to one skilled in the art. Such modifications, improvements, and modifications are intended to be suggested within this specification, and therefore, such modifications, improvements, and modifications are intended to be included within the spirit and scope of the exemplary embodiments of the present invention.
Meanwhile, the specification uses specific words to describe the embodiments of the specification. Reference to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic is associated with at least one embodiment of the present description. Thus, it should be emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various positions in this specification are not necessarily referring to the same embodiment. Furthermore, certain features, structures, or characteristics of one or more embodiments of the present description may be combined as suitable.
Furthermore, the order in which the elements and sequences are processed, the use of numerical letters, or other designations in the description are not intended to limit the order in which the processes and methods of the description are performed unless explicitly recited in the claims. While certain presently useful inventive embodiments have been discussed in the foregoing disclosure, by way of various examples, it is to be understood that such details are merely illustrative and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements included within the spirit and scope of the embodiments of the present disclosure. For example, while the system components described above may be implemented by hardware devices, they may also be implemented solely by software solutions, such as installing the described system on an existing server or mobile device.
Likewise, it should be noted that in order to simplify the presentation disclosed in this specification and thereby aid in understanding one or more inventive embodiments, various features are sometimes grouped together in a single embodiment, figure, or description thereof. This method of disclosure, however, is not intended to imply that more features than are presented in the claims are required for the present description. Indeed, less than all of the features of a single embodiment disclosed above.
In some embodiments, numbers describing the components, number of attributes are used, it being understood that such numbers being used in the description of embodiments are modified in some examples by the modifier "about," approximately, "or" substantially. Unless otherwise indicated, "about," "approximately," or "substantially" indicate that the number allows for a 20% variation. Accordingly, in some embodiments, numerical parameters set forth in the specification and claims are approximations that may vary depending upon the desired properties sought to be obtained by the individual embodiments. In some embodiments, the numerical parameters should take into account the specified significant digits and employ a method for preserving the general number of digits. Although the numerical ranges and parameters set forth herein are approximations that may be employed in some embodiments to confirm the breadth of the range, in particular embodiments, the setting of such numerical values is as precise as possible.
Finally, it should be understood that the embodiments described in this specification are merely illustrative of the principles of the embodiments of this specification. Other variations are possible within the scope of this description. Thus, by way of example, and not limitation, alternative configurations of embodiments of the present specification may be considered as consistent with the teachings of the present specification. Accordingly, the embodiments of the present specification are not limited to only the embodiments explicitly described and depicted in the present specification.

Claims (10)

1. A causal relationship discovery method for a missing data set based on causal feedback, comprising:
establishing a causal relation discovery model, wherein the causal relation discovery model comprises a missing data set complement sub-model and a causal discovery sub-model, the missing data set complement sub-model is used for complementing a missing data set, and the causal discovery sub-model is used for determining an optimal causal graph corresponding to the complemented missing data set;
performing joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model;
inputting the missing data set to be processed into the trained causal relation discovery model, and outputting an optimal causal graph corresponding to the missing data set to be processed by the trained causal relation discovery model.
2. The causal relationship discovery method of a missing data set based on causal feedback of claim 1, wherein the input of the missing data set complement sub-model comprises the missing data set to be processed, a mask matrix, and a random noise matrix, wherein the random noise matrix obeys a standard normal distribution;
and outputting the complement sub-model of the missing data set, wherein the output of the complement sub-model of the missing data set comprises the complement data set corresponding to the missing data set to be processed.
3. The causal relationship discovery method of the missing data set based on causal feedback of claim 2, wherein the missing data set complement sub-model comprises a generator and a discriminator, and wherein the input of the discriminator comprises a prompt matrix.
4. A causal relationship discovery method for a missing data set based on causal feedback according to claim 2 or 3, wherein the causal discovery sub-model comprises a graph generating unit and a graph searching unit, the graph generating unit is used for capturing variable relationships from a complement data set corresponding to the missing data set to be processed and generating a causal graph adjacency matrix, and the graph searching unit is used for searching the best causal graph corresponding to the missing data set to be processed in a graph space.
5. The causal relationship discovery method of missing data set based on causal feedback of claim 4, wherein the graph generation unit comprises an encoder and a decoder, wherein the encoder comprises a multi-layer self-attention convolutional network, and wherein the decoder comprises a single-layer neural network.
6. The causal relationship discovery method of the missing data set based on causal feedback of claim 4, wherein the graph search unit comprises a three-layer feedforward multi-layer perceptron with a ReLU as an activation function.
7. The causal relationship discovery method of claim 4, wherein the performing joint training on the missing dataset complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model comprises:
and carrying out joint training on the missing data set complement sub-model and the causal relationship discovery sub-model based on a causal characterization extraction feedback mechanism, and obtaining the trained causal relationship discovery model.
8. The causal relationship discovery method of claim 7, wherein the causal relationship discovery model is obtained by jointly training the missing dataset complement sub-model and the causal relationship discovery sub-model by the causal relationship extraction-based feedback mechanism, and comprises:
in the combined training process, the graph searching unit fuses the classification errors for training.
9. The causal relationship discovery method of the missing data set based on causal feedback of claim 1, further comprising:
pruning the best causal graph corresponding to the to-be-processed missing data set output by the trained causal relationship discovery model, and determining the target best causal graph corresponding to the to-be-processed missing data set.
10. A causal relationship discovery system for a missing data set based on causal feedback, comprising:
the system comprises a model building module, a causal relation finding module and a causal relation finding module, wherein the causal relation finding module comprises a missing data set complement sub-model and a causal relation finding sub-model, the missing data set complement sub-model is used for complementing a missing data set, and the causal relation finding sub-model is used for determining an optimal causal diagram corresponding to the complemented missing data set;
the model training module is used for carrying out joint training on the missing data set complement sub-model and the causal relationship discovery sub-model to obtain a trained causal relationship discovery model;
and the causal relation discovery module is used for inputting the missing data set to be processed into the trained causal relation discovery model, and the trained causal relation discovery model outputs an optimal causal diagram corresponding to the missing data set to be processed.
CN202310364531.8A 2023-04-07 2023-04-07 Causal relation discovery method and system for missing data set based on causal feedback Pending CN116090522A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310364531.8A CN116090522A (en) 2023-04-07 2023-04-07 Causal relation discovery method and system for missing data set based on causal feedback

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310364531.8A CN116090522A (en) 2023-04-07 2023-04-07 Causal relation discovery method and system for missing data set based on causal feedback

Publications (1)

Publication Number Publication Date
CN116090522A true CN116090522A (en) 2023-05-09

Family

ID=86204850

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310364531.8A Pending CN116090522A (en) 2023-04-07 2023-04-07 Causal relation discovery method and system for missing data set based on causal feedback

Country Status (1)

Country Link
CN (1) CN116090522A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116977667A (en) * 2023-08-01 2023-10-31 中交第二公路勘察设计研究院有限公司 Tunnel deformation data filling method based on improved GAIN

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116977667A (en) * 2023-08-01 2023-10-31 中交第二公路勘察设计研究院有限公司 Tunnel deformation data filling method based on improved GAIN
CN116977667B (en) * 2023-08-01 2024-01-26 中交第二公路勘察设计研究院有限公司 Tunnel deformation data filling method based on improved GAIN

Similar Documents

Publication Publication Date Title
CN112529168B (en) GCN-based attribute multilayer network representation learning method
CN114048331A (en) Knowledge graph recommendation method and system based on improved KGAT model
CN113590900A (en) Sequence recommendation method fusing dynamic knowledge maps
US11334791B2 (en) Learning to search deep network architectures
CN116601626A (en) Personal knowledge graph construction method and device and related equipment
Zheng et al. Feature grouping and selection: A graph-based approach
CN116090522A (en) Causal relation discovery method and system for missing data set based on causal feedback
CN112749737A (en) Image classification method and device, electronic equipment and storage medium
CN114461929A (en) Recommendation method based on collaborative relationship graph and related device
CN113360670A (en) Knowledge graph completion method and system based on fact context
CN116992151A (en) Online course recommendation method based on double-tower graph convolution neural network
CN116308551A (en) Content recommendation method and system based on digital financial AI platform
CN112241920A (en) Investment and financing organization evaluation method, system and equipment based on graph neural network
CN116992008B (en) Knowledge graph multi-hop question-answer reasoning method, device and computer equipment
CN117312680A (en) Resource recommendation method based on user-entity sub-graph comparison learning
CN116522232A (en) Document classification method, device, equipment and storage medium
Cai et al. Semantic and correlation disentangled graph convolutions for multilabel image recognition
CN115344794A (en) Scenic spot recommendation method based on knowledge map semantic embedding
CN115238134A (en) Method and apparatus for generating a graph vector representation of a graph data structure
CN113158088A (en) Position recommendation method based on graph neural network
Baeta et al. Exploring expression-based generative adversarial networks
US11829735B2 (en) Artificial intelligence (AI) framework to identify object-relational mapping issues in real-time
US20230018525A1 (en) Artificial Intelligence (AI) Framework to Identify Object-Relational Mapping Issues in Real-Time
CN110457543B (en) Entity resolution method and system based on end-to-end multi-view matching
US20230419102A1 (en) Token synthesis for machine learning models

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20230509

RJ01 Rejection of invention patent application after publication