JP7459386B2

JP7459386B2 - Disease diagnosis prediction system based on graph neural network

Info

Publication number: JP7459386B2
Application number: JP2023536567A
Authority: JP
Inventors: ▲勁▼松李; ▲勝▼▲強▼ 池; 宇清王; 雨田; 天舒周
Original assignee: 之江実験室
Priority date: 2021-12-27
Filing date: 2022-09-05
Publication date: 2024-04-01
Anticipated expiration: 2042-09-05
Also published as: JP2024503980A; CN113990495B; WO2023124190A1; CN113990495A

Description

本発明は、医療健康情報技術分野に属し、特にグラフニューラルネットワークに基づく疾患診断予測システムに関する。 The present invention belongs to the field of medical and health information technology, and particularly relates to a disease diagnosis and prediction system based on graph neural networks.

医療保健分野には、良く整理された知識マップ、例えば、国際疾患分類、ＤｒｕｇＢａｎｋ、臨床ガイド及び共通知識等が非常に多く存在し、それらは、人間の認知に合致する階層情報、複雑な関連関係を有する。知識マップは、様々な関係を含む異種グラフネットワークである。どのように知識マップにおける専門家知識及び電子カルテデータの両方を利用し、知識とデータとを統合してモデル化を行うかは、疾患の診断予測への応用にとって、重要な役割を有する。 In the medical and health field, there are many well-organized knowledge maps such as the International Classification of Diseases, DrugBank, clinical guides, and common knowledge. has. A knowledge map is a heterogeneous graph network containing various relationships. How to use both expert knowledge and electronic medical record data in a knowledge map and integrate knowledge and data to perform modeling plays an important role in application to diagnosis and prediction of diseases.

従来のグラフニューラルネットワークモデルに基づいて疾患予測を行う方法は、医学知識マップと電子カルテデータとを効果的に融合して異種グラフネットワークを構築する方法が欠いている。現在、主な方法は、以下の幾つかの種類を有する。（１）データに基づくグラフネットワークモデル化：電子カルテデータに基づいてグラフネットワークを構築し、グラフニューラルネットワークモデルを利用して疾患予測を行う。当該方法は、従来の医学知識ソースを十分に利用していない。(２)知識表現学習及び疾患予測の段階的なモデル化方法：医学知識マップを学習して知識のベクトル表現を取得してから、電子カルテデータに融合し、疾患予測を行う。段階的なトレーニング方法は、疾患予測に最も適する知識表現を取得することができない。(３)疾患予測タスクのエンドツーエンドモデル化方法のみに注目する：医学知識マップ及び電子カルテデータを融合し、異種グラフネットワークを構築し、グラフニューラルネットワークモデルを利用して疾患予測を行う。当該方法は、上記２種の方法に存在する不足を解決したが、モデルが疾患予測タスクのみを最適化するため、学習された知識がデータ中のノイズの影響を受ける恐れがある。 Conventional disease prediction methods based on graph neural network models lack a method to effectively fuse medical knowledge maps and electronic medical record data to construct a heterogeneous graph network. Currently, the main methods have several types: (1) Graph network modeling based on data: A graph network is constructed based on electronic medical record data, and disease prediction is performed using a graph neural network model. The method does not take full advantage of traditional medical knowledge sources. (2) Stepwise modeling method for knowledge representation learning and disease prediction: After learning a medical knowledge map and obtaining a vector representation of knowledge, it is fused with electronic medical record data to perform disease prediction. Stepwise training methods fail to obtain the most suitable knowledge representation for disease prediction. (3) Focus only on the end-to-end modeling method for disease prediction tasks: fuse medical knowledge maps and electronic medical record data, construct a heterogeneous graph network, and use graph neural network models to perform disease prediction. Although this method solved the deficiencies present in the above two methods, since the model only optimizes the disease prediction task, the learned knowledge may be affected by noise in the data.

本発明は、従来技術の不足について、グラフニューラルネットワークに基づく疾患診断予測システムを提供する。 The present invention provides a disease diagnosis and prediction system based on a graph neural network over the deficiencies of the prior art.

本発明の目的は、以下の解決手段によって達成される。 The object of the invention is achieved by the following solution.

本発明は、グラフニューラルネットワークに基づく疾患診断予測システムを提供する。当該システムは、
（１）医学知識ソースに基づいて疾患―症状知識マップを構築する知識マップ構築モジュールと、
(２)電子カルテシステムから、患者疾患診断及び症状データを含む患者電子カルテデータであってトライグラム形式で格納された患者電子カルテデータを抽出するデータ抽出及び予処理モジュールと、
(３)疾患―症状知識マップ及び電子カルテデータに対してグラフニューラルネットワーク学習及び予測モデル化を行う疾患診断モデル構築モジュールと、
(４)疾患診断モデルを用いて、入力された新患者の症状について疾患診断予測を行う疾患診断モデル応用モジュールと、を備え、
前記グラフニューラルネットワーク学習及び予測モデル化は、異種グラフネットワークの構築と、疾患診断モデルの構築とを含み、
前記異種グラフネットワークは、疾患―症状知識マップから疾患―症状関係を抽出して構築された疾患―症状サブグラフと、トライグラム形式の患者疾患診断及び症状データを用いて構築された患者―症状サブグラフとを含み、
前記疾患診断モデルは、グラフエンコーダとグラフデコーダとの両方によって構成され、
前記グラフエンコーダは、グラフ畳み込みニューラルネットワークを基に実現され、疾患―症状共起行列を用いて得られた疾患、症状、患者のノード初期埋込表現、疾患―症状隣接行列及び患者―症状隣接行列を入力とし、異なるタイプのノードは、接続辺を介して情報を伝送し、ノード埋込表現更新操作によって疾患、症状、患者ノード埋込表現を取得し、グラフデコーダに入力し、
前記グラフデコーダは、ノード埋込表現を用いてマルチタスク学習を行い、前記マルチタスク学習は、以下の部分ａ)～ｃ)を含み、
ａ)患者疾患診断予測のマルチラベルな階層分類：疾患の階層構造を用いて疾患層階層関係を構築し、前記疾患層階層関係は、診断予測を行う必要のある疾患層と、医学知識から得られた疾患システム分類層とを含み、マルチラベルな階層分類器を構築し、マルチラベルな階層分類の損失関数を設計し、
ｂ)疾患対比学習：疾患ペアシステム種別判別器を構築し、疾患ペア中の２種の疾患の間の距離を算出し、疾患対比学習の損失関数を設計し、
ｃ)疾患―症状関係学習：疾患―症状関係学習器を構築し、疾患―症状ペア中の疾患と症状とが関連関係を有する確率を算出し、疾患―症状関係学習の損失関数を設計し、
前記マルチラベルな階層分類の損失関数と前記疾患対比学習の損失関数と前記疾患―症状関係学習の損失関数との和を求めて疾患診断モデルの損失関数を取得する。 The present invention provides a disease diagnosis and prediction system based on graph neural networks. The system is
(1) a knowledge map construction module that constructs a disease-symptom knowledge map based on medical knowledge sources;
(2) a data extraction and preprocessing module that extracts patient electronic medical record data that includes patient disease diagnosis and symptom data and is stored in trigram format from the electronic medical record system;
(3) a disease diagnosis model construction module that performs graph neural network learning and predictive modeling on disease-symptom knowledge maps and electronic medical record data;
(4) a disease diagnosis model application module that uses the disease diagnosis model to predict disease diagnosis for input new patient symptoms;
The graph neural network learning and predictive modeling includes constructing a heterogeneous graph network and constructing a disease diagnosis model,
The heterogeneous graph network consists of a disease-symptom subgraph constructed by extracting disease-symptom relationships from a disease-symptom knowledge map, and a patient-symptom subgraph constructed using patient disease diagnosis and symptom data in trigram format. including;
The disease diagnosis model is composed of both a graph encoder and a graph decoder,
The graph encoder is realized based on a graph convolutional neural network, and initial embedded representations of nodes of diseases, symptoms, and patients obtained using a disease-symptom co-occurrence matrix, a disease-symptom adjacency matrix, and a patient-symptom adjacency matrix. As an input, nodes of different types transmit information through connection edges, obtain disease, symptom, and patient node embedding representations through node embedding representation update operations, and input them to the graph decoder,
The graph decoder performs multitask learning using a node embedding representation, and the multitask learning includes the following parts a) to c),
a) Multi-label hierarchical classification of patient disease diagnosis prediction: A disease layer hierarchical relationship is constructed using the disease hierarchical structure, and the disease layer hierarchical relationship is based on the disease layer that requires diagnosis prediction and medical knowledge. construct a multi-label hierarchical classifier, design a multi-label hierarchical classification loss function,
b) Disease contrast learning: Build a disease pair system type classifier, calculate the distance between two diseases in the disease pair, design a loss function for disease contrast learning,
c) Disease-symptom relationship learning: Build a disease-symptom relationship learner, calculate the probability that a disease and symptom in a disease-symptom pair have a related relationship, design a loss function for disease-symptom relationship learning,
The loss function of the disease diagnosis model is obtained by calculating the sum of the loss function of the multi-label hierarchical classification, the loss function of the disease comparison learning, and the loss function of the disease-symptom relationship learning.

更に、前記知識マップ構築モジュールにおいて、前記疾患―症状知識マップは、疾患と症状との２種のノードタイプ、及び、疾患―症状という１種の関係を含む。 Furthermore, in the knowledge map construction module, the disease-symptom knowledge map includes two node types, disease and symptom, and one relationship, disease-symptom.

更に、前記異種グラフネットワークは、疾患―症状知識マップ及び電子カルテデータを基に構築され、疾患と症状と患者との３種のノードタイプを含み、症状は、疾患と患者との間に接続される中間ノードであり、前記異種グラフネットワークには、疾患―症状知識マップのうち疾患、症状に関連する関係サブグラフと、電子カルテデータのうち患者、症状に関連する関係サブグラフとが統合されている。 Furthermore, the heterogeneous graph network is constructed based on the disease-symptom knowledge map and electronic medical record data, and includes three types of nodes: disease, symptom, and patient, and the symptom is connected between the disease and the patient. The heterogeneous graph network integrates relationship subgraphs related to diseases and symptoms in the disease-symptom knowledge map, and relationship subgraphs related to patients and symptoms in electronic medical record data.

更に、前記異種グラフネットワーク

は、

と示され、ノードセットは、

と示され、Ｄ、Ｓ、Ｐは、それぞれ所定の疾患セット、症状セット及び患者セットであり、且つ

と示され、

は、疾患種類、症状種類及び患者数をそれぞれ表し、辺セットは、

と示され、セットＲは、疾患―症状関係

と患者―症状関係

とを含み、前記疾患―症状関係は、疾患―症状隣接行列に格納され、前記患者―症状関係は、患者―症状隣接行列に格納されている。 Furthermore, the heterogeneous graph network

teeth,

and the node set is

where D, S, and P are predetermined disease sets, symptom sets, and patient sets, respectively; and

It is shown that

represent the disease type, symptom type, and number of patients, respectively, and the edge set is

and set R is the disease-symptom relationship.

and the patient-symptom relationship

wherein the disease-symptom relationships are stored in a disease-symptom adjacency matrix, and the patient-symptom relationships are stored in a patient-symptom adjacency matrix.

更に、前記ノード初期埋込表現の生成は、
疾患―症状共起行列

を構築する処理であって、行列

の第

行且つ第

列が

と記され、電子カルテデータおける疾患

と診断された患者のうち症状

を発症した患者の数を表す処理と、

に対して行の正規化を行って

を取得する処理であって、疾患

の初期埋込表現が

であり、

の第

行を示す処理と、

に対して列の正規化を行って

を取得する処理であって、症状

の初期埋込表現が

であり、

の第

列を示す処理と、
患者

の初期埋込表現

を

により求める処理とを含み、

は、患者

の症状数である。 Furthermore, the generation of the node initial embedding representation is performed by:
Disease-symptom co-occurrence matrix

The process of constructing a matrix

No.

row and first

The row is

disease in electronic medical record data.

Symptoms among patients diagnosed with

a process representing the number of patients who developed

Perform row normalization for

A process to obtain disease

The initial embedded representation of

and

No.

Processing to indicate the row,

Perform column normalization for

The process of acquiring the symptoms

The initial embedded representation of

and

No.

Processing to indicate columns,
patient

initial embedded representation of

of

including the processing required by

is a patient

This is the number of symptoms.

更に、異なるタイプのノード初期埋込表現を１つの多層パーセプトロンにそれぞれ入力し、同じ次元の初期埋込表現を取得してから、グラフエンコーダに入力する。 Moreover, the initial embedding representations of nodes of different types are respectively input into one multilayer perceptron, and the initial embedding representations of the same dimension are obtained before being input into the graph encoder.

更に、前記グラフエンコーダでは、疾患

について、第

層のノード埋込表現

は、

にて求められ、
症状

について、第

層のノード埋込表現

は、

にて求められ、
患者

について、第

層のノード埋込表現

は、

にて求められ、

は、活性化関数であり、

は、それぞれ第

層疾患診断モデルをトレーニングして得られた疾患―症状関連重み行列及び患者―症状関連重み行列であり、

は、それぞれ疾患

、症状

、患者

の、第

層におけるノード埋込表現であり、

は、疾患

に隣接する症状ノードのセットを表し、

は、症状

に隣接する疾患ノードのセットを表し、

は、症状

に隣接する患者ノードのセットを表し、

は、患者

に隣接する症状ノードのセットを表す。 Furthermore, in the graph encoder, the disease

About, No.

Layer node embedding representation

teeth,

asked for,
symptoms

About, No.

Layer node embedding representation

teeth,

asked for,
patient

About, No.

Layer node embedding representation

teeth,

asked for,

is the activation function,

are each

A disease-symptom related weight matrix and a patient-symptom related weight matrix obtained by training a layered disease diagnosis model,

are each disease

, symptoms

,patient

of, the th

is a node embedding representation in a layer,

is a disease

represents the set of symptom nodes adjacent to

are the symptoms

represents the set of disease nodes adjacent to

are the symptoms

represents the set of patient nodes adjacent to

is a patient

represents the set of symptom nodes adjacent to .

更に、前記グラフデコーダにおいて、前記患者疾患診断予測のマルチラベルな階層分類は、以下のことを含む。
疾患層階層関係を構築し、疾患層の疾患種類を

と記し、疾患システム分類層を

と記し、

は、疾患システム分類数であり、

個の二値分類器を含むマルチラベルな階層分類器を構築し、

個の二値分類器を

（ただし、

）と記し、

を満たし、
患者

のノード埋込表現を

個の二値分類器にそれぞれ入力して

個の予測確率を取得し、

と記し、二値分類器

に対応するラベルは、患者の疾患システム分類であり、二値分類器

に対応するラベルは、患者の疾患診断であり、対応するモデルパラメータは、

であり、
患者

が疾患

を発症する確率

を

により求め、

は、二値分類器

で予測される、患者が

を発症するか否かの確率であり、疾患

のシステム分類を

とし、

は、二値分類器

で予測される、患者に疾患システム分類

が出現するか否かの確率であり、
マルチラベルな階層分類の損失関数

は、

にて求められ、

は、患者

が疾患

を発症する実ラベルであり、

は、患者

の疾患診断に対応する疾患システム分類の実ラベルであり、

は、Ｌ１ノルムを表し、

は、疾患

と疾患

との間の類似度であり、

にて求められ、

は、疾患

及び疾患

の実ラベル分布をそれぞれ表し、

を満たし、

と

は、患者

が疾患

、疾患

を発症する実ラベルをそれぞれ表す。 Further, in the graph decoder, the multi-label hierarchical classification of patient disease diagnosis prediction includes the following.
Build a disease layer hierarchy and identify the disease type in the disease layer.

, and the disease system classification layer is

written as,

is the number of disease system classifications,

Build a multi-label hierarchical classifier containing binary classifiers,

binary classifiers

(however,

),

The filling,
patient

The node embedding representation of

input into two binary classifiers respectively.

Get the predicted probabilities of

, a binary classifier

The label corresponding to is the patient's disease system classification, and the binary classifier

The label corresponding to is the patient's disease diagnosis, and the corresponding model parameter is

and
patient

is a disease

probability of developing

of

Obtained by,

is a binary classifier

predicted that the patient will

It is the probability of developing or not developing a disease.

system classification of

year,

is a binary classifier

Disease system classification of patients as predicted by

is the probability of whether or not appears,
Loss function for multi-label hierarchical classification

teeth,

asked for,

is a patient

is a disease

It is a real label that develops,

is a patient

is the actual label of the disease system classification corresponding to the disease diagnosis of

represents the L1 norm,

is a disease

and diseases

is the degree of similarity between

asked for,

is a disease

and diseases

respectively represent the real label distribution of

The filling,

and

is a patient

is a disease

,disease

Each represents a real label that develops.

更に、前記グラフデコーダにおいて、前記疾患対比学習は、以下のことを含む。
疾患セットＤ中の疾患を２つずつ組み合わせ、疾患ペアセットＤＤを取得し、疾患ペア数が

であり、ＤＤ中の何れか１つの疾患ペア

に関し、疾患ペアラベルは、２種の疾患が同一のシステム分類に属する場合に、

とし、２種の疾患が異なるシステム分類に属する場合に、

とし、
疾患ペアシステム種別判別器

を構築し、疾患ペア

中の２種の疾患のノード埋込表現

を

に入力し、２種の疾患の間の距離

を

により求め、

は、Ｌ２ノルムを表し、
疾患対比学習の損失関数

を

により求め、ｍは、異なる疾患システム種別埋込表現の間の距離の下限値である。 Furthermore, in the graph decoder, the disease comparison learning includes the following.
Two diseases in disease set D are combined to obtain disease pair set DD, and the number of disease pairs is

and any one disease pair in DD

Regarding disease pair labels, when two diseases belong to the same system classification,

If two diseases belong to different system classifications,

year,
Disease pair system type discriminator

construct a disease pair

Node embedding representation of two diseases in

of

and the distance between the two diseases

of

Obtained by,

represents the L2 norm,
Loss function for disease contrast learning

of

m is the lower limit of the distance between different disease system type embedded representations.

更に、前記グラフデコーダにおいて、前記疾患―症状関係学習は、下記のことを含む。
疾患セットＤ及び症状セットＳから疾患及び症状を１種ずつ選択し、疾患―症状ペアセットＤＳを取得し、疾患―症状ペア数が

であり、ＤＳ中の何れか１つの疾患―症状ペア

に関し、疾患―症状ペアラベルは、疾患―症状が疾患―症状知識マップにおいて関連関係を有する場合に、

とし、疾患―症状が疾患―症状知識マップにおいて関連関係を有さない場合に、

とし、
疾患―症状関係学習器

を構築し、

中の疾患及び症状のノード埋込表現

を

に入力し、

中の疾患と症状とが関連関係を有する確率

を

により求め、

は、ｓｉｇｍｏｉｄ関数を表し、
疾患―症状関係学習の損失関数

を

により求める。 Further, in the graph decoder, the disease-symptom relationship learning includes the following.
Select one disease and symptom from disease set D and symptom set S, obtain disease-symptom pair set DS, and calculate the number of disease-symptom pairs.

and any one disease-symptom pair in DS

Regarding disease-symptom pair labels, when disease-symptoms have a related relationship in the disease-symptom knowledge map,

If the disease-symptoms have no relation in the disease-symptom knowledge map,

year,
Disease-symptom relationship learning device

Build and

node-embedded representations of diseases and symptoms in

of

and enter

Probability that there is a relationship between the disease and symptoms in

of

Obtained by,

represents a sigmoid function,
Loss function for disease-symptom relationship learning

of

Find it by

本発明は、以下の有利な作用効果を有する。本発明では、知識マップにおける専門家知識及び電子カルテデータを有効に統合して異種グラフネットワークを構築する。異種グラフネットワークにおいて、グラフ畳み込みニューラルネットワーク方法を用いて異種グラフネットワークの局所情報及びグローバル情報を学習する。疾患診断モデルは、知識及びデータの両方に対してエンドツーエンドのトレーニングを行うことができる。モデル最適化目標において、疾患予測タスクを最適化するに加えて、知識関係に対する教師情報も追加することにより（疾患対比学習部分及び疾患―症状関係学習部分）、疾患予測タスクが知識を効果的に利用することが確保されるとともに、知識表現がデータノイズの影響を受けないことも確保される。予測疾患数が多くて一部の疾患に対応する患者数が限られる問題について、マルチラベルな階層分類を設計することにより、少ないサンプル種別の疾患の予測効果を向上させる。 The present invention has the following advantageous effects. In the present invention, a heterogeneous graph network is constructed by effectively integrating expert knowledge and electronic medical record data in a knowledge map. In a heterogeneous graph network, a graph convolution neural network method is used to learn local information and global information of a heterogeneous graph network. Disease diagnostic models can be trained end-to-end on both knowledge and data. In the model optimization goal, in addition to optimizing the disease prediction task, we also add teacher information for knowledge relationships (disease comparison learning part and disease-symptom relationship learning part) so that the disease prediction task can effectively use knowledge. It is ensured that the knowledge representation is not affected by data noise. For problems where the number of predicted diseases is large and the number of patients corresponding to some diseases is limited, by designing a multi-label hierarchical classification, we can improve the prediction effect of diseases with a small number of sample types.

本発明の実施例に関わるグラフニューラルネットワークに基づく疾患診断予測システムの構成図である。FIG. 1 is a configuration diagram of a disease diagnosis and prediction system based on a graph neural network according to an embodiment of the present invention. 本発明の実施例に関わる異種グラフネットワークの構成図である。FIG. 1 is a configuration diagram of a heterogeneous graph network according to an embodiment of the present invention. 本発明の実施例に関わる疾患診断モデルの構成図である。FIG. 1 is a configuration diagram of a disease diagnosis model related to an embodiment of the present invention. 本発明の実施例に関わる疾患の階層構造の模式図である。FIG. 2 is a schematic diagram of a hierarchical structure of diseases related to an example of the present invention.

本発明の上記目的、特徴及びメリットがより明白且つ分かりやすくなるように、以下では、図面を参照しながら本発明の具体的な実施形態について詳細に説明する。 In order to make the above objects, features, and advantages of the present invention more clear and comprehensible, specific embodiments of the present invention will be described in detail below with reference to the drawings.

本発明が十分に理解されるように以下の説明において詳細が多く記述されているが、本発明は、更に、ここで記述された形態と異なる形態で実施され得る。当業者は、本発明の要旨に反しない場合に、類似する拡張を行うことができる。したがって、本発明は、以下に開示された具体的な実施例に限定されない。 Although many details are set forth in the following description to provide a thorough understanding of the invention, the invention may be practiced otherwise than as described herein. Those skilled in the art can make similar extensions without departing from the spirit of the invention. Therefore, the invention is not limited to the specific examples disclosed below.

本発明の実施例は、グラフニューラルネットワークに基づく疾患診断予測システムを提供する。図１に示すように、当該システムは、知識マップ構築モジュールと、データ抽出及び予処理モジュールと、疾患診断モデル構築モジュールと、疾患診断モデル応用モジュールとを備える。以下では、各モジュールの実施形態を詳細に説明する。 Embodiments of the present invention provide a disease diagnosis and prediction system based on graph neural networks. As shown in FIG. 1, the system includes a knowledge map construction module, a data extraction and preprocessing module, a disease diagnosis model construction module, and a disease diagnosis model application module. Below, embodiments of each module will be described in detail.

知識マップ構築モジュール：ＳＮＯＭＥＤ―ＣＴ、ＨＰＯ等の医学知識ソースに基づいて疾患―症状知識マップを構築し、前記疾患―症状知識マップは、疾患と症状との２種のノードタイプ、及び、疾患―症状という１種の関係を含む。
データ抽出及び予処理モジュール：電子カルテシステムから、患者疾患診断及び症状データを含む患者電子カルテデータであってトライグラム形式で格納された患者電子カルテデータを抽出する。
疾患診断モデル構築モジュール：疾患―症状知識マップ及び電子カルテデータに対してグラフニューラルネットワーク学習及び予測モデル化を行う。
疾患診断モデル応用モジュール：疾患診断モデルを用いて、入力された新患者の症状について疾患診断予測を行う。 Knowledge map construction module: Constructs a disease-symptom knowledge map based on medical knowledge sources such as SNOMED-CT and HPO, and the disease-symptom knowledge map has two node types: disease and symptom, and disease- It includes one type of relationship: symptoms.
Data extraction and preprocessing module: Extracts patient electronic medical record data, which includes patient disease diagnosis and symptom data and is stored in trigram format, from the electronic medical record system.
Disease diagnosis model construction module: Performs graph neural network learning and predictive modeling on disease-symptom knowledge maps and electronic medical record data.
Disease diagnosis model application module: Uses a disease diagnosis model to predict disease diagnosis for input new patient symptoms.

疾患診断モデル構築モジュールの具体的な機能は、所定疾患セット

、症状セット

及び患者セット

である。

は、疾患種類、症状種類及び患者数をそれぞれ表す。疾患診断予測は、マルチラベル分類問題と見なされる。即ち、所定患者症状の場合に、疾患診断モデルは、患者の疾患診断を予測することができる。 The specific function of the disease diagnosis model construction module is to

, Symptom Set

and patient set

It is.

where x, y, y, y, y, and y denote the disease type, symptom type, and number of patients, respectively. Disease diagnosis prediction is viewed as a multi-label classification problem, i.e., for a given patient symptom, a disease diagnosis model can predict the patient's disease diagnosis.

疾患診断モデルの実現は、以下の（１）～（６）を含む。
（１）異種グラフネットワークの構築
疾患―症状知識マップ及び電子カルテデータを用いて、疾患、症状及び患者の３種のノードタイプを含む異種グラフネットワーク

を構築する。症状は、疾患と患者との間に接続される中間ノードである。当該異種グラフネットワークは、疾患―症状知識マップのうち疾患、症状に関連する関係サブグラフと、電子カルテデータのうち患者、症状に関連する関係サブグラフとが統合されており、疾患―症状サブグラフ

及び患者―症状サブグラフ

を含む。
異種グラフネットワーク

は、

と示されてもよい。
その中、ノードセットは、

と示され、辺セットは、

と示され、セットＲは、疾患―症状関係

及び患者―症状関係

を含み、疾患―症状関係は、疾患―症状隣接行列に格納され、患者―症状関係は、患者―症状隣接行列に格納されている。
図２は、異種グラフネットワーク構造の例示であり、４人の患者

、４種の疾患

、４種の症状

、及び患者―症状関係、疾患―症状関係を含む。 The realization of the disease diagnosis model includes the following (1) to (6).
(1) Construction of a heterogeneous graph network Using the disease-symptom knowledge map and electronic medical record data, a heterogeneous graph network including three node types: disease, symptoms, and patients is constructed.

A symptom is an intermediate node connected between a disease and a patient. The heterogeneous graph network is an integrated network of disease-symptom knowledge map relationship subgraphs related to diseases and symptoms and electronic medical record data relationship subgraphs related to patients and symptoms.

and the patient-symptom subgraph

including.
Heterogeneous Graph Networks

teeth,

It may be shown as follows.
Among them, the node set is

and the edge set is

and set R is the disease-symptom relationship.

and patient-symptom relationship

where the disease-symptom relationships are stored in a disease-symptom adjacency matrix and the patient-symptom relationships are stored in a patient-symptom adjacency matrix.
FIG. 2 is an example of a heterogeneous graph network structure, with four patients

, 4 types of diseases

, 4 types of symptoms

, as well as the patient-symptom relationship and the disease-symptom relationship.

（２）サブグラフの構築
疾患―症状サブグラフ

：疾患―症状知識マップから疾患―症状関係構築疾患―症状サブグラフを抽出する。
患者―症状サブグラフ

：トライグラム形式の患者疾患診断及び症状データを用いて、患者―症状サブグラフを構築する。 (2) Constructing a subgraph Disease-symptom subgraph

: Extract disease-symptom relation-constructed disease-symptom subgraphs from the disease-symptom knowledge map.
Patient-Symptoms Subgraph

: Construct a patient-symptom subgraph using patient disease diagnosis and symptom data in trigram format.

（３）疾患診断モデル構造
図３は、疾患診断モデル構造の例示である。疾患―症状共起行列を用いて疾患、症状、患者のノード初期埋込表現を取得する。ノード初期埋込表現及び隣接行列を疾患診断モデルの入力とする。疾患診断モデルは、グラフエンコーダ及びグラフデコーダの２つの部分によって構成される。ノード初期埋込表現の生成、グラフエンコーダ及びグラフデコーダの具体的なステップは、（４）～（６）を参照可能である。 (3) Disease diagnosis model structure FIG. 3 is an example of a disease diagnosis model structure. Obtain initial node embedding representations of diseases, symptoms, and patients using a disease-symptom co-occurrence matrix. The node initial embedding representation and adjacency matrix are input to the disease diagnosis model. A disease diagnosis model is composed of two parts: a graph encoder and a graph decoder. For the specific steps of generating the node initial embedded representation, graph encoder, and graph decoder, refer to (4) to (6).

（４）ノード初期埋込表現の生成
まず、疾患―症状共起行列

を構築し、行列

の第

行且つ第

列を

と記し、電子カルテデータにおいて疾患

と診断された患者のうち、症状

を発症した数を示す。次に、

に対して行の正規化を行って

を取得し、疾患

の初期埋込表現が

、即ち、

の第

行であり、

に対して列の正規化を行って

を取得し、症状

の初期埋込表現が

、即ち、

の第

列である。その後、患者

の初期埋込表現

を

により求め、

は、患者

の症状数である。 (4) Generation of node initial embedding representation First, disease-symptom co-occurrence matrix

construct the matrix

No.

row and first

row

disease in the electronic medical record data.

Among patients diagnosed with

Indicates the number of people who developed this disease. next,

Perform row normalization for

get disease

The initial embedded representation of

, that is,

No.

row,

Perform column normalization for

get symptoms

The initial embedded representation of

, that is,

No.

It is a column. Then the patient

initial embedded representation of

of

Obtained by,

is a patient

This is the number of symptoms.

（５）グラフエンコーダ
まず、異なるタイプのノード初期埋込表現を１つの多層パーセプトロンにそれぞれ入力し、同じ次元の初期埋込表現を取得してから、グラフエンコーダに入力する。グラフエンコーダは、グラフ畳み込みニューラルネットワークに基づいて実現される。
グラフエンコーダにおいて、異なるタイプのノードは、図における接続辺を介して情報を伝送して他のタイプノードの情報を統合してもよい。疾患

について、第

層のノード埋込表現

は、

にて求められ、
症状

について、第

層のノード埋込表現

は、

にて求められ、
患者

について、第

層のノード埋込表現

は、

にて求められ、

は、活性化関数であり、

は、それぞれ第

は、それぞれ疾患ノード

、症状ノード

、患者ノード

の、第

層におけるノード埋込表現であり、グラフエンコーダの総層数は、

である。

は、疾患ノード

に隣接する症状ノードのセットを表し、

は、症状ノード

に隣接する疾患ノードのセットを表し、

は、症状ノード

に隣接する患者ノードのセットを表し、

は、患者ノード

に隣接する症状ノードのセットを表す。

、

は、疾患―症状隣接行列によって取得され、

、

は、患者―症状隣接行列によって取得される。上記ノード埋込表現更新操作を

回繰り返して実行することにより、関連関係を十分に捉える疾患、症状、患者ノード埋込表現を取得することができる。 (5) Graph Encoder First, initial embedding representations of nodes of different types are each input into one multilayer perceptron, initial embedding representations of the same dimension are obtained, and then input into the graph encoder. The graph encoder is realized based on graph convolutional neural networks.
In a graph encoder, nodes of different types may transmit information via connecting edges in the diagram to integrate information of nodes of other types. disease

About, No.

Layer node embedding representation

teeth,

asked for,
symptoms

About, No.

Layer node embedding representation

teeth,

asked for,
patient

About, No.

Layer node embedding representation

teeth,

asked for,

is the activation function,

are each

are disease nodes, respectively.

, symptom node

, patient node

of, the th

It is a node embedding representation in a layer, and the total number of layers in the graph encoder is

It is.

is the disease node

represents the set of symptom nodes adjacent to

is the symptom node

represents the set of disease nodes adjacent to

is the symptom node

represents the set of patient nodes adjacent to

is the patient node

represents the set of symptom nodes adjacent to .

,

is obtained by the disease-symptom adjacency matrix,

,

is obtained by the patient-symptom adjacency matrix. The above node embedded expression update operation

By repeating the process several times, it is possible to obtain disease, symptom, and patient node embedded representations that sufficiently capture the relevant relationships.

（６）グラフデコーダ
グラフエンコーダで取得されたノード埋込表現をグラフデコーダに入力する。グラフデコーダでは、ノード埋込表現を用いてマルチタスク学習を行う。 (6) Graph decoder The node embedding representation obtained by the graph encoder is input to the graph decoder. The graph decoder performs multitask learning using node embedding representations.

第１に、患者疾患診断予測のマルチラベルな階層分類を行う。
まず、図４に示すように、疾患の階層構造を用いて疾患層階層関係を構築する。

層は、疾患セットＤ中の疾患、即ち、診断予測を行う必要のある疾患であり、疾患種類は、上述した通り、

であり、

層は、医学知識に基づいて疾患に対して行われたシステム分類であり、

と記し、

は、

層の疾患システム分類数である。 First, multi-label hierarchical classification of patient disease diagnosis prediction is performed.
First, as shown in FIG. 4, a disease layer hierarchical relationship is constructed using a disease hierarchical structure.

The layer is the disease in the disease set D, that is, the disease for which diagnosis and prediction needs to be performed, and the disease type is as described above.

and

Tiers are system classifications made for diseases based on medical knowledge,

written as,

teeth,

This is the number of disease system classifications in the layer.

次に、

個の二値分類器を

、

と記す。患者

のノード埋込表現を

個の二値分類器にそれぞれ入力し、

個の予測確率を取得し、

と記す。

を満たし、分類器

に対応するラベルは、患者の疾患システム分類であり、分類器

である。
その後、患者

が疾患

を発症する確率

を、

により求め、

は、二値分類器

で予測される、患者が

を発症するか否かの確率であり、疾患

のシステム分類を

とし、

は、二値分類器

で予測される、患者に疾患システム分類

が出現するか否かの確率である。 next,

Construct a multi-label hierarchical classifier that includes binary classifiers,

binary classifiers

,

Patient

Let us consider the node embedding representation of

are input to each binary classifier,

Obtain the predicted probabilities

It is written as follows.

and the classifier

The label corresponding to is the disease system classification of the patient, and the classifier

The label corresponding to is the patient's disease diagnosis, and the corresponding model parameters are

It is.
Then, the patient

is a disease

Probability of developing

of,

Calculate by

is a binary classifier

Predicted by

The probability of developing a disease

System classification of

year,

is a binary classifier

Predicted disease system classification for patients

is the probability of whether or not a

最後に、マルチラベルな階層分類の損失関数

は、

にて求められ、

は、患者

が疾患

を発症する実ラベルであり、

は、患者

は、Ｌ１ノルムを表し、

は、疾患

と疾患

との間の類似度であり、

にて求められ、

は、疾患

及び疾患

の実ラベル分布をそれぞれ表し、

を満たし、

と

は、患者

が疾患

、疾患

をそれぞれ発症する実ラベルをそれぞれ表す。 Finally, the loss function for multi-label hierarchical classification

teeth,

asked for,

is a patient

is a disease

It is a real label that develops,

is a patient

represents the L1 norm,

is a disease

and diseases

is the degree of similarity between

asked for,

is a disease

and diseases

respectively represent the real label distribution of

The filling,

and

is a patient

is a disease

,disease

represent the actual labels that each occur.

第２に、疾患対比学習を行う。
まず、疾患セットＤ中の疾患を２つずつ組み合わせ、疾患ペアセットＤＤを取得し、疾患ペア数が

である。ＤＤ中の何れか１つの疾患ペア

とし、２種の疾患が異なるシステム分類に属する場合に、

を満たす。 Second, disease comparison learning is performed.
First, two diseases in disease set D are combined to obtain disease pair set DD, and the number of disease pairs is

It is. Any one disease pair in DD

If two diseases belong to different system classifications,

satisfy.

次に、疾患ペアシステム種別判別器

を構築し、疾患ペア

中の２種の疾患のノード埋込表現

を

に入力し、２種の疾患の間の距離

を

により求め、

は、Ｌ２ノルムを表す。 Next, the disease pair system type discriminator

construct a disease pair

Node embedding representation of two diseases in

of

and the distance between the two diseases

of

Obtained by,

represents the L2 norm.

最後に、疾患対比学習の損失関数

を

により求め、ｍは、異なる疾患システム種別埋込表現の間の距離の下限値である。 Finally, the loss function for disease contrast learning

of

where m is the lower bound of the distance between different disease system type embedded representations.

第３に、疾患―症状関係学習を行う。
まず、疾患セットＤ及び症状セットＳから疾患及び症状を１種ずつ選択し、疾患―症状ペアセットＤＳを取得し、疾患―症状ペア数が

である。ＤＳ中の何れか１つの疾患―症状ペア

を満たす。 Third, learn the disease-symptom relationship.
First, select one disease and symptom from disease set D and symptom set S, obtain a disease-symptom pair set DS, and calculate the number of disease-symptom pairs.

It is. Any one disease-symptom pair in DS

If the disease-symptoms have no relation in the disease-symptom knowledge map,

satisfy.

次に、疾患―症状関係学習器

を構築し、

中の疾患及び症状のノード埋込表現

を

に入力し、

中の疾患と症状とが関連関係を有する確率

を

により求め、

を

により求め、疾患診断モデルの損失関数

は、

のように定義される。 Next, the disease-symptom relationship learning device

Build and

node-embedded representations of diseases and symptoms in

of

and enter

Probability that there is a relationship between the disease and symptoms in

of

Obtained by,

of

The loss function of the disease diagnosis model is calculated by

teeth,

It is defined as:

上述したのは、本発明の好適な実施形態に過ぎない。本発明が好ましい実施例で上述されたが、これらの実施例は、本発明を限定するものではない。当業者であれば、本発明の技術的解決手段の範囲から逸脱することなく、上記開示された方法及び技術内容を利用して本発明の技術的解決手段に対して多くの可能な変動及び修飾を行い、又は同等変化の等価実施例に修正することができる。したがって、本発明の技術的解決手段の内容から逸脱せず、本発明の技術的思想に基づいて以上の実施例に対して行われたいかなる簡単な修正、同等変化及び修飾は、いずれも依然として本発明の技術的解決手段の保護範囲内に含まれる。 What has been described above are only preferred embodiments of the invention. Although the invention has been described above with preferred embodiments, these embodiments are not intended to limit the invention. Those skilled in the art can make many possible variations and modifications to the technical solution of the present invention using the methods and technical contents disclosed above without departing from the scope of the technical solution of the present invention. or can be modified to equivalent embodiments with equivalent changes. Therefore, any simple modifications, equivalent changes and modifications made to the above embodiments based on the technical idea of the present invention without departing from the content of the technical solutions of the present invention will still remain the same. fall within the protection scope of the technical solution of the invention.

Claims

グラフニューラルネットワークに基づく疾患診断予測システムであって、
医学知識ソースに基づいて疾患―症状知識マップを構築する知識マップ構築モジュールと、
電子カルテシステムから、患者疾患診断及び症状データを含む患者電子カルテデータであってトライグラム形式で格納された患者電子カルテデータを抽出するデータ抽出及び予処理モジュールと、
疾患―症状知識マップ及び電子カルテデータに対してグラフニューラルネットワーク学習及び予測モデル化を行う疾患診断モデル構築モジュールと、
疾患診断モデルを用いて、入力された新患者の症状について疾患診断予測を行う疾患診断モデル応用モジュールと、を備え、
前記グラフニューラルネットワーク学習及び予測モデル化は、異種グラフネットワークの構築と、疾患診断モデルの構築とを含み、
前記異種グラフネットワークは、疾患―症状知識マップから疾患―症状関係を抽出して構築された疾患―症状サブグラフと、トライグラム形式の患者疾患診断及び症状データを用いて構築された患者―症状サブグラフとを含み、
前記疾患診断モデルは、グラフエンコーダとグラフデコーダとの両方によって構成され、
前記グラフエンコーダは、グラフ畳み込みニューラルネットワークを基に実現され、疾患―症状共起行列を用いて得られた疾患、症状、患者のノード初期埋込表現、疾患―症状隣接行列及び患者―症状隣接行列を入力とし、異なるタイプのノードは、接続辺を介して情報を伝送し、ノード埋込表現更新操作によって疾患、症状、患者ノード埋込表現を取得し、グラフデコーダに入力し、
前記グラフデコーダは、ノード埋込表現を用いてマルチタスク学習を行い、前記マルチタスク学習は、患者疾患診断予測のマルチラベルな階層分類という部分ａ)と、疾患対比学習という部分ｂ）と、疾患―症状関係学習という部分ｃ)とを含み、
前記部分ａ)では、疾患の階層構造を用いて疾患層階層関係を構築し、前記疾患層階層関係は、診断予測を行う必要のある疾患層と、医学知識から得られた疾患システム分類層とを含み、疾患層の疾患種類を

と記し、疾患システム分類層を

と記し、

は、疾患システム分類数であり、

個の二値分類器を

（ただし、

）と記し、

を満たし、
患者

のノード埋込表現を

個の二値分類器にそれぞれ入力して

個の予測確率を取得し、

と記し、二値分類器

であり、
患者

が疾患

を発症する確率

を

により求め、

は、二値分類器

で予測される、患者が

を発症するか否かの確率であり、疾患

のシステム分類を

とし、

は、二値分類器

で予測される、患者に疾患システム分類

は、

にて求められ、

は、患者数を表し、

は、患者

が疾患

を発症する実ラベルであり、

は、患者

は、Ｌ１ノルムを表し、

は、疾患

と疾患

との間の類似度であり、

にて求められ、

は、疾患

及び疾患

の実ラベル分布をそれぞれ表し、

を満たし、

と

は、患者

が疾患

、疾患

をそれぞれ発症する実ラベルをそれぞれ表し、
前記部分ｂ)では、疾患ペアシステム種別判別器を構築し、疾患ペア中の２種の疾患の間の距離を算出し、疾患対比学習の損失関数を設計し、
部分ｃ)では、疾患―症状関係学習器を構築し、疾患―症状ペア中の疾患と症状とが関連関係を有する確率を算出し、疾患―症状関係学習の損失関数を設計し、
前記マルチラベルな階層分類の損失関数と前記疾患対比学習の損失関数と前記疾患―症状関係学習の損失関数との和を求めて疾患診断モデルの損失関数を取得することを特徴とするグラフニューラルネットワークに基づく疾患診断予測システム。 A disease diagnosis prediction system based on a graph neural network,
a knowledge map construction module that constructs a disease-symptom knowledge map based on medical knowledge sources;
a data extraction and preprocessing module that extracts patient electronic medical record data that includes patient disease diagnosis and symptom data and is stored in trigram format from the electronic medical record system;
a disease diagnosis model construction module that performs graph neural network learning and predictive modeling on disease-symptom knowledge maps and electronic medical record data;
a disease diagnosis model application module that uses the disease diagnosis model to predict disease diagnosis for input new patient symptoms;
The graph neural network learning and predictive modeling includes constructing a heterogeneous graph network and constructing a disease diagnosis model,
The heterogeneous graph network includes a disease-symptom subgraph constructed by extracting disease-symptom relationships from a disease-symptom knowledge map, and a patient-symptom subgraph constructed using patient disease diagnosis and symptom data in trigram format. including;
The disease diagnosis model is composed of both a graph encoder and a graph decoder,
The graph encoder is realized based on a graph convolutional neural network, and initial embedded representations of nodes of diseases, symptoms, and patients obtained using a disease-symptom co-occurrence matrix, a disease-symptom adjacency matrix, and a patient-symptom adjacency matrix. As an input, nodes of different types transmit information through connection edges, obtain disease, symptom, and patient node embedding representations through node embedding representation update operations, and input them to the graph decoder,
The graph decoder performs multitask learning using a node embedding representation, and the multitask learning includes part a) of multi-label hierarchical classification of patient disease diagnosis prediction, part b) of disease comparison learning, and disease comparison learning. -Includes part c) of symptom-related learning,
In the above part a), a disease layer hierarchical relationship is constructed using a disease hierarchical structure, and the disease layer hierarchical relationship includes a disease layer that requires diagnosis prediction and a disease system classification layer obtained from medical knowledge. including the disease type of the disease layer.

, and the disease system classification layer is

written as,

is the number of disease system classifications,

Build a multi-label hierarchical classifier containing binary classifiers,

binary classifiers

(however,

),

The filling,
patient

The node embedding representation of

input into two binary classifiers respectively.

Get the predicted probabilities of

, a binary classifier

and
patient

is a disease

probability of developing

of

Obtained by,

is a binary classifier

predicted that the patient will

It is the probability of developing or not developing a disease.

system classification of

year,

is a binary classifier

Disease system classification of patients as predicted by

teeth,

asked for,

represents the number of patients,

is a patient

is a disease

It is a real label that develops,

is a patient

represents the L1 norm,

is a disease

and diseases

is the degree of similarity between

asked for,

is a disease

and diseases

respectively represent the real label distribution of

The filling,

and

is a patient

is a disease

,disease

Each represents a real label that develops,
In part b), construct a disease pair system type discriminator, calculate the distance between two diseases in the disease pair, design a loss function for disease contrast learning,
In part c), construct a disease-symptom relationship learning device, calculate the probability that a disease and a symptom in a disease-symptom pair have a related relationship, and design a loss function for disease-symptom relationship learning;
A graph neural network characterized in that a loss function of a disease diagnosis model is obtained by calculating the sum of the loss function of the multi-label hierarchical classification, the loss function of the disease comparison learning, and the loss function of the disease-symptom relationship learning. A disease diagnosis prediction system based on

前記知識マップ構築モジュールにおいて、前記疾患―症状知識マップは、疾患と症状との２種のノードタイプ、及び、疾患―症状という１種の関係を含むことを特徴とする請求項１に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 The graph according to claim 1, wherein in the knowledge map construction module, the disease-symptom knowledge map includes two types of nodes: disease and symptom, and one type of relationship: disease-symptom. Disease diagnosis and prediction system based on neural networks.

前記異種グラフネットワークは、疾患―症状知識マップ及び電子カルテデータを基に構築され、疾患と症状と患者との３種のノードタイプを含み、症状は、疾患と患者との間に接続される中間ノードであり、前記異種グラフネットワークには、疾患―症状知識マップのうち疾患、症状に関連する関係サブグラフと、電子カルテデータのうち患者、症状に関連する関係サブグラフとが統合されていることを特徴とする請求項１に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 The heterogeneous graph network is constructed based on a disease-symptom knowledge map and electronic medical record data, and includes three types of nodes: disease, symptom, and patient. Symptom is an intermediate node connected between disease and patient. node, and the heterogeneous graph network is characterized in that relational subgraphs related to diseases and symptoms in the disease-symptom knowledge map and relational subgraphs related to patients and symptoms in electronic medical record data are integrated. A disease diagnosis and prediction system based on a graph neural network according to claim 1.

前記異種グラフネットワーク

は、

と示され、ノードセットは、

と示され、

と示され、セットＲは、疾患―症状関係を表す

と患者―症状関係を表す

とを含み、前記疾患―症状関係は、疾患―症状隣接行列に格納され、前記患者―症状関係は、患者―症状隣接行列に格納されていることを特徴とする請求項１に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 The heterogeneous graph network

teeth,

and the node set is

where D, S, and P are respectively a predetermined disease set, symptom set, and patient set, and

It is shown that

where the set R represents the disease-symptom relationship.

represents the patient-symptom relationship.

The graph neural network according to claim 1, wherein the disease-symptom relationship is stored in a disease-symptom adjacency matrix, and the patient-symptom relationship is stored in a patient-symptom adjacency matrix. Network-based disease diagnosis prediction system.

前記ノード初期埋込表現の生成は、
疾患―症状共起行列

を構築する処理であって、行列

の第

行且つ第

列が

と記され、電子カルテデータおける疾患

と診断された患者のうち症状

を発症した患者の数を表す処理と、

に対して行の正規化を行って

を取得する処理であって、疾患

の初期埋込表現が

であり、

の第

行を示す処理と、

に対して列の正規化を行って

を取得する処理であって、症状

の初期埋込表現が

であり、

の第

列を示す処理と、
患者

の初期埋込表現

を

により求める処理とを含み、

は、患者

の症状数であることを特徴とする請求項４に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 The generation of the node initial embedded representation is
Disease-symptom co-occurrence matrix

The process of constructing a matrix

No.

row and first

The row is

disease in electronic medical record data.

Symptoms among patients diagnosed with

a process representing the number of patients who developed

Perform row normalization for

The process of acquiring disease

The initial embedded representation of

and

No.

Processing to indicate the row,

Perform column normalization for

The process of acquiring the symptoms

The initial embedded representation of

and

No.

Processing to indicate columns,
patient

initial embedded representation of

of

including the processing required by

is a patient

5. The disease diagnosis and prediction system based on a graph neural network according to claim 4, wherein the number of symptoms is .

異なるタイプのノード初期埋込表現を１つの多層パーセプトロンにそれぞれ入力し、同じ次元の初期埋込表現を取得してから、グラフエンコーダに入力することを特徴とする請求項１に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 The graph neural network according to claim 1, characterized in that initial embedding representations of nodes of different types are respectively input into one multilayer perceptron, and initial embedding representations of the same dimension are obtained and then input into a graph encoder. A disease diagnosis prediction system based on

前記グラフエンコーダでは、疾患

について、第

層のノード埋込表現

は、

にて求められ、
症状

について、第

層のノード埋込表現

は、

にて求められ、
患者

について、第

層のノード埋込表現

は、

にて求められ、

は、活性化関数であり、

は、それぞれ第

は、それぞれ疾患

、症状

、患者

の、第

層におけるノード埋込表現であり、

は、疾患

に隣接する症状ノードのセットを表し、

は、症状

に隣接する疾患ノードのセットを表し、

は、症状

に隣接する患者ノードのセットを表し、

は、患者

に隣接する症状ノードのセットを表すことを特徴とする請求項５に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 In the graph encoder, the disease

About, No.

Layer node embedding representation

teeth,

asked for,
symptoms

About, No.

Layer node embedding representation

teeth,

asked for,
patient

About, No.

Layer node embedding representation

teeth,

asked for,

is the activation function,

are each

are each disease

, symptoms

,patient

of, the th

is a node embedding representation in a layer,

is a disease

represents the set of symptom nodes adjacent to

are the symptoms

represents the set of disease nodes adjacent to

are the symptoms

represents the set of patient nodes adjacent to

is a patient

6. The disease diagnosis and prediction system based on graph neural networks according to claim 5, characterized in that the system represents a set of symptom nodes adjacent to .

前記グラフデコーダでは、前記疾患対比学習において、
疾患セットＤ中の疾患を２つずつ組み合わせ、疾患ペアセットＤＤを取得し、疾患ペア数が

であり、ＤＤ中の何れか１つの疾患ペア

とし、２種の疾患が異なるシステム分類に属する場合に、

とし、
疾患ペアシステム種別判別器

を構築し、疾患ペア

中の２種の疾患のノード埋込表現

を

に入力し、２種の疾患の間の距離

を

により求め、

は、Ｌ２ノルムを表し、
疾患対比学習の損失関数

を

により求め、ｍは、異なる疾患システム種別埋込表現の間の距離の下限値であることを特徴とする請求項７に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 In the graph decoder, in the disease comparison learning,
Two diseases in disease set D are combined to obtain disease pair set DD, and the number of disease pairs is

and any one disease pair in DD

If two diseases belong to different system classifications,

year,
Disease pair system type discriminator

construct a disease pair

Node embedding representation of two diseases in

of

and the distance between the two diseases

of

Obtained by,

represents the L2 norm,
Loss function for disease contrast learning

of

8. The disease diagnosis prediction system based on a graph neural network according to claim 7, wherein m is a lower limit value of the distance between embedded representations of different disease system types.

前記グラフデコーダでは、前記疾患―症状関係学習において、
疾患セットＤ及び症状セットＳから疾患及び症状を１種ずつ選択し、疾患―症状ペアセットＤＳを取得し、疾患―症状ペア数が

であり、ＤＳ中の何れか１つの疾患―症状ペア

とし、
疾患―症状関係学習器

を構築し、

中の疾患及び症状のノード埋込表現

を

に入力し、

中の疾患と症状とが関連関係を有する確率

を

により求め、

を

により求めることを特徴とする請求項７に記載のグラフニューラルネットワークに基づく疾患診断予測システム。 In the graph decoder, in the disease-symptom relationship learning,
Select one disease and one symptom from the disease set D and the symptom set S to obtain a disease-symptom pair set DS, and calculate the number of disease-symptom pairs.

and any one of the disease-symptom pairs in the DS

Regarding the disease-symptom pair label, if the disease-symptom has an associated relationship in the disease-symptom knowledge map,

If the disease-symptom does not have an association relationship in the disease-symptom knowledge map,

year,
Disease-symptom relationship learning device

Build

Node embedding representation of diseases and symptoms in

of

Enter in

Probability of association between the disease and symptoms in

of

Calculate by

represents the sigmoid function,
Loss function for learning disease-symptom relationships

of

The disease diagnosis and prediction system based on a graph neural network according to claim 7, characterized in that the disease diagnosis and prediction system is obtained by: