JP2013105465A

JP2013105465A - Medical synonym dictionary creating apparatus and medical synonym dictionary creating method

Info

Publication number: JP2013105465A
Application number: JP2011251137A
Authority: JP
Inventors: Kazutoyo Takada; 和豊高田; Kenji Kondo; 堅司近藤; Kazuki Kozuka; 和紀小塚
Original assignee: Panasonic Corp
Current assignee: Panasonic Corp
Priority date: 2011-11-16
Filing date: 2011-11-16
Publication date: 2013-05-30

Abstract

PROBLEM TO BE SOLVED: To provide a medical synonym dictionary creating apparatus for creating medical synonym dictionary by correctly evaluating the similarity of an image with respect to an image reading report.SOLUTION: A synonym decision section 120 (i) determines whether or not a pair of keywords is a pair of synonyms based on an image reading report; (ii) performs a weighting on a feature quantity of each image calculated from a medical image as a base for creating a keyword constituting the pair of keywords based on a piece of binomial relation information prescribing the relevance between the feature quantity and the keyword of each image, creates an image feature quantity vector of each keyword having weighted image feature quantity as an element, compares two image feature quantity vectors with respect to the keyword pair to determine whether the keyword pair is the pair of synonyms; and (iii) when both of the two determination results are pairs of synonyms, determines the keyword pair selected by a keyword pair selection section as synonyms.

Description

本発明は、読影レポートにおける医用同義語辞書を自動的に作成する医用同義語辞書作成装置および医用同義語辞書作成方法に関する。 The present invention relates to a medical synonym dictionary creation device and a medical synonym dictionary creation method for automatically creating a medical synonym dictionary in an interpretation report.

近年、画像診断の分野では撮影画像および読影レポートのデジタル化が進み、医師が大量のデータを共有することが容易になっている。ここで、読影レポートとは、撮影画像に対して読影者が下した診断を示すテキスト情報のことである。つまり、読影レポートは、医用画像を読影した結果が記載された文書データである。また、画像を保管および通信するシステムであるＰＡＣＳ（ＰｉｃｔｕｒｅＡｒｃｈｉｖｉｎｇａｎｄＣｏｍｍｕｎｉｃａｔｉｏｎＳｙｓｔｅｍｓ）内に保管されている読影レポート同士は、共通のＩＤやキーワードで互いに紐付けされて管理されており、保管されている過去の読影レポートの有効な二次利用が求められている。 In recent years, in the field of diagnostic imaging, digitization of captured images and interpretation reports has progressed, making it easy for doctors to share large amounts of data. Here, the image interpretation report is text information indicating a diagnosis made by the image interpreter on the captured image. That is, the interpretation report is document data in which the result of interpretation of a medical image is described. Interpretation reports stored in PACS (Picture Archiving and Communication Systems), which is a system for storing and communicating images, are managed and linked to each other with common IDs and keywords. Effective secondary use of past interpretation reports is required.

読影レポートの有効な二次利用の一つとしては、レポートのテキスト検索が挙げられる。一般的なテキスト検索では、検索キーワードと同じキーワードを持つ読影レポートを検索結果として出力するが、同じ意味を持ちながら異なる表記がされているレポートについては検索結果から外れてしまうという問題が存在する。そのため、より汎用性の高いテキスト検索を実現するためには、同じ意味を持つキーワード同士を結びつける同義語辞書の作成が必須になる。 One effective secondary use of interpretation reports is text search of reports. In general text search, an interpretation report having the same keyword as the search keyword is output as a search result. However, there is a problem that a report having the same meaning but different notation is excluded from the search result. Therefore, in order to realize a more versatile text search, it is essential to create a synonym dictionary that connects keywords having the same meaning.

このような同義語辞書を作成する従来技術として、特許文献１では、「人名に対応する顔画像は一意に決まる」ことを利用し、ウェブ上のドキュメントから様々な表記の人名と、その人名が付与された顔画像を抽出し、類似した顔画像に付与された人名を全て同義語（別名）として登録する方法が提案されている。この方法では画像の類似性に基づいた同義語処理を行っており、テキスト情報だけを用いた処理よりも、より精度の高い同義語辞書を作成することができる。 As a conventional technique for creating such a synonym dictionary, Patent Document 1 uses the fact that “a face image corresponding to a person's name is uniquely determined”. There has been proposed a method of extracting a given face image and registering all names given to similar face images as synonyms (aliases). In this method, synonym processing based on image similarity is performed, and a synonym dictionary with higher accuracy can be created than processing using only text information.

特開２０１０−１２８９２６号公報JP 2010-128926 A

しかし、特許文献１に記載の方法を画像診断分野における読影レポートに適用した場合、キーワードごとに関連する画像特徴量が異なるため、単純に画像の類似性を用いるだけでは、キーワード間の正しい同義語関係を判定できないという課題がある。 However, when the method described in Patent Literature 1 is applied to an image interpretation report in the field of image diagnosis, the image feature quantities related to each keyword are different, so that the correct synonym between keywords can be obtained simply by using image similarity. There is a problem that the relationship cannot be determined.

例えば、肝腫瘤の画像に対して付与された「辺縁明瞭」というキーワードはエッジ等の形状に関する画像特徴量と関係しているが、濃度に関する画像特徴量とは関係しない。一方、「高吸収」というキーワードは濃度に関する画像特徴量と関係しているが、形状に関する画像特徴量とは関係しない。このため、「辺縁明瞭」と「高吸収」の同義語関係を画像の類似性を用いて評価する際、形状と濃度に関する画像特徴量の値をそのまま用いると、「辺縁明瞭」とは関係のない濃度に関する画像特徴量、また、「高吸収」とは関係のない形状に関する画像特徴量が、それぞれ画像の類似判定に含まれてしまう。よって、画像の類似性を正しく評価することができない。そのため、読影レポートにおいて画像を用いてキーワード間の同義語関係を判定するためには、画像特徴量の中からキーワードに関連する画像特徴量を適切に選択する必要がある。 For example, the keyword “clear border” given to an image of a liver tumor is related to an image feature quantity related to the shape of an edge or the like, but is not related to an image feature quantity related to density. On the other hand, the keyword “high absorption” is related to the image feature quantity related to the density, but is not related to the image feature quantity related to the shape. For this reason, when evaluating the synonym relationship between “clear edge” and “high absorption” using image similarity, if the image feature values related to shape and density are used as they are, “clear edge” Image feature amounts relating to unrelated density and image feature amounts relating to shapes not related to “high absorption” are included in the image similarity determination. Therefore, the similarity of images cannot be correctly evaluated. Therefore, in order to determine a synonym relationship between keywords using an image in an interpretation report, it is necessary to appropriately select an image feature amount related to the keyword from image feature amounts.

本発明は、上記課題を解決するためになされたものであり、キーワードに適合する画像特徴量を選択することにより、読影レポートに対して画像の類似性を正しく評価した上で医用同義語辞書を作成する医用同義語辞書作成装置および医用同義語辞書作成方法を提供することを目的とする。 The present invention has been made to solve the above problems, and by selecting an image feature amount that matches a keyword, a medical synonym dictionary is created after correctly evaluating the similarity of an image to an interpretation report. An object is to provide a medical synonym dictionary creation device and a medical synonym dictionary creation method to be created.

上記課題を解決するために、本発明のある局面に係る医用同義語辞書作成装置は、医用画像と、当該医用画像を読影した結果が記載された文書データである読影レポートとを取得する取得部と、医用画像の特徴を示す文字列の読影項目または医用画像の診断結果を示す文字列の疾病名であるキーワードが登録されているキーワード辞書データを参照して、前記取得部が取得した読影レポートから前記キーワード辞書データに登録されているキーワードを抽出するキーワード抽出部と、前記キーワード抽出部が抽出したキーワードからキーワード対を選択するキーワード対選択部と、前記キーワード対選択部が選択したキーワード対が同義語であるか否かを判定する同義語判定部と、前記同義語判定部が同義語であると判定したキーワード対を、医用同義語辞書に含まれる同義語として出力する出力部とを備え、前記同義語判定部は、（ｉ）前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定し、（ｉｉ）医用画像から抽出される各画像特徴量と前記医用画像に対するキーワードとの間の関連性を予め定めた二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを作成し、前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定し、（ｉｉｉ）２つの判定結果が共に同義語であることを示す場合に、前記キーワード対選択部が選択したキーワード対が同義語であると判定する。 In order to solve the above-described problem, a medical synonym dictionary creation device according to an aspect of the present invention acquires a medical image and an interpretation report that is document data in which a result of interpretation of the medical image is described. And an interpretation report acquired by the acquisition unit with reference to keyword dictionary data in which a keyword that is a disease name of a character string indicating a diagnosis result of a medical image or a character string indicating a medical image is registered. A keyword extraction unit for extracting a keyword registered in the keyword dictionary data from the keyword, a keyword pair selection unit for selecting a keyword pair from the keywords extracted by the keyword extraction unit, and a keyword pair selected by the keyword pair selection unit. A synonym determination unit that determines whether or not a synonym is used, and a keyword pair that the synonym determination unit determines to be a synonym An output unit that outputs as a synonym included in the medical synonym dictionary, and the synonym determination unit (i) determines whether the keyword pair is a synonym based on the interpretation report; (Ii) Based on the binomial relationship information in which the relationship between each image feature amount extracted from the medical image and the keyword for the medical image is determined in advance, for each keyword constituting the keyword pair, the keyword Each image feature amount calculated from the medical image that is the basis of the creation is weighted with a larger value as the relevance between the image feature amount and the keyword is higher. By creating an image feature quantity vector of each keyword having the element as an element and comparing two image feature quantity vectors for the keyword pair, the keyword pair is synonymous. Determines whether or not there, it is determined that to indicate that (iii) 2 single determination is synonymous both the Keyword pair pair selecting unit selects is a synonym.

この構成によると、読影レポートに基づいた同義語判定と、医用画像から抽出される画像特徴量に基づいた同義語判定とを行っている。後者については、医用画像から抽出される各画像特徴量について、読影レポートに記載されているキーワードとの間の関連性が高いものほど大きな重みで重み付けを行った上で、重み付けされた画像特徴量同士を比較している。このため、読影レポートに対して画像の類似性を正しく評価した上で医用同義語辞書を作成することができる。 According to this configuration, synonym determination based on an interpretation report and synonym determination based on an image feature amount extracted from a medical image are performed. For the latter, each image feature amount extracted from the medical image is weighted with a larger weight as the relevance to the keyword described in the interpretation report is higher, and the weighted image feature amount Comparing each other. For this reason, it is possible to create a medical synonym dictionary after correctly evaluating the similarity of images to an interpretation report.

なお、本発明は、このような特徴的な処理部を備える医用同義語辞書作成装置として実現することができるだけでなく、医用同義語辞書作成装置に含まれる特徴的な処理部が実行する処理をステップとする医用同義語辞書作成方法として実現することができる。また、医用同義語辞書作成装置が備える特徴的な処理部としてコンピュータを機能させるためのプログラムとして実現することもできる。また、医用同義語辞書作成方法に含まれる特徴的なステップをコンピュータに実行させるプログラムとして実現することもできる。そして、そのようなプログラムを、ＣＤ−ＲＯＭ（ＣｏｍｐａｃｔＤｉｓｃ−ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）等のコンピュータ読取可能な不揮発性の記録媒体やインターネット等の通信ネットワークを介して流通させることができるのは、言うまでもない。 Note that the present invention can be realized not only as a medical synonym dictionary creation device including such a characteristic processing unit, but also by the characteristic processing unit included in the medical synonym dictionary creation device. This can be realized as a method for creating a medical synonym dictionary as a step. It can also be realized as a program for causing a computer to function as a characteristic processing unit provided in the medical synonym dictionary creation device. It can also be realized as a program that causes a computer to execute the characteristic steps included in the medical synonym dictionary creation method. Needless to say, such a program can be distributed through a computer-readable non-volatile recording medium such as a CD-ROM (Compact Disc-Read Only Memory) or a communication network such as the Internet.

本発明によると、読影レポートに対して画像の類似性を正しく評価した上で医用同義語辞書を作成することができる。 According to the present invention, it is possible to create a medical synonym dictionary after correctly evaluating the similarity of an image to an interpretation report.

本発明の実施の形態１における、医用同義語辞書作成装置の特徴的な機能構成を示すブロック図The block diagram which shows the characteristic function structure of the medical synonym dictionary creation apparatus in Embodiment 1 of this invention. 本発明の実施の形態１における、症例データベースに記憶されている症例データの一例を示す図The figure which shows an example of the case data memorize | stored in the case database in Embodiment 1 of this invention 本発明の実施の形態１における、キーワード辞書の一例を示す図The figure which shows an example of the keyword dictionary in Embodiment 1 of this invention. 本発明の実施の形態１における、読影知識作成の手順を示すフローチャートThe flowchart which shows the procedure of the interpretation knowledge preparation in Embodiment 1 of this invention 本発明の実施の形態１における、画像特徴量抽出の手順を示すフローチャートThe flowchart which shows the procedure of the image feature-value extraction in Embodiment 1 of this invention. 本発明の実施の形態１における、腹部ＣＴ検査の読影レポートの例を示す図The figure which shows the example of the interpretation report of abdominal CT examination in Embodiment 1 of this invention 本発明の実施の形態１における、読影レポートから抽出された読影項目および疾病名を示す図The figure which shows the interpretation item and disease name extracted from the interpretation report in Embodiment 1 of this invention. 本発明の実施の形態１における、読影レポートから抽出された読影項目および疾病名、及び、読影項目と同時に抽出された位置と時相の情報を示す図The figure which shows the information of the position and time phase extracted simultaneously with the interpretation item and disease name extracted from the interpretation report in Embodiment 1 of this invention simultaneously with the interpretation item. 本発明の実施の形態１における、読影レポートから抽出された読影項目および疾病名、及び、文脈解釈を行って読影項目と同時に抽出された位置と時相の情報を示す図The figure which shows the information of the position and time phase extracted simultaneously with the interpretation item which performed context interpretation by performing context interpretation in Embodiment 1 of this invention extracted from the interpretation report 本発明の実施の形態１における、読影知識抽出のために取得したデータ一式を示す図The figure which shows a data set acquired for the interpretation knowledge extraction in Embodiment 1 of this invention. 本発明の実施の形態１における、読影項目と画像特徴量との間の相関関係（二値）の概念図The conceptual diagram of the correlation (binary) between the interpretation item and image feature-value in Embodiment 1 of this invention 本発明の実施の形態１における、読影項目と画像特徴量との間の相関関係（多値）の概念図The conceptual diagram of the correlation (multi-value) between the interpretation item and the image feature-value in Embodiment 1 of this invention 本発明の実施の形態１における、疾病名と画像特徴量との間の相関関係（二値）の概念図Conceptual diagram of correlation (binary) between disease name and image feature amount in Embodiment 1 of the present invention 本発明の実施の形態１における、読影項目と疾病名との間の相関関係（二値）の概念図Conceptual diagram of correlation (binary) between interpretation items and disease names in Embodiment 1 of the present invention 本発明の実施の形態１における、読影知識として抽出した（画像特徴量−読影項目）間の相関関係の格納形式を示す図The figure which shows the storage format of the correlation between the (image feature-value-interpretation item) extracted as interpretation knowledge in Embodiment 1 of this invention. 本発明の実施の形態１における、読影知識として抽出した（画像特徴量−疾病名）間の相関関係の格納形式を示す図The figure which shows the storage format of the correlation between (image feature-value-disease name) extracted as interpretation knowledge in Embodiment 1 of this invention. 本発明の実施の形態１における、読影知識として抽出した（読影項目−疾病名）間の相関関係の格納形式を示す図The figure which shows the storage format of the correlation between (interpretation item-disease name) extracted as interpretation knowledge in Embodiment 1 of this invention. 本発明の実施の形態１における、医用同義語辞書作成装置が実行する全体的な処理の流れを示すフローチャートThe flowchart which shows the flow of the whole process which the medical synonym dictionary creation apparatus in Embodiment 1 of this invention performs 本発明の実施の形態１における、キーワード抽出処理（図１８のステップＳ３０２）の出力例を示す図The figure which shows the output example of the keyword extraction process (step S302 of FIG. 18) in Embodiment 1 of this invention. 本発明の実施の形態１における、同義語判定処理（図１８のステップＳ３０３）に用いるキーワードベクトルの概念図Conceptual diagram of a keyword vector used for synonym determination processing (step S303 in FIG. 18) in Embodiment 1 of the present invention 本発明の実施の形態１における、代表画像ベクトル生成処理（図１８のステップＳ３０５）の詳細な処理の流れの一例を示すフローチャートThe flowchart which shows an example of the detailed process flow of the representative image vector production | generation process (step S305 of FIG. 18) in Embodiment 1 of this invention. 本発明の実施の形態１における、代表画像ベクトル生成処理（図１８のステップＳ３０５）の詳細な処理の流れの一例を示すフローチャートThe flowchart which shows an example of the detailed process flow of the representative image vector production | generation process (step S305 of FIG. 18) in Embodiment 1 of this invention. 本発明の実施の形態１の変形例に係る医用同義語辞書作成装置の特徴的な機能構成を示すブロック図The block diagram which shows the characteristic functional structure of the medical synonym dictionary creation apparatus which concerns on the modification of Embodiment 1 of this invention. 本発明の実施の形態２における、医用同義語辞書作成装置の特徴的な機能構成を示すブロック図The block diagram which shows the characteristic function structure of the medical synonym dictionary creation apparatus in Embodiment 2 of this invention. 本発明の実施の形態２における、医用同義語辞書作成装置が実行する全体的な処理の流れを示すフローチャートThe flowchart which shows the flow of the whole process which the medical synonym dictionary creation apparatus in Embodiment 2 of this invention performs 医用同義語辞書データベースを利用したシステムの構成を示す図The figure which shows the composition of the system which utilizes the medical synonym dictionary database

以下、本発明の実施の形態について、図面を参照しながら説明する。なお、以下で説明する実施の形態は、いずれも本発明の好ましい一具体例を示すものである。以下の実施の形態で示される数値、構成要素、構成要素の接続形態、ステップ、ステップの順序などは、一例であり、本発明を限定する主旨ではない。本発明は、特許請求の範囲だけによって限定される。よって、以下の実施の形態における構成要素のうち、本発明の最上位概念を示す独立請求項に記載されていない構成要素については、本発明の課題を達成するのに必ずしも必要ではないが、より好ましい形態を構成するものとして説明される。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. Each of the embodiments described below shows a preferred specific example of the present invention. Numerical values, components, connection modes of components, steps, order of steps, and the like shown in the following embodiments are merely examples, and are not intended to limit the present invention. The invention is limited only by the claims. Therefore, among the constituent elements in the following embodiments, constituent elements that are not described in the independent claims indicating the highest concept of the present invention are not necessarily required to achieve the object of the present invention. It will be described as constituting a preferred form.

本発明の実施の形態に係る医用同義語辞書作成装置は、超音波画像、ＣＴ（ＣｏｍｐｕｔｅｄＴｏｍｏｇｒａｐｈｙ）画像、または核磁気共鳴画像等の医用画像に対する読影レポートに記述されたキーワードに関する医用同義語辞書を作成する装置である。本明細書中では「画像データ」のことを単に「画像」と言う。 The medical synonym dictionary creation device according to the embodiment of the present invention provides a medical synonym dictionary related to keywords described in an interpretation report for a medical image such as an ultrasound image, a CT (Computed Tomography) image, or a nuclear magnetic resonance image. It is a device to create. In this specification, “image data” is simply referred to as “image”.

本発明の一実施態様に係る医用同義語辞書作成装置は、医用画像と、当該医用画像を読影した結果が記載された文書データである読影レポートとを取得する取得部と、医用画像の特徴を示す文字列の読影項目または医用画像の診断結果を示す文字列の疾病名であるキーワードが登録されているキーワード辞書データを参照して、前記取得部が取得した読影レポートから前記キーワード辞書データに登録されているキーワードを抽出するキーワード抽出部と、前記キーワード抽出部が抽出したキーワードからキーワード対を選択するキーワード対選択部と、前記キーワード対選択部が選択したキーワード対が同義語であるか否かを判定する同義語判定部と、前記同義語判定部が同義語であると判定したキーワード対を、医用同義語辞書に含まれる同義語として出力する出力部とを備え、前記同義語判定部は、（ｉ）前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定し、（ｉｉ）医用画像から抽出される各画像特徴量と前記医用画像に対するキーワードとの間の関連性を予め定めた二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを作成し、前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定し、（ｉｉｉ）２つの判定結果が共に同義語であることを示す場合に、前記キーワード対選択部が選択したキーワード対が同義語であると判定する。 A medical synonym dictionary creation device according to an embodiment of the present invention includes an acquisition unit that acquires a medical image and an interpretation report that is document data describing a result of interpretation of the medical image, and features of the medical image. Refer to the keyword dictionary data in which the keyword that is the disease name of the character string indicating the interpretation item of the character string or the diagnostic result of the medical image is registered and registered in the keyword dictionary data from the interpretation report acquired by the acquisition unit A keyword extraction unit that extracts a keyword that has been extracted, a keyword pair selection unit that selects a keyword pair from the keywords extracted by the keyword extraction unit, and whether or not the keyword pair selected by the keyword pair selection unit is a synonym The medical synonym dictionary includes a synonym determining unit that determines whether the synonym determining unit and the synonym determining unit determine that the synonym determining unit is a synonym. The synonym determination unit determines whether the keyword pair is a synonym based on the interpretation report, and (ii) extracts from the medical image Based on binary relation information that predetermines the relevance between each image feature quantity and the keyword for the medical image, the medical data used as the basis for creating the keyword for each keyword constituting the keyword pair By assigning a larger value to each image feature amount calculated from the image as the relevance between the image feature amount and the keyword is higher, each keyword feature element having each weighted image feature amount as an element An image feature vector is created, and by comparing two image feature vectors for the keyword pair, it is determined whether the keyword pair is a synonym ( To indicate that ii) 2 single determination is synonymous both the Keyword pair pair selecting unit selects is determined as synonymous.

具体的には、前記同義語判定部は、前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定するテキスト判定部と、前記テキスト判定部で同義語であると判定された場合に、前記二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを生成する代表画像ベクトル生成部と、前記代表画像ベクトル生成部が生成した前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定する画像判定部とを含み、前記出力部は、前記画像判定部が同義語であると判定したキーワード対を、前記医用同義語辞書に含まれる同義語として出力する。 Specifically, the synonym determination unit is determined to be a synonym by a text determination unit that determines whether or not the keyword pair is a synonym based on the interpretation report and the text determination unit. In this case, based on the binomial relationship information, for each image feature amount calculated from the medical image that is the basis for creating the keyword for each keyword constituting the keyword pair, the image feature amount and the A representative image vector generation unit that generates an image feature amount vector of each keyword having each weighted image feature amount as an element by weighting a larger value as the relevance between the keywords is higher, and the representative image It is determined whether or not the keyword pair is a synonym by comparing two image feature quantity vectors for the keyword pair generated by the vector generation unit. And a image determining unit, wherein the output unit outputs the keyword pair the image determination unit determines that it is synonymous synonymously included in the medical synonym dictionary.

また、前記同義語判定部は、前記二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより各キーワードの画像特徴量ベクトルを生成する代表画像ベクトル生成部と、前記代表画像ベクトル生成部が生成した前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定する画像判定部と、前記画像判定部で同義語であると判定された場合に、前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定するテキスト判定部とを含み、前記出力部は、前記テキスト判定部が同義語であると判定したキーワード対を、前記医用同義語辞書に含まれる同義語として出力するものであっても良い。 In addition, the synonym determination unit is configured to calculate, based on the binomial relationship information, for each image feature amount calculated from a medical image that is a basis for creating the keyword for each keyword constituting the keyword pair. A representative image vector generation unit that generates an image feature amount vector of each keyword by weighting a larger value as the relevance between the image feature amount and the keyword is higher, and the representative image vector generation unit By comparing two image feature quantity vectors for a keyword pair to determine whether the keyword pair is a synonym, and when the image determining unit determines that the keyword pair is a synonym, A text determination unit that determines whether the keyword pair is a synonym based on the interpretation report, and the output unit includes the text determination unit. The keyword pairs part is determined to be synonymous, or may be output as a synonym included in the medical synonym dictionary.

また、前記テキスト判定部は、前記キーワード対を構成する各キーワードについて、前記読影レポート中の当該キーワードを含む文章中の当該キーワード以外のキーワードの出現頻度をベクトルの要素とするキーワードベクトルを作成し、作成したキーワードベクトル間の距離が第１閾値以下であれば、前記キーワード対が同義語であると判定するものであっても良い。 Further, the text determination unit, for each keyword constituting the keyword pair, creates a keyword vector having the frequency of occurrence of keywords other than the keyword in the sentence including the keyword in the interpretation report as a vector element, If the distance between the created keyword vectors is equal to or less than the first threshold value, it may be determined that the keyword pair is a synonym.

また、前記代表画像ベクトル生成部は、前記キーワード対を構成するキーワードが読影項目である場合、医用画像から抽出される各画像特徴量と前記医用画像に対する読影項目との関連性を予め定めた二項間関係情報に基づいて、当該キーワードの作成の基となった医用画像から算出した各画像特徴量に当該画像特徴量と読影項目である当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより当該キーワードの画像特徴量ベクトルを生成するものであっても良い。 Further, the representative image vector generation unit is configured to determine a relationship between each image feature amount extracted from a medical image and an interpretation item for the medical image when a keyword constituting the keyword pair is an interpretation item. Based on the inter-item relationship information, each image feature amount calculated from the medical image that is the basis for creating the keyword has a larger value as the relationship between the image feature amount and the keyword that is the interpretation item is higher. An image feature quantity vector of the keyword may be generated by weighting.

この構成によると、読影項目と関連性のある画像特徴量に大きな重みで重み付けを行うことができる。 According to this configuration, it is possible to weight the image feature quantity related to the interpretation item with a large weight.

また、前記代表画像ベクトル生成部は、前記キーワード対を構成するキーワードが疾病名である場合、医用画像から抽出される各画像特徴量と前記医用画像に対する疾病名との関連性を予め定めた二項間関係情報に基づいて、当該キーワードの作成の基となった医用画像から算出した各画像特徴量に当該画像特徴量と疾病名である当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより当該キーワードの画像特徴量ベクトルを生成するものであっても良い。 Further, the representative image vector generation unit determines a relationship between each image feature amount extracted from a medical image and a disease name with respect to the medical image when a keyword constituting the keyword pair is a disease name. Based on the inter-item relationship information, each image feature amount calculated from the medical image that is the basis for creating the keyword has a larger value as the relationship between the image feature amount and the keyword that is the disease name is higher. An image feature quantity vector of the keyword may be generated by weighting.

この構成によると、疾病名と関連性のある画像特徴量に大きな重みで重み付けを行うことができる。 According to this configuration, it is possible to weight the image feature quantity related to the disease name with a large weight.

また、前記代表画像ベクトル生成部は、前記キーワード対を構成するキーワードが読影項目である場合、（ｉ）医用画像から抽出される各画像特徴量と前記医用画像に対する読影項目との関連性を予め定めた二項間関係情報に基づいて、当該キーワードの作成の基となった医用画像から算出した各画像特徴量に当該画像特徴量と読影項目である当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うとともに、（ｉｉ）前記読影レポートの中から当該キーワードと共起する疾病名を検出し、読影項目と疾病名との関連性を予め定めた二項間関係情報に基づいて、前記各画像特徴量を読影項目である当該キーワードと当該キーワードと共起する前記疾病名との間の関連性が高いほど大きな値の重みでさらに重み付けを行うことにより、重み付けされた各画像特徴量を要素とする当該キーワードの画像特徴量ベクトルを生成するものであっても良い。 Further, the representative image vector generation unit, in the case where the keyword constituting the keyword pair is an interpretation item, (i) the relevance between each image feature amount extracted from the medical image and the interpretation item for the medical image in advance. The higher the relevance between the image feature quantity and the keyword that is the interpretation item is to each image feature quantity calculated from the medical image that is the basis of the creation of the keyword based on the determined binary relation information (Ii) Detecting a disease name that co-occurs with the keyword from the interpretation report, and based on binary relational information in which the relationship between the interpretation item and the disease name is determined in advance. The image feature amount is further weighted with a larger weight as the relevance between the keyword as an interpretation item and the disease name co-occurring with the keyword is higher. , It may be configured to generate an image feature vector of the keyword for each image feature amount weighted as elements.

この構成によると、読影項目と関連性の低い症例の重みは小さくなるため、読影項目と関連性の低い症例を取り除いた画像特徴量ベクトルを生成することができる。これにより、画像の類似性をより正しく評価することができ、医用同義語辞書の精度を向上させることができる。 According to this configuration, since the weight of a case having low relevance to the interpretation item is reduced, an image feature vector from which a case having low relevance to the interpretation item is removed can be generated. Thereby, the similarity of an image can be evaluated more correctly and the precision of a medical synonym dictionary can be improved.

好ましくは、前記キーワード対選択部は、読影項目同士または疾病名同士のキーワード対のみを選択する。 Preferably, the keyword pair selection unit selects only a keyword pair of interpretation items or disease names.

疾病名は複数の診断項目の上位概念であるため、疾病名と診断項目とは直接同義語にはならない。そのため、疾病名と診断項目の対を選択しないことで、処理時間を低減することができる。 Since the disease name is a superordinate concept of a plurality of diagnosis items, the disease name and the diagnosis item are not directly synonymous. Therefore, the processing time can be reduced by not selecting a disease name and diagnosis item pair.

また、上述の医用同義語辞書作成装置は、さらに、前記出力部が出力するキーワード対を、前記医用同義語辞書に含まれる同義語として記憶する記憶部を備えるものであっても良い。 The medical synonym dictionary creation device described above may further include a storage unit that stores the keyword pairs output by the output unit as synonyms included in the medical synonym dictionary.

好ましくは、前記取得部は、医用画像と当該医用画像に対する読影レポートとの組である症例データが記憶されている症例データベースから、前記医用画像と前記読影レポートとを取得し、前記医用同義語辞書作成装置は、さらに、前記症例データベースに記憶されている症例データが更新されているか否かを判断し、前記症例データが更新されていると判断した場合に、前記取得部、前記キーワード抽出部、前記キーワード対選択部、前記同義語判定部および前記出力部を動作させ、前記医用同義語辞書に含まれる同義語を更新する更新制御部を備える。 Preferably, the acquisition unit acquires the medical image and the interpretation report from a case database in which case data that is a set of a medical image and an interpretation report for the medical image is stored, and the medical synonym dictionary The creation device further determines whether or not the case data stored in the case database has been updated, and when determining that the case data has been updated, the acquisition unit, the keyword extraction unit, The update control part which operates the said keyword pair selection part, the said synonym determination part, and the said output part, and updates the synonym contained in the said medical synonym dictionary is provided.

この構成によると、症例データベースに記憶されている症例データが更新された場合であっても、医用同義語辞書を自動的に更新することができるため、より汎用性の高い医用同義語辞書を用いた検索が可能になる。 According to this configuration, even if the case data stored in the case database is updated, the medical synonym dictionary can be automatically updated. Therefore, a more versatile medical synonym dictionary is used. Search that was possible.

なお、前記更新制御部は、前記症例データが更新されていると判断した場合に、前記取得部、前記キーワード抽出部、前記キーワード対選択部、前記同義語判定部および前記出力部を動作させることにより、前記医用同義語辞書に含まれる全てのキーワードについて同義語を更新するものであっても良い。 The update control unit operates the acquisition unit, the keyword extraction unit, the keyword pair selection unit, the synonym determination unit, and the output unit when it is determined that the case data is updated. Thus, the synonyms may be updated for all keywords included in the medical synonym dictionary.

また、前記更新制御部は、（ｉ）前記症例データベースに記憶されている前記症例データにおける各キーワードの出現頻度を算出し、（ｉｉ）前記症例データが更新されていると判断した場合に、前記取得部、前記キーワード抽出部、前記キーワード対選択部、前記同義語判定部および前記出力部を動作させることにより、出現頻度が第２閾値以下のキーワードについてのみ同義語を更新するものであっても良い。 Further, the update control unit (i) calculates an appearance frequency of each keyword in the case data stored in the case database, and (ii) when the case data is determined to be updated, Even if the acquisition unit, the keyword extraction unit, the keyword pair selection unit, the synonym determination unit, and the output unit are operated, the synonym is updated only for a keyword whose appearance frequency is a second threshold value or less. good.

高頻度のキーワードが新しく追加された場合は、仮に同義語か否かの判定をし直したとしても結果は変わらないため、医用同義語辞書の更新を行う必要性が低い。一方、出現頻度が少ないキーワードに対しては、同義語関係の不確実性が高いため、医用同義語辞書を更新する必要性が高い。このように、症例データベース内のキーワード頻度に応じて同義語辞書の更新の可否を判定することにより、更新時の計算量を低減できるため、更新時間を短縮することができる。 When a high-frequency keyword is newly added, the result does not change even if it is determined again whether or not it is a synonym. Therefore, it is less necessary to update the medical synonym dictionary. On the other hand, there is a high need for updating the medical synonym dictionary for keywords with a low appearance frequency because of the high uncertainty of synonym relationships. As described above, by determining whether or not the synonym dictionary can be updated according to the keyword frequency in the case database, the amount of calculation at the time of updating can be reduced, so that the updating time can be shortened.

本発明の他の実施態様に係る医用同義語辞書作成方法は、医用画像と、当該医用画像を読影した結果が記載された文書データである読影レポートとを取得する取得ステップと、医用画像の特徴を示す文字列の読影項目または医用画像の診断結果を示す文字列の疾病名であるキーワードが登録されているキーワード辞書データを参照して、前記取得ステップで取得された読影レポートから前記キーワード辞書データに登録されているキーワードを抽出するキーワード抽出ステップと、前記キーワード抽出ステップで抽出されたキーワードからキーワード対を選択するキーワード対選択ステップと、前記キーワード対選択ステップで選択されたキーワード対が同義語であるか否かを判定する同義語判定ステップと、前記同義語判定ステップで同義語であると判定されたキーワード対を、医用同義語辞書に含まれる同義語として出力する出力ステップとを含み、前記同義語判定ステップでは、（ｉ）前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定し、（ｉｉ）医用画像から抽出される各画像特徴量と前記医用画像に対するキーワードとの間の関連性を予め定めた二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを作成し、前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定し、（ｉｉｉ）２つの判定結果が共に同義語であることを示す場合に、前記キーワード対選択ステップで選択されたキーワード対が同義語であると判定する。 According to another embodiment of the present invention, there is provided a medical synonym dictionary creation method, an acquisition step of acquiring a medical image and an interpretation report that is document data describing a result of interpretation of the medical image, and characteristics of the medical image The keyword dictionary data from the interpretation report acquired in the acquisition step with reference to keyword dictionary data in which a keyword that is a disease name of a character string indicating a diagnostic result of a medical image or a diagnostic image of a medical image is registered The keyword extraction step for extracting the keywords registered in the keyword, the keyword pair selection step for selecting the keyword pair from the keywords extracted in the keyword extraction step, and the keyword pair selected in the keyword pair selection step are synonyms. Synonym determining step for determining whether or not there is a synonym in the synonym determining step An output step of outputting the keyword pair determined to be as a synonym included in the medical synonym dictionary, and in the synonym determination step, (i) the keyword pair is synonymous based on the interpretation report Whether the word is a word, and (ii) based on binomial relationship information in which the relationship between each image feature amount extracted from the medical image and the keyword for the medical image is determined in advance. For each keyword constituting a pair, a larger value is weighted as the relevance between the image feature quantity and the keyword is higher for each image feature quantity calculated from the medical image on which the keyword is created. Thus, an image feature quantity vector of each keyword having each weighted image feature quantity as an element is created, and two image feature quantity vectors for the keyword pair are created. To determine whether the keyword pair is a synonym, and (iii) if the two determination results indicate that both are synonyms, the keyword selected in the keyword pair selection step Determine that the pair is a synonym.

本発明のさらに他の実施態様に係るプログラムは、上述の医用同義語辞書作成方法に含まれる各ステップをコンピュータに実行させるためのプログラムである。 A program according to still another embodiment of the present invention is a program for causing a computer to execute each step included in the above-described medical synonym dictionary creation method.

（実施の形態１）
本実施の形態で用いる用語を説明する。 (Embodiment 1)
Terms used in this embodiment will be described.

「画像特徴量」とは、医用画像における臓器や病変部分の形状に関するもの、輝度分布に関するものなどを示す。画像特徴量として、例えば、非特許文献：「根本，清水，萩原，小畑，縄野，“多数の特徴量からの特徴選択による***Ｘ線像上の腫瘤影判別精度の改善と高速な特徴選択法の提案”，電子情報通信学会論文誌Ｄ−ＩＩ，Ｖｏｌ．Ｊ８８−Ｄ−ＩＩ，Ｎｏ．２，ｐｐ．４１６−４２６，２００５年２月」に４９０種類の特徴量を用いることが記載されている。本実施の形態においても、使用した医用画像撮影装置（モダリティ）、または対象臓器ごとに予め定めた数十〜数百種の画像特徴量を用いる。 The “image feature amount” indicates a thing related to the shape of an organ or a lesion part in a medical image, a thing related to a luminance distribution, or the like. For example, non-patent literature: “Nemoto, Shimizu, Sugawara, Obata, Nawano,” Improvement of mass shadow discrimination accuracy on mammograms by feature selection from a large number of features and a fast feature selection method "Proposal of the Institute of Electronics, Information and Communication Engineers, D-II, Vol. J88-D-II, No. 2, pp. 416-426, February 2005" describes that 490 types of feature values are used. Yes. Also in the present embodiment, several tens to several hundreds of image feature amounts determined in advance for each used medical image photographing device (modality) or target organ are used.

「キーワード」とは、以下に述べる「読影項目」と「疾病名」の何れかを示す。 “Keyword” indicates either “interpretation item” or “disease name” described below.

「読影項目」とは、本実施の形態では、「読影医が、読影対象の画像の特徴を言語化した文字列」と定義する。使用する医用画像撮影装置、対象臓器等で使用される用語はほぼ限定されるが、例えば、分葉状、棘状、不整形、境界明瞭、輪郭不明瞭、低／高濃度、低／高吸収、スリガラス状、石灰化、モザイク状、濃染、低／高エコー、毛羽立ち、等が挙げられる。 In this embodiment, the “interpretation item” is defined as “a character string in which the interpretation doctor verbalizes the characteristics of the image to be interpreted”. The terms used in the medical imaging apparatus to be used, target organs, etc. are almost limited, but for example, lobed, spiny, irregular, clear boundary, unclear outline, low / high concentration, low / high absorption, Examples include ground glass, calcification, mosaic, dark dyeing, low / high echo, and fluffing.

「疾病名」とは、読影者が医用画像やその他の検査を基に診断した疾病名のことである。例えば、肝細胞癌、嚢胞、血管腫、等が挙げられる。 The “disease name” is a disease name diagnosed by a radiogram interpreter based on medical images and other examinations. For example, hepatocellular carcinoma, cyst, hemangioma and the like can be mentioned.

（実施の形態１：構成の説明）
以下、本発明の実施の形態１に係る医用同義語辞書作成装置について、図面を用いて詳細に説明する。 (Embodiment 1: Explanation of configuration)
Hereinafter, the medical synonym dictionary creation device according to Embodiment 1 of the present invention will be described in detail with reference to the drawings.

図１は、本発明の実施の形態１に係る医用同義語辞書作成装置の特徴的な機能構成を示すブロック図である。 FIG. 1 is a block diagram showing a characteristic functional configuration of the medical synonym dictionary creation device according to Embodiment 1 of the present invention.

図１に示すように、医用同義語辞書作成装置１００は、症例データベース１０１内に記憶されている読影レポートから抽出されるキーワードの同義語辞書である医用同義語辞書を作成する装置である。なお、本明細書中で同義語とは同一の意味の語に限定されず、類似の意味の語である類義語も含むものとする。つまり、本発明に係る医用同義語辞書作成装置は、医用類義語辞書作成装置としても利用可能である。 As shown in FIG. 1, the medical synonym dictionary creation device 100 is a device that creates a medical synonym dictionary that is a synonym dictionary of keywords extracted from an interpretation report stored in the case database 101. In the present specification, synonyms are not limited to words having the same meaning, but also include synonyms that are words having similar meanings. That is, the medical synonym dictionary creation device according to the present invention can also be used as a medical synonym dictionary creation device.

医用同義語辞書作成装置１００は、取得部１０４、キーワード抽出部１０５、キーワード対選択部１０６、同義語判定部１２０、出力部１２１、および記憶部１１０を備える。 The medical synonym dictionary creation device 100 includes an acquisition unit 104, a keyword extraction unit 105, a keyword pair selection unit 106, a synonym determination unit 120, an output unit 121, and a storage unit 110.

同義語判定部１２０は、テキスト判定部１０７、代表画像ベクトル生成部１０８、および画像判定部１０９を備える。 The synonym determination unit 120 includes a text determination unit 107, a representative image vector generation unit 108, and an image determination unit 109.

医用同義語辞書作成装置１００は、外部の症例データベース１０１、キーワード辞書１０２、読影知識データベース１０３、および表示装置１２２に接続される。 The medical synonym dictionary creation device 100 is connected to an external case database 101, a keyword dictionary 102, an interpretation knowledge database 103, and a display device 122.

以下、図１に示した、症例データベース１０１、キーワード辞書１０２、読影知識データベース１０３、および医用同義語辞書作成装置１００の各構成要素の詳細について順に説明する。 Hereinafter, details of each component of the case database 101, the keyword dictionary 102, the interpretation knowledge database 103, and the medical synonym dictionary creation device 100 illustrated in FIG. 1 will be described in order.

症例データベース１０１は、例えばハードディスク、メモリ等からなる記憶装置である。症例データベース１０１は、読影者に提示する読影対象の画像を示す医用画像と、その医用画像に対応する読影レポートとから構成される症例データを記憶しているデータベースである。ここで、医用画像とは、画像診断のために用いられる画像データであり、電子媒体に格納された画像データを示す。また、読影レポートとは、医用画像の読影結果に加え、画像診断後に行われる生検等の確定診断結果までを示す情報である。読影レポートは、文書データ（テキストデータ）である。生検とは、患部の一部を切り取って、顕微鏡などで調べる検査のことである。 The case database 101 is a storage device that includes, for example, a hard disk, a memory, and the like. The case database 101 is a database that stores case data composed of a medical image indicating an image to be interpreted to be presented to an interpreter and an interpretation report corresponding to the medical image. Here, the medical image is image data used for image diagnosis, and indicates image data stored in an electronic medium. The interpretation report is information indicating not only the interpretation result of the medical image but also the definitive diagnosis result such as biopsy performed after the image diagnosis. The interpretation report is document data (text data). A biopsy is a test in which a part of an affected area is cut out and examined with a microscope or the like.

図２は、症例データベース１０１に記憶されている症例データを構成する、医用画像２０としてのＣＴ画像および読影レポート２１の一例をそれぞれ示す図である。読影レポート２１は、読影レポートＩＤ２２、画像ＩＤ２３、画像所見２４および確定診断結果２５を含む。１つの症例データは同一の患者から作成される。 FIG. 2 is a diagram showing an example of a CT image as a medical image 20 and an interpretation report 21 constituting case data stored in the case database 101. The interpretation report 21 includes an interpretation report ID 22, an image ID 23, an image finding 24, and a definitive diagnosis result 25. One case data is created from the same patient.

読影レポートＩＤ２２は、読影レポート２１を識別するための識別子であり、読影レポート２１ごとに識別子が異なる。画像ＩＤ２３は、医用画像２０を識別するための識別子であり、医用画像２０ごとに識別子が異なる。画像所見２４は、画像ＩＤ２３の医用画像２０に対する読影者の診断結果を示す情報である。つまり、画像所見２４は、疾病名を含む診断結果（読影結果）および診断理由（読影理由）を示す情報である。確定診断結果２５は、医用画像２０の患者の確定診断結果を示す。ここで確定診断結果とは、手術または生検で得られた試験体の顕微鏡による病理検査、またはその他様々な手段によって、対象の患者の真の状態が何であったのかを明らかにした診断結果である。 The interpretation report ID 22 is an identifier for identifying the interpretation report 21, and the identifier differs for each interpretation report 21. The image ID 23 is an identifier for identifying the medical image 20, and the identifier is different for each medical image 20. The image finding 24 is information indicating the diagnostic result of the image interpreter for the medical image 20 with the image ID 23. That is, the image finding 24 is information indicating a diagnosis result (interpretation result) including a disease name and a diagnosis reason (interpretation reason). The definitive diagnosis result 25 indicates the definitive diagnosis result of the patient in the medical image 20. Here, the definitive diagnosis result is a diagnosis result that reveals the true state of the subject patient by microscopic pathological examination of the specimen obtained by surgery or biopsy, or by various other means. is there.

キーワード辞書１０２は、例えばハードディスク、メモリ等からなる記憶装置である。 The keyword dictionary 102 is a storage device that includes, for example, a hard disk, a memory, and the like.

キーワード辞書１０２は、読影レポート２１からの抽出対象となるキーワード（キーワード辞書データ）を記憶しているデータベースである。図３は、キーワード辞書１０２に記憶されているキーワードの一例を示す図である。図３に示すように、キーワード辞書１０２には、キーワード名３０とキーワード属性３１とがリスト形式で記憶されている。ここで、キーワード属性３１とは、キーワード名３０のキーワードが読影項目か疾病名かを示すデータである。例えば、濃染というキーワードのキーワード属性は読影項目である。 The keyword dictionary 102 is a database that stores keywords (keyword dictionary data) to be extracted from the interpretation report 21. FIG. 3 is a diagram illustrating an example of keywords stored in the keyword dictionary 102. As shown in FIG. 3, the keyword dictionary 102 stores keyword names 30 and keyword attributes 31 in a list format. Here, the keyword attribute 31 is data indicating whether the keyword of the keyword name 30 is an interpretation item or a disease name. For example, the keyword attribute of the keyword dark dye is an interpretation item.

読影知識データベース１０３は、例えばハードディスク、メモリ等からなる記憶装置である。 The interpretation knowledge database 103 is a storage device including, for example, a hard disk and a memory.

読影知識データベース１０３は、読影レポート２１から抽出したキーワード間の相関関係（関連性）を示す二項間関係情報と、キーワードと医用画像２０から抽出した画像特徴量との相関関係（関連性）を示す二項間関係情報とを記憶しているデータベースである。二項間関係情報は、症例データベース１０１のデータを用いて自動的に作成される。データベースの構成および作成方法については後述する。 Interpretation knowledge database 103 shows the correlation (relevance) between the binomial relation information indicating the correlation (relevance) between keywords extracted from interpretation report 21 and the image feature quantity extracted from medical image 20. It is the database which has memorize | stored the binomial relationship information to show. The binomial relationship information is automatically created using the data in the case database 101. The configuration and creation method of the database will be described later.

取得部１０４は、症例データベース１０１から、読影者が診断を行った医用画像２０および読影レポート２１を取得する。取得部１０４は、取得した医用画像２０および読影レポート２１を、キーワード抽出部１０５に出力する。 The acquisition unit 104 acquires, from the case database 101, the medical image 20 and the interpretation report 21 that have been diagnosed by the interpreter. The acquisition unit 104 outputs the acquired medical image 20 and interpretation report 21 to the keyword extraction unit 105.

キーワード抽出部１０５は、キーワード辞書１０２を参照することにより、取得部１０４が取得した読影レポート２１の中からキーワード辞書１０２に登録されているキーワードを抽出し、抽出したキーワードをリスト化してキーワード対選択部１０６に出力する。具体的なキーワード抽出方法については後述する。 The keyword extraction unit 105 refers to the keyword dictionary 102 to extract keywords registered in the keyword dictionary 102 from the interpretation report 21 acquired by the acquisition unit 104, lists the extracted keywords, and selects a keyword pair. To the unit 106. A specific keyword extraction method will be described later.

キーワード対選択部１０６は、キーワード抽出部１０５が抽出したキーワードリストから未選択のキーワード対を選択し、選択したキーワード対を、テキスト判定部１０７、代表画像ベクトル生成部１０８、および出力部１２１に出力する。 The keyword pair selection unit 106 selects an unselected keyword pair from the keyword list extracted by the keyword extraction unit 105, and outputs the selected keyword pair to the text determination unit 107, the representative image vector generation unit 108, and the output unit 121. To do.

同義語判定部１２０は、キーワード対選択部１０６が選択したキーワード対が同義語であるか否かを判定する。 The synonym determination unit 120 determines whether the keyword pair selected by the keyword pair selection unit 106 is a synonym.

つまり、テキスト判定部１０７は、読影レポート２１に基づいて、キーワード対選択部１０６が選択したキーワード対に対して同義語判定を行い、同義語と判定した場合には、判定結果を代表画像ベクトル生成部１０８に出力する。具体的な同義語判定方法については後述する。 That is, the text determination unit 107 performs synonym determination on the keyword pair selected by the keyword pair selection unit 106 based on the image interpretation report 21. If it is determined as a synonym, the determination result is generated as a representative image vector. Output to the unit 108. A specific synonym determination method will be described later.

代表画像ベクトル生成部１０８は、テキスト判定部１０７でキーワード対選択部１０６が選択したキーワード対が同義語であると判定された場合に、キーワード対選択部１０６から取得したキーワード対と、読影知識データベース１０３に記憶されている二項間関係情報と、症例データベース１０１に記憶されている医用画像２０とを用いて、キーワード対選択部１０６が選択したキーワード対を構成する各キーワードに対する代表画像ベクトルを生成し、画像判定部１０９に出力する。ここで、代表画像ベクトルとは、各キーワードが付与されている医用画像群に対して算出された画像特徴量のベクトルであり、このベクトルには読影知識データベース１０３に記憶されているキーワード毎に算出された画像特徴量に対する重みが付加される。 The representative image vector generation unit 108, when the text determination unit 107 determines that the keyword pair selected by the keyword pair selection unit 106 is a synonym, the keyword pair acquired from the keyword pair selection unit 106, and the interpretation knowledge database Using the binomial relationship information stored in 103 and the medical image 20 stored in the case database 101, a representative image vector for each keyword constituting the keyword pair selected by the keyword pair selection unit 106 is generated. And output to the image determination unit 109. Here, the representative image vector is a vector of image feature amounts calculated for the medical image group to which each keyword is assigned, and this vector is calculated for each keyword stored in the interpretation knowledge database 103. A weight is added to the image feature amount.

医用同義語辞書作成に代表画像ベクトルを用いる理由は以下の通りである。 The reason why the representative image vector is used for creating the medical synonym dictionary is as follows.

読影レポート２１とは、医用画像２０に対して医学的に統一された診断指針に基づいて記述されたテキストであるため、同義語関係にあるキーワードは同じ画像特徴を呈する。つまり、キーワード間の同義語関係は画像の類似性で評価することができる。すなわち、テキストだけを用いて作成された同義語関係を画像の類似性で再評価することにより、テキストだけを用いるよりも精度の高い医用同義語辞書を作成することができる。 The interpretation report 21 is a text described based on a medically unified diagnosis guideline for the medical image 20, and thus keywords having synonym relations exhibit the same image characteristics. That is, the synonym relationship between keywords can be evaluated by image similarity. That is, by re-evaluating the synonym relationship created using only text with the similarity of images, a medical synonym dictionary with higher accuracy than using only text can be created.

しかし、読影レポート２１中のキーワードの同義語関係を評価するために、すべてのキーワードに対して同一の画像特徴量を用いて画像の類似性を評価することはできない。何故なら、それぞれのキーワードごとに、関連する画像特徴量が異なるからである。例えば、「辺縁明瞭」というキーワードはエッジ等の形状に関する画像特徴量と関係しているが、「高吸収」というキーワードは濃度に関する画像特徴量と関係している。「辺縁明瞭」と「高吸収」の同義語関係を画像の類似性を用いて評価する際、形状と濃度に関する画像特徴量の値をそのまま用いてしまうと、「辺縁明瞭」とは関係のない濃度に関する画像特徴量、そして、「高吸収」とは関係のない形状に関する画像特徴量が、それぞれ画像の類似判定に含まれてしまい、画像の類似性を正しく評価することができない。 However, in order to evaluate the synonym relationship of keywords in the interpretation report 21, it is not possible to evaluate the similarity of images using the same image feature amount for all keywords. This is because the related image feature amount is different for each keyword. For example, the keyword “clear edge” is related to the image feature amount related to the shape of the edge or the like, while the keyword “high absorption” is related to the image feature amount related to the density. When evaluating the synonym relationship between “clear border” and “high absorption” using image similarity, if the image feature values related to shape and density are used as they are, it is related to “clear border”. The image feature amount related to the density without the image and the image feature amount related to the shape not related to “high absorption” are included in the image similarity determination, respectively, and the image similarity cannot be correctly evaluated.

そこで、代表画像ベクトル生成部１０８は、各キーワードが付属する画像から算出された画像特徴量に対して、読影知識データベース１０３に記憶されている（キーワード−画像特徴量）間の関連性を示す値によって重み付けを行うことによって代表画像ベクトルを生成する。 Therefore, the representative image vector generation unit 108 represents a value indicating the relationship between (keyword-image feature amount) stored in the interpretation knowledge database 103 with respect to the image feature amount calculated from the image to which each keyword is attached. The representative image vector is generated by weighting according to.

これにより、「辺縁明瞭」の画像に対しては形状情報の値に、「高吸収」の画像に対しては濃度情報の値に大きな重みを付けることができるため、画像の類似性を正しく評価することができ、画像の類似性に基づくキーワード間の同義語判定が可能になる。 As a result, a large weight can be applied to the value of shape information for an image with “clear border” and the value of density information for an image with “high absorption”. It is possible to evaluate the synonyms between keywords based on the similarity of images.

具体的な代表画像ベクトル生成方法は後述する。 A specific representative image vector generation method will be described later.

画像判定部１０９は、代表画像ベクトル生成部１０８が生成した代表画像ベクトルを用いて、キーワード対選択部１０６が選択したキーワード対が同義語であるか否かを再判定し、判定結果を出力部１２１に出力する。 The image determination unit 109 uses the representative image vector generated by the representative image vector generation unit 108 to re-determine whether or not the keyword pair selected by the keyword pair selection unit 106 is a synonym, and outputs the determination result to the output unit It outputs to 121.

出力部１２１は、キーワード対選択部１０６から取得したキーワード対のうち、画像判定部１０９で同義語と判定されたキーワード対を、医用同義語辞書に含まれる同義語として記憶部１１０に書き込む、または、表示装置１２２に表示する。 The output unit 121 writes the keyword pair determined as a synonym by the image determination unit 109 among the keyword pairs acquired from the keyword pair selection unit 106 to the storage unit 110 as a synonym included in the medical synonym dictionary, or Is displayed on the display device 122.

次に、読影知識データベース１０３の作成方法、および以上のように構成された医用同義語辞書作成装置１００の動作について順に説明する。 Next, the creation method of the interpretation knowledge database 103 and the operation of the medical synonym dictionary creation device 100 configured as described above will be described in order.

（実施の形態１：読影知識データベース１０３の事前作成）
医用同義語辞書作成を行うに当たり、事前に読影知識を得て、読影知識データベース１０３に格納しておく。読影知識は、医用画像とその医用画像を読影した結果である読影レポートとの対から構成される“症例”（症例データ）を複数集めたものから得られる。症例は、症例データベース１０１に格納されたものを用いてもよいし、他のデータベースに格納されたものを用いてもよい。必要な症例数は、種種のデータマイニングアルゴリズムを用いて何らかの法則性および知識を得るために十分となる数である。通常は数百〜数万個のデータが用いられる。本実施の形態では、読影知識として、（１）画像特徴量、（２）読影項目、（３）疾病名の三項のうち二項間の相関関係を用いる。ここで、読影時の診断疾病名とその他の検査を経て確定診断した疾病名とは異なることがあるが、読影知識データベースを作成する際は、確定診断の結果を用いる。 (Embodiment 1: Pre-creation of interpretation knowledge database 103)
In creating a medical synonym dictionary, interpretation knowledge is acquired in advance and stored in the interpretation knowledge database 103. Interpretation knowledge is obtained from a collection of a plurality of “cases” (case data) composed of pairs of medical images and interpretation reports that are the results of interpretation of the medical images. Cases stored in the case database 101 may be used, or cases stored in other databases may be used. The required number of cases is sufficient to obtain some kind of law and knowledge using various data mining algorithms. Usually, hundreds to tens of thousands of data are used. In this embodiment, as the interpretation knowledge, the correlation between two terms among the three terms (1) image feature amount, (2) interpretation item, and (3) disease name is used. Here, although the diagnosis disease name at the time of interpretation may differ from the disease name that has been confirmed through other examinations, the result of the definitive diagnosis is used when creating the interpretation knowledge database.

以下、図４のフローチャートを用いて読影知識作成の手順を説明する。本実施の形態で対象とする、つまり使用する医用画像撮影装置はマルチスライスＣＴとし、対象臓器および疾病は、それぞれ肝臓および肝腫瘤とする。 Hereinafter, the procedure for creating interpretation knowledge will be described with reference to the flowchart of FIG. The medical imaging apparatus to be used in this embodiment, that is, the medical imaging apparatus to be used is a multi-slice CT, and the target organ and disease are a liver and a liver mass, respectively.

ステップＳ１０１では、読影知識を得るための症例が格納されたデータベースから症例を１つ取得する。ここで読影知識を得るための症例の総数をＣ個とする。１つの症例は、医用画像とその医用画像を読影した結果である読影レポートとの対で構成されている。医用画像がマルチスライスＣＴ装置により取得された場合、１つの症例は多数枚のスライス画像を含むことになる。また、通常、マルチスライスＣＴ画像を医師が読影する場合、重要なスライス画像１〜数枚を、キー画像として読影レポートに添付する。以後、多数枚のスライス画像集合、あるいは、数枚のキー画像を単に「医用画像」、「画像」と呼ぶこともある。 In step S101, one case is acquired from a database storing cases for obtaining interpretation knowledge. Here, the total number of cases for obtaining interpretation knowledge is C. One case is composed of a pair of a medical image and an interpretation report that is a result of interpretation of the medical image. When a medical image is acquired by a multi-slice CT apparatus, one case includes a large number of slice images. In general, when a doctor interprets a multi-slice CT image, one to several important slice images are attached to the interpretation report as key images. Hereinafter, a large number of slice image sets or several key images may be simply referred to as “medical images” or “images”.

ステップＳ１０２では、医用画像から画像特徴量を抽出する。ステップＳ１０２の処理を、図５のフローチャートを用いて詳細に説明する。 In step S102, an image feature amount is extracted from the medical image. The process of step S102 will be described in detail using the flowchart of FIG.

ステップＳ２０１では、対象臓器の領域を抽出する。本実施の形態では肝臓領域を抽出する。肝臓領域抽出法として、例えば、非特許文献：「田中，清水，小畑，“異常部位の濃度パターンを考慮した肝臓領域抽出手法の改良＜第二報＞”，電子情報通信学会技術研究報告，医用画像，１０４（５８０），ｐｐ．７−１２，２００５年１月」等の手法を用いることができる。 In step S201, the region of the target organ is extracted. In the present embodiment, a liver region is extracted. Examples of liver region extraction methods include, for example, non-patent literature: “Tanaka, Shimizu, Obata,“ Improvement of liver region extraction method considering concentration pattern of abnormal part <second report> ”, IEICE Technical Report, Medical Image, 104 (580), pp. 7-12, January 2005 ”can be used.

ステップＳ２０２では、ステップＳ２０１で抽出された臓器領域から病変領域を抽出する。本実施の形態では肝臓領域から腫瘤領域を抽出する。肝腫瘤領域抽出法として、例えば、非特許文献「中川、清水，一杉，小畑，“３次元腹部ＣＴ像からの肝腫瘤影の自動抽出手法の開発＜第二報＞”，医用画像，１０２（５７５），ｐｐ．８９−９４，２００３年１月」等の手法を用いることができる。ここで、ｉ番目の症例における画像から抽出した腫瘤の数をＭ_ｉとすると、腫瘤は（症例番号，腫瘤番号）の組（ｉ，ｊ）で特定できる。ここで、１≦ｉ≦Ｃ，１≦ｊ≦Ｍ_ｉである。また本実施の形態では病変として肝腫瘤を対象としているため、“腫瘤番号”と呼んだが、本発明で共通の表現を用いて“病変番号”と呼ぶこともできる。 In step S202, a lesion area is extracted from the organ area extracted in step S201. In the present embodiment, a tumor region is extracted from the liver region. As a method for extracting a liver tumor region, for example, non-patent literature “Nakagawa, Shimizu, Hitosugi, Obata,“ Development of an automatic extraction method of liver tumor shadow from 3D abdominal CT image <second report> ”, medical image, 102 (575), pp. 89-94, January 2003 ”. Here, when the number of i-th mass extracted from the image in case the M _i, tumor mass may be identified by (case number, mass number) pairs of (i, j). Here, a 1 ≦ i ≦ C, 1 ≦ j ≦ M i. In this embodiment, since a liver tumor is targeted as a lesion, it is called “tumor number”. However, it can also be called “lesion number” using a common expression in the present invention.

ステップＳ２０３では、ステップＳ２０２で抽出された病変領域のうち、１つの領域を選択する。 In step S203, one region is selected from the lesion regions extracted in step S202.

ステップＳ２０４では、ステップＳ２０３で選択された病変領域から画像特徴量を抽出する。本実施の形態では、画像特徴量として、非特許文献：「根本，清水，萩原，小畑，縄野，“多数の特徴量からの特徴選択による***Ｘ線像上の腫瘤影判別精度の改善と高速な特徴選択法の提案”，電子情報通信学会論文誌Ｄ−ＩＩ，Ｖｏｌ．Ｊ８８−Ｄ−ＩＩ，Ｎｏ．２，ｐｐ．４１６−４２６，２００５年２月」に記載された４９０種類の特徴量のうち、肝腫瘤にも適用可能な特徴量をいくつか選択して用いる。この特徴量数をＮ_Ｆ個とする。本ステップで抽出された特徴量は、（症例番号，この症例（医用画像）から抽出された腫瘤番号，特徴量番号）の組（ｉ，ｊ，ｋ）で特定できる。ここで、１≦ｉ≦Ｃ，１≦ｊ≦Ｍ_ｉ，１≦ｋ≦Ｎ_Ｆである。 In step S204, an image feature amount is extracted from the lesion area selected in step S203. In the present embodiment, non-patent literature: “Nemoto, Shimizu, Sugawara, Obata, Nawano,“ Improved mass shadow discrimination accuracy on a mammogram by selecting features from a large number of features and high speed. 490 types of feature quantities described in "Proposal of a Feature Selection Method", IEICE Transactions D-II, Vol. J88-D-II, No. 2, pp. 416-426, February 2005 Among them, some feature quantities applicable to the liver mass are selected and used. The feature quantity number to the N _F. The feature amount extracted in this step can be specified by a set (i, j, k) of (case number, tumor number extracted from this case (medical image), feature amount number). Here, a 1 ≦ i ≦ C, 1 ≦ j ≦ M i, 1 ≦ k ≦ N F.

ステップＳ２０５では、ステップＳ２０２で抽出された病変領域のうち未選択の病変があるかどうかをチェックし、未選択の病変がある場合は、ステップＳ２０３に戻り未選択の病変領域を選択した後、ステップＳ２０４を再実行する。未選択の病変がない場合、すなわち、ステップＳ２０２で抽出された全ての病変領域に対し、ステップＳ２０４の特徴量選択を行った場合は図５のフローチャートの処理を終了し、図４のフローチャートに戻る。 In step S205, it is checked whether or not there is an unselected lesion among the lesion areas extracted in step S202. If there is an unselected lesion, the process returns to step S203 to select an unselected lesion area. Re-execute S204. If there is no unselected lesion, that is, if the feature amount selection in step S204 is performed for all the lesion areas extracted in step S202, the processing of the flowchart of FIG. 5 is terminated, and the process returns to the flowchart of FIG. .

図４のステップＳ１０３では、読影レポートの解析処理を行う。具体的には読影レポートから読影項目及び疾病名を抽出する。本実施の形態では読影項目が格納された読影項目単語辞書、および疾病名が格納された疾病名単語辞書を用いた形態素解析及び構文解析を行う。これらの処理により、各単語辞書に格納された単語と一致する単語を抽出する。形態素解析技術としては、例えば、ＭｅＣａｂ（ｈｔｔｐ：／／ｍｅｃａｂ．ｓｏｕｒｃｅｆｏｒｇｅ．ｎｅｔ）やＣｈａＳｅｎ（ｈｔｔｐ：／／ｃｈａｓｅｎ−ｌｅｇａｃｙ．ｓｏｕｒｃｅｆｏｒｇｅ．ｊｐ）等が、構文解析技術としては、ＫＮＰ（ｈｔｔｐ：／／ｎｌｐ．ｋｕｅｅ．ｋｙｏｔｏ−ｕ．ａｃ．ｊｐ／ｎｌ−ｒｅｓｏｕｒｃｅ／ｋｎｐ．ｈｔｍｌ）、ＣａｂｏＣｈａ（ｈｔｔｐ：／／ｃｈａｓｅｎ．ｏｒｇ／〜ｔａｋｕ／ｓｏｆｔｗａｒｅ／ｃａｂｏｃｈａ／）等が存在する。読影レポートは医師により読影レポート独特の表現で記述されることが多いので、読影レポートに特化した形態素解析技術、構文解析技術、各単語辞書を開発することが望ましい。 In step S103 in FIG. 4, an interpretation report analysis process is performed. Specifically, an interpretation item and a disease name are extracted from the interpretation report. In this embodiment, morphological analysis and syntax analysis are performed using an interpretation item word dictionary storing interpretation items and a disease name word dictionary storing disease names. Through these processes, a word that matches the word stored in each word dictionary is extracted. Examples of morphological analysis techniques include MeCab (http://mecab.sourceforge.net) and ChaSen (http://chasen-legacy.sourceforge.jp), and syntactic analysis techniques include KNP (http: ///). nlp.kuee.kyoto-u.ac.jp/nl-resource/knp.html), CaboCha (http://chasen.org/˜take/software/caboca/), and the like. Interpretation reports are often described by doctors with expressions unique to interpretation reports, so it is desirable to develop morphological analysis technology, syntax analysis technology, and word dictionaries specialized for interpretation reports.

図６は腹部ＣＴ検査の読影レポートの例であり、図７は図６の読影レポートから抽出された読影項目および疾病名を示す。読影項目は通常複数個、疾病名は１個抽出される。ｉ番目の症例における読影レポートから抽出した読影項目の数をＮ_ｉとすると、読影項目は（症例番号，読影項目番号）の組（ｉ，ｊ）で特定できる。ここで、１≦ｉ≦Ｃ，１≦ｊ≦Ｎ_ｉである。 FIG. 6 is an example of an interpretation report of an abdominal CT examination, and FIG. 7 shows interpretation items and disease names extracted from the interpretation report of FIG. A plurality of interpretation items are usually extracted and one disease name is extracted. When i-th number of interpretation items extracted from the image interpretation report in case the N _i, interpretation items can be identified by (case number, image interpretation item number) pairs of (i, j). Here, 1 ≦ i ≦ C and 1 ≦ j ≦ N _i .

また、図７では、読影項目および疾病名の単語のみを抽出しているが、読影レポートにおける病変の位置を表す文字列、時相を表す文字列を同時に抽出してもよい。ここで、時相について補足する。肝臓の病変の鑑別には、造影剤を急速静注して経時的に撮像する造影検査が有用とされている。肝臓の造影検査では一般に、肝動脈に造影剤が流入し多血性の腫瘍が濃染する動脈相、腸管や脾臓に分布した造影剤が門脈から肝臓に流入し肝実質が最も造影される門脈相、肝の血管内外の造影剤が平衡に達する平衡相、肝の間質に造影剤が貯留する晩期相などにおいて、肝臓が撮像される。読影レポートには病変の臓器における位置や、造影検査であれば着目した時相の情報が記述されていることが多い。このため、読影項目だけでなく位置や時相の情報も合わせて抽出することで、後で説明する読影知識の抽出に有効となる。図８に、読影項目と同時に位置と時相の情報を抽出した例を示す。例えば、図６の読影レポートを解析し、「肝Ｓ３区域に早期濃染を認め」という文節から「早期濃染」の位置属性として「肝Ｓ３区域」が抽出される。同様に、「後期相でｗａｓｈｏｕｔされており」という文節から「ｗａｓｈｏｕｔ」の時相属性として「後期相」が抽出される。 In FIG. 7, only the interpretation item and the disease name word are extracted. However, a character string representing the position of the lesion in the interpretation report and a character string representing the time phase may be extracted simultaneously. Here, the time phase will be supplemented. In order to distinguish liver lesions, a contrast examination in which a contrast medium is rapidly injected and images are taken over time is useful. In contrast imaging of the liver, in general, the contrast medium flows into the hepatic artery and the bloody tumor is densely stained. The liver is imaged in the pulse phase, the equilibrium phase in which the contrast medium inside and outside the liver blood vessels reaches equilibrium, the late phase in which the contrast medium accumulates in the stroma of the liver, and the like. In many cases, the interpretation report describes the position of the lesion in the organ and the time phase information of interest in contrast imaging. For this reason, extracting not only the interpretation items but also the position and time phase information is effective in extracting interpretation knowledge to be described later. FIG. 8 shows an example in which position and time phase information is extracted simultaneously with the interpretation items. For example, the interpretation report of FIG. 6 is analyzed, and “liver S3 area” is extracted as a position attribute of “early dark staining” from the phrase “early dark staining is recognized in liver S3 area”. Similarly, “late phase” is extracted as the time phase attribute of “washout” from the phrase “washed out in late phase”.

図６の読影レポートを、単純に解釈すると、図８のように「早期濃染」に関する時相、ｗａｓｈｏｕｔに関する位置の部分が空白になる。これに対し、読影項目「早期濃染」は早期相に対応した単語であるという事前知識を利用したり、「早期濃染」の状態を示す腫瘤と「後期相でｗａｓｈｏｕｔ」される腫瘤が同一の腫瘤を指すという高度な文脈解釈を行ったりすることができれば、抽出される位置と時相の情報は図９のようになる。 When the interpretation report of FIG. 6 is simply interpreted, the time phase relating to “early dark dyeing” and the position portion relating to washout are blank as shown in FIG. On the other hand, the prior knowledge that the interpretation item “early dark staining” is a word corresponding to the early phase is used, or the mass indicating the state of “early dark staining” is the same as the mass “washed out in the late phase” If the advanced context interpretation of pointing to a mass of the subject can be performed, the extracted position and time phase information is as shown in FIG.

ステップＳ１０４では、読影知識を得るための症例が格納されたデータベースにおいて未取得の症例があるかどうかをチェックし、未取得の症例がある場合は、ステップＳ１０１に戻り未取得の症例を取得した後、ステップＳ１０２およびＳ１０３を実行する。未取得の症例がない場合、すなわち、全ての症例に対し、ステップＳ１０２の画像特徴抽出およびステップＳ１０３のレポート解析を実施済の場合は、ステップＳ１０５に進む。 In step S104, it is checked whether or not there is an unacquired case in the database storing cases for obtaining interpretation knowledge. If there is an unacquired case, the process returns to step S101 and an unacquired case is acquired. Steps S102 and S103 are executed. If there are no unacquired cases, that is, if image feature extraction in step S102 and report analysis in step S103 have been performed for all cases, the process proceeds to step S105.

ステップＳ１０２とステップＳ１０３の結果は相互に依存しないため、実行順は逆でも構わない。 Since the results of step S102 and step S103 do not depend on each other, the execution order may be reversed.

ステップＳ１０５に到達した時点で、図１０で表されるデータ一式が取得できたことになる。つまり、症例ごとに画像特徴量と読影項目と疾病名とが取得される。症例番号１の症例については、医用画像中にＭ１個の病変が含まれており、各病変から抽出される画像特徴量の個数はＮＦ個である。また、読影レポート中の読影項目の数はＮ１個である。例えば、病変番号（１，１）で示される１つ目の病変のうち、１つ目の画像特徴量の値は０．８５１である。また、読影項目番号（１，１）で示される１つ目の読影項目の値は「早期濃染」である。 When reaching step S105, the data set shown in FIG. 10 has been acquired. That is, an image feature amount, an interpretation item, and a disease name are acquired for each case. For the case of case number 1, M1 lesions are included in the medical image, and the number of image feature values extracted from each lesion is NF. The number of interpretation items in the interpretation report is N1. For example, the value of the first image feature value of the first lesion indicated by the lesion number (1, 1) is 0.851. In addition, the value of the first interpretation item indicated by the interpretation item number (1, 1) is “early dark dyeing”.

ステップＳ１０５では、ステップＳ１０２で得られた画像特徴量、ステップＳ１０３で得られた読影項目および疾病名から、読影知識を抽出する。本実施の形態では、画像特徴量、読影項目、疾病名という三項のうちの二項の相関関係を、読影知識とする。 In step S105, interpretation knowledge is extracted from the image feature amount obtained in step S102, the interpretation item and disease name obtained in step S103. In this embodiment, the correlation between two of the three items of image feature, interpretation item, and disease name is taken as interpretation knowledge.

以下では、画像特徴量、読影項目、疾病名という三項から得られる三組の二項の相関関係について説明する。 Below, the correlation of three sets of two terms obtained from the three terms of image feature, interpretation item, and disease name will be described.

（１）（画像特徴量−読影項目）間の相関関係
１対の（画像特徴量，読影項目）間の相関関係の求め方について説明する。相関関係の表現形態は複数あるが、ここでは相関比を用いる。相関比は、質的データと量的データとの間の相関関係を表す指標であり、（式１）で表される。 (1) Correlation between (image feature amount−interpretation item) A method of obtaining the correlation between a pair of (image feature amount, interpretation item) will be described. Although there are a plurality of expression forms of the correlation, the correlation ratio is used here. The correlation ratio is an index representing a correlation between qualitative data and quantitative data, and is expressed by (Equation 1).

読影レポート中に、ある読影項目を含む場合および含まない場合の２カテゴリを考え、これを質的データとする。医用画像から抽出した、ある画像特徴量の値そのものを量的データとする。例えば、読影知識を抽出するための症例データベースに含まれる全症例に対し、読影レポートを、ある読影項目を含むものまたは含まないものに区分する。ここでは、読影項目「早期濃染」と画像特徴量「早期相における腫瘤内部の輝度平均値」との相関比を求める方法について説明する。（式１）においては、カテゴリｉ＝１を「早期濃染」を含むもの、カテゴリｉ＝２を「早期濃染」を含まないものとする。読影レポートに「早期濃染」を含む症例から抽出した腫瘤画像の「早期相における腫瘤内部の輝度平均値」であるｊ番目の観測値をｘ_１ｊとする。また、読影レポートに「早期濃染」を含まない症例から抽出した腫瘤画像の「早期相における腫瘤内部の輝度平均値」であるｊ番目の観測値をｘ_２ｊとする。「早期濃染」とは造影早期相にてＣＴ値が上昇することを表すため、この場合、相関比が大きく（１に近く）なることが予想される。また、早期濃染は腫瘤の種類に依存し、腫瘤の大きさには依存しないため、読影項目「早期濃染」と画像特徴量「腫瘤面積」との相関比は小さく（０に近く）なることが予想される。このようにして、全ての読影項目と全ての画像特徴量との間の相関比を計算する。 In the interpretation report, two categories of cases where an interpretation item is included and not included are considered as qualitative data. The value of a certain image feature amount itself extracted from the medical image is used as quantitative data. For example, for all cases included in the case database for extracting interpretation knowledge, the interpretation report is classified into those including or not including a certain interpretation item. Here, a method for obtaining the correlation ratio between the interpretation item “early dark staining” and the image feature amount “luminance average value inside the tumor in the early phase” will be described. In (Formula 1), it is assumed that category i = 1 includes “early dark dyeing” and category i = 2 does not include “early dark dyeing”. The j-th observed value that is the “average luminance value in the tumor in the early phase” of the tumor image extracted from the case that includes “early dark staining” in the interpretation report is defined as x _1j . Further, the j-th observed value that is “the average luminance value in the tumor in the early phase” of the tumor image extracted from the case that does not include “early dark staining” in the interpretation report is assumed to be x _2j . “Early dark staining” means that the CT value increases in the early phase of contrast, and in this case, the correlation ratio is expected to be large (close to 1). Further, since early dark staining depends on the type of tumor and not on the size of the tumor, the correlation between the interpretation item “early dark staining” and the image feature “mass area” is small (close to 0). It is expected that. In this way, the correlation ratio between all interpretation items and all image feature amounts is calculated.

図１１に、読影項目と画像特徴量との間の相関関係（ここでは、相関比）の概念図を示す。左側には複数の読影項目、右側には複数の画像特徴量の名称が列挙されている。そして、相関比が閾値以上の読影項目と画像特徴量の間が実線で結ばれている。計算した相関比を最終的に閾値で二値化すると、図１１のような情報が求められることになる。その一例について補足する。肝腫瘤の造影ＣＴ検査においては、殆どの腫瘤は造影剤使用前のＣＴ画像（単純、単純ＣＴ、単純相などと呼ぶ）で低濃度に描出され、多くの場合、読影レポートに「低濃度」「ＬＤＡ（ＬｏｗＤｅｎｓｉｔｙＡｒｅａ）あり」などと記述される。そのため、「低輝度」や「ＬＤＡ」といった読影項目と、造影剤使用前のＣＴ画像における腫瘤内部の輝度平均（図１１では「単純相輝度平均」と略記載）との相関が大きくなる。 FIG. 11 shows a conceptual diagram of a correlation (here, a correlation ratio) between an interpretation item and an image feature amount. A plurality of interpretation items are listed on the left side, and names of a plurality of image feature amounts are listed on the right side. An interpretation item having a correlation ratio equal to or greater than a threshold and an image feature amount are connected by a solid line. When the calculated correlation ratio is finally binarized with a threshold value, information as shown in FIG. 11 is obtained. One example is supplemented. In contrast-enhanced CT examinations of liver masses, most masses are rendered at low density on CT images (called simple, simple CT, simple phase, etc.) before the use of contrast agents, and in many cases “low density” is used in the interpretation report. “With LDA (Low Density Area)” is described. Therefore, the correlation between the interpretation items such as “low luminance” and “LDA” and the luminance average inside the tumor in the CT image before using the contrast agent (abbreviated as “simple phase luminance average” in FIG. 11) increases.

また、図１２に、読影項目と画像特徴量との間の相関関係（例えば、相関比）の別の概念図を示す。この図では、相関比を多値表現しており、読影項目と画像特徴量の間の実線の太さが相関比の大きさに相当している。例えば、造影早期相にてＣＴ値が上昇する「早期濃染」と、早期動脈相（早期相、動脈相とも略される）における腫瘤内部の輝度平均（図１２では「動脈相輝度平均」と略記載）との相関が大きくなっている。 FIG. 12 shows another conceptual diagram of the correlation (for example, correlation ratio) between the interpretation item and the image feature amount. In this figure, the correlation ratio is expressed in multiple values, and the thickness of the solid line between the interpretation item and the image feature amount corresponds to the magnitude of the correlation ratio. For example, “early dark staining” in which the CT value increases in the early phase of contrast, and the luminance average inside the tumor in the early arterial phase (abbreviated as early phase and arterial phase) (“arterial phase luminance average” in FIG. 12) (Abbreviation) is increased.

相関比の値に着目することで、ある読影項目と相関の高い画像特徴量を特定することができる。実際には１つの症例には、複数の画像や複数の病変（腫瘤）を含む場合が多く、その場合は読影レポートには複数の病変に関する記載が含まれることになる。例えば、造影ＣＴ検査では、造影剤使用前や使用後の複数時刻におけるタイミングでＣＴ撮影を行う。そのため、スライス画像の集合が複数得られ、スライス画像の１つの集合には複数の病変（腫瘤）が含まれ、１つの病変からは複数の画像特徴量が抽出される。そのため、（スライス画像集合数）×（１人の患者から検出された病変数）×（画像特徴量の種類数）の個数だけ画像特徴量が得られ、これら複数の画像特徴量と、１つの読影レポートから抽出された複数の読影項目や疾病名との相関関係を求める必要がある。もちろん大量の症例を用いることにより、対応が正しく得られる可能性があるが、図９のように病変位置と時相を用いる等して、読影レポートの記載と、対応する画像特徴量とをある程度事前に対応づけることができれば、より正確に相関関係を求めることができる。 By paying attention to the value of the correlation ratio, it is possible to specify an image feature amount having a high correlation with a certain interpretation item. Actually, one case often includes a plurality of images and a plurality of lesions (mass), and in that case, the interpretation report includes descriptions about the plurality of lesions. For example, in contrast CT examination, CT imaging is performed at a timing at a plurality of times before or after using a contrast medium. Therefore, a plurality of sets of slice images are obtained, and one set of slice images includes a plurality of lesions (tumors), and a plurality of image feature amounts are extracted from one lesion. Therefore, image feature amounts are obtained as many as (number of slice image sets) × (number of lesions detected from one patient) × (number of types of image feature amounts). It is necessary to obtain a correlation with a plurality of interpretation items and disease names extracted from the interpretation report. Of course, there is a possibility that the correspondence can be obtained correctly by using a large number of cases. However, the description of the interpretation report and the corresponding image feature amount are used to some extent by using the lesion position and time phase as shown in FIG. If the correspondence can be made in advance, the correlation can be obtained more accurately.

先の説明では、質的データが、ある読影項目を含むものおよび含まないものの２カテゴリである場合について説明したが、ある読影項目（例えば、「境界明瞭」）と、その対義語となる読影項目（例えば、「境界不明瞭」）との２カテゴリであってもよい。また、読影項目が「低濃度」、「中濃度」、「高濃度」などの序数尺度の場合は、それらの各々をカテゴリとして（この例では３カテゴリ）、相関比を計算してもよい。 In the above description, the case where the qualitative data is in two categories, including and not including a certain interpretation item, has been described. However, a certain interpretation item (for example, “clear border”) and its interpretation item ( For example, there may be two categories such as “unclear boundary”). When the interpretation items are ordinal scales such as “low density”, “medium density”, and “high density”, the correlation ratio may be calculated using each of them as a category (three categories in this example).

また、「低濃度」、「低輝度」、「低吸収」などの同義語については、予め医用同義語辞書を作成しておき、それらを同一の読影項目として扱う。 For synonyms such as “low density”, “low luminance”, and “low absorption”, a medical synonym dictionary is created in advance, and these are treated as the same interpretation item.

（２）（画像特徴量−疾病名）間の相関関係
１対の（画像特徴量，疾病名）間の相関関係については、（画像特徴量，読影項目）間の場合と同じく相関比を用いることができる。図１３に、疾病名と画像特徴量との間の相関関係（例えば、相関比）の概念図を示す。この図では図１１と同じく相関関係を二値表現しているが、もちろん図１２のような多値表現を行うことも可能である。 (2) Correlation between (image feature quantity-disease name) For the correlation between a pair of (image feature quantity, disease name), the correlation ratio is used in the same manner as in the case of (image feature quantity, interpretation item). be able to. FIG. 13 shows a conceptual diagram of a correlation (for example, correlation ratio) between a disease name and an image feature amount. In this figure, the correlation is expressed in binary as in FIG. 11, but it is of course possible to perform multi-value expression as shown in FIG.

（３）（読影項目−疾病名）間の相関関係
１対の（読影項目，疾病名）間の相関関係の求め方について説明する。相関関係の表現形態は複数あるが、ここでは支持度を用いる。支持度は、質的データ間の相関ルールを表す指標であり、（式２）で表される。 (3) Correlation between (interpretation item-disease name) A method for obtaining the correlation between a pair of (interpretation item, disease name) will be described. Although there are a plurality of expression forms of the correlation, the support level is used here. The degree of support is an index representing an association rule between qualitative data, and is expressed by (Expression 2).

この支持度は、全症例において読影項目Ｘと疾病名Ｙとが同時に出現する確率（共起確率）を意味する。支持度を用いることで、関連性の強い読影項目と疾病名との組合せを特定することができる。 This degree of support means the probability (co-occurrence probability) that the interpretation item X and the disease name Y appear simultaneously in all cases. By using the support level, it is possible to specify a combination of a highly relevant interpretation item and a disease name.

なお、支持度の代わりに、（式３）で示される確信度や、（式４）で示されるリフト値を用いても良い。 In addition, you may use the reliability shown by (Formula 3), and the lift value shown by (Formula 4) instead of a support degree.

確信度とは、条件部Ｘのアイテムの出現を条件としたときの結論部Ｙのアイテムが出現する確率である。リフト値とは、Ｘの出現を条件としないときのＹの出現確率に対して、Ｘの出現を条件としたときのＹの出現確率がどの程度上昇したかを示す指標である。その他、ｃｏｎｖｉｃｔｉｏｎ，φ係数を用いても良い。ｃｏｎｖｉｃｔｉｏｎ，φ係数については相関ルール分析に関する文献（例えば、非特許文献：「データマイニングとその応用」、加藤／羽室／矢田共著、朝倉書店）に記載されている。 The certainty factor is the probability that an item in the conclusion part Y appears when the condition item X appears on the condition. The lift value is an index indicating how much the appearance probability of Y when the condition of appearance of X is increased with respect to the appearance probability of Y when condition of appearance of X is not a condition. In addition, a connection and a φ coefficient may be used. The connection and φ coefficient are described in documents relating to association rule analysis (for example, non-patent document: “Data Mining and its Application”, Kato / Hamuro / Yada Co-author, Asakura Shoten).

図１４に、読影項目と疾病名との間の相関関係（例えば、支持度）の概念図を示す。この図では図１１と同じく相関関係を二値表現しているが、もちろん図１２のような多値表現を行うことも可能である。 In FIG. 14, the conceptual diagram of the correlation (for example, support degree) between an interpretation item and a disease name is shown. In this figure, the correlation is expressed in binary as in FIG. 11, but it is of course possible to perform multi-value expression as shown in FIG.

以上の方法にて、ステップＳ１０５の処理を行うと、図１５、図１６、図１７のような、（画像特徴量−読影項目）間の相関関係、（画像特徴量−疾病名）間の相関関係、（読影項目−疾病名）間の相関関係が、それぞれ得られる。なお表中の数値は、図１５、図１６では相関比、図１７では支持度である。また、得られた相関関係は、図１５、図１６、図１７の形式にて読影知識データベース１０３に格納される。 When the process of step S105 is performed by the above method, the correlation between (image feature quantity-interpretation item) and the correlation between (image feature quantity-disease name) as shown in FIGS. A correlation between the relationship and (interpretation item-disease name) is obtained. The numerical values in the tables are correlation ratios in FIGS. 15 and 16 and support levels in FIG. The obtained correlation is stored in the interpretation knowledge database 103 in the format of FIGS.

以上、読影知識データベース１０３の作成方法について述べた。次に、医用同義語辞書作成装置１００の動作について説明する。 The method for creating the interpretation knowledge database 103 has been described above. Next, the operation of the medical synonym dictionary creation device 100 will be described.

（実施の形態１：医用同義語辞書作成装置１００の動作の説明）
図１８は、医用同義語辞書作成装置１００が実行する処理の全体的な流れを示すフローチャートである。以下、図１８を用いて、医用同義語辞書作成装置１００が実行する処理の全体的な流れについて説明する。 (Embodiment 1: Explanation of operation of medical synonym dictionary creating apparatus 100)
FIG. 18 is a flowchart showing an overall flow of processing executed by the medical synonym dictionary creation device 100. The overall flow of processing executed by the medical synonym dictionary creation device 100 will be described below using FIG.

まず、取得部１０４は、症例データベース１０１から、読影者が診断した医用画像２０と読影レポート２１を取得し、キーワード抽出部１０５に出力する（ステップＳ３０１）。医用画像２０と読影レポート２１は、例えば、１週間単位などの固定期間ごとに取得してもよいし、ユーザが指定する任意のタイミングで取得してもよい。 First, the acquisition unit 104 acquires the medical image 20 and the interpretation report 21 diagnosed by the interpreter from the case database 101, and outputs them to the keyword extraction unit 105 (step S301). The medical image 20 and the image interpretation report 21 may be acquired for each fixed period such as one week, or may be acquired at an arbitrary timing designated by the user.

次に、キーワード抽出部１０５は、キーワード辞書１０２を参照することにより、取得部１０４から取得した読影レポート２１の中からキーワードを抽出し、抽出したキーワードと読影レポートＩＤ２２をリスト化してキーワード対選択部１０６に出力する（ステップＳ３０２）。特に、キーワード抽出部１０５は、画像所見２４と確定診断結果２５とからキーワードを抽出する。 Next, the keyword extraction unit 105 extracts keywords from the interpretation report 21 acquired from the acquisition unit 104 by referring to the keyword dictionary 102, lists the extracted keywords and the interpretation report ID 22, and selects a keyword pair selection unit. It outputs to 106 (step S302). In particular, the keyword extraction unit 105 extracts keywords from the image findings 24 and the definitive diagnosis results 25.

抽出されたキーワードと読影レポートＩＤ２２のリストの一例を図１９に示す。キーワード抽出方法としては、例えば、読影レポート２１の中から、キーワード辞書１０２に記憶されているキーワードと一致するキーワードを抽出すればよい。図１９に示すリストより、例えば、キーワード「高吸収」を含む読影レポート２１の読影レポートＩＤ２２は、ｒ＿１２およびｒ＿１４などであることが分かる。 An example of a list of extracted keywords and interpretation report ID 22 is shown in FIG. As a keyword extraction method, for example, a keyword that matches the keyword stored in the keyword dictionary 102 may be extracted from the interpretation report 21. From the list shown in FIG. 19, for example, it is understood that the interpretation report IDs 22 of the interpretation report 21 including the keyword “high absorption” are r_12 and r_14.

次に、キーワード対選択部１０６は、キーワード抽出部１０５から取得したキーワードリストから未選択のキーワード対を選択し、選択したキーワード対をテキスト判定部１０７、代表画像ベクトル生成部１０８、および出力部１２１に出力する（ステップＳ３０３）。 Next, the keyword pair selection unit 106 selects an unselected keyword pair from the keyword list acquired from the keyword extraction unit 105, and selects the selected keyword pair from the text determination unit 107, the representative image vector generation unit 108, and the output unit 121. (Step S303).

なお、キーワード対選択部１０６は、キーワード対を選択する際に、疾病名の対、および診断項目の対のみを選択してもよい。疾病名は複数の診断項目の上位概念であるため、疾病名と診断項目とは直接同義語にはならない。そのため、疾病名と診断項目の対を選択しないことで、処理時間を低減することができる。 The keyword pair selection unit 106 may select only a disease name pair and a diagnosis item pair when selecting a keyword pair. Since the disease name is a superordinate concept of a plurality of diagnosis items, the disease name and the diagnosis item are not directly synonymous. Therefore, the processing time can be reduced by not selecting a disease name and diagnosis item pair.

次に、テキスト判定部１０７は、取得部１０４が取得した読影レポート２１に基づいて、キーワード対選択部１０６が選択したキーワード対に対して同義語判定を行い、同義語と判定した場合には、判定結果を代表画像ベクトル生成部１０８に通知する。一方、同義語と判定しなかった場合は、ステップＳ３０３へ戻る（ステップＳ３０４）。具体的な同義語判定方法としては、例えば、判定対象となるキーワードの前後に出現するキーワード頻度の類似性に基づいて、キーワード対が類義語か否かを判定すればよい。以下、キーワード対選択部１０６で「辺縁明瞭」と「高吸収」の２つのキーワードが選択された場合を例に、テキスト判定部１０７の処理の一例を説明する。つまり、テキスト判定部１０７は、図１９に示すキーワードと読影レポートＩＤ２２のリストを参照し、「辺縁明瞭」が付与されている読影レポートＩＤ２２を抽出する。テキスト判定部１０７は、抽出した読影レポートＩＤ２２の読影レポート２１の画像所見２４および確定診断結果２５に含まれるテキストデータを取得する。テキスト判定部１０７は、取得したテキストデータから「辺縁明瞭」を含む一文を選択する。テキスト判定部１０７は、選択した一文から「辺縁明瞭」以外のキーワードを抽出し、図２０に示すようなキーワードベクトルを作成する。ここで、ｔｉ（ｉ＝１〜ｎ）はキーワードｉの出現頻度、ｎはキーワードの種類数であり、各キーワードの出現頻度がベクトルの要素である。次に、テキスト判定部１０７は、「高吸収」に対するキーワードベクトルを、「辺縁明瞭」の場合と同様の手法で作成する。最後に、テキスト判定部１０７は、作成された２つのキーワードベクトル間のコサイン距離を算出し、算出した距離が閾値以下であれば同義語であると判定し、閾値より大きければ同義語でないと判定する。このようなテキストに基づいた同義語判定の詳細なアルゴリズムは、非特許文献：「山本，梅村，“辞書を用いない関連語リストの構築方法”，情報処理学会研究報告，ｖｏｌ．２００２（２０），ｐｐ．８１−８８，２００２−０３−０４」に開示されている。 Next, the text determination unit 107 performs synonym determination on the keyword pair selected by the keyword pair selection unit 106 based on the interpretation report 21 acquired by the acquisition unit 104, and determines that it is a synonym. The determination result is notified to the representative image vector generation unit. On the other hand, if it is not determined to be a synonym, the process returns to step S303 (step S304). As a specific synonym determination method, for example, it may be determined whether or not a keyword pair is a synonym based on the similarity of keyword frequencies appearing before and after the keyword to be determined. Hereinafter, an example of the processing of the text determination unit 107 will be described by taking as an example a case where two keywords “clear edge” and “high absorption” are selected by the keyword pair selection unit 106. That is, the text determination unit 107 refers to the keyword and interpretation report ID 22 list shown in FIG. 19 and extracts the interpretation report ID 22 to which “clear border” is assigned. The text determination unit 107 acquires the text data included in the image findings 24 and the definitive diagnosis result 25 of the interpretation report 21 with the extracted interpretation report ID 22. The text determination unit 107 selects a sentence including “clear border” from the acquired text data. The text determination unit 107 extracts keywords other than “clear edges” from the selected sentence, and creates a keyword vector as shown in FIG. Here, ti (i = 1 to n) is the appearance frequency of the keyword i, n is the number of types of keywords, and the appearance frequency of each keyword is a vector element. Next, the text determination unit 107 creates a keyword vector for “high absorption” in the same manner as in the case of “clear edge”. Finally, the text determination unit 107 calculates a cosine distance between the two generated keyword vectors, determines that the calculated distance is equal to or less than a threshold value, and determines that the synonym is not greater than the threshold value. To do. A detailed algorithm for synonym determination based on such text is described in non-patent literature: “Yamamoto, Umemura,“ Method of constructing a related term list without using a dictionary ”, Information Processing Society of Japan, Vol. 2002 (20). , Pp. 81-88, 2002-03-04 ".

次に、代表画像ベクトル生成部１０８は、テキスト判定部１０７でキーワード対が同義語であると判定された場合に、キーワード対選択部１０６が選択したキーワード対と、読影知識データベース１０３に記憶されている二項間関係情報と、症例データベース１０１に記憶されている医用画像２０とを用いて、キーワード対選択部１０６から取得したキーワードに対する代表画像ベクトルを生成し、画像判定部１０９に出力する（ステップＳ３０５）。 Next, the representative image vector generation unit 108 stores the keyword pair selected by the keyword pair selection unit 106 and the interpretation knowledge database 103 when the text determination unit 107 determines that the keyword pair is a synonym. The representative image vector for the keyword acquired from the keyword pair selection unit 106 is generated using the binomial relationship information and the medical image 20 stored in the case database 101, and is output to the image determination unit 109 (step) S305).

図２１にステップＳ３０５の処理の詳細なフローチャートの一例を示す。以下、図２１を用いて具体的な代表画像ベクトル生成方法について説明する。 FIG. 21 shows an example of a detailed flowchart of the process in step S305. Hereinafter, a specific representative image vector generation method will be described with reference to FIG.

まず初めに、代表画像ベクトル生成部１０８は、キーワード対選択部１０６から取得したキーワード対の中から１つのキーワードを選択する（ステップＳ４０１）。 First, the representative image vector generation unit 108 selects one keyword from the keyword pairs acquired from the keyword pair selection unit 106 (step S401).

次に、代表画像ベクトル生成部１０８は、読影知識データベース１０３から、ステップＳ４０１で選択したキーワードに対する画像重みを取得する（ステップＳ４０２）。画像重みとは、画像特徴量に掛けられる重みのことである。キーワードが読影項目の場合には、図１５に示す（画像特徴量−読影項目）間の相関関係より画像重みを取得する。例えば、キーワードが読影項目１の場合には、画像特徴量１に対する画像重みは０．８０８であり、画像特徴量２に対する画像重みは０．６２７である。また、キーワードが疾病名の場合には、図１６に示す（画像特徴量−疾病名）間の相関関係より画像重みを取得する。例えば、キーワードが疾病名１の場合には、画像特徴量１に対する画像重みは０．６７１であり、画像特徴量２に対する画像重みは０．６９７である。 Next, the representative image vector generation unit 108 acquires the image weight for the keyword selected in step S401 from the interpretation knowledge database 103 (step S402). The image weight is a weight applied to the image feature amount. When the keyword is an interpretation item, the image weight is acquired from the correlation between (image feature amount−interpretation item) shown in FIG. For example, when the keyword is interpretation item 1, the image weight for image feature 1 is 0.808, and the image weight for image feature 2 is 0.627. If the keyword is a disease name, the image weight is acquired from the correlation between (image feature amount−disease name) shown in FIG. For example, when the keyword is disease name 1, the image weight for image feature amount 1 is 0.671, and the image weight for image feature amount 2 is 0.697.

次に、代表画像ベクトル生成部１０８は、ステップＳ４０１で選択したキーワードが付与された医用画像２０に対して画像特徴量ベクトルを算出する（ステップＳ４０３）。キーワードが付与された医用画像２０とは、キーワードを含む読影レポート２１の作成の基となった医用画像２０のことである。画像特徴量ベクトルとは、医用画像２０から算出された画像特徴量をベクトル表現したものである。例えば、画像特徴量としてエッジ強度と輝度分散を算出する場合は、医用画像２０から算出したエッジ強度の平均値ｓと輝度分散の平均値ｔを、画像特徴量ベクトル（ｓ，ｔ）として出力する。実際には、前述のように臓器および病変部分の形状および輝度分布に関する数十〜数百種の画像特徴量を用いるため、画像特徴量ベクトルは数十から数百次元のベクトルとなる。なお、個々の医用画像２０に対して算出される画像特徴量は、予め症例データベース１０１に記憶しておき、症例データベース１０１を参照することで取得しても構わない。これにより、本ステップでの処理時間が低減できる。 Next, the representative image vector generation unit 108 calculates an image feature amount vector for the medical image 20 to which the keyword selected in step S401 is assigned (step S403). The medical image 20 to which the keyword is assigned is the medical image 20 that is the basis for creating the interpretation report 21 including the keyword. The image feature amount vector is a vector representation of the image feature amount calculated from the medical image 20. For example, when calculating the edge strength and the luminance variance as the image feature amount, the average value s of the edge strength and the average value t of the luminance variance calculated from the medical image 20 are output as an image feature amount vector (s, t). . Actually, since tens to hundreds of types of image feature quantities related to the shapes and luminance distributions of organs and lesions are used as described above, the image feature quantity vectors are vectors of tens to hundreds of dimensions. The image feature amount calculated for each medical image 20 may be stored in advance in the case database 101 and acquired by referring to the case database 101. Thereby, the processing time in this step can be reduced.

次に、代表画像ベクトル生成部１０８は、ステップＳ４０５から取得した画像重みを、ステップＳ４０６で算出した画像特徴量ベクトルに掛け合わせ、代表画像ベクトルとして出力する（ステップＳ４０４）。例えば、ステップＳ４０２で画像重み（エッジ強度の重みｗ１，輝度分散の重みｗ２）が取得され、ステップＳ４０３で画像特徴量ベクトル（例えばエッジ強度の平均値ｆ１と輝度分散の平均値ｆ２）が、（ｆ１，ｆ２）のベクトル形式で出力されたとする。この場合、ステップＳ４０４で、代表画像ベクトル生成部１０８は、代表画像ベクトルとして、画像特徴量ベクトルに画像重みを掛け合わせた（ｗ１・ｆ１，ｗ２・ｆ２）を出力する。 Next, the representative image vector generation unit 108 multiplies the image weight acquired from step S405 by the image feature amount vector calculated in step S406, and outputs the result as a representative image vector (step S404). For example, an image weight (edge strength weight w1, luminance variance weight w2) is acquired in step S402, and an image feature vector (for example, average value f1 of edge strength and average value f2 of luminance variance) is obtained in step S403. Assume that the data is output in the vector format of f1, f2). In this case, in step S404, the representative image vector generation unit 108 outputs (w1 · f1, w2 · f2) obtained by multiplying the image feature amount vector by the image weight as the representative image vector.

なお、ステップＳ４０３では、画像特徴量ベクトルを、画像ＩＤ２３と画像特徴量とを対応付けたリスト形式で出力してもよい。この場合、ステップＳ４０４では、画像ＩＤ２３ごとに画像重みと画像特徴量ベクトルを掛け合わせ、最後に掛け合わせにより得られるベクトルの平均ベクトルを代表画像ベクトルとして出力すればよい。 In step S403, the image feature amount vector may be output in a list format in which the image ID 23 and the image feature amount are associated with each other. In this case, in step S404, the image weight and the image feature amount vector are multiplied for each image ID 23, and finally the average vector of the vectors obtained by the multiplication may be output as the representative image vector.

ステップＳ４０４で代表画像ベクトルを算出することにより、前述した「辺縁明瞭」と「高吸収」の同義語関係を画像の類似性を用いて評価する際、「辺縁明瞭」の画像に対しては形状情報、「高吸収」の画像に対しては濃度情報の画像特徴量に大きな重みを重みづけることができるため、画像の類似性に基づくキーワード間の同義語判定を正しく行うことが可能になる。 By calculating the representative image vector in step S404, the synonym relationship between “clear border” and “high absorption” described above is evaluated using the similarity of images, and the image of “clear border” is evaluated. Can be weighted with a large weight on the image feature value of density information for shape information and “high absorption” images, so it is possible to correctly determine synonyms between keywords based on image similarity Become.

最後に、代表画像ベクトル生成部１０８は、キーワード対選択部１０６から取得した全てのキーワードが、ステップＳ４０１において選択されたか否かを判定し、選択されてないキーワードがある場合はステップＳ４０１へ戻り、全てのキーワードが選択されている場合は処理を終了する（ステップＳ４０５）。 Finally, the representative image vector generation unit 108 determines whether all the keywords acquired from the keyword pair selection unit 106 have been selected in step S401. If there is an unselected keyword, the process returns to step S401. If all keywords have been selected, the process ends (step S405).

以上のステップＳ４０１〜Ｓ４０５の処理を行うことにより、ステップＳ３０５において、ステップＳ３０３で選択したキーワードに対する代表画像ベクトルを生成することが可能になる。 By performing the processes in steps S401 to S405, it is possible to generate a representative image vector for the keyword selected in step S303 in step S305.

なお、ステップＳ４０１で選択したキーワードが読影項目の場合は、共起する疾病名による重みを付加した代表画像ベクトルを生成してもよい。 If the keyword selected in step S401 is an interpretation item, a representative image vector to which a weight based on a co-occurring disease name is added may be generated.

図１７に示すように、疾病名と読影項目の相関関係（支持度）は疾病名によって異なる。支持度の値は、疾病名に対する読影項目の寄与度を示す。医師が疾病名を決める際に典型的に利用される読影項目に対する支持度は高くなり、一方、Ｓ３０１で取得された症例が非典型な症例の場合、または、医用同義語辞書作成装置による読影項目の誤抽出があった場合には、疾病名に対する読影項目の支持度は低くなる。代表画像ベクトルは各読影項目に対する典型的な画像特徴量を示すベクトルであり、支持度の低い読影項目が付与された画像は、代表画像ベクトルを作成する際のノイズ要因の一つになる。例えば、「高吸収」という読影項目は、「肝細胞癌（疾病名）」の診断に典型的に用いられるため、「肝細胞癌」に対する「高吸収」の支持度は高い値となり、濃度に関する画像特徴量の重みが大きくなる。一方、「高吸収」は「嚢胞（疾病名）」の診断には殆ど用いられないため、「嚢胞」に対する「高吸収」の支持度の値は低くなり、濃度に関する画像特徴量の重みが小さくなる。よって、代表画像ベクトルを作成する際に、このような支持度の低い症例を取り除くことができれば、より正しく画像の類似性を評価することができ、医用同義語辞書の精度を向上させることができる。 As shown in FIG. 17, the correlation (support level) between the disease name and the interpretation item differs depending on the disease name. The value of support indicates the degree of contribution of the interpretation item to the disease name. The degree of support for interpretation items typically used when a doctor determines a disease name is high. On the other hand, if the case acquired in S301 is an atypical case, or the interpretation item by the medical synonym dictionary creation device If there is an erroneous extraction, the degree of support of the interpretation item for the disease name will be low. The representative image vector is a vector indicating a typical image feature amount for each interpretation item, and an image to which an interpretation item with low support is added becomes one of noise factors when creating a representative image vector. For example, the interpretation item “high absorption” is typically used for the diagnosis of “hepatocellular carcinoma (disease name)”. The weight of the image feature amount increases. On the other hand, since “high absorption” is rarely used for diagnosis of “cyst (disease name)”, the support value of “high absorption” for “cyst” is low, and the weight of the image feature amount related to density is small. Become. Therefore, if a case with such a low degree of support can be removed when creating a representative image vector, the similarity of images can be more correctly evaluated, and the accuracy of the medical synonym dictionary can be improved. .

具体的なステップＳ３０５の処理のフローチャートを図２２に示す。以下、図２２を用いて、読影項目と共起する疾病名による重みを付加した代表画像ベクトル生成方法について説明する。なお、図２１と同じ構成要素については同じ符号を付し、説明を繰り返さない。 A specific flowchart of the process in step S305 is shown in FIG. Hereinafter, a representative image vector generation method to which a weight based on a disease name co-occurring with an interpretation item is added will be described with reference to FIG. The same components as those in FIG. 21 are denoted by the same reference numerals, and description thereof will not be repeated.

ステップＳ４０１〜Ｓ４０３の処理の実行後、代表画像ベクトル生成部１０８は、ステップＳ４０１で取得したキーワードのキーワード属性３１が読影項目か否かを判定し、読影項目の場合はステップＳ５０２に進み、読影項目でなかった場合はステップＳ４０４に進む（ステップＳ５０１）。 After executing the processing in steps S401 to S403, the representative image vector generation unit 108 determines whether or not the keyword attribute 31 of the keyword acquired in step S401 is an interpretation item. If it is an interpretation item, the process proceeds to step S502. If not, the process proceeds to step S404 (step S501).

次に、代表画像ベクトル生成部１０８は、ステップＳ４０１で取得したキーワードと共起する疾病名を取得し、画像ＩＤ２３と共にリスト化する（ステップＳ５０２）。具体的には、代表画像ベクトル生成部１０８は、取得したキーワードが含まれる画像所見２４と同じ読影レポート２１に含まれる確定診断結果２５を、取得したキーワードとともに共起する疾病名として決定する。代表画像ベクトル生成部１０８は、確定診断結果２５を、確定診断結果２５と同じ読影レポート２１に含まれる画像ＩＤ２３と共にリスト化する。 Next, the representative image vector generation unit 108 acquires disease names that co-occur with the keyword acquired in step S401, and lists them together with the image ID 23 (step S502). Specifically, the representative image vector generation unit 108 determines the confirmed diagnosis result 25 included in the same interpretation report 21 as the image finding 24 including the acquired keyword as a disease name that co-occurs with the acquired keyword. The representative image vector generation unit 108 lists the confirmed diagnosis result 25 together with the image ID 23 included in the same interpretation report 21 as the confirmed diagnosis result 25.

次に、代表画像ベクトル生成部１０８は、ステップＳ４０１で選択した読影項目と、ステップＳ５０２で取得した疾病名との間の相関関係（支持度）を、（読影項目−疾病名）の重みとして読影知識データベース１０３から取得し、画像ＩＤ２３と共にリスト化する（ステップＳ５０３）。この画像ＩＤ２３は、ステップＳ５０２で説明した画像ＩＤ２３と同じである。 Next, the representative image vector generation unit 108 interprets the correlation (support level) between the interpretation item selected in step S401 and the disease name acquired in step S502 as a weight of (interpretation item-disease name). Obtained from the knowledge database 103 and listed together with the image ID 23 (step S503). This image ID 23 is the same as the image ID 23 described in step S502.

次に、代表画像ベクトル生成部１０８は、ステップＳ４０５から取得したキーワード重みベクトルと、ステップＳ４０６で算出した画像特徴量ベクトルと、ステップＳ５０３から取得した（読影項目−疾病名）の重みを掛け合わせることにより、代表画像ベクトルを算出する（ステップＳ５０４）。例えば、ステップＳ４０１で読影項目Ａが選択された時に、ステップＳ４０２で読影項目Ａに対する画像重み（ｗ１，ｗ２）が取得されたとする。また、ステップＳ４０３で画像特徴量ベクトル（例えばエッジ強度の値ｆ１と輝度分散の値ｆ２）が、画像ＩＤ２３ごとに（ｆ１，ｆ２）のベクトル形式で算出されたとする。また、ステップＳ５０３では（読影項目−疾病名）の重みαが画像ＩＤ２３ごとに取得されたとする。このとき、ステップＳ５０４では、代表画像ベクトル生成部１０８は、画像特徴量ベクトル（ｆ１，ｆ２）に、画像重み（ｗ１，ｗ２）と（読影項目−疾病名）の重みαを掛け合わせた（α・ｗ１・ｆ１，α・ｗ２・ｆ２）を画像ＩＤ２３ごとに算出し、これらの平均ベクトルを代表画像ベクトルとして出力する。 Next, the representative image vector generation unit 108 multiplies the keyword weight vector acquired from step S405, the image feature amount vector calculated in step S406, and the weight of (interpretation item-disease name) acquired from step S503. Thus, a representative image vector is calculated (step S504). For example, it is assumed that when image interpretation item A is selected in step S401, image weights (w1, w2) for image interpretation item A are acquired in step S402. Further, it is assumed that an image feature amount vector (for example, edge strength value f1 and luminance variance value f2) is calculated in the vector format (f1, f2) for each image ID 23 in step S403. In step S503, it is assumed that the weight α of (interpretation item−disease name) is acquired for each image ID 23. At this time, in step S504, the representative image vector generation unit 108 multiplies the image feature amount vector (f1, f2) by the image weight (w1, w2) and the weight α of (interpretation item−disease name) (α W1 · f1, α · w2 · f2) is calculated for each image ID 23, and an average vector of these is output as a representative image vector.

ステップＳ５０４またはＳ４０４の処理後、ステップＳ４０５の処理が実行される。 After the process of step S504 or S404, the process of step S405 is executed.

以上のステップＳ４０１〜Ｓ４０５およびステップＳ５０１〜Ｓ５０４の処理を行うことにより、ステップＳ３０５において、ステップＳ３０３で選択したキーワードが読影項目の場合に、共起する疾病名による重みを付加した代表画像ベクトルを生成することが可能になる。例えば、「高吸収」という読影項目に対する代表画像ベクトルを作成する場合では、肝細胞癌と共起して利用されている画像に対しては重みが大きく付与され、一方、嚢胞と共起して利用されている画像に対しては小さい重みが付与される。このため、実際には肝細胞癌の画像のみを用いて代表画像ベクトルを作成することができ、「高吸収」に対してより典型的な画像特徴量を表す画像ベクトルを作成することができる。 By performing the processes of steps S401 to S405 and steps S501 to S504, a representative image vector to which a weight based on the name of a co-occurring disease is added in step S305 when the keyword selected in step S303 is an interpretation item is generated. It becomes possible to do. For example, in the case of creating a representative image vector for an interpretation item of “high absorption”, a large weight is given to an image that is used together with hepatocellular carcinoma, while the image is used together with a cyst. A small weight is assigned to the image being used. Therefore, in practice, a representative image vector can be created using only an image of hepatocellular carcinoma, and an image vector representing a more typical image feature amount for “high absorption” can be created.

ここで、図１８に示した医用同義語辞書作成装置１００の動作の説明に戻る。 Here, it returns to description of operation | movement of the medical synonym dictionary creation apparatus 100 shown in FIG.

画像判定部１０９は、代表画像ベクトル生成部１０８から取得した代表画像ベクトルを用いて、キーワード対選択部１０６が選択したキーワード対の同義語関係を再判定し、判定結果を出力部１２１に通知する（ステップＳ３０６）。 The image determination unit 109 uses the representative image vector acquired from the representative image vector generation unit 108 to redetermine the synonym relationship of the keyword pair selected by the keyword pair selection unit 106 and notifies the output unit 121 of the determination result. (Step S306).

例えば、画像判定部１０９は、代表画像ベクトル間のユークリッド距離を算出し、算出した距離が閾値以下の場合に、選択したキーワード対が同義語であると判定し、算出した距離が閾値よりも大きい場合に、選択したキーワード対が同義語でないと判定する。これにより、複数の画像特徴量の中から、医師が着目した画像特徴量のみを用いて画像の類似性を評価することができる。 For example, the image determination unit 109 calculates the Euclidean distance between the representative image vectors, and determines that the selected keyword pair is a synonym when the calculated distance is less than or equal to the threshold, and the calculated distance is greater than the threshold. In this case, it is determined that the selected keyword pair is not a synonym. Accordingly, it is possible to evaluate the similarity of images using only the image feature amount focused by the doctor from among the plurality of image feature amounts.

次に、出力部１２１は、画像判定部１０９で同義語と判定された場合に、キーワード対選択部１０６から取得したキーワード対を、医用同義語辞書に含まれる同義語として記憶部１１０に書き込む（ステップＳ３０７）。これにより、記憶部１１０には複数の同義語を含む医用同義語辞書が記憶される。 Next, when the image determining unit 109 determines that the synonym is synonymous, the output unit 121 writes the keyword pair acquired from the keyword pair selecting unit 106 in the storage unit 110 as a synonym included in the medical synonym dictionary ( Step S307). Thus, a medical synonym dictionary including a plurality of synonyms is stored in the storage unit 110.

最後に、キーワード対選択部１０６は、ステップＳ３０３で全てのキーワード対を選択下か否かを判定し、選択されていないキーワード対がある場合にはステップＳ３０３に戻り、全てのキーワード対が選択されている場合には処理を終了する（ステップＳ３０８）。 Finally, the keyword pair selection unit 106 determines whether or not all keyword pairs are selected in step S303. If there is an unselected keyword pair, the process returns to step S303, and all keyword pairs are selected. If yes, the process ends (step S308).

以上、図１８に示すステップＳ３０１〜Ｓ３０８の処理を実行することにより、医用同義語辞書作成装置１００は、キーワードに適合する画像特徴量を動的に選択することができ、読影レポート２１に対して画像の類似性に基づく医用同義語辞書を作成することができる。 As described above, by executing the processing of steps S301 to S308 shown in FIG. 18, the medical synonym dictionary creation device 100 can dynamically select the image feature quantity that matches the keyword, and the interpretation report 21 A medical synonym dictionary based on image similarity can be created.

（従来手法との比較）
例えば「辺縁明瞭」と「高吸収」の同義語関係を画像の類似性を用いて評価する場合、特許文献１の手法では、形状情報と濃度情報の両方を用いて画像の類似性を評価するため、キーワードとは関係の無い画像特徴量が類似評価に含まれてしまい、同義語であると間違って判定される可能性がある。しかし、本手法では「辺縁明瞭」の画像に対しては形状情報、「高吸収」の画像に対しては濃度情報の値に重みづけて画像の類似度を評価できるため、画像の類似度は低くなり、この２つのキーワードは同義語ではないと正しく判定することができる。 (Comparison with conventional methods)
For example, when the synonym relationship between “clear edge” and “high absorption” is evaluated using image similarity, the technique of Patent Document 1 evaluates image similarity using both shape information and density information. Therefore, an image feature amount that is not related to the keyword is included in the similarity evaluation, and may be erroneously determined as a synonym. However, this method can evaluate the image similarity by weighting the shape information for “clear edge” images and the density information value for “high absorption” images. It is possible to correctly determine that these two keywords are not synonyms.

以上のように、本実施の形態に係る医用同義語辞書作成装置１００は、キーワードに適合する画像特徴量を動的に選択することにより、読影レポート２１に対して画像の類似性に基づいた医用同義語辞書を作成することができる。 As described above, the medical synonym dictionary creation device 100 according to the present embodiment dynamically selects an image feature amount that matches a keyword, thereby making the medical synonym dictionary creation device 100 based on image similarity with respect to the interpretation report 21. A synonym dictionary can be created.

なお、症例データベース１０１、キーワード辞書１０２、および読影知識データベース１０３は、医用同義語辞書作成装置１００に備えられていてもよい。 The case database 101, the keyword dictionary 102, and the interpretation knowledge database 103 may be provided in the medical synonym dictionary creation device 100.

また、症例データベース１０１、キーワード辞書１０２、および読影知識データベース１０３は、医用同義語辞書作成装置１００とネットワークを介して接続されたサーバ上に備えられてもよい。 The case database 101, the keyword dictionary 102, and the interpretation knowledge database 103 may be provided on a server connected to the medical synonym dictionary creating apparatus 100 via a network.

また、読影レポート２１は、医用画像２０内に付属データとして含まれていてもよい。 The image interpretation report 21 may be included in the medical image 20 as attached data.

（実施の形態１の変形例）
図１に示した実施の形態１に係る医用同義語辞書作成装置１００の同義語判定部１２０は、テキスト判定部１０７、代表画像ベクトル生成部１０８、画像判定部１０９の順に処理を実行した。つまり、テキスト判定部１０７が、キーワード対選択部１０６が選択したキーワード対が同義語か否かを判定する。次に、代表画像ベクトル生成部１０８および画像判定部１０９が、テキスト判定部１０７が同義語であると判定したキーワード対について同義語か否かを再判定する。 (Modification of Embodiment 1)
The synonym determination unit 120 of the medical synonym dictionary creation device 100 according to Embodiment 1 illustrated in FIG. 1 performs processing in the order of the text determination unit 107, the representative image vector generation unit 108, and the image determination unit 109. That is, the text determination unit 107 determines whether or not the keyword pair selected by the keyword pair selection unit 106 is a synonym. Next, the representative image vector generation unit 108 and the image determination unit 109 re-determine whether or not the keyword pair determined by the text determination unit 107 as a synonym is a synonym.

実施の形態１の変形例では、同義語判定部による同義語判定の順序が実施の形態１とは異なる。 In the modification of the first embodiment, the order of synonym determination by the synonym determination unit is different from that of the first embodiment.

図２３は、実施の形態１の変形例に係る医用同義語辞書作成装置の特徴的な機能構成を示すブロック図である。 FIG. 23 is a block diagram showing a characteristic functional configuration of a medical synonym dictionary creation device according to a modification of the first embodiment.

医用同義語辞書作成装置１００Ａは、図１に示した医用同義語辞書作成装置１００の構成において、同義語判定部１２０の代わりに、同義語判定部１２０Ａを用いている点が異なる。それ以外の構成は実施の形態１と同様であるため、その詳細な説明は繰り返さない。 The medical synonym dictionary creating device 100A is different from the medical synonym dictionary creating device 100 shown in FIG. 1 in that a synonym judging unit 120A is used instead of the synonym judging unit 120. Since the other configuration is the same as that of the first embodiment, detailed description thereof will not be repeated.

同義語判定部１２０Ａは、代表画像ベクトル生成部１０８と、画像判定部１０９と、テキスト判定部１０７とを含む。各処理部は接続先が実施の形態１とは異なるが、処理は実施の形態１と同様である。 The synonym determination unit 120A includes a representative image vector generation unit 108, an image determination unit 109, and a text determination unit 107. Each processing unit has a connection destination different from that of the first embodiment, but the processing is the same as that of the first embodiment.

つまり、代表画像ベクトル生成部１０８は、キーワード対選択部１０６が選択したキーワード対と、読影知識データベース１０３に記憶されている二項間関係情報と、症例データベース１０１に記憶されている医用画像２０とを用いて、キーワード対選択部１０６が選択したキーワード対を構成する各キーワードに対する代表画像ベクトルを生成し、画像判定部１０９に出力する。 That is, the representative image vector generation unit 108 includes the keyword pair selected by the keyword pair selection unit 106, the binary relation information stored in the interpretation knowledge database 103, and the medical image 20 stored in the case database 101. Is used to generate a representative image vector for each keyword constituting the keyword pair selected by the keyword pair selection unit 106, and outputs the representative image vector to the image determination unit 109.

画像判定部１０９は、代表画像ベクトル生成部１０８が生成した代表画像ベクトルを用いて、キーワード対選択部１０６が選択したキーワード対が同義語であるか否かを判定し、判定結果をテキスト判定部１０７に出力する。 The image determination unit 109 uses the representative image vector generated by the representative image vector generation unit 108 to determine whether the keyword pair selected by the keyword pair selection unit 106 is a synonym, and the determination result is used as a text determination unit. It outputs to 107.

テキスト判定部１０７は、画像判定部１０９でキーワード対選択部１０６が選択したキーワード対が同義語であると判定された場合に、読影レポート２１に基づいて、キーワード対選択部１０６が選択したキーワード対に対して同義語であるか否かの再判定を行い、同義語と判定した場合には、判定結果を出力部１２１に出力する。具体的な同義語判定方法については後述する。 The text determination unit 107 determines the keyword pair selected by the keyword pair selection unit 106 based on the interpretation report 21 when the image determination unit 109 determines that the keyword pair selected by the keyword pair selection unit 106 is a synonym. Whether or not it is a synonym is determined, and if it is determined as a synonym, the determination result is output to the output unit 121. A specific synonym determination method will be described later.

実施の形態１の変形例に係る医用同義語辞書作成装置１００Ａは、キーワードに適合する画像特徴量を動的に選択することにより、読影レポート２１に対して画像の類似性に基づいた医用同義語辞書を作成することができる。 The medical synonym dictionary creation device 100A according to the modification of the first embodiment dynamically selects an image feature amount that matches a keyword, thereby making a medical synonym based on image similarity with respect to the interpretation report 21. You can create a dictionary.

（実施の形態２）
次に、本発明の実施の形態２に係る医用同義語辞書作成装置２００について、図面を用いて詳細に説明する。 (Embodiment 2)
Next, a medical synonym dictionary creating apparatus 200 according to Embodiment 2 of the present invention will be described in detail with reference to the drawings.

本実施の形態の医用同義語辞書作成装置２００は、症例データベース１０１に記憶されている症例データが更新された際に、医用同義語辞書を自動的に更新する特徴を有する。 The medical synonym dictionary creating apparatus 200 according to the present embodiment has a feature of automatically updating the medical synonym dictionary when the case data stored in the case database 101 is updated.

上述の実施の形態１に係る医用同義語辞書作成装置１００は、症例データベース１０１が与えられた際に医用同義語辞書を自動的に算出する。ここで、症例データベース１０１には日々の診断の結果が蓄積され、逐次更新される特徴を持つ。医用同義語辞書に存在しないキーワードを含んだ読影レポート２１が、症例データベース１０１に新しく追加された場合、新たに追加されたキーワードに対しては、そのキーワードと同義語となるキーワードが存在するか否かについて決定されていない。このため、この新たに追加されたキーワードを使った汎用性の高い検索を行うことができないという問題が生じる。 The medical synonym dictionary creation device 100 according to Embodiment 1 described above automatically calculates a medical synonym dictionary when the case database 101 is given. Here, the result of daily diagnosis is accumulated in the case database 101 and has a feature of being sequentially updated. When an interpretation report 21 including a keyword that does not exist in the medical synonym dictionary is newly added to the case database 101, whether or not a keyword that is synonymous with the keyword exists for the newly added keyword. Has not been determined. Therefore, there arises a problem that a highly versatile search using the newly added keyword cannot be performed.

そこで本実施の形態における医用同義語辞書作成装置２００は、症例データベース１０１に記憶されている症例データの更新に応じて、キーワードに関する医用同義語辞書を新たに作成し、記憶部１１０に記憶する。 Therefore, the medical synonym dictionary creating apparatus 200 in the present embodiment newly creates a medical synonym dictionary related to a keyword in response to the update of the case data stored in the case database 101 and stores it in the storage unit 110.

これにより、症例データベース１０１に記憶されている症例データが更新された場合であっても、汎用性の高い検索が可能になる。 Thereby, even when the case data stored in the case database 101 is updated, a highly versatile search is possible.

以下、初めに図２４を参照しながら、医用同義語辞書作成装置２００の各構成について順に説明する。 Hereinafter, each component of the medical synonym dictionary creation device 200 will be described in order with reference to FIG. 24 first.

（実施の形態２：構成の説明）
図２４は、本発明の実施の形態２に係る医用同義語辞書作成装置２００の特徴的な機能構成を示すブロック図である。 (Embodiment 2: Explanation of configuration)
FIG. 24 is a block diagram showing a characteristic functional configuration of the medical synonym dictionary creation device 200 according to Embodiment 2 of the present invention.

図２４において、図１と同じ構成要素については同じ符号を付し、説明を繰り返さない。図２４に示す医用同義語辞書作成装置２００が図１に示す医用同義語辞書作成装置１００と相違する点は、症例データベース１０１から取得した症例から、医用同義語辞書を更新するか否かを判定する更新制御部２０１を有する点である。 24, the same components as those in FIG. 1 are denoted by the same reference numerals, and description thereof will not be repeated. The medical synonym dictionary creating apparatus 200 shown in FIG. 24 is different from the medical synonym dictionary creating apparatus 100 shown in FIG. 1 in that it is determined whether or not to update the medical synonym dictionary from the cases acquired from the case database 101. It is the point which has the update control part 201 which performs.

更新制御部２０１は、症例データベース１０１から取得した医用画像２０および読影レポート２１を用いて、医用同義語辞書を更新するか否かを判定する。ここで、更新すると判定した場合は、更新制御部２０１は、取得部１０４、キーワード抽出部１０５、キーワード対選択部１０６、同義語判定部１２０および出力部１２１を動作させ、医用同義語辞書に含まれる同義語を更新する。一方、更新しないと判定した場合には、更新制御部２０１は、医用同義語辞書に含まれる同義語の更新を行わない。医用同義語辞書を更新するか否かの具体的な判定方法については後述する。 The update control unit 201 determines whether to update the medical synonym dictionary using the medical image 20 and the interpretation report 21 acquired from the case database 101. Here, if it is determined to be updated, the update control unit 201 operates the acquisition unit 104, the keyword extraction unit 105, the keyword pair selection unit 106, the synonym determination unit 120, and the output unit 121, and is included in the medical synonym dictionary. Update synonyms. On the other hand, if it is determined not to update, the update control unit 201 does not update the synonym included in the medical synonym dictionary. A specific method for determining whether or not to update the medical synonym dictionary will be described later.

次に、以上のように構成された医用同義語辞書作成装置２００の動作について説明する。 Next, the operation of the medical synonym dictionary creation device 200 configured as described above will be described.

（実施の形態２：動作の説明）
図２５は、医用同義語辞書作成装置２００が実行する処理の全体的な流れを示すフローチャートである。図２５において、図１８と同じ構成要素については同じ符号を付し、説明を繰り返さない。 (Embodiment 2: Explanation of operation)
FIG. 25 is a flowchart showing an overall flow of processing executed by the medical synonym dictionary creation device 200. 25, the same components as those in FIG. 18 are denoted by the same reference numerals, and description thereof will not be repeated.

更新制御部２０１は、症例データベース１０１から取得した症例データを用いて、医用同義語辞書を更新するか否かを判定する。ここで、医用同義語辞書を更新すると判定した場合は、ステップＳ３０１へ進む。一方、医用同義語辞書を更新しないと判定した場合には、処理を終了する（ステップＳ６０１）。 The update control unit 201 uses the case data acquired from the case database 101 to determine whether or not to update the medical synonym dictionary. If it is determined that the medical synonym dictionary is to be updated, the process proceeds to step S301. On the other hand, if it is determined not to update the medical synonym dictionary, the process ends (step S601).

具体的には、更新制御部２０１は、症例データベース１０１に記憶されている症例データが追加、削除または変更されることにより、症例データが更新された場合に、医用同義語辞書を更新すると判定し、症例データが更新されていない場合に、医用同義語辞書を更新しないと判定する。 Specifically, the update control unit 201 determines to update the medical synonym dictionary when the case data is updated by adding, deleting, or changing the case data stored in the case database 101. When the case data is not updated, it is determined that the medical synonym dictionary is not updated.

更新制御部２０１は、症例データが更新された場合に、全てのキーワードについて医用同義語辞書を更新しても良いし、症例データベース１０１に記憶されている全症例データにおける各キーワードの出現頻度をカウントし、出現頻度が閾値以下のキーワードに対してのみ、医用同義語辞書を更新してもよい。症例データベース１０１内に含まれるキーワードの出現頻度が十分に大きければ、既に十分な数のデータを用いて同義語関係が評価されたことになる。このような高頻度のキーワードが新しく追加された場合は、仮にキーワードベクトル間のコサイン距離および代表画像ベクトル間のユークリッド距離の再計算を行ったとしても値は大きく変化しないため、医用同義語辞書の更新を行う必要性が低い。一方、出現頻度が少ないキーワードに対しては、同義語関係の不確実性が高いため、医用同義語辞書を更新する必要性が高い。このように、症例データベース内のキーワード頻度に応じて同義語辞書の更新の可否を判定することにより、更新時の計算量を低減できるため、更新時間を短縮することができる。 When the case data is updated, the update control unit 201 may update the medical synonym dictionary for all keywords, or count the appearance frequency of each keyword in all case data stored in the case database 101. However, the medical synonym dictionary may be updated only for keywords whose appearance frequency is equal to or less than a threshold. If the appearance frequency of the keyword included in the case database 101 is sufficiently high, the synonym relationship has already been evaluated using a sufficient number of data. When such a high-frequency keyword is newly added, the value does not change greatly even if the cosine distance between keyword vectors and the Euclidean distance between representative image vectors are recalculated. Less need to update. On the other hand, there is a high need for updating the medical synonym dictionary for keywords with a low appearance frequency because of the high uncertainty of synonym relationships. As described above, by determining whether or not the synonym dictionary can be updated according to the keyword frequency in the case database, the amount of calculation at the time of updating can be reduced, so that the updating time can be shortened.

以上のように、本実施の形態に係る医用同義語辞書作成装置２００は、症例データベース１０１に記憶されている症例データが更新された場合であっても、医用同義語辞書を自動的に更新することができるため、より汎用性の高い医用同義語辞書を用いた検索が可能になる。 As described above, the medical synonym dictionary creating apparatus 200 according to the present embodiment automatically updates the medical synonym dictionary even when the case data stored in the case database 101 is updated. Therefore, it is possible to search using a medical synonym dictionary with higher versatility.

以上、本発明に係る医用同義語辞書作成装置について、実施の形態に基づいて説明したが、本発明は、これらの実施の形態に限定されるものではない。本発明の趣旨を逸脱しない限り、当業者が思いつく各種変形を本実施の形態に施したもの、異なる実施の形態における構成要素を組み合わせて構築される形態なども、本発明の範囲内に含まれる。 The medical synonym dictionary creating apparatus according to the present invention has been described based on the embodiments. However, the present invention is not limited to these embodiments. Unless it deviates from the gist of the present invention, various modifications conceived by those skilled in the art have been made in the present embodiment, and forms constructed by combining components in different embodiments are also included in the scope of the present invention. .

例えば、実施の形態２に係る医用同義語辞書作成装置２００の同義語判定部１２０の代わりに、実施の形態１の変形例で説明した同義語判定部１２０Ａを用いても良い。 For example, instead of the synonym determining unit 120 of the medical synonym dictionary creating apparatus 200 according to the second embodiment, the synonym determining unit 120A described in the modification of the first embodiment may be used.

また、実施の形態１または実施の形態２で作成され記憶部１１０に記憶された医用同義語辞書は、診断の支援に用いたり、医用情報の検索に用いたりすることができる。例えば、図２６に示すように、医用同義語辞書データベース３０１と、診断支援装置３０２または検索装置３０３とをインターネット等のネットワーク３０４を介して接続しても良い。医用同義語辞書データベース３０１には、記憶部１１０に記憶されたのと同じ医用同義語辞書が記憶されている。診断支援装置３０２は、医用同義語辞書データベース３０１に記憶されている医用同義語辞書を参照することにより、読影項目または疾病名の同義語も含めて診断支援を行う。また、検索装置３０３は、医用同義語辞書データベース３０１に記憶されている医用同義語辞書を参照することにより、読影項目または疾病名の同義語も含めて類似症例の検索を行う。 Further, the medical synonym dictionary created in the first embodiment or the second embodiment and stored in the storage unit 110 can be used for diagnosis support or for searching for medical information. For example, as shown in FIG. 26, a medical synonym dictionary database 301 and a diagnosis support apparatus 302 or a search apparatus 303 may be connected via a network 304 such as the Internet. The medical synonym dictionary database 301 stores the same medical synonym dictionary stored in the storage unit 110. The diagnosis support apparatus 302 refers to the medical synonym dictionary stored in the medical synonym dictionary database 301 to perform diagnosis support including interpretation items or synonyms of disease names. Further, the search device 303 searches for similar cases including interpretation items or synonyms of disease names by referring to the medical synonym dictionary stored in the medical synonym dictionary database 301.

また、上記の医用同義語辞書作成装置は、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭ、ハードディスクドライブ、ディスプレイユニット、キーボード、マウスなどから構成されるコンピュータシステムとして構成されても良い。ＲＡＭまたはハードディスクドライブには、コンピュータプログラムが記憶されている。マイクロプロセッサが、コンピュータプログラムに従って動作することにより、医用同義語辞書作成装置は、その機能を達成する。ここでコンピュータプログラムは、所定の機能を達成するために、コンピュータに対する指令を示す命令コードが複数個組み合わされて構成されたものである。 In addition, the medical synonym dictionary creation device described above may be specifically configured as a computer system including a microprocessor, ROM, RAM, hard disk drive, display unit, keyboard, mouse, and the like. A computer program is stored in the RAM or hard disk drive. The medical synonym dictionary creating apparatus achieves its function by the microprocessor operating according to the computer program. Here, the computer program is configured by combining a plurality of instruction codes indicating instructions for the computer in order to achieve a predetermined function.

さらに、上記の医用同義語辞書作成装置を構成する構成要素の一部または全部は、１個のシステムＬＳＩ（ＬａｒｇｅＳｃａｌｅＩｎｔｅｇｒａｔｉｏｎ：大規模集積回路）から構成されているとしても良い。システムＬＳＩは、複数の構成部を１個のチップ上に集積して製造された超多機能ＬＳＩであり、具体的には、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどを含んで構成されるコンピュータシステムである。ＲＡＭには、コンピュータプログラムが記憶されている。マイクロプロセッサが、コンピュータプログラムに従って動作することにより、システムＬＳＩは、その機能を達成する。 Furthermore, a part or all of the constituent elements constituting the medical synonym dictionary creating apparatus may be configured by one system LSI (Large Scale Integration). The system LSI is an ultra-multifunctional LSI manufactured by integrating a plurality of components on a single chip, and specifically, a computer system including a microprocessor, ROM, RAM, and the like. . A computer program is stored in the RAM. The system LSI achieves its functions by the microprocessor operating according to the computer program.

さらにまた、上記の医用同義語辞書作成装置を構成する構成要素の一部または全部は、医用同義語辞書作成装置に脱着可能なＩＣカードまたは単体のモジュールから構成されているとしても良い。ＩＣカードまたはモジュールは、マイクロプロセッサ、ＲＯＭ、ＲＡＭなどから構成されるコンピュータシステムである。ＩＣカードまたはモジュールは、上記の超多機能ＬＳＩを含むとしても良い。マイクロプロセッサが、コンピュータプログラムに従って動作することにより、ＩＣカードまたはモジュールは、その機能を達成する。このＩＣカードまたはこのモジュールは、耐タンパ性を有するとしても良い。 Furthermore, a part or all of the constituent elements constituting the medical synonym dictionary creating apparatus may be composed of an IC card or a single module that can be attached to and detached from the medical synonym dictionary creating apparatus. The IC card or module is a computer system that includes a microprocessor, ROM, RAM, and the like. The IC card or the module may include the super multifunctional LSI described above. The IC card or the module achieves its function by the microprocessor operating according to the computer program. This IC card or this module may have tamper resistance.

また、本発明は、上記に示す方法であるとしても良い。また、これらの方法をコンピュータにより実現するコンピュータプログラムであるとしても良いし、前記コンピュータプログラムからなるデジタル信号であるとしても良い。 Further, the present invention may be the method described above. Further, the present invention may be a computer program that realizes these methods by a computer, or may be a digital signal composed of the computer program.

さらに、本発明は、上記コンピュータプログラムまたは上記デジタル信号をコンピュータ読み取り可能な非一時的な記録媒体、例えば、フレキシブルディスク、ハードディスク、ＣＤ−ＲＯＭ、ＭＯ、ＤＶＤ、ＤＶＤ−ＲＯＭ、ＤＶＤ−ＲＡＭ、ＢＤ（Ｂｌｕ−ｒａｙＤｉｓｃ（登録商標））、半導体メモリなどに記録したものとしても良い。また、これらの非一時的な記録媒体に記録されている上記デジタル信号であるとしても良い。 Furthermore, the present invention provides a non-transitory recording medium that can read the computer program or the digital signal, for example, a flexible disk, a hard disk, a CD-ROM, an MO, a DVD, a DVD-ROM, a DVD-RAM, a BD ( It may be recorded on a Blu-ray Disc (registered trademark)), a semiconductor memory, or the like. The digital signal may be recorded on these non-temporary recording media.

また、本発明は、上記コンピュータプログラムまたは上記デジタル信号を、電気通信回線、無線または有線通信回線、インターネットを代表とするネットワーク、データ放送等を経由して伝送するものとしても良い。 In the present invention, the computer program or the digital signal may be transmitted via an electric communication line, a wireless or wired communication line, a network represented by the Internet, a data broadcast, or the like.

また、本発明は、マイクロプロセッサとメモリを備えたコンピュータシステムであって、上記メモリは、上記コンピュータプログラムを記憶しており、上記マイクロプロセッサは、上記コンピュータプログラムに従って動作するとしても良い。 The present invention may be a computer system including a microprocessor and a memory, wherein the memory stores the computer program, and the microprocessor operates according to the computer program.

また、上記プログラムまたは上記デジタル信号を上記非一時的な記録媒体に記録して移送することにより、または上記プログラムまたは上記デジタル信号を上記ネットワーク等を経由して移送することにより、独立した他のコンピュータシステムにより実施するとしても良い。 Further, by recording the program or the digital signal on the non-temporary recording medium and transferring it, or transferring the program or the digital signal via the network or the like, another independent computer It may be implemented by the system.

本発明は、画像診断分野の読影レポートにおける医用同義語辞書作成装置等として利用可能である。 The present invention can be used as a medical synonym dictionary creation device or the like in an interpretation report in the field of image diagnosis.

２０医用画像
２１読影レポート
２２読影レポートＩＤ
２３画像ＩＤ
２４画像所見
２５確定診断結果
３０キーワード名
３１キーワード属性
１００、１００Ａ、２００医用同義語辞書作成装置
１０１症例データベース
１０２キーワード辞書
１０３読影知識データベース
１０４取得部
１０５キーワード抽出部
１０６キーワード対選択部
１０７テキスト判定部
１０８代表画像ベクトル生成部
１０９画像判定部
１１０記憶部
１２０、１２０Ａ同義語判定部
１２１出力部
１２２表示装置
２０１更新制御部
３０１医用同義語辞書データベース
３０２診断支援装置
３０３検索装置
３０４ネットワーク 20 Medical image 21 Interpretation report 22 Interpretation report ID
23 Image ID
24 Image Findings 25 Definitive Diagnosis Results 30 Keyword Name 31 Keyword Attribute 100, 100A, 200 Medical Synonym Dictionary Creation Device 101 Case Database 102 Keyword Dictionary 103 Interpretation Knowledge Database 104 Acquisition Unit 105 Keyword Extraction Unit 106 Keyword Pair Selection Unit 107 Text Determination Unit 107 DESCRIPTION OF SYMBOLS 108 Representative image vector production | generation part 109 Image determination part 110 Memory | storage part 120,120A Synonym determination part 121 Output part 122 Display apparatus 201 Update control part 301 Medical synonym dictionary database 302 Diagnosis support apparatus 303 Search apparatus 304 Network

Claims

医用画像と、当該医用画像を読影した結果が記載された文書データである読影レポートとを取得する取得部と、
医用画像の特徴を示す文字列の読影項目または医用画像の診断結果を示す文字列の疾病名であるキーワードが登録されているキーワード辞書データを参照して、前記取得部が取得した読影レポートから前記キーワード辞書データに登録されているキーワードを抽出するキーワード抽出部と、
前記キーワード抽出部が抽出したキーワードからキーワード対を選択するキーワード対選択部と、
前記キーワード対選択部が選択したキーワード対が同義語であるか否かを判定する同義語判定部と、
前記同義語判定部が同義語であると判定したキーワード対を、医用同義語辞書に含まれる同義語として出力する出力部と
を備え、
前記同義語判定部は、（ｉ）前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定し、（ｉｉ）医用画像から抽出される各画像特徴量と前記医用画像に対するキーワードとの間の関連性を予め定めた二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを作成し、前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定し、（ｉｉｉ）２つの判定結果が共に同義語であることを示す場合に、前記キーワード対選択部が選択したキーワード対が同義語であると判定する
医用同義語辞書作成装置。 An acquisition unit that acquires a medical image and an interpretation report that is document data describing a result of interpretation of the medical image;
From the interpretation report acquired by the acquisition unit with reference to keyword dictionary data in which a keyword that is a disease name of a character string indicating a diagnostic result of a medical image or a character string indicating a medical image characteristic is registered A keyword extraction unit for extracting keywords registered in the keyword dictionary data;
A keyword pair selection unit for selecting a keyword pair from the keywords extracted by the keyword extraction unit;
A synonym determination unit that determines whether the keyword pair selected by the keyword pair selection unit is a synonym;
An output unit that outputs the keyword pair determined by the synonym determination unit as a synonym as a synonym included in the medical synonym dictionary;
The synonym determination unit (i) determines whether the keyword pair is a synonym based on the interpretation report, and (ii) for each image feature amount extracted from the medical image and the medical image For each image feature amount calculated from the medical image that is the basis for creating the keyword for each keyword constituting the keyword pair, based on the binary relation information that defines the relationship between the keyword in advance. As the relationship between the image feature amount and the keyword is higher, a larger value is weighted to create an image feature amount vector of each keyword having each weighted image feature amount as an element, and the keyword pair To determine whether the keyword pair is a synonym, and (iii) the two determination results are both synonyms. To indicate that determines medical synonym dictionary creating apparatus and keyword pair the keyword pair selecting unit selects is a synonym.

前記同義語判定部は、
前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定するテキスト判定部と、
前記テキスト判定部で同義語であると判定された場合に、前記二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを生成する代表画像ベクトル生成部と、
前記代表画像ベクトル生成部が生成した前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定する画像判定部と
を含み、
前記出力部は、前記画像判定部が同義語であると判定したキーワード対を、前記医用同義語辞書に含まれる同義語として出力する
請求項１記載の医用同義語辞書作成装置。 The synonym determination unit
A text determination unit that determines whether the keyword pair is a synonym based on the interpretation report;
When it is determined that the text determination unit is a synonym, each calculated from the medical image that is the basis for creating the keyword for each keyword constituting the keyword pair, based on the binomial relationship information The image feature quantity vector of each keyword having each weighted image feature quantity as an element is obtained by weighting a larger value as the relationship between the image feature quantity and the keyword is higher with respect to the image feature quantity. A representative image vector generation unit to generate;
An image determination unit that determines whether the keyword pair is a synonym by comparing two image feature vectors for the keyword pair generated by the representative image vector generation unit;
The medical synonym dictionary creation device according to claim 1, wherein the output unit outputs a keyword pair determined by the image determination unit as a synonym as a synonym included in the medical synonym dictionary.

前記同義語判定部は、
前記二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより各キーワードの画像特徴量ベクトルを生成する代表画像ベクトル生成部と、
前記代表画像ベクトル生成部が生成した前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定する画像判定部と、
前記画像判定部で同義語であると判定された場合に、前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定するテキスト判定部と
を含み、
前記出力部は、前記テキスト判定部が同義語であると判定したキーワード対を、前記医用同義語辞書に含まれる同義語として出力する
請求項１記載の医用同義語辞書作成装置。 The synonym determination unit
Based on the binomial relationship information, for each image feature amount calculated from the medical image that is the basis for creating the keyword for each keyword that constitutes the keyword pair, between the image feature amount and the keyword A representative image vector generation unit that generates an image feature vector of each keyword by weighting a larger value as the relevance of
An image determination unit that determines whether or not the keyword pair is a synonym by comparing two image feature amount vectors for the keyword pair generated by the representative image vector generation unit;
A text determination unit that determines whether the keyword pair is a synonym based on the interpretation report when it is determined by the image determination unit to be a synonym;
The medical synonym dictionary creation device according to claim 1, wherein the output unit outputs a keyword pair determined by the text determination unit as a synonym as a synonym included in the medical synonym dictionary.

前記テキスト判定部は、前記キーワード対を構成する各キーワードについて、前記読影レポート中の当該キーワードを含む文章中の当該キーワード以外のキーワードの出現頻度をベクトルの要素とするキーワードベクトルを作成し、作成したキーワードベクトル間の距離が第１閾値以下であれば、前記キーワード対が同義語であると判定する
請求項２または３に記載の医用同義語辞書作成装置。 The text determination unit creates, for each keyword constituting the keyword pair, a keyword vector having the frequency of occurrence of keywords other than the keyword in the sentence including the keyword in the interpretation report as a vector element. The medical synonym dictionary creation device according to claim 2 or 3, wherein the keyword pair is determined to be a synonym if a distance between keyword vectors is equal to or less than a first threshold value.

前記代表画像ベクトル生成部は、前記キーワード対を構成するキーワードが読影項目である場合、医用画像から抽出される各画像特徴量と前記医用画像に対する読影項目との関連性を予め定めた二項間関係情報に基づいて、当該キーワードの作成の基となった医用画像から算出した各画像特徴量に当該画像特徴量と読影項目である当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより当該キーワードの画像特徴量ベクトルを生成する
請求項２または３に記載の医用同義語辞書作成装置。 When the keyword constituting the keyword pair is an interpretation item, the representative image vector generation unit is configured to determine the relevance between each image feature amount extracted from the medical image and the interpretation item for the medical image. Based on the relationship information, each image feature amount calculated from the medical image on which the keyword is created is weighted with a larger value as the relationship between the image feature amount and the keyword that is the interpretation item is higher. The medical synonym dictionary creation device according to claim 2 or 3, wherein an image feature quantity vector of the keyword is generated by performing.

前記代表画像ベクトル生成部は、前記キーワード対を構成するキーワードが疾病名である場合、医用画像から抽出される各画像特徴量と前記医用画像に対する疾病名との関連性を予め定めた二項間関係情報に基づいて、当該キーワードの作成の基となった医用画像から算出した各画像特徴量に当該画像特徴量と疾病名である当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより当該キーワードの画像特徴量ベクトルを生成する
請求項２または３に記載の医用同義語辞書作成装置。 When the keyword constituting the keyword pair is a disease name, the representative image vector generation unit determines a relationship between each image feature amount extracted from the medical image and a disease name with respect to the medical image. Based on the relationship information, each image feature amount calculated from the medical image on which the keyword is created is weighted with a larger value as the relationship between the image feature amount and the keyword that is the disease name is higher. The medical synonym dictionary creation device according to claim 2 or 3, wherein an image feature quantity vector of the keyword is generated by performing.

前記代表画像ベクトル生成部は、前記キーワード対を構成するキーワードが読影項目である場合、（ｉ）医用画像から抽出される各画像特徴量と前記医用画像に対する読影項目との関連性を予め定めた二項間関係情報に基づいて、当該キーワードの作成の基となった医用画像から算出した各画像特徴量に当該画像特徴量と読影項目である当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うとともに、（ｉｉ）前記読影レポートの中から当該キーワードと共起する疾病名を検出し、読影項目と疾病名との関連性を予め定めた二項間関係情報に基づいて、前記各画像特徴量を読影項目である当該キーワードと当該キーワードと共起する前記疾病名との間の関連性が高いほど大きな値の重みでさらに重み付けを行うことにより、重み付けされた各画像特徴量を要素とする当該キーワードの画像特徴量ベクトルを生成する
請求項２または３に記載の医用同義語辞書作成装置。 When the keyword constituting the keyword pair is an interpretation item, the representative image vector generation unit predetermines a relationship between each image feature amount extracted from the medical image and the interpretation item for the medical image. Based on the binomial relationship information, each image feature amount calculated from the medical image that is the basis for creating the keyword has a higher value as the relationship between the image feature amount and the keyword that is the interpretation item is higher. And (ii) detecting a disease name that co-occurs with the keyword from the interpretation report, and based on the binary relation information that defines the relationship between the interpretation item and the disease name in advance, Each image feature amount is weighted with a greater weight as the relevance between the keyword that is the interpretation item and the disease name that co-occurs with the keyword is higher. Medical synonym dictionary creation device according to claim 2 or 3 to generate image feature vectors of the keywords that each image feature amount with the element.

前記キーワード対選択部は、読影項目同士または疾病名同士のキーワード対のみを選択する
請求項１〜７のいずれか１項に記載の医用同義語辞書作成装置。 The medical synonym dictionary creation device according to any one of claims 1 to 7, wherein the keyword pair selection unit selects only keyword pairs of interpretation items or disease names.

さらに、
前記出力部が出力するキーワード対を、前記医用同義語辞書に含まれる同義語として記憶する記憶部を備える
請求項１〜８のいずれか１項に記載の医用同義語辞書作成装置。 further,
The medical synonym dictionary creation device according to claim 1, further comprising a storage unit that stores the keyword pairs output by the output unit as synonyms included in the medical synonym dictionary.

前記取得部は、医用画像と当該医用画像に対する読影レポートとの組である症例データが記憶されている症例データベースから、前記医用画像と前記読影レポートとを取得し、
前記医用同義語辞書作成装置は、さらに、
前記症例データベースに記憶されている症例データが更新されているか否かを判断し、前記症例データが更新されていると判断した場合に、前記取得部、前記キーワード抽出部、前記キーワード対選択部、前記同義語判定部および前記出力部を動作させ、前記医用同義語辞書に含まれる同義語を更新する更新制御部を備える
請求項１〜９のいずれか１項に記載の医用同義語辞書作成装置。 The acquisition unit acquires the medical image and the interpretation report from a case database storing case data that is a set of a medical image and an interpretation report for the medical image,
The medical synonym dictionary creation device further includes:
It is determined whether or not case data stored in the case database is updated, and when it is determined that the case data is updated, the acquisition unit, the keyword extraction unit, the keyword pair selection unit, The medical synonym dictionary creation device according to any one of claims 1 to 9, further comprising an update control unit that operates the synonym determination unit and the output unit to update a synonym included in the medical synonym dictionary. .

前記更新制御部は、前記症例データが更新されていると判断した場合に、前記取得部、前記キーワード抽出部、前記キーワード対選択部、前記同義語判定部および前記出力部を動作させることにより、前記医用同義語辞書に含まれる全てのキーワードについて同義語を更新する
請求項１０に記載の医用同義語辞書作成装置。 When the update control unit determines that the case data has been updated, by operating the acquisition unit, the keyword extraction unit, the keyword pair selection unit, the synonym determination unit, and the output unit, The medical synonym dictionary creation device according to claim 10, wherein synonyms are updated for all keywords included in the medical synonym dictionary.

前記更新制御部は、（ｉ）前記症例データベースに記憶されている前記症例データにおける各キーワードの出現頻度を算出し、（ｉｉ）前記症例データが更新されていると判断した場合に、前記取得部、前記キーワード抽出部、前記キーワード対選択部、前記同義語判定部および前記出力部を動作させることにより、出現頻度が第２閾値以下のキーワードについてのみ同義語を更新する
請求項１０に記載の医用同義語辞書作成装置。 The update control unit calculates (i) the appearance frequency of each keyword in the case data stored in the case database, and (ii) the acquisition unit when determining that the case data is updated. The synonym is updated only for a keyword whose appearance frequency is a second threshold value or less by operating the keyword extraction unit, the keyword pair selection unit, the synonym determination unit, and the output unit. Synonym dictionary creation device.

医用画像と、当該医用画像を読影した結果が記載された文書データである読影レポートとを取得する取得ステップと、
医用画像の特徴を示す文字列の読影項目または医用画像の診断結果を示す文字列の疾病名であるキーワードが登録されているキーワード辞書データを参照して、前記取得ステップで取得された読影レポートから前記キーワード辞書データに登録されているキーワードを抽出するキーワード抽出ステップと、
前記キーワード抽出ステップで抽出されたキーワードからキーワード対を選択するキーワード対選択ステップと、
前記キーワード対選択ステップで選択されたキーワード対が同義語であるか否かを判定する同義語判定ステップと、
前記同義語判定ステップで同義語であると判定されたキーワード対を、医用同義語辞書に含まれる同義語として出力する出力ステップと
を含み、
前記同義語判定ステップでは、（ｉ）前記読影レポートに基づいて、前記キーワード対が同義語であるか否かを判定し、（ｉｉ）医用画像から抽出される各画像特徴量と前記医用画像に対するキーワードとの間の関連性を予め定めた二項間関係情報に基づいて、前記キーワード対を構成するキーワードごとに当該キーワードの作成の基となった医用画像から算出した各画像特徴量に対して当該画像特徴量と当該キーワードとの間の関連性が高いほど大きな値の重み付けを行うことにより、重み付けされた各画像特徴量を要素とする各キーワードの画像特徴量ベクトルを作成し、前記キーワード対に対する２つの画像特徴量ベクトルを比較することにより、前記キーワード対が同義語であるか否かを判定し、（ｉｉｉ）２つの判定結果が共に同義語であることを示す場合に、前記キーワード対選択ステップで選択されたキーワード対が同義語であると判定する
医用同義語辞書作成方法。 An acquisition step of acquiring a medical image and an interpretation report that is document data describing a result of interpretation of the medical image;
From the image interpretation report acquired in the acquisition step with reference to keyword dictionary data in which a keyword that is a disease name of a character string indicating a diagnostic result of a medical image or a character string indicating a medical image is registered. A keyword extraction step of extracting a keyword registered in the keyword dictionary data;
A keyword pair selection step of selecting a keyword pair from the keywords extracted in the keyword extraction step;
A synonym determination step of determining whether or not the keyword pair selected in the keyword pair selection step is a synonym;
Outputting the keyword pairs determined to be synonyms in the synonym determining step as synonyms included in the medical synonym dictionary; and
In the synonym determining step, (i) based on the interpretation report, it is determined whether the keyword pair is a synonym, and (ii) each image feature amount extracted from the medical image and the medical image For each image feature amount calculated from the medical image that is the basis for creating the keyword for each keyword constituting the keyword pair, based on the binary relation information that defines the relationship between the keyword in advance. As the relationship between the image feature amount and the keyword is higher, a larger value is weighted to create an image feature amount vector of each keyword having each weighted image feature amount as an element, and the keyword pair To determine whether the keyword pair is a synonym, and (iii) the two determination results are the same. To indicate that it is a word, a medical synonym dictionary creation method determines that the keyword pair selected keyword pair selected in step synonymous.

請求項１３に記載の医用同義語辞書作成方法に含まれる各ステップをコンピュータに実行させるためのプログラム。 A program for causing a computer to execute each step included in the medical synonym dictionary creating method according to claim 13.