CN114332381B

CN114332381B - Aortic CT image key point detection method and system based on three-dimensional reconstruction

Info

Publication number: CN114332381B
Application number: CN202210005936.8A
Authority: CN
Inventors: 张百海; 李浩天; 柴森春; 王昭洋; 崔灵果; 姚分喜
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2022-01-05
Filing date: 2022-01-05
Publication date: 2024-06-11
Anticipated expiration: 2042-01-05
Also published as: CN114332381A

Abstract

The invention discloses an aortic CT image key point detection method and system based on three-dimensional reconstruction, comprising the steps of marking key points of an upper body image of a human body to obtain a three-dimensional aortic CT image key point data set; acquiring a two-dimensional aortic CT image key point data set through multi-scale cutting mapping; performing target aortic frame selection, and cutting to obtain a 0/1heatmap image and an offset image; improving the Vnet network, inputting the 0/1heatmap graph and the offset graph into the improved Vnet network for training, and generating a key point circle on the multi-scale view; and carrying out space geometry three-dimensional reconstruction, obtaining three-dimensional coordinates of key points, and finishing the detection of the key points of the aortic CT image. The invention can effectively solve the problems existing in the aortic CT image processing process: the medical image has the advantages of small sample size, long training time, insufficient training precision, large regression difficulty and the like.

Description

Aortic CT image key point detection method and system based on three-dimensional reconstruction

Technical Field

The invention belongs to the technical field of aortic CT image processing, and particularly relates to an aortic CT image key point detection method and system based on three-dimensional reconstruction.

Background

Medical image processing is taken as one of the important means of the current medical research, and provides a very important basis for acquiring and researching detailed information in aspects of tissue structure dissection, clinical operation guidance planning, focus analysis, pathological positioning and the like. With the development of machine learning technology, medical image processing technology has been increasingly highlighted in medical research and even clinical diagnosis, and intelligent deep learning-based processing technology has been widely applied.

A number of methods for image processing detection for medical imaging have been developed at present: a detection method based on a region, a traditional detection method based on a shape model, a detection method based on cascade shape regression and a detection method based on a neural network.

The detection method based on the region divides the image into two parts of a target and a background, and if the classification detection of multiple targets is required to be completed, the image is also required to be marked. The gray values between adjacent pixels within a target or background are similar, but there is a difference between different targets or backgrounds. The art of thresholding based on local or global information is defined by text et al as a context-based and non-context processing method, which can be further classified as local thresholding in terms of processing area, or as adaptive thresholding. The threshold processing utilizes the image gray histogram information, has small calculation amount, is beneficial to realization, but is easy to cause error detection, and meanwhile, spatial information is not considered, artifacts are easy to be generated due to noise and the like, and the threshold processing is mainly used as a preprocessing method. Varga-Szemes et al have tried to construct a semi-automatic cardiac MRI detection treatment based on myocardial signal intensity threshold on this basis and compared with conventional contour-based image processing methods, and as a result, the thresholding method is shown to be less time-consuming, and has better consistency with SV for aortic blood flow measurement for EDV, ESW, SV, EF than conventional methods.

The key point detection method based on the shape Model is mainly ASM (ACTIVE SHAPE Model) algorithm, and the algorithm is proposed by Cootes. The algorithm is an algorithm based on a point distribution model (Point Distribution Model, PDM). In PDM, the geometry of objects with similar shapes, such as faces, aorta, heart, lungs, etc., can be represented by forming a shape vector by sequentially concatenating coordinates of several key points (landmarks). The ASM algorithm needs to calibrate a training set by a manual calibration method, obtain a shape model through training, and realize the matching of specific objects through the matching of key points. The ASM algorithm has the advantages of simple and direct model, clear and definite architecture, easy understanding and application, and strong constraint on the outline shape, but the key point positioning mode similar to the exhaustive search limits the operation efficiency to a certain extent.

Cascade-based shape regression detection was first proposed by Sun et al. The method applies CNN to key point detection to form a cascade CNN (with three levels) -DCNN (Deep Convolutional Network), and the method belongs to a cascade regression method. The cascade convolution neural network with three levels is carefully designed, the problem of local optimum caused by improper initial condition is solved, and more accurate key point detection is obtained by means of strong feature extraction capability of CNN. The accurate key point positions are obtained gradually from coarse to fine, and a local weight sharing mechanism is introduced, so that the positioning performance of the network is improved.

Further neural network algorithms based on cascaded regression algorithms are more widely used in medical image processing. Unet as a structure based on the expansion and modification of the full convolution network, the network mainly comprises two parts: a convergent path (connecting path) to obtain context information and a symmetrical divergent path (expanding path) for accurate positioning. However, in order to solve such a problem, a Vnet network is proposed and widely used in the field of medical image processing. Compared with Unet networks, the Vnet adds the sampling process of the original image in the last step in the sampling process of the layers, combines the information ignored in the convolution process with the convolution result, and realizes the reutilization of the information. However, the processing of the three-dimensional image still has the problems of slow training speed caused by excessive image pixels, vanishing training gradient caused by long paths between target points and the like.

The above methods are all desirable for key point detection of medical images, but still suffer from a number of drawbacks. The problems of small sample size, long training time, insufficient training precision, large regression difficulty and the like of medical images still restrict the process of medical image processing. Aiming at the problems, the invention creatively provides an aortic CT image key point detection method and system based on three-dimensional reconstruction.

Disclosure of Invention

Aiming at solving various problems existing in the aspect of current aortic CT image processing, the invention provides an aortic CT image key point detection method and system based on three-dimensional reconstruction.

In order to achieve the above object, the present invention provides an aortic CT image key point detection method based on three-dimensional reconstruction, comprising the following steps:

marking key points of an image of the upper body of a human body, and obtaining a three-dimensional aortic CT image key point data set, wherein the key points are sinus canal juncture, a juncture of an ascending aorta and an aortic arch, a juncture of the aortic arch and a descending aorta and an iliac artery revealing position;

performing multi-scale cutting mapping on the three-dimensional aortic CT image key point data set to obtain a two-dimensional aortic CT image key point data set;

Performing target aortic frame selection based on the two-dimensional aortic CT image key point data set, and cutting to obtain a 0/1heatmap chart and an offset chart, wherein the 0/1heatmap chart is a 0/1heatmap key point chart with voxel values of 1 in a preset range around the key point; the offset map is a key point offset map taking the position offset as a standard;

Improving the Vnet network, inputting the 0/1heatmap graph and the offset graph into the improved Vnet network for training, and generating a key point circle on the multi-scale view;

And carrying out space geometry three-dimensional reconstruction based on the key point circle on the multi-scale view, obtaining three-dimensional coordinates of the key point, and finishing the detection of the key point of the aortic CT image.

Optionally, the three-dimensional aortic CT image key point data set includes a single point three-dimensional data set, a 4-point three-dimensional single classification data set, and a 4-point three-dimensional 4-classification data set.

Optionally, the two-dimensional aortic CT image key point dataset comprises a single point two-dimensional dataset, a 4 point single classification two-dimensional dataset, and a 4 point 4 classification two-dimensional dataset.

Optionally, the 0/1heatmap map and the offset map obtained by cutting include a single-point two-dimensional 0/1heatmap map and the offset map, a 4-point single-classification two-dimensional 0/1heatmap map and the offset map, and a 4-point 4-classification two-dimensional 0/1heatmap map and the offset map.

Optionally, after obtaining the 4-point 4-classification two-dimensional 0/1heatmap map, onehot encoding is further performed on the 4-point 4-classification two-dimensional 0/1heatmap map.

Optionally, the method for improving the Vnet network is to add an intermediate supervisory layer in the Vnet network and set a learning objective function to obtain the improved Vnet network.

Optionally, the training performed by inputting the single-point two-dimensional 0/1heatmap graph and the offset graph, the 4-point single-classification two-dimensional 0/1heatmap graph and the offset graph, and the 4-point 4-classification two-dimensional 0/1heatmap graph and the offset graph are respectively performed by selecting a loss function.

Optionally, the single-point two-dimensional 0/1heatmap graph and the offset graph adopt a combination loss function of linear combination of two loss functions of cross entropy and Dice to perform network training;

The 4-point single-classification two-dimensional 0/1heatmap graph and the offset graph adopt a combination loss function of cross entropy and Dice two loss functions in linear combination, and then adopt a focal loss function to perform network training;

the 4-point 4-classification two-dimensional 0/1heatmap graph and the offset graph are subjected to network training by adopting a focal loss function.

Optionally, the three-dimensional coordinates of the key points include three-dimensional coordinates of single-point key points, three-dimensional coordinates of 4-point single-classification key points and three-dimensional coordinates of 4-point 4-classification key points.

In order to achieve the above object, the present invention provides an aortic CT image keypoint detection system based on three-dimensional reconstruction, comprising: the device comprises a labeling module, a cutting mapping module, a frame selection module, a training module and a reconstruction module;

The labeling module is used for labeling key points of the upper body image of the human body to obtain a three-dimensional aortic CT image key point data set, wherein the key points are sinus canal juncture, a boundary point between an ascending aorta and an aortic arch, a boundary point between the aortic arch and a descending aorta and an iliac artery revealing position;

The cutting mapping module is used for carrying out multi-scale cutting mapping on the three-dimensional aortic CT image key point data set to obtain a two-dimensional aortic CT image key point data set;

the frame selection module is used for performing target aortic frame selection on the two-dimensional aortic CT image key point dataset, and cutting to obtain a 0/1heatmap image and an offset image;

the training module is used for improving the Vnet network, inputting the 0/1heatmap graph and the offset graph into the improved Vnet network for training, and generating a key point circle on the multi-scale view;

And the reconstruction module is used for carrying out space geometry three-dimensional reconstruction according to the key point circle on the multi-scale view, obtaining the three-dimensional coordinates of the key point and finishing the detection of the key point of the aortic CT image.

Compared with the prior art, the invention has the following advantages and technical effects:

1. aiming at the problems of insufficient sample quantity and lower sample reliability of the current aortic CT image, the invention creatively establishes a key point data set in the aspect of the aortic CT image by means of manual labeling, and simultaneously cooperates with hospital authoritative specialists to ensure the effectiveness and reliability of the data set.

2. And carrying out multi-scale cutting on the three-dimensional aorta CT image, and reserving two-dimensional images of the three-dimensional aorta CT image in front view, side view, top view and other scale views to greatly amplify sample data.

3. In order to solve the problems of overlong training time, insufficient training precision and the like, the invention provides the method for mapping or cutting the original three-dimensional model of the aortic key points into a two-dimensional space, detecting the key points in the two-dimensional image, and reconstructing the detection result in three dimensions by means of Euclidean space geometry, thereby achieving the effects of reducing the training difficulty and the training time on the basis of data augmentation. The two-dimensional image can be trained by directly utilizing the two-dimensional Vnet, so that the powerful data processing capability of the two-dimensional image in the aspects of edge information and information with uneven gray level distribution is ensured.

4. Different kinds of loss functions are selected for training for different numbers of different kinds of key point detection objects, cross entropy loss and dice loss are adopted for training for single-point two-class data sets, and focal loss functions are adopted for training for multi-point and multi-class data sets so as to improve accuracy as much as possible.

5. An intermediate supervision layer is added in the original Vnet network to solve the problems of easy gradient disappearance, high regression difficulty and the like of data in the training process, so as to achieve the effect of reducing the training operation cost and operation time to the greatest extent.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application. In the drawings:

Fig. 1 is a schematic flow chart of an aortic CT image key point detection method based on three-dimensional reconstruction according to the first embodiment of the invention;

FIG. 2 is a three-dimensional generation schematic diagram of an aortic CT image according to the first embodiment of the invention;

fig. 3 (a) is a single-point schematic diagram of an aortic start key point according to the first embodiment of the present invention, fig. 3 (b) is a single-point schematic diagram of an aortic arch01 key point according to the first embodiment of the present invention, fig. 3 (c) is a single-point schematic diagram of an aortic arch02 key point according to the first embodiment of the present invention, and fig. 3 (d) is a single-point schematic diagram of an aortic end key point according to the first embodiment of the present invention;

FIG. 4 is a schematic diagram of a 4-point single classification of aortic keypoints according to the first embodiment of the invention;

FIG. 5 is a schematic diagram of a 4-point 4-classification of aorta with different view displays according to the first embodiment of the invention;

fig. 6 is a schematic structural diagram of an aortic CT image key point detection system based on three-dimensional reconstruction according to a second embodiment of the invention.

Detailed Description

It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other. The application will be described in detail below with reference to the drawings in connection with embodiments.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer executable instructions, and that although a logical order is illustrated in the flowcharts, in some cases the steps illustrated or described may be performed in an order other than that illustrated herein.

Example 1

As shown in fig. 1, the method for detecting the key points of the aortic CT image based on three-dimensional reconstruction is provided in the present embodiment;

marking key points of an image of the upper body of a human body, and obtaining a three-dimensional aortic CT image key point data set, wherein the key points are sinus canal juncture, a boundary point of an ascending aorta and an aortic arch, a boundary point of the aortic arch and a descending aorta and an iliac artery revealing position; the three-dimensional aortic CT image key point data set comprises a single-point three-dimensional data set, a 4-point three-dimensional single-classification data set and a 4-point three-dimensional 4-classification data set.

In this embodiment, the labeling of aortic keypoints is mainly performed using 3D slicers software. The 3D slicer software is utilized to label the CT image of the upper body of the human body given by the hospital, and the frame selection labeling tool is utilized to select the aortic partial frames in each aortic image group consisting of 800 to 1000 CT two-dimensional images, and three-dimensional generation is carried out, as shown in fig. 2, and the three-dimensional generation is cooperated with the hospital to improve the problem of insufficient precision possibly occurring in the aortic labeling process. And the three-dimensional generated accurate aorta is marked in a segmentation way according to the physiological structure of the aorta, so that the key points are marked in a segmentation way.

The key points of the aorta are marked, and 4 points are mainly marked and respectively named. Starting point of the aorta starting from the left ventricle: sinus tube interface, named start; the boundary point of the ascending aorta and the aortic arch is named as arch01; the boundary point of the aortic arch and the descending aorta is named as arch02; the descending aortic bifurcation, the iliac artery starter, is designated end. These 4 key points are marked to complete the division of the 4 segments of the aorta (ascending aorta, aortic arch, descending aorta, iliac artery). And labeling more than 200 groups to initially complete the establishment of the three-dimensional aortic CT image key point data set.

And storing the position information of the 4 key points of the marked aorta into json format, and programming to read json position information. Firstly, reading the json format file into vscode to generate a dictionary, traversing the two hundred groups of position files which are marked, and reading the position information of 4 points in each group of position files.

And each point of different types of read data is utilized to generate a data graph aiming at the single aortic key point position information by utilizing the different types of ids in the json positions. And meanwhile, the voxel values in the range of 0-6mm of the store position are 1, so that a single point map corresponding to the single point position data can be obtained, and a single point three-dimensional data set is generated, as shown in fig. 3.

On the basis of obtaining the single-point three-dimensional data set, single points in each group of data can be circularly traversed so as to reproduce 4 points in the same data set and generate a 4-point three-dimensional single-classification data set, as shown in fig. 4.

On the basis of completing the 4 key point diagram, classifying each key point so as to realize the multi-point classification regression problem of the neural network output structure. The 4 types of single points are classified and marked in space by means of different json position ids of different points, and a 4-point three-dimensional 4-classification data set is generated, as shown in fig. 5.

The method is used for generating single-point, 4-point single-classification and 4-point 4-classification on 200 groups of data of an experiment respectively, and a three-dimensional aortic CT image key point data set can be obtained.

Performing multi-scale cutting mapping on the three-dimensional aortic CT image key point data set to obtain a two-dimensional aortic CT image key point data set; the two-dimensional aortic CT image key point data set comprises a single-point two-dimensional data set, a 4-point single-classification two-dimensional data set and a 4-point 4-classification two-dimensional data set.

In this embodiment, a single-point three-dimensional dataset, a 4-point three-dimensional single-classification dataset and a 4-point three-dimensional 4-classification dataset in a three-dimensional aortic CT image key point dataset are subjected to multi-azimuth cutting mapping by means of a 3D slider image processing tool, and a front view, a side view, a top view and some other scale images of the three-dimensional aortic CT image key point dataset are taken to establish a two-dimensional aortic CT image key point dataset consisting of the single-point two-dimensional dataset, the 4-point single-classification two-dimensional dataset and the 4-point 4-classification two-dimensional dataset.

Performing target aortic frame selection based on a two-dimensional aortic CT image key point dataset, and cutting to obtain a 0/1heatmap diagram and an offset diagram, wherein the 0/1heatmap diagram is a 0/1heatmap key point diagram with voxel values of 1 in a preset range around the key point; the offset map is a key point offset map with the position offset as a standard; the 0/1heatmap diagram and the offset diagram respectively comprise a single-point two-dimensional 0/1heatmap diagram and the offset diagram, a 4-point single-classification two-dimensional 0/1heatmap diagram and the offset diagram, and a 4-point 4-classification two-dimensional 0/1heatmap diagram and the offset diagram. After the 4-point 4-classification two-dimensional 0/1heatmap diagram is obtained, onehot coding is further carried out on the 4-point 4-classification two-dimensional 0/1heatmap diagram.

In this embodiment, a two-dimensional aortic CT image keypoint dataset is preprocessed to generate image data forms that are amenable to neural network training processing. The method comprises the following steps: the two-dimensional aortic image is first processed and the aortic position detection is performed by means of FASTER RCNN. The method has the advantages that the aortic position is selected in a frame mode by means of a supervised neural network training method, the complete target aorta is framed in the two-dimensional medical image obtained through cutting mapping and is used as the center of segmentation of the whole two-dimensional image, cutting is conducted along the edge of the complete target aorta, the two-dimensional image with the main body being a view of the aorta is obtained, interference of background elements on training is reduced, and the number of training voxels and training duration required by aorta related training are reduced. And meanwhile, the obtained two-dimensional images are subjected to normalization processing, the sizes of the same view and the sizes of different views can be related, the front view can be 112×160 according to actual design requirements, the side view size is 112×160, the top view size is 112×112, and the network training is facilitated. And processing the cut image continuously to effectively improve the training precision, generating a 0/1heatmap key point diagram with voxel values of 1 in the range of 6 unit lengths around the key point, and simultaneously generating a key point offset diagram taking the position offset as a standard so as to improve the precision of network training and reduce the training difficulty.

Because the key point detection comprises 4 key points to be detected, the original two-dimensional image is first onehot coded before the key points are input into the network training to realize the multi-classification image processing problem.

Onehot coding is to number the classified variables existing in the input data so as to express the variable characteristics through binary vectors of 0 and 1. This first requires mapping the classification value to an integer value. Each integer value is then represented as a binary vector, with the exception of the index of the integer, which is zero, and the integer is marked 1. Using onehot codes for discrete features will make the distance computation between features more reasonable.

For the 4-point 4-classification dataset used in the present invention, the sinotubular junction start is labeled 1, the boundary arch01 of the ascending aorta and the aortic arch is labeled 2, the boundary of the aortic arch and the descending aorta is labeled 3, the iliac artery branch is labeled 4, and the background voxel is labeled 0.

Improving the Vnet network, inputting the 0/1heatmap graph and the offset graph into the improved Vnet network for training, and generating a key point circle on the multi-scale view; the method for improving the Vnet network comprises the steps of adding an intermediate supervision layer into the Vnet network, and setting a learning objective function to obtain the improved Vnet network. The method further comprises the step of selecting a loss function to train the single-point two-dimensional 0/1heatmap diagram and the offset diagram, the 4-point single-classification two-dimensional 0/1heatmap diagram and the offset diagram, and the 4-point 4-classification two-dimensional 0/1heatmap diagram and the offset diagram respectively. The single-point two-dimensional 0/1heatmap graph and the offset graph perform network training in a mode of linearly combining two loss functions of cross entropy and Dice; performing network training on the 4-point single-classification two-dimensional 0/1heatmap graph and the offset graph by adopting a focal loss function; the 4-point 4-classification two-dimensional 0/1heatmap graph and the offset graph are subjected to network training by adopting a focal loss function.

And carrying out space geometry three-dimensional reconstruction based on the key point circle on the multi-scale view, obtaining the three-dimensional coordinates of the key point, and finishing the detection of the key point of the aortic CT image. The three-dimensional coordinates of the key points comprise three-dimensional coordinates of single-point key points, three-dimensional coordinates of 4-point single-classification key points and three-dimensional coordinates of 4-point 4-classification key points.

In this embodiment, the intermediate supervisory layer improvement based on the original Vnet network includes;

the Vnet network has been widely used in the medical image processing field since it was proposed as a modified form of Unet network. Medical images are generally unique in terms of semantics and structure and are difficult to acquire for imaging an organ or tissue. Aiming at the characteristics, the Vnet network respectively performs 4 downsampling and 4 upsampling, so that the resolution and the edge precision of the image are effectively improved, and meanwhile, the model has smaller parameter quantity and has certain advantages for training tasks with the characteristic of few samples.

The left side of the network is divided into different phases, running at different resolutions. Each stage includes one to three convolutional layers, each stage being formulated such that it learns a residual function. The input of each stage is made up of data that has been subjected to a non-linear convolution process in the convolution layer and the output data that is added to the last convolution layer of that stage, so as to ensure that a residual function is learned at each stage. This architecture ensures that the network takes only a small fraction of the time to complete convergence compared to other similar networks that do not learn the residual function.

The operation of extracting features is accomplished using convolution, affected by the convolutional neural network. Furthermore, since the number of feature channels doubles at each stage of the compression path of Vnet and the model is formulated as a residual network, these convolution operations need to be utilized to double the number of feature maps when the resolution of the feature maps is reduced. PReLU nonlinearities are applied throughout the network.

The invention also uses the Vnet network for training aiming at the key point detection regression of the medical image. The existing Vnet network has the advantages that due to the special structure and the fact that the high-efficiency utilization of lost information and edge information in the convolutional process of the neural network is achieved to a great extent, compared with a CPM network commonly used in image segmentation, the existing Vnet network still has great disadvantages in the aspects of solving the gradient disappearance problem or the problem of overlong training time and the like. In order to solve the problems, the invention adjusts the existing Vnet network, adds an intermediate supervision module between layers, and provides a solution for the problems that the gradient easily appears in the transmission process of the offset graph between the layers is disappeared and the network training difficulty is large in the training time process by using a learning objective function f= |x-x _i||² (x is an image voxel obtained by training and x _i is an ideal image voxel). The difficulty in lowering the gradient of the whole network is reduced, and the training speed is improved.

In this embodiment, training with a combined loss function that is a linear combination of two loss functions, cross entropy and Dice, for a single point two-dimensional dataset includes:

Firstly, a step single-point two-dimensional 0/1heatmap diagram and an offset diagram are firstly input into a Vnet network for training, the performance of the built Vnet in the two-classification problem is observed, and the cross entropy loss and the Dice loss which are commonly used in the image segmentation and the two-classification problem are used as loss functions. For training of neural network models, the loss function is a more important network structure.

The calculated cross entropy formula is as follows:

Where n is the total number of pixels in the image, p (x _i) is the predicted value of the point to be predicted x _i at n points, q (x _i)＝1-p(x_i). Which is essentially the degree of difference between the different probability distributions contained in a certain random variable being measured. Taking the machine learning process as an example, the different probabilities refer to the difference between the actual probability distribution of the sample and the predicted probability distribution at the output.

Dice coefficient loss function (Dice loss):

Wherein X is an actual label, Y is a predicted value obtained by training, and |X and Y| represent an intersection between X and Y; because of the non-directionality of the denominator calculation, common elements of the X and Y sets are repeatedly calculated in the process, and thus the formula numerator coefficient is 2.

The cross entropy loss itself has a gradient optimization mechanism (THE GRADIENTS ARE nicer.) compared to the Dice loss function. And due to the special mechanism of cross entropy loss, the loss function is widely applied in the field of image segmentation and classification. However, if the values of the output information of the activation function layer and the output value of the objective function are small in the case of extreme cases such as input loss, the magnitude of the square value of the numerator is rapidly expanded, so that the final training process is extremely unstable.

In order to combine the advantages of cross entropy and race loss on the two-class training problem and avoid the interference caused by the deficiency of the cross entropy and race loss as far as possible, the two-dimensional data set training of the single-point key point adopts a mode of linearly combining two loss functions.

In this embodiment, the three-dimensional reconstruction of the single-point two-dimensional dataset comprises:

After single-point training is completed, carrying out geometric reconstruction analysis on regression of key point coordinates completed for two-dimensional aortic CT images on each scale, carrying out space geometric three-dimensional reconstruction on a single two-dimensional key point circle on a generated main view, a side view, a top view and other scale views, locking point positions in a space geometric body range formed by a plurality of key point circle centers by means of multiple views, solving the circle center of an circumscribed circle of the space geometric body as the position coordinates of a required key point, obtaining three-dimensional coordinates of the single-point key point, and judging the distance between the key point position obtained by training and the original key point position (X, y, z is the three-dimensional coordinates of the key points obtained by labeling, x _i,y_i,z_i is the three-dimensional coordinates of the single-point key points obtained by training) as a final key point training result evaluation standard.

In the embodiment, for a 4-point single-classification two-dimensional data set, a combination loss function of cross entropy and Dice is adopted, a focal loss function is adopted to perform network training, and three-dimensional reconstruction is performed after the network training, wherein the method comprises the steps of;

After the 4 single-point key point training is completed, adding the two-dimensional key point data set with all the 4 single points marked as 1 into a Vnet training network, setting num_ classes as 2, and directly bringing the two classes of the 2 layers into the network for training. The loss function still takes the form of a combination of cross entropy loss and dice loss. After completion, three-dimensional reconstruction is performed for 4 key points simultaneously, similar to the single point reconstruction method. And respectively constructing space geometric bodies aiming at the circle centers of key point circles on the 4 scale pictures, taking the circle center of each space geometric body circumscribing circle as a key point coordinate, obtaining a three-dimensional coordinate of the 4-point single-classification key point, comparing with the original key point position, respectively solving offset d _ev according to the Euclidean geometric distance, and taking the average value and variance of the offset of the 4 points as an evaluation index of the training effect of the 4-point single-classification key point data set.

The loss function used for training the 4-point single-classification two-dimensional data set is modified into a focal loss function, and two parameters in the function are adjusted. The focal loss function fully combines the advantages of single-stage training and double-stage training in the training process, avoids the problem caused by uneven classification to a certain extent, and has good effects in the aspects of improving training speed and precision.

Expression of focal loss:

FL(p)＝-(l-p)^γlog(p)

Wherein γ is a focusing parameter, γ > =0, and p is a point probability value;

(1-p) ^γ is called a modulation factor to achieve a decrease in the weight of the easily classified samples in the focal loss, and the parameter γ adjusts the weight proportion of the easily classified samples among the classified samples relatively smoothly. During the training process, it was found that in the network training process of multi-point classification detection regression, the focal loss performed more advantageously than the Dice loss.

In this embodiment, the 4-classification network training and three-dimensional reconstruction of the 4-point two-dimensional dataset includes,

After training num classes to 2 is completed, the onehot encoded 4-point 4-classification keypoint dataset is processed. Firstly layering data according to 1,2, 3 and4 of coding points, then classifying all 2,3 and4 into 1, training each label batch by means of a Vnet neural network with a loss function of focal loss and an intermediate supervision layer, and processing the training result in the same mode as the mode of classifying 4 points. And respectively constructing space geometric bodies aiming at the circle centers of key point circles on the 4 scale pictures, taking the circle center of each space geometric body circumscribing circle as a key point coordinate, obtaining a 4-point 4-classification key point three-dimensional coordinate, comparing with the original key point position, respectively solving offset d _ev according to the Euclidean geometric distance, and taking the average value and variance of the offset of the 4 points as an evaluation index of the training effect of the 4-point 4-classification key point data set.

Example two

As shown in fig. 6, in this embodiment, a schematic structural diagram of an aortic CT image key point detection system based on three-dimensional reconstruction is provided, which includes: the device comprises a labeling module, a cutting mapping module, a frame selection module, a training module and a reconstruction module;

Specifically, the labeling module, the cutting mapping module, the frame selection module, the training module and the reconstruction module are sequentially connected; the labeling module is used for labeling key points of an upper body image of a human body to obtain a three-dimensional aortic CT image key point data set, wherein the key points are sinus canal juncture, a boundary point between an ascending aorta and an aortic arch, a boundary point between the aortic arch and a descending aorta and an iliac artery revealing position; the cutting mapping module is used for carrying out multi-scale cutting mapping on the three-dimensional aortic CT image key point data set to obtain a two-dimensional aortic CT image key point data set; the frame selection module is used for performing target aortic frame selection on the two-dimensional aortic CT image key point dataset, and cutting to obtain a 0/1heatmap image and an offset image; the training module is used for improving the Vnet network, inputting the 0/1heatmap graph and the offset graph into the improved Vnet network for training, and generating a key point circle on the multi-scale view; the reconstruction module is used for carrying out space geometry three-dimensional reconstruction according to the key point circle on the multi-scale view, obtaining the three-dimensional coordinates of the key point and finishing the detection of the key point of the aortic CT image.

The present application is not limited to the above-mentioned embodiments, and any changes or substitutions that can be easily understood by those skilled in the art within the technical scope of the present application are intended to be included in the scope of the present application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.

Claims

1. The aortic CT image key point detection method based on three-dimensional reconstruction is characterized by comprising the following steps,

Performing target aortic frame selection based on the two-dimensional aortic CT image key point data set, and cutting to obtain a 0/1heatmap chart and an offset chart, wherein the 0/1heatmap chart is a 0/1heatmap key point chart with voxel values of 1 in a preset range around the key point; the offset map is a key point offset map taking the position offset as a standard; the 0/1heatmap map and the offset map obtained by cutting respectively comprise a single-point two-dimensional 0/1heatmap map and the offset map, a 4-point single-classification two-dimensional 0/1heatmap map and the offset map, and a 4-point 4-classification two-dimensional 0/1heatmap map and the offset map; after the 4-point 4-classification two-dimensional 0/1heatmap diagram is obtained, onehot coding is carried out on the 4-point 4-classification two-dimensional 0/1heatmap diagram;

Improving the Vnet network, inputting the 0/1heatmap graph and the offset graph into the improved Vnet network for training, and generating a key point circle on the multi-scale view; adding an intermediate supervision layer into the Vnet network, and setting a learning objective function to obtain an improved Vnet network;

the method further comprises the steps of selecting a loss function to train the single-point two-dimensional 0/1heatmap diagram and the offset diagram, the 4-point single-classification two-dimensional 0/1heatmap diagram and the offset diagram and the 4-point 4-classification two-dimensional 0/1heatmap diagram and the offset diagram respectively;

the single-point two-dimensional 0/1heatmap graph and the offset graph adopt a combination loss function of linear combination of two loss functions of cross entropy and Dice to perform network training;

the 4-point 4-classification two-dimensional 0/1heatmap diagram and the offset diagram adopt a focal loss function to carry out network training;

2. The aortic CT image keypoint detection method based on three-dimensional reconstruction according to claim 1, wherein the three-dimensional aortic CT image keypoint dataset comprises a single point three-dimensional dataset, a 4-point three-dimensional single classification dataset, a 4-point three-dimensional 4-classification dataset.

3. The aortic CT image keypoint detection method based on three-dimensional reconstruction according to claim 1, wherein the two-dimensional aortic CT image keypoint dataset comprises a single point two-dimensional dataset, a 4-point single-classification two-dimensional dataset, and a 4-point 4-classification two-dimensional dataset.

4. The method for detecting key points of aortic CT image based on three-dimensional reconstruction according to claim 1, wherein,

The three-dimensional coordinates of the key points comprise three-dimensional coordinates of single-point key points, three-dimensional coordinates of 4-point single-classification key points and three-dimensional coordinates of 4-point 4-classification key points.

5. An aortic CT image key point detection system based on three-dimensional reconstruction, which is characterized by being used for executing the aortic CT image key point detection method based on three-dimensional reconstruction according to any one of claims 1-4, wherein the system comprises a labeling module, a cutting mapping module, a frame selection module, a training module and a reconstruction module;