CN112950582A

CN112950582A - 3D lung lesion segmentation method and device based on deep learning

Info

Publication number: CN112950582A
Application number: CN202110223645.1A
Authority: CN
Inventors: 杜强; 陈相儒; 郭雨晨; 聂方兴; 唐超
Original assignee: Beijing Xbentury Network Technology Co ltd
Current assignee: Beijing Xbentury Network Technology Co ltd
Priority date: 2021-03-01
Filing date: 2021-03-01
Publication date: 2021-06-11
Anticipated expiration: 2041-03-01
Also published as: CN112950582B

Abstract

The invention discloses a 3D lung lesion segmentation method based on deep learning, a device, electronic equipment and a storage medium, wherein the method comprises the steps of acquiring a dicom image of a lung nodule, and preprocessing the dicom image; three-dimensionally stacking the preprocessed dicom images to obtain 3D image blocks, and cutting the 3D image blocks; performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; compared with the conventional mainstream 3D segmentation method, the method is simpler, more convenient and faster, meets the shape characteristics of the nodular sphere, achieves the precision close to that of the mainstream segmentation method, and provides a new idea.

Description

3D lung lesion segmentation method and device based on deep learning

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a 3D lung lesion segmentation method and device based on deep learning, electronic equipment and a storage medium.

Background

In medicine, in the medical field, medical images become a step of accurate diagnosis and inseparable lesion location, but some lesions are small and difficult to find, so that doctors often neglect the lesions to cause misdiagnosis, and the shapes and sizes of the lesions are inconsistent, so that the doctors are relatively difficult to measure the lesions.

The deep learning method is proposed from the end of the 20 th century, has been well developed so far, simulates the human nerve composition, and completes the simulation of nerve connection by using a large amount of data, thereby achieving the effect of simulating the human brain and completing the work which can not even be completed by some professionals.

The LIDC-IDRI is a data set composed of chest medical image files (e.g., CT, X-ray film) and corresponding diagnostic lesion labels. This data was collected by the National Cancer Institute (National Cancer Institute) initiative for the purpose of studying early Cancer detection in high risk groups. The data set contains CT images of 1018 patients, and lung nodule labeling at a pixel-by-pixel level is completed, and certain labeling is also performed on classification of lung nodules.

The current method for detecting and segmenting the LIDC-IDRI by using the deep learning scheme mainly detects lung nodules first, and performs pixel-by-pixel (2D) or voxel-by-voxel (3D) segmentation according to the detection result, so as to obtain a description of the volume of the lung nodules and other corresponding descriptions. Current methods, however, involve the use of a two-stage model, or one-stage detection and segmentation. However, such a method results in a correspondingly high time consumption or a correspondingly high memory usage.

Disclosure of Invention

The invention aims to provide a 3D lung lesion segmentation method, a device, electronic equipment and a storage medium based on deep learning, which do not need to detect and segment again firstly, and can directly segment through a 3D image to obtain a corresponding result.

In a first aspect, an embodiment of the present invention provides a deep learning 3D lung lesion segmentation method, including the following steps:

acquiring a dicom image of a lung nodule, and preprocessing the dicom image;

three-dimensionally stacking the preprocessed dicom images to obtain 3D image blocks, and cutting the 3D image blocks;

performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of central point coordinates, and obtaining coordinates of the regression point through the central point coordinates to obtain a segmentation result.

Optionally, the training process of the spherical segmentation model includes:

preprocessing the labeling data of the dicom image to obtain spherical coordinate data;

extracting a feature map based on the constructed network;

constructing a feature map pyramid network FPN and a cyclic feature pyramid network RFP based on ResNeSt-34;

and dividing the characteristic diagram into three parts through the FPN and the RFP, calculating a loss function of the spherical coordinate data of the characteristic diagram, and solving an optimized model.

Optionally, pre-processing the dicom image comprises;

image normalization and image enhancement;

the image normalization comprises the steps of adjusting the image value of the dicom image through the window width and window level to obtain an image normalized to the pixel value of 0-255 by using the window width and window level values of 500-1500 respectively.

Optionally, the pre-processing the labeling data of the dicom image to obtain the ball coordinate data includes:

obtaining coordinate information and segmentation contours of data according to the labeling data of the dicom image;

carrying out consistent interpolation on the labeling data of the dicom image according to the space distance, and setting the unit value of each dimension of the voxel as the same value in an interpolation mode of 3 times of linear interpolation;

after interpolation of the labeled data of the dicom image is completed, calculating the spherical coordinate space position of the regression pixel; wherein the spatial expression of the spherical coordinates is as follows:

z＝rcosθ

wherein,

n is taken to be the value 36 in the present invention,

m is 36 in the invention, and r is the sphere center coordinate from each corresponding theta and

the angle reaches the length of the edge of the object.

Further, dividing the feature map into three parts through the FPN and RFP to calculate a loss function for the spherical coordinate data of the feature map comprises:

the first part regresses the corresponding classification of the characteristic map, its size is D H W k, DHW corresponds to the length of the characteristic map separately and is tall and wide three dimensions, k is the classification number, set up as 1, activate the function as sigmoid, namely judge whether it is a nodule through the probability value, the loss function used is Focal loss;

the second part is the centrality of the spherical coordinates corresponding to the feature map, the size of the centrality is D H W1, DHW is three dimensions of the length, the width and the height of the feature map, and the centrality is calculated by the following formula:

wherein d is_iIs the length of the first ray in which the spherical coordinates would regress n rays, where n in the present invention is taken as 72 (representing 36+36), in terms of the sum of θ and θ of the spherical coordinates

Uniformly taking values, namely taking 36 values of theta from 0-2 pi at equal intervals,

taking 36 values from 0-2 pi at equal intervals to obtain 72 regression lines which represent regression distances from a central point to 72 directions;

the third part is the distance from the central point on n angles of the spherical coordinates corresponding to the regression feature map, the size of the distance is D H W n, DHW is three dimensions of length, width and height, n is n angles, and as shown in the second part, n is 72 in the invention.

In a second aspect, an embodiment of the present invention provides a deep learning-based 3D lung lesion segmentation apparatus, where the segmentation apparatus includes:

the image processing module is used for acquiring a lung nodule dicom image and preprocessing the dicom image;

the 3D image acquisition module is used for three-dimensionally stacking the preprocessed dicom images to obtain 3D image blocks and cutting the 3D image blocks;

the regression subgraph acquisition module is used for extracting the characteristics of the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and the calculation module is used for calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of central point coordinates, and obtaining the coordinates of the regression points through the central point coordinates to obtain a segmentation result.

In a third aspect, the present invention provides an electronic device, comprising:

a processor; a memory for storing processor-executable instructions;

wherein the processor implements the above method by executing the executable instructions.

In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the above-described method.

Advantageous effects

The invention provides a 3D lung lesion segmentation method based on deep learning, a device, electronic equipment and a storage medium, wherein the method comprises the steps of acquiring a dicom image of a lung nodule, and preprocessing the dicom image; three-dimensionally stacking the preprocessed dicom images to obtain 3D image blocks, and cutting the 3D image blocks; performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; the regression subgraph is subjected to product calculation of centrality and probability to obtain a plurality of central point coordinates, coordinates of regression points are obtained through the central point coordinates, and segmentation results are obtained.

Drawings

Fig. 1 is a flowchart of a deep learning-based 3D lung lesion segmentation method according to an embodiment of the present invention;

FIG. 2 is a training process of a spherical segmentation model according to an embodiment of the present invention;

fig. 3 is a flowchart illustrating preprocessing of annotation data of a dicom image to obtain spherical coordinate data according to an embodiment of the present invention;

fig. 4 is a diagram illustrating coordinate information and a segmentation contour of data obtained by performing uniform interpolation according to spatial distance through labeled data of a dicom image according to an embodiment of the present invention;

FIG. 5 is a schematic diagram of the calculation of the loss function of the spherical coordinate data of the feature map by dividing the feature map into three parts through the FPN and RFP according to the embodiment of the present invention;

fig. 6 is a block diagram illustrating a 3D lung lesion segmentation apparatus based on deep learning according to an embodiment of the present invention;

fig. 7 is a block diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The present invention aims to provide a method, an apparatus, an electronic device and a storage medium for segmenting a 3D lung lesion based on deep learning, which do not need to detect and then segment first, and segment directly through a 3D image to obtain a corresponding result, and the following description and specific embodiments are further described with reference to the accompanying drawings:

fig. 1 shows a flowchart of a deep learning-based 3D lung lesion segmentation method according to an embodiment of the present invention, as shown in fig. 1, the segmentation method includes the following steps:

s20, acquiring a lung nodule dicom image, and preprocessing the dicom image;

s40, carrying out three-dimensional stacking on the preprocessed dicom images to obtain 3D image blocks, and cutting the 3D image blocks;

s60, performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

s80, calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of central point coordinates, and obtaining coordinates of the regression points through the central point coordinates to obtain a segmentation result.

In the method of the embodiment, a lung nodule dicom image is obtained, and the dicom image is preprocessed; three-dimensionally stacking the preprocessed dicom images to obtain 3D image blocks, and cutting the 3D image blocks; performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph; the regression subgraph is subjected to product calculation of centrality and probability to obtain a plurality of central point coordinates, coordinates of regression points are obtained through the central point coordinates, and segmentation results are obtained.

Specifically, as shown in fig. 2, the training process of the spherical segmentation model includes:

s601, preprocessing the labeling data of the dicom image to obtain spherical coordinate data;

s602, extracting a feature graph based on the constructed network;

s603, constructing a feature map pyramid network FPN and a cyclic feature pyramid network RFP based on ResNeSt-34;

s604, dividing the characteristic diagram into three parts through the FPN and the RFP, calculating a loss function of the spherical coordinate data of the characteristic diagram, and solving an optimized model.

Specifically, as shown in fig. 3 to 4, the pre-processing the labeling data of the dicom image to obtain the spherical coordinate data includes:

s6011, obtaining coordinate information and a segmentation contour of data according to the labeling data of the dicom image;

s6012, carrying out consistent interpolation on the labeling data of the dicom image according to the spatial distance, and setting the unit value of each dimension of the voxel as the same value in an interpolation mode of 3 times of linear interpolation;

s6013, after interpolation of the labeled data of the dicom image is completed, calculating a spherical coordinate space position of a regression pixel; wherein the spatial expression of the spherical coordinates is as follows:

wherein, among others,

n is taken to be the value 36 in the present invention,

the angle reaches the length of the edge of the object.

In the data processing process, the coordinate information and the segmentation contour of the data are obtained through the labeled data of the dicom image, consistent interpolation is performed according to the spatial distance, and the size of the voxel (unit value of each dimension of the voxel) is set to be the same value in an interpolation mode, that is, the unit size of each dimension is the same millimeter number, for example, one voxel corresponds to a spatial size of 0.6 × 0.6 mm. By processing the corresponding regression radius and angle through the method, the model can more easily perform radius and additive regression on the same pixel value, and the model can easily complete better training. After the interpolation of the original data is finished, calculating the spherical coordinate space position of the regression pixel, wherein in the data processing process, the distance from the central point of each pixel to the node segmentation edge is calculated by calculating the distance from the central point of each pixel to the node segmentation edge as shown in the formula (1), n values are equally spaced from 0-2 pi according to theta, wherein n is 36 in the invention, and phi is also equally spaced from 0-2 pi similarly, and the number of the values is also n; phi and n are respectively used as submodules of two 36 channels after sampling based on the output of the feature map pyramid network FPN and the cyclic feature pyramid network RFP of ResNeSt-34 to carry out regression training, and the regression target is the distance from the central point to the outer edge of the nodule at each angle corresponding to phi and n. And regressing the distances from the center point to the 36 phi and 36 n different directions on each pixel, thereby obtaining the segmentation result of the spherical coordinates.

Specifically, as shown in fig. 5, dividing the feature map into three parts through the FPN and RFP to calculate the loss function of the spherical coordinate data of the feature map includes:

the first part regresses the corresponding classification of the characteristic map, its size is D H W k, DHW corresponds to the length of the characteristic map separately and is tall and wide three dimensions, k is the classification number, set up as 1, activate the function as sigmoid, namely judge whether it is a nodule through the probability value, the loss function used is Focal loss; the probability values are calculated by the neural networks FPN and RFP, and the neural networks FPN and RFP learn the probability calculation result on each feature map corresponding to the classification part in fig. 4 by learning the label pattern.

In some embodiments, pre-processing the dicom image comprises;

image normalization and image enhancement;

the image normalization comprises the steps of adjusting the image value of the dicom image through the window width and window level to obtain an image normalized to the pixel value of 0-255 by using the window width and window level values of 500-1500 respectively. Specifically, first x_c,i,jThe pixel values of the ith row and the jth column in the c channel of a picture are represented, window width window level values are respectively 500-1500 in the invention, namely, an image obtained from a dicom image is normalized to an image with the pixel value of 0-255 by adjusting the image value through the window width window level. In the present invention, c is {1},

and (3) training parameter configuration:

in the training, the initial learning rate is set to 0.01, the training passage number is 50 epochs, the learning rate updating mode is WarmUpCosinaleleringrate, namely, in the first 5 epochs, the learning rate is increased to 0.01 from 0.002 by the fluctuation of 0.2 per epoch, and in the subsequent 45 epochs, according to the formula

And calculating, wherein n is the total number of epochs for training, 150 is taken in the experiment, and e is the current epoch number. The optimizer used in the training was an Adam optimizer and the loss function was MSE.

In the actual data processing process, limited by the storage of the display card in the training process, the invention performs 128 × 128 pixel block taking on n × 512 images, and due to the limitation of the storage of the display card, the value taking can simultaneously see the image details, and can ensure that the program runs smoothly without errors, and the result of each block is spliced back to the final result, wherein n is the number of images, and as each dicom sequence has a plurality of dicoms stacked to form a 3D structure. The final result is obtained by multiplying the centrality and the probability value of each feature map pixel, and the results are sorted according to the product of the centrality and the probability value.

In the application of an actual scene, firstly, reading a dicom image through pydicom, carrying out fixed window width window level conversion on the dicom image, carrying out three-dimensional stacking on a plurality of dicom images to obtain a 3D image block, cutting the 3D image block, wherein the size of the cut image is 128 × 128, carrying out feature extraction on the cut image by using a statistical Segmentation model to obtain a regression subgraph, then carrying out product calculation of the central degree and the probability value to obtain a plurality of central point coordinates, and obtaining the coordinates of the regression point through the central point coordinates to obtain the final Segmentation result.

Based on the same inventive concept, the embodiment of the present invention further provides a 3D lung lesion segmentation apparatus based on deep learning, which can be used to implement the method described in the above embodiments, as described in the following embodiments. Because the principle of solving the problem of the 3D lung lesion segmentation device based on deep learning is similar to that of a 3D lung lesion segmentation method based on deep learning, the implementation of the 3D lung lesion segmentation device based on deep learning can refer to the implementation of the 3D lung lesion segmentation method based on deep learning, and repeated points are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. While the system described in the embodiments below is preferably implemented in software, implementations in hardware, or a combination of software and hardware are also possible and contemplated.

As shown in fig. 6, the segmenting apparatus provided in the embodiment of the present invention includes:

the image processing module 20 is configured to acquire a dicom image of a lung nodule and perform preprocessing on the dicom image;

the 3D image acquisition module 40 is used for three-dimensionally stacking the preprocessed dicom images to obtain 3D image blocks and cutting the 3D image blocks;

the regression subgraph obtaining module 60 is used for performing feature extraction on the cut 3D image through a pre-trained spherical segmentation model to obtain a regression subgraph;

and the calculating module 80 is configured to calculate a product of centrality and probability of the regression subgraph to obtain a plurality of central point coordinates, and obtain coordinates of a regression point through the central point coordinates to obtain a segmentation result.

The segmentation device of the embodiment acquires a dicom image of a lung nodule through an image processing module 20, and pre-processes the dicom image; the 3D image acquisition module 40 three-dimensionally stacks the preprocessed dicom images to obtain 3D image blocks, and cuts the 3D image blocks; the regression subgraph obtaining module 60 performs feature extraction on the cut 3D image through a spherical segmentation model trained in advance to obtain a regression subgraph; the calculation module 80 calculates the product of the centrality and the probability of the regression subgraph to obtain a plurality of central point coordinates, and obtains the coordinates of the regression points through the central point coordinates to obtain the segmentation results.

s602, extracting a characteristic diagram based on the network ResNeSt-34;

wherein, among others,

n is taken to be the value 36 in the present invention,

the angle reaches the length of the edge of the object.

An electronic device is further provided in an embodiment of the present invention, and fig. 7 shows a schematic structural diagram of an electronic device to which an embodiment of the present invention can be applied, and as shown in fig. 7, the computer electronic device includes a Central Processing Unit (CPU)701 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM)702 or a program loaded from a storage section 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data necessary for system operation are also stored. The CPU 701, the ROM 702, and the RAM 703 are connected to each other via a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.

The following components are connected to the I/O interface 705: an input portion 706 including a keyboard, a mouse, and the like; an output section 707 including a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker and the like; a storage section 708 including a hard disk and the like; and a communication section 709 including a network interface card such as a LAN card, a modem, or the like. The communication section 709 performs communication processing via a network such as the internet. The driver 310 is also connected to the I/O interface 705 as needed. A removable medium 711 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 710 as necessary, so that a computer program read out therefrom is mounted into the storage section 708 as necessary.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The present invention also provides a computer-readable storage medium, which may be the computer-readable storage medium included in the apparatus for deep learning based 3D pulmonary lesion segmentation in the above embodiments; or it may be a computer-readable storage medium that exists separately and is not built into the electronic device. The computer readable storage medium stores one or more programs for use by one or more processors in performing the deep learning based 3D pulmonary lesion segmentation method described in the present invention.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. A3D lung lesion segmentation method based on deep learning is characterized by comprising the following steps:

acquiring a dicom image of a lung nodule, and preprocessing the dicom image;

2. The segmentation method according to claim 1, wherein the training process of the spherical segmentation model comprises:

extracting a feature map based on the constructed network;

3. The segmentation method according to claim 1, wherein pre-processing the dicom image comprises;

image normalization and image enhancement;

4. The segmentation method as claimed in claim 2, wherein the pre-processing of the annotation data of the dicom image to obtain the spherical coordinate data comprises:

z＝rcosθ

wherein,

n is taken to be the value 36 in the present invention,

the angle reaches the length of the edge of the object.

5. The segmentation method as claimed in claim 4, wherein the step of dividing the feature map into three parts through the FPN and RFP comprises the steps of:

the first part regresses the corresponding classification of the characteristic map, its size is D H W k, DHW corresponds to the length of the characteristic map separately, k is the classification number, set up as 1, activate the function as sigmoid, namely judge whether it is a node through the probability value, the loss function used is Focalloss;

the third part is the distance from the center point on n angles of the spherical coordinates corresponding to the regression feature map, the size of the distance is D H W n, DHW is three dimensions of length, width and height, n is n angles, and as shown in the second part, n is 72 in the invention.

6. A deep learning-based 3D lung lesion segmentation apparatus, the segmentation apparatus comprising:

and the calculation module is used for calculating the product of the centrality and the probability of the regression subgraph to obtain a plurality of central point coordinates, and obtaining the coordinates of the regression points through the central point coordinates to obtain the segmentation results.

7. An electronic device, comprising:

a processor, a memory for storing processor-executable instructions;

wherein the processor implements the method of any one of claims 1-5 by executing the executable instructions.

8. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, carry out the steps of the method according to any one of claims 1 to 5.