CN110807788B

CN110807788B - Medical image processing method, medical image processing device, electronic equipment and computer storage medium

Info

Publication number: CN110807788B
Application number: CN201911001225.8A
Authority: CN
Inventors: 曹世磊; 刘小彤; 马锴; 伍健荣; 朱艳春; 李仁�; 陈景亮; 杨昊臻; 常佳; 郑冶枫
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-10-21
Filing date: 2019-10-21
Publication date: 2023-07-21
Anticipated expiration: 2039-10-21
Also published as: CN110807788A

Abstract

The embodiment of the invention provides a medical image processing method, a medical image processing device, electronic equipment and a computer storage medium, wherein the medical image processing method can comprise the following steps: acquiring a 3D medical image to be processed; segmenting the 3D medical image into at least two adjacent image blocks; extracting the characteristics of each image block to obtain a characteristic diagram of each image block; determining a prediction probability map corresponding to each image block based on the characteristic map of each image block; based on the predictive probability map of each image block, the detection results of the focus areas corresponding to each disease in the 3D medical image are determined. According to the scheme provided by the invention, the detection result corresponding to the same disease type in the 3D medical image can be accurately determined based on each image block and the relevance among the image blocks.

Description

Medical image processing method, medical image processing device, electronic equipment and computer storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a medical image processing method, apparatus, electronic device, and computer storage medium.

Background

In order to detect a focus position in a 3D medical image, the existing medical image processing method is: the 3D medical image is detected based on a 2D or 3D neural network model to determine if there is a lesion in the medical image and the location of the lesion in the image. However, since the 3D medical image has a large data size, when the 3D medical image is processed based on the 2D neural network model, the detection effect is not ideal, and when the 3D medical image is processed based on the 3D neural network model, the algorithm calculation amount is too large and the performance of the model is affected due to the large data size of the 3D medical image and the large parameter size of the model. Therefore, the algorithm in the prior art has poor performance and cannot meet the processing requirement of the 3D medical image.

Disclosure of Invention

The object of the present invention is to solve at least one of the above technical drawbacks, which allows a reduction of the data throughput. The technical scheme adopted by the invention is as follows:

in a first aspect, the present invention provides a medical image processing method, the method comprising:

acquiring a three-dimensional 3D medical image to be processed;

segmenting the 3D medical image into at least two adjacent image blocks;

extracting the characteristics of each image block to obtain a characteristic diagram of each image block;

determining a prediction probability map corresponding to each image block based on the characteristic map of each image block, wherein the prediction probability map characterizes the probability that each pixel in the image block belongs to various diseases;

based on the predictive probability map of each image block, the detection results of the focus areas corresponding to each disease in the 3D medical image are determined.

In an embodiment of the first aspect of the present invention, feature extraction is performed on each image block to obtain a feature map of each image block, including:

extracting features of each image block to obtain a feature map of at least one level of the 3D medical image;

determining a weight feature map corresponding to each image block based on the feature map of at least one hierarchy;

and obtaining the feature map of each image block based on the weight feature map corresponding to each image block and the feature map of at least one level.

In an embodiment of the first aspect of the present invention, the at least one hierarchical feature map includes at least two hierarchical feature maps, and determining a weighted feature map corresponding to each image block based on the at least one hierarchical feature map includes:

determining a weight feature map of each image block corresponding to each level based on the feature map of each level;

based on the weight feature map corresponding to each image block and at least one level of feature map, obtaining the feature map of each image block comprises the following steps:

determining a feature map of each image block corresponding to each level based on the feature map of each level and the weighted feature map of each image block corresponding to each level;

and fusing the feature graphs of each level corresponding to each image block for each image block to obtain the feature graph of the image block.

In an embodiment of the first aspect of the present invention, determining a detection result of a focus area corresponding to each disease in a 3D medical image based on a prediction probability map of each image block includes:

splicing the prediction probability graphs of the image blocks according to the segmentation sequence of the image blocks to obtain a probability graph of the 3D medical image;

based on the probability map of the 3D medical image, detection results of focus areas corresponding to various diseases in the 3D medical image are determined.

In an embodiment of the first aspect of the present invention, determining a detection result of a focus area corresponding to each disease in a 3D medical image based on a probability map of the 3D medical image includes:

performing binarization processing on the probability map of the 3D medical image aiming at each disease type to obtain a segmentation result corresponding to the 3D medical image, wherein the segmentation result represents a detection result of a focus area of each disease type of each pixel in the 3D medical image;

and determining detection results of focus areas corresponding to various diseases in the 3D medical image based on the segmentation results corresponding to the 3D medical image.

In an embodiment of the first aspect of the present invention, determining a detection result of a focus area corresponding to each disease in a 3D medical image based on a segmentation result corresponding to the 3D medical image includes:

based on a segmentation result corresponding to the 3D medical image, determining a connected domain in the 3D medical image, wherein the connected domain is a region corresponding to adjacent pixels with the same binarization value;

based on each connected domain, a detection result of a focus region corresponding to each disease in the 3D medical image is determined.

In an embodiment of the first aspect of the present invention, an overlapping area is provided between two adjacent image blocks in at least two image blocks, and the prediction probability map of each image block is spliced according to the segmentation order of each image block to obtain a probability map of a 3D medical image, including:

For each overlap region, determining a new probability map portion of the overlap region based on the probability map portions of the overlap region corresponding in the corresponding two predictive probability maps;

and splicing the prediction probability graphs of the image blocks according to the segmentation sequence of the image blocks to obtain a probability graph of the 3D medical image, wherein the probability graph part corresponding to each overlapping region in the spliced probability graph of the 3D medical image is a corresponding new probability graph part.

In an embodiment of the first aspect of the present invention, determining a new probability map portion of an overlap region based on corresponding probability map portions of the overlap region in corresponding two predictive probability maps includes:

determining the weight corresponding to each image block in the two image blocks corresponding to the overlapping area;

based on the respective corresponding weights of each image block and the corresponding probability map portions of the overlap region in the corresponding two predictive probability maps, a new probability map portion of the overlap region is determined.

In an embodiment of the first aspect of the present invention, feature extraction is performed on each image block to obtain a feature map corresponding to each image block, and based on the feature map corresponding to each image block, determining a prediction probability map corresponding to each disease type for each image block is obtained through a neural network model, where the neural network model is obtained through training in the following manner:

Acquiring sample 3D medical images, wherein each sample 3D medical image comprises at least two adjacent slices, each slice is marked with a marking result corresponding to each disease, and the marking result represents the probability that each pixel in the slice belongs to a focus area of each disease;

training an initial network model based on the sample 3D medical image until the loss function of the initial network model converges, and taking the model after training as a neural network model; the value of the loss function characterizes the difference degree of the prediction result and the labeling result corresponding to each slice.

In an embodiment of the first aspect of the present invention, the prediction result is a probability that each pixel in each slice belongs to a focal region of each disease; the loss function of the initial network model comprises a first loss function and a second loss function, wherein the value of the first loss function represents the difference degree between the prediction result of each pixel in each slice and the labeling result corresponding to each pixel, and the value of the second loss function represents the difference degree between the prediction result corresponding to each pair of slices in each slice pair and the corresponding labeling result.

In an embodiment of the first aspect of the invention, the value of the second loss function is determined by:

Determining the association weight corresponding to each slice pair;

and determining the value of the second loss function based on the prediction result and the labeling result corresponding to each slice pair and the association weight corresponding to each slice pair.

In an embodiment of the first aspect of the invention, the sample 3D medical image is a 3D lung image, the disease comprises at least one of nodules, arteriosclerosis, lymph node calcification or chordae; the labeling result is a target labeling frame corresponding to the focus region, and a slice in a sample 3D medical image is targeted;

if the disease includes at least one of nodules, arteriosclerosis or lymph node calcification, the labeling of the slice is determined by:

determining a target labeling frame of a corresponding focus area in the slice based on a center point and a radius corresponding to an original labeling result corresponding to the slice, wherein the original labeling result is a labeling result corresponding to each slice in a training data set corresponding to the 3D medical image of the sample;

if the disease is a cable, the labeling result of the slice is determined by the following method:

and determining an initial labeling frame of a corresponding focus area in the slice based on a center point and a radius corresponding to an original labeling result corresponding to the slice, performing image morphological operation on the initial labeling frame, and determining a target labeling frame of the corresponding focus area in the slice based on an operation result.

In a second aspect, the present invention provides a medical image processing apparatus comprising:

the image acquisition module is used for acquiring a three-dimensional 3D medical image to be processed;

the image segmentation module is used for segmenting the 3D medical image into at least two adjacent image blocks;

the feature map determining module is used for extracting features of each image block to obtain a feature map of each image block;

the prediction probability map determining module is used for determining a prediction probability map corresponding to each image block based on the characteristic map of each image block, wherein the prediction probability map represents the probability that each pixel in the image block belongs to each disease;

and the detection result determining module is used for determining detection results of focus areas corresponding to various diseases in the 3D medical image based on the prediction probability map of each image block.

In an embodiment of the second aspect of the present invention, when the feature map determining module performs feature extraction on each image block to obtain a feature map of each image block, the feature map determining module is specifically configured to:

In an embodiment of the second aspect of the present invention, when the feature map determining module includes at least two levels of feature maps in at least one level of feature maps, the feature map determining module is specifically configured to, when determining a weight feature map corresponding to each image block based on the at least one level of feature maps:

the feature map determining module is specifically configured to, when obtaining a feature map of each image block based on the weighted feature map corresponding to each image block and at least one hierarchical feature map:

In an embodiment of the second aspect of the present invention, the detection result determining module is specifically configured to, when determining a detection result of a focus area corresponding to each disease in the 3D medical image based on the prediction probability map of each image block:

In an embodiment of the second aspect of the present invention, the detection result determining module is specifically configured to, when determining a detection result of a focus area corresponding to each disease in the 3D medical image based on the probability map of the 3D medical image:

In an embodiment of the second aspect of the present invention, the detection result determining module is specifically configured to, when determining a detection result of a focus area corresponding to each disease type in the 3D medical image based on the segmentation result corresponding to the 3D medical image:

In an embodiment of the second aspect of the present invention, an overlapping area is provided between two adjacent image blocks in at least two image blocks, and the detection result determining module is specifically configured to, when splicing the prediction probability maps of each image block according to the segmentation order of each image block to obtain the probability map of the 3D medical image:

In an embodiment of the second aspect of the present invention, the detection result determining module is specifically configured to, when determining a new probability map portion of the overlapping region based on the probability map portions corresponding to the overlapping region in the two corresponding predicted probability maps:

In an embodiment of the second aspect of the present invention, feature extraction is performed on each image block to obtain a feature map corresponding to each image block, and based on the feature map corresponding to each image block, determining a prediction probability map corresponding to each disease type for each image block is obtained through a neural network model, where the neural network model is obtained through training in the following manner:

In an embodiment of the second aspect of the present invention, the prediction result is a probability that each pixel in each slice belongs to a focal region of each disease; the loss function of the initial network model comprises a first loss function and a second loss function, wherein the value of the first loss function represents the difference degree between the prediction result of each pixel in each slice and the labeling result corresponding to each pixel, and the value of the second loss function represents the difference degree between the prediction result corresponding to each pair of slices in each slice pair and the corresponding labeling result.

In an embodiment of the second aspect of the invention, the value of the second loss function is determined by:

determining the association weight corresponding to each slice pair;

In an embodiment of the second aspect of the invention, the sample 3D medical image is a 3D lung image, the disease comprises at least one of nodules, arteriosclerosis, lymph node calcification or chordae; the labeling result is a target labeling frame corresponding to the focus region, and a slice in a sample 3D medical image is targeted;

In a third aspect, the present invention provides an electronic device comprising:

a processor and a memory;

a memory for storing computer operating instructions;

a processor for performing the method as shown in the first aspect and any of the embodiments of the invention by invoking computer operating instructions.

In a fourth aspect, the present invention provides a computer readable storage medium storing at least one computer program loaded and executed by a processor to implement a method as shown in any embodiment of the first aspect of the invention.

The technical scheme provided by the embodiment of the invention has the beneficial effects that: according to the medical image processing method, the medical image processing device, the electronic equipment and the computer storage medium, the 3D medical image can be segmented into at least two adjacent image blocks, the feature extraction is carried out on each image block, the feature map corresponding to each image block is obtained, the feature map corresponding to each image block can reflect the characteristics of each pixel in the image, and therefore the detection result corresponding to the same disease in the 3D medical image can be accurately determined based on the feature map corresponding to each image block and the relevance between the adjacent image blocks.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that are required to be used in the description of the embodiments of the present invention will be briefly described below.

Fig. 1 is a schematic flow chart of a medical image processing method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a network structure of an initial network model according to an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a multi-branch decoding module according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a timing attention module according to an embodiment of the present invention;

FIG. 5 is a schematic illustration of the overlap region between two adjacent image blocks of a 3D lung image in an example provided by an embodiment of the present invention;

FIG. 6 is a schematic illustration of a process flow of a CT image of a lung according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of a medical image processing apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, features and advantages of the present invention more comprehensible, the technical solutions in the embodiments of the present invention will be clearly described in conjunction with the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.

As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.

The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.

Among them, machine Learning (ML) is a multi-domain interdisciplinary, and involves multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, and countermeasure learning.

With research and advancement of artificial intelligence technology, research and application of artificial intelligence technology is being developed in various fields, such as common smart home, smart wearable devices, virtual assistants, smart speakers, smart marketing, unmanned, automatic driving, unmanned aerial vehicles, robots, smart medical treatment, smart customer service, etc., and it is believed that with the development of technology, artificial intelligence technology will be applied in more fields and with increasing importance value.

The scheme provided by the embodiment of the application relates to the technology of artificial intelligence such as machine learning, and is specifically described by the following embodiments:

first, in order to better understand and describe the schemes of the embodiments of the present invention, some technical terms related to the embodiments of the present invention will be briefly described below.

Neural Networks (NN): is an algorithm mathematical model which simulates the behavior characteristics of an animal neural network and processes distributed parallel information. The network relies on the complexity of the system and achieves the purpose of processing information by adjusting the relationship of the interconnection among a large number of nodes.

FCN (Fully Convolutional Network, full convolutional neural network): it means that the neural network is mainly composed of convolutional layer, active layer, pooling layer, batch standardization layer and the like, which do not contain full connection layer.

DCNN (Deep Convolutional Neural Network ): a neural network is characterized by a plurality of network layers.

Temporal-aware: the serial plurality of slices is input into the neural network in series, where we consider the serial plurality of slices as a sequential series, corresponding to the input of the inventive scheme.

2.5D data: between 3D data and 2D data, both the characteristics of 2D data and 3D data are included, and this type of data is referred to as 2.5D data for the purpose of distinguishing from 2D data.

Multi-branch decoder: a multi-branch decoder refers to a separate decoder designed for each slice's predictive probability map during decoding in the detection network.

Image block: a portion of an image.

And (3) communicating domain: if two pixel points are adjacent and have the same value, the two pixel points are located in the same area which is communicated with each other, wherein the value can refer to the labels corresponding to the pixel points, namely the labels are the same and the adjacent pixel points are located in the same area which is communicated with each other. Visually, pixel points that communicate with each other form a region, and the set of all the communicating pixel points in the region is referred to as a communicating region.

ASPP (Atrous Spatial Pyramid Pooling, hole space pyramid pooling): images of different sizes can be processed simultaneously, and features of the images of different sizes can be extracted by adopting hole convolution of different sampling rates (rates).

X-reception structure: the X-reception structure breaks down the normal convolutional layer into two parts: a channel independent convolution layer (depthwise separable convolutions) and a point independent convolution layer (pointwise convolution) for extracting high-dimensional features of an image.

Hole convolution (Dilated convolution): holes are added in standard convolution operation, and intervals are reserved between convolution kernels, so that the receptive field of the convolution operation can be enlarged, and the number of convolution parameters can be not increased.

CT (Computed Tomography ) images: an image obtained by scanning around a certain part of a human body using, for example, X-rays, Y-rays, ultrasonic waves, etc., is called a CT image.

Slicing: the slice in a CT image, the CT image is made up of a plurality of consecutive slices.

CT value: is a measurement unit for measuring the density of a certain local tissue or organ of a human body, and is commonly called Hennsfield Unit (HU), wherein the air is-1000, and the compact bone is +1000.

Feature map (Feature map): the Feature map obtained by convolving the image with the filter may be convolved with the filter to generate a new Feature map.

In the field of medical images, 2D and 3D convolutional neural networks have achieved a great deal of achievements with respect to the segmentation and detection of single disease species. In the prior art, detecting whether there is a lesion in a 3D medical image, and the lesion location, may be achieved by:

the first way is: detection mode based on 3D convolutional neural network: based on the 3D medical image of the sample marked by the pixels, a 3D convolutional neural network model is trained, the input of the convolutional neural network model is the 3D medical image of the sample, the output is the detection result of the 3D medical image, namely whether the 3D medical image has a focus, if so, the position of the focus is determined.

The second way is: means for detecting specific disease species based on 3D convolutional neural network: the trained convolutional neural network model is trained based on a sample 3D medical image of a specific disease (such as a lung nodule), and then the trained convolutional neural network model can be used for detecting the 3D medical image with the disease, namely detecting whether a focus corresponding to the specific disease exists in the 3D medical image, and if the focus exists, determining the position of the focus.

However, the processing of medical images by the first method has the following problems:

(1) When training a model, a huge number of marked sample medical images are required, and because the sample medical data are marked pixel by pixel, great manpower and material resources are generally required in the data acquisition stage.

(2) For a medical image to be processed, the trained model determines the detection result of the medical image based on the characteristics of each pixel in the medical image to be processed, and the 3D convolutional neural network cannot obtain a satisfactory result in the detection of multiple diseases due to the fact that the data size of the 3D medical image is large and the parameter number of the corresponding model is also large, so that the calculation amount of the algorithm is large, and the calculation capacity of a display card is limited.

The processing of medical images by the second way has the following problems: based on the trained model in the second mode, only specific disease types are detected, and the characteristics of other disease types are ignored, so that the scheme has no good universality, and the performance of the model is poor.

Aiming at the problems existing in the prior art and better meeting the actual application demands, the embodiment of the invention provides a medical image processing method, which divides a 3D medical image to be processed into at least two adjacent image blocks, determines the detection result of the 3D medical image to be processed based on each image block, and reflects the characteristics of each pixel in the image and the relevance between the adjacent image blocks by the corresponding feature map of each image block.

The invention provides a novel 2.5D full convolution neural network, which realizes multi-disease positioning by a method of dividing focus and adding connected domain post-treatment. The network inputs the image size as length, width and channel number in training engineering, wherein the channel dimension is composed of a plurality of adjacent slices, and finally the network outputs a multi-disease prediction probability map of the adjacent slices. By using the method, the network parameter quantity can be greatly reduced on the premise of fully utilizing the correlation among the slices. Through the 2.5D multi-branch coding and decoding network, various common lung diseases such as nodules, chordae, arteriosclerosis, calcification and the like can be accurately positioned, so that a finer auxiliary means is provided for clinical diagnosis.

The following describes the technical scheme of the present invention and how the technical scheme of the present invention solves the above technical problems in detail with specific embodiments. The following embodiments may be combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments. Embodiments of the present invention will be described below with reference to the accompanying drawings.

Fig. 1 shows a schematic flow chart of a medical image processing method provided by the present invention, as shown in the figure, the method may include steps S110 to S150, where:

Step S110: a 3D medical image to be processed.

The 3D medical image to be processed may be a 3D medical image of a certain portion of a certain patient, for example, a CT image of a lung, and the present invention is not limited to the CT image but may be other 3D medical images. The following examples of the present invention will be described with reference to CT images.

Step S120: the 3D medical image is segmented into at least two adjacent image blocks.

Wherein, a 3D medical image can be segmented into a plurality of continuous image blocks (which can be called 2.5D data) according to the requirement, and the image sizes of the image blocks are the same. In the image processing described below, the 3D medical image may be divided into a plurality of adjacent image blocks, for example, three image blocks (hereinafter, referred to as a first image block, a second image block, and a third image block, respectively) that are adjacent to each other in any order, but the present invention is not limited thereto, and the number of image blocks may be selected appropriately according to the required accuracy and the available calculation amount.

Step S130: and extracting the characteristics of each image block to obtain a characteristic diagram of each image block.

The method of feature extraction is not limited in the present invention, for example, the feature extraction may be performed by a neural network, or other feature extraction methods may be performed, which are all within the scope of the present invention.

Step S140: and determining a prediction probability map corresponding to each image block based on the characteristic map of each image block, wherein the prediction probability map characterizes the probability that each pixel in the image block belongs to various diseases.

The prediction probability map corresponding to the first image block may be determined based on the feature map of the first image block, the prediction probability map corresponding to the second image block may be determined based on the feature map of the second image block, and the prediction probability map corresponding to the third image block may be determined based on the feature map of the third image block, as described above. In the same image block, the prediction probability map corresponding to the image block characterizes the probability that each pixel in the image block belongs to various diseases, for example, 4 diseases exist, and in the prediction probability map corresponding to the image block, each pixel can correspondingly obtain 4 probabilities corresponding to the 4 diseases.

Step S150: based on the predictive probability map of each image block, the detection results of the focus areas corresponding to each disease in the 3D medical image are determined.

Based on the prediction probability map of each image block, the probability that each pixel in each image block belongs to each disease can be known, then based on the probability that each pixel belongs to each disease, the region corresponding to a plurality of pixels corresponding to the same disease can be used as a focus region, and then a plurality of focus regions can be simultaneously included in one 3D medical image, so that the detection result of the focus region corresponding to each disease in the 3D medical image can be determined, and through the detection result, whether a focus exists in the 3D medical image and the position of the focus, namely the position of the focus region in the 3D medical image can be clarified.

Compared with the 3D convolutional neural network in the prior art, when the 3D convolutional neural network is used for detecting multiple diseases of the medical image, the medical image is processed based on each pixel of the medical image, the parameter quantity is huge, the medical image processing method is used for processing at least two image blocks (which can be called as 2.5D data) of the medical image, and the data quantity is smaller than the data processing quantity based on the 3D convolutional neural network in the prior art.

According to the medical image processing method provided by the embodiment of the invention, the 3D medical image can be segmented into at least two adjacent image blocks, the characteristic extraction is carried out on each image block, the characteristic image corresponding to each image block is obtained, and the characteristic of each pixel in the image can be reflected by the characteristic image corresponding to each image block, so that the detection result corresponding to the same disease in the 3D medical image can be accurately determined based on the characteristic image corresponding to each image block and the relevance between the adjacent image blocks.

In an alternative of the present invention, in step S130, feature extraction is performed on each image block to obtain a feature map of each image block, which may include:

Specifically, when extracting features of each image block, feature maps of different levels can be obtained based on actual requirements, the feature maps of different levels can reflect features of the image blocks from different angles, and the feature map of one level includes features of each image block corresponding to the level. In practical application, feature extraction can be performed on each image block based on different sampling rates, so as to obtain feature graphs of different levels.

Optionally, when feature extraction is performed on each image block to obtain a feature map of at least one hierarchy corresponding to a 3D medical image, each image block is used as an input image of a feature extraction network, where the number of each image block may be used as the number of channels of the input image, and the length and width of each image block may be used as the length and width of the input image, and the feature map of at least one hierarchy corresponding to the 3D medical image is obtained through the feature extraction network. For example, if the number of image blocks is 3 and the width and length of one image block are w and h, respectively, the parameters of the input image are w×h×3.

It should be noted that, the number of channels of the feature map of each level is not limited in the embodiment of the present invention, and may be configured according to actual requirements. That is, the number of channels of the feature extraction network input is the number of image blocks, and the number of channels of the feature extraction network output can be configured according to actual requirements.

The feature images of the images are extracted by performing downsampling (such as convolution processing) on the images, and the feature images of different levels can be understood as the downsampling rates corresponding to the feature images are different.

As an example, assume that the feature map corresponding to the acquired 3D medical image is a feature map of 2 levels, which is a feature map of a first level and a feature map of a second level, respectively, the feature map of the first level corresponds to a downsampling rate of 16, and the feature map of the second level corresponds to a downsampling rate of 4.

For a hierarchy, since the importance degree of each image block for the 3D medical image is different, that is, the importance degree of each channel is different, when determining the feature map of each image block, the weight corresponding to each image block, that is, the weight feature map corresponding to each image block, may be determined based on the different importance degrees corresponding to each image block, so that the feature map of each image block corresponding to the hierarchy is determined based on the weight feature map corresponding to each image block.

In an alternative aspect of the present invention, the at least one level of feature map includes at least two levels of feature maps, and determining, based on the at least one level of feature map, a weight feature map corresponding to each image block may include:

based on the feature map of each hierarchy, a weighted feature map of each image block corresponding to each hierarchy is determined.

Based on the weight feature map corresponding to each image block and at least one level of feature map, obtaining the feature map of each image block may include:

Specifically, when at least one hierarchy is at least two hierarchies, each hierarchy corresponding to each image block has a corresponding weight feature map, based on the feature map of each hierarchy of the 3D medical image and the weight feature map of each image block corresponding to each hierarchy, the feature map of each image block corresponding to each hierarchy can be determined, and features of different hierarchies have different feature expression capacities, so that after the feature map of each image block corresponding to each hierarchy is obtained, the feature maps of each image block corresponding to different hierarchies can be fused, and the fused feature map is used as the feature map of the image block, and the expression capacity of the feature map can be further improved through the fused feature map. As an example, as in the examples described above, for each image block, the feature map of the image block corresponding to the first level and the feature map corresponding to the second level may be fused, and the fused feature map may be used as the feature map of the image block.

In an alternative of the present invention, in step S150, determining a detection result of a focus area corresponding to each disease in the 3D medical image based on the prediction probability map of each image block may include:

Specifically, each image block is from the same 3D medical image, in order to determine the detection result corresponding to each disease in the 3D medical image, the prediction probability map of each image block may be spliced according to the segmentation sequence of each image block to obtain a complete probability map corresponding to the 3D medical image, and the probability that each pixel belongs to each disease may be reflected by the probability map, so that the detection result corresponding to each disease in the 3D medical image may be obtained based on the probability map.

In an alternative of the present invention, determining a detection result of a focus area corresponding to each disease in a 3D medical image based on a probability map of the 3D medical image may include:

Specifically, the probability corresponding to each pixel may be different, in order to determine the detection result of the focus area corresponding to each disease in the 3D medical image, binarization processing may be performed on the probability map of the 3D medical image for each disease, the probability corresponding to the pixel belonging to the same disease is normalized to be an identifier, that is, two results are included after the binarization processing, one result identifier pixel is not the pixel corresponding to the disease, the other result identifier pixel is the pixel corresponding to the disease, and the binarized probability map is used as the segmentation result of the 3D medical image. The region corresponding to the pixel belonging to the same disease can be used as the focus region corresponding to the disease.

In an embodiment of the present invention, one realistic way to determine whether a pixel in an image is a pixel corresponding to a disease type may be: whether the probability of a pixel is greater than the probability threshold corresponding to the disease, for example, the probability of a pixel is greater than the probability threshold corresponding to the disease, indicates that the pixel is a pixel corresponding to the disease, otherwise, the pixel is not a pixel corresponding to the disease, and it is understood that the probability thresholds corresponding to different disease may be different.

Furthermore, one implementation scheme for performing binarization processing on the probability map of the 3D medical image for each disease type may be: for each pixel in the 3D medical image corresponding to the same disease, determining whether the pixel is a pixel corresponding to the disease based on the probability that the pixel corresponds to the disease and a corresponding probability threshold, if so, binarizing the probability that the pixel corresponds to a result (for example, the result is 1), and if not, binarizing the probability that the pixel corresponds to another result (for example, the result is 0), namely, after binarizing, the pixel corresponding to 1 is a pixel corresponding to the disease, and the pixel corresponding to 0 is not a pixel corresponding to the disease.

In an alternative of the present invention, determining a detection result of a focus area corresponding to each disease in the 3D medical image based on a segmentation result corresponding to the 3D medical image may include:

Specifically, after binarizing the probability map of the 3D medical image, pixels belonging to each disease can be distinguished, in order to provide a relatively obvious focus area for a user, an area corresponding to adjacent pixels with the same binarization value can be used as a connected domain, the connected domain visually corresponds to an area of the 3D medical image, and then based on each connected domain, a detection result of the focus area corresponding to each disease in the 3D medical image can be determined.

Based on the detection result determined by the connected domain, in order to facilitate the user to view the lesion area in the 3D medical image, the lesion area in the image may be identified by an identifier, for example, as a labeling frame. For different disease types, in the same 3D medical image, in order to distinguish the focus areas corresponding to the different disease types, the focus areas of the different disease types can be identified by different identifiers, for example, the focus areas of the different disease types are represented by different color identifiers.

In an alternative scheme of the present invention, an overlapping area is provided between two adjacent image blocks in at least two image blocks, and the prediction probability map of each image block is spliced according to the segmentation order of each image block to obtain a probability map of a 3D medical image, which may include:

Since at least two image blocks are obtained by cutting the 3D medical image, after each image block is processed, the predictive probability maps corresponding to the image blocks are needed to be spliced into the probability map of the 3D medical image according to the cutting sequence, in order to ensure that the information of the boundary of each image block is reserved when the adjacent image blocks are processed, and an overlapping area is formed between the two cut adjacent image blocks when the image blocks are cut. Because the overlapping area is arranged between two adjacent image blocks, in the process of obtaining the prediction probability map of each image block, in order to make the spliced probability map more accurate, the corresponding probability map parts of the overlapping parts in the corresponding two prediction probability maps can be processed to obtain new probability map parts, and the spliced probability map can not contain the overlapping information corresponding to the overlapping area based on the new probability map parts.

It will be appreciated that determining new probability map portions may be performed before or after stitching, and that the order in which new probability map portions are determined is not limited in the present invention.

In an alternative aspect of the present invention, determining a new probability map portion of an overlap region based on the probability map portions of the overlap region corresponding to the two prediction probability maps may include:

a new probability map portion of the overlap region is determined based on the respective weights for each image block and the corresponding probability map portions of the overlap region in the corresponding two predictive probability maps.

Specifically, the importance degree of each image block is different, and for two image blocks having an overlapping region, the importance degrees of the two image blocks may also be different, and in the prediction probability maps corresponding to the two image blocks, the importance degree of each prediction probability map is also different, so when determining a new probability map portion of the overlapping region, the magnitude of the action of the corresponding prediction probability map may be represented based on the weights corresponding to the two image blocks, and the greater the weight, the greater the action of the image block corresponding to the weight. Meanwhile, in the determined new probability map part, effective information in the image block can be stored to the greatest extent.

As an example, for example, two adjacent image blocks are respectively an image block a and an image block B, the prediction probability map of the image block a is A1, the prediction probability map of the image block B is B1, the overlapping area of the image block a and the image block B is a C portion, the probability map portion corresponding to the C portion in the prediction probability map A1 and the prediction probability map B1 is D, the weight of the image block a is w1, the weight of the image block B is w2, and a new probability map portion can be determined based on the weight w1 of the image block a, the weight w2 of the image block B, and the probability map portion is D.

In the above example, w1 and w2 may be configured based on actual requirements, if the weights w1 and w2 are the same, it means that the roles of image block a and image block B are the same.

In the alternative scheme of the invention, the characteristic extraction is carried out on each image block to obtain the characteristic diagram corresponding to each image block, and the prediction probability diagram corresponding to each disease of each image block is determined based on the characteristic diagram corresponding to each image block, wherein the prediction probability diagram corresponding to each disease of each image block is obtained through a neural network model which is trained by the following modes:

Training an initial network model based on the sample 3D medical image until the loss function of the initial network model converges, and taking the model after training as a neural network model;

the loss function characterizes the difference degree of the prediction result and the labeling result corresponding to each slice.

Specifically, the scheme of the invention can be realized through a neural network model, the neural network model is obtained based on sample 3D medical image training, and a slice adopted during training can be used as the image block. As described above, the sample 3D medical image may also be a CT image, and for convenience of description, the sample 3D medical image is taken as an example of a CT image, and for each CT image, the CT image may include a plurality of consecutive slices, and during the following image processing, a plurality of adjacent slices in the CT image may be selected, for example, any three slices (hereinafter, referred to as a first slice, a second slice, and a third slice, respectively) adjacent to each other in the following image processing, but the present invention is not limited thereto, and an appropriate number of slices may be selected according to the required accuracy and the available calculation amount.

In the alternative scheme of the invention, the prediction result is the detection result of the focus area of each disease type of each pixel in each slice; the loss function of the initial network model comprises a first loss function and a second loss function, wherein the value of the first loss function represents the difference degree between the prediction result of each pixel in each slice and the labeling result corresponding to each pixel, and the value of the second loss function represents the difference degree between the prediction result corresponding to each pair of slices in each slice pair and the corresponding labeling result.

The prediction result corresponding to each slice pair is determined based on the prediction result corresponding to each slice in the slice pair, and the labeling result corresponding to each slice in the slice pair is determined based on the labeling result corresponding to each slice in the slice pair.

Specifically, during training, not only the degree of difference between the prediction result and the labeling result of each pixel in each slice is considered, but also the degree of difference between the prediction result and the corresponding labeling result corresponding to a slice pair (any two slices can correspond to one slice pair) is considered, and then the loss function of the initial network model comprises a first loss function and a second loss function, the degree of difference between the prediction result and the labeling result of each pixel in each slice is represented by the value of the first loss function, and the degree of difference between the prediction result and the corresponding labeling result corresponding to each slice pair is represented by the value of the second loss function. The degree of difference may represent the correlation between two slices, and among the plurality of slices, the closer the two slices are, the stronger the correlation between the two slices are, and the weaker the correlation between the two slices are.

In an alternative aspect of the invention, the value of the second loss function is determined by:

determining the association weight corresponding to each slice pair;

Specifically, since the correlation between two slices is stronger as the two slices are closer to each other, and the correlation between two slices are weaker as the two slices are farther away from each other, the correlation between two slices corresponding to a slice pair can be represented by the correlation weight corresponding to the slice pair.

In an alternative of the present invention, the correlation between two slices is weaker as the correlation between two slices is closer, the strength of the correlation between two slices is related to the distance between two slices, and thus the correlation weight can be determined by the distance between two slices in the corresponding slice pair. Wherein the distance refers to the distance between the corresponding positions of the two slices in the same 3D medical image.

As an example, the disease is denoted as C, in this example, C takes a value of 4, that is, taking 4 disease types as an example, the loss function of the initial network model is denoted as L (hereinafter referred to as the total loss function for convenience of description), and the first loss function is denoted as L _Dice The second loss function is denoted as L _DCD SelectingZhang Shuji of the slices of (2) are OC, OC has a value of 3,for the probability that the ith pixel on the mth slice belongs to each disease C, i.e. the prediction result,/>For the corresponding labeling result, V represents the number of pixel points contained in the list Zhang Qiepian, L _Dice Is the first loss function of the predicted outcome of the list Zhang Qiepian and the corresponding labeling outcome. The first loss function can be expressed by the following formula (1):

L _DCD for the second loss function between the predicted result corresponding to each slice pair and the corresponding labeling result, the second loss function L _DCD Can be represented by the following formula (2):

where m is the index of the mth slice of the at least two slices and n is the index of the slice associated therewith. The slice m and the slice n can form a slice pair, and the prediction result corresponding to the slice pair is based on the prediction result corresponding to the slice m in the slice pairPrediction corresponding to slice n +.>The labeling result corresponding to the slice is determined based on the labeling result corresponding to the slice m in the slice pair +.>Marking result corresponding to slice n +.>And (3) determining.

Since in a three-dimensional image the closer the inter-slice correlation is, the stronger the distance is, and the further the distance is, the weaker the correlation is, the value of the second loss function can be determined by:

For a slice pair, the correlation weight corresponding to each slice pair in each slice pair may be used to represent the strength of the correlation between two slices in the slice pair, and in this example, one way to determine the correlation weight is to determine the correlation weight corresponding to the slice pair based on the distance between two slices in the slice pair, specifically may be represented by the following formula (3):

based on the prediction result and the labeling result corresponding to each slice pair and the association weight corresponding to each slice pair, determining a second loss function according to the following formula (4):

where m is an index of an mth slice of the output at least two slices, and n is an index of a slice associated therewith. Slice m and slice n may form a slice pair, P in equation (4) _m,n Can be represented by the following formula (5):

based on the first loss function and the second loss function described above, the total loss function can be expressed by the following formula (6):

L＝L _Dice +L _DCD (6)

optionally, the value of the second loss function characterizes a degree of difference between a predicted result corresponding to each slice in each slice pair and a corresponding labeling result, the degree of difference reflecting a correlation between two slices corresponding to the slice pair, and the first loss function and the second loss function may be configured with weights based on their contribution to the total loss function, and the total loss function may be represented by the following formula (7):

L＝L _Dice +λ×L _DCD (7)

Wherein λ is a weight corresponding to the second loss function, and may be configured based on actual requirements.

Optionally, the second loss function is applied to slice pairs, each slice pair corresponding to an associated weight. In order to balance the importance of the first and second loss functions, we restrict the importance of the weights corresponding to the second loss function to be determined based on the associated weights, which weights can be determined specifically by the following equation (8):

at this time, the total loss function can be expressed by the following formula (9):

in an alternative aspect of the invention, the sample 3D medical image is a 3D lung image, the disease includes at least one of nodules, arteriosclerosis, lymph node calcification or chordae; the labeling result is a target labeling frame corresponding to the focus region, and a slice in a sample 3D medical image is targeted;

Specifically, before training the neural network model, a training dataset, that is, a dataset corresponding to the 3D medical image of the sample, needs to be established, and based on medical images corresponding to different human body parts, an image detection model for different human body parts can be obtained through training.

In this example, the lung image may be a lung CT image based on the foregoing description, as CT values of different sizes correspond to different body parts. According to the corresponding medical knowledge, the lung CT image can be cut to be [ -1000, 600] (CT value corresponding to lung) according to the numerical value, and then normalized to be [0,1], and the numerical value is taken as the input of the model.

If the training dataset is a dataset created based on a 3D lung image and the disease comprises at least one of nodules, arteriosclerosis, lymph node calcification or chords, before training the model, a training dataset comprising 3D lung images of the 4 disease is acquired, each lung image comprising a plurality of consecutive slices, in the training dataset, each slice having a corresponding original labeling result, the original labeling result being determined at a center point and a radius, since the labeling result at an image level is determined based on the center point and the radius, the original labeling result needs to be converted into a labeling result at a pixel level in order to meet the image processing requirements at the pixel level, the labeling result being a target labeling frame corresponding to the lesion area.

In particular, for a slice, if the disease includes at least one of nodules, arteriosclerosis, or lymph node calcifications, the nodules, arteriosclerosis, and lymph node calcifications are generally spherical or ellipsoidal, the labeling of the slice is determined by: determining a target labeling frame of a corresponding focus area in the slice based on a center point and a radius corresponding to an original labeling result corresponding to the slice; if the disease is a cable, the cable is usually strip-shaped, and the labeling result of the slice is determined by the following modes: and determining an initial labeling frame of a corresponding focus area in the slice based on a center point and a radius corresponding to an original labeling result corresponding to the slice, performing image morphological operation on the initial labeling frame, and determining a target labeling frame of the corresponding focus area in the slice based on an operation result. Among these, image morphology operations include operations of erosion, swelling, and the like.

In an alternative aspect of the invention, the initial network model may be a fully-convoluted neural network.

To further illustrate the present solution, the following is specific to fig. 2 to 5, where in this example, a lung CT image is taken as an example, and the disease includes at least one of a nodule, arteriosclerosis, lymph node calcification, or a cable bar:

In the model training stage: first, a training dataset is acquired, and labeling results of each slice are determined based on the manner described above.

5 slices (corresponding to slices in fig. 2) are selected and input to the initial network model, and the number of slices is taken as the channel number of the model, and then the input of the model is the length of the slices.

A schematic diagram of the initial network model is shown in fig. 2, and the network model mainly comprises two parts, namely an encoding module and a decoding module (corresponding to the Encoder and the Decoder in fig. 2 respectively). Conv, BN and Relu, included therein refer to Convolume (Convolution), batch Normalization (batch data normalization) and ReLU (Rectified Linear Units, activation function), respectively.

The coding module is used for extracting the characteristics of the 5 slices to obtain the characteristic diagram of each slice, the coding module adopts a convolutional neural network (corresponding to the DCNN in fig. 2), and the coding module is formed by connecting an ASPP module with an X-reception structure in series and with a non-shared parameter. The X-reception module is used for extracting high-dimensional characteristics of the image, the ASPP module is used for extracting information of multiple layers among the slices by combining hole convolutions with 4 different sampling rates, and the hole convolutions with different sampling rates can be used for effectively capturing multi-scale information, namely, the ASPP module can be used for capturing association information of multiple layers among the slices.

The X-reception structure is used for extracting high-dimensional characteristics of an input slice, and the specific characteristic extraction principle is as follows: dividing the traditional convolution operation into two steps, assuming the original convolution of 3*3, firstly using M3*3 convolution kernels to convolve the input M feature graphs one to one, and generating M results without summation; the previously generated M results are then normally convolved with N1*1 convolution kernels (corresponding to the 1X1 Conv in the decoding module Decoder of fig. 2, connected with X-reception), and summed to finally generate N results, i.e., an N-dimensional feature map (Low Level Features shown in fig. 2), wherein the 1*1 convolution kernels can transform features into a feature map of set dimensions.

The ASPP module, as shown in fig. 2, includes a 1X1 convolution (corresponding to 1X1 Conv shown in the ASPP module in fig. 2), and three 3X3 convolutions, where the 3X3 convolutions are hole convolutions (corresponding to 3x3 Conv Rate 6,3x3 Conv Rate 12,3x3 Conv Rate 18 shown in fig. 2) with sampling rates (rates) of 6, 12, 18, respectively, a BN layer, and a global Pooling layer (corresponding to Image Pooling shown in fig. 2), and the ASPP module obtains association information of multiple layers (different scales) between slices through hole convolutions with different sampling rates based on a feature map extracted by the X-reception structure, then sequentially passes features of the hole convolutions layer through the global Pooling layer and convolutions, and finally concatenates all features through 1X1 Conv (corresponding to 1X1 Conv before High Level Features in the coding module in fig. 2) to obtain a down-sampled 16-dimensional feature (such as High Level Features shown in fig. 2), where the 16-dimensional feature includes not only features of each slice but also features between slices.

In this embodiment, the decoding module is configured as a multi-branch decoding module, and based on the multi-branch structure, at least two levels of features may be selected to perform the following decoding process. In this example, the feature map of the first level corresponds to a downsampling rate of 16, and the feature map of the second level corresponds to a downsampling rate of 4.

The decoding module is configured as a multi-branch decoding module, and the number of branches corresponds to the number of predictive probability maps to be output, that is, one predictive probability map is generated for each of the input slices (except for the first slice and the last slice), in this example, 3 slices in the middle position among the 5 slices can be selected in consideration of correlation information among the slices, and the predictive probability map (such as 3 predictive probability maps shown in fig. 2) of the 3 slices is determined by a neural network.

Specifically, in the decoding process, for the high-dimensional feature downsampled by 16 times, it is first input into a timing attention module (Temporal Attention Block shown in fig. 2) to obtain OC group features (feature maps corresponding to 3 slices of the first hierarchy, such as oc=3), and then the three groups of features are upsampled respectively. On the other hand, for the down-sampled 4-fold high-dimensional feature, it is also input into a time-series attention module, resulting in an OC group feature (feature map corresponding to 3 slices of the second hierarchy). Splicing the feature graphs corresponding to the two layers (Concat shown in the figure) to obtain common OC group features (feature graphs of 3 slices), and respectively performing up-sampling (up-sampling by 4 shown in figure 2) and convolution operation (such as

Conv of 3 3*3 shown in FIG. 2) to obtain a predictive probability map (Output as shown in FIG. 3) for each of the 3 slices.

The new feature map obtained by a series of convolution pools is usually obtained, and each channel (5 channels in the scheme of the invention) of the new feature map is equally important, and does not distinguish which channel is important from which channel is not important, but in practical application, the obtained feature map of each channel is not useful, for example, a picture containing animals, the background of the picture may be less important, only one channel is important after the picture is converted into a gray level map, and the other channels are less important, so that the importance degree of each channel of the new feature map actually obtained is different, that is, each channel may have a weight representing importance, and then the weight of each channel is multiplied by the original value of each channel, that is, the feature map actually required.

The schematic diagram of the multi-branch decoder module shown in fig. 3, the timing attention module functions as follows: and determining a weight characteristic diagram (corresponding to the weight value described above) corresponding to each slice of the corresponding hierarchy, and determining a characteristic diagram (corresponding to the actually required characteristic diagram described above) of each slice of the corresponding hierarchy based on the weight characteristic diagram corresponding to each slice and at least one hierarchy characteristic diagram, wherein in the scheme of the invention, the number of slices is taken as the channel number, so that the slices can be taken as the channels described above.

The principle of the time series attention module in fig. 3 may be as shown in fig. 4, and as shown in fig. 4, taking a first-level feature map as an example, the first-level feature map (F shown in fig. 4) is subjected to two-layer convolution (convolution Conv of 1 layer 3*3 and convolution Conv of 1*1 shown in fig. 4), and may also be convolution Conv of 3 layer 1*1 shown in fig. 3, then the probability between 0 and 1 is mapped through an activation function (Sigmoid shown in fig. 4), and the obtained probabilities are multiplied by the corresponding feature maps respectively, so as to obtain a weighted feature map (three blurred maps shown in fig. 4) of each slice corresponding to the level, and the feature map of each slice corresponding to the level is determined based on the feature map of the level and the weighted feature map of each slice corresponding to the level.

Based on the same manner as described above, the feature map of each slice corresponding to the second level may be obtained, and after the feature map of each slice corresponding to each level is obtained, the feature maps of the respective levels corresponding to each slice are fused (as shown in Concat in fig. 2) for each slice, and the feature map of the slice is obtained. After the feature map of each slice is obtained, up-sampling processing with the sampling rate of 4 is performed on the feature maps corresponding to all slices, and then convolution processing is performed, in this example, convolution processing is performed by selecting 3 convolutions of 3*3 (e.g., conv of 3 3*3 in the decoding module of FIG. 2), and finally, a prediction probability map of each slice is obtained.

After obtaining a prediction probability map (which can be used as a prediction result) of each slice through the initial network model, judging whether the model is converged through a loss function of the initial network model, wherein the loss function can be the loss function as described above, and judging that the loss function is converged when the difference degree between the prediction result and the corresponding labeling result is smaller than a set value, wherein the trained model is the neural network model.

After obtaining the prediction probability map of each slice, for one prediction probability map, the probability corresponding to each pixel may be different in the prediction probability map, in order to determine the detection result of the focus area corresponding to each disease in the 3D lung image, binarization processing may be performed on the probability map of the 3D lung image for each disease, the probability corresponding to the pixel belonging to the same disease is normalized to one identifier, that is, two results are included after the binarization processing, one result identifier pixel is not the pixel corresponding to the disease, the other result identifier pixel is the pixel corresponding to the disease, and the binarized probability map is used as the segmentation result of the 3D lung image.

After binarizing the probability map of the 3D lung image, pixels belonging to each disease can be distinguished, and in order to provide a relatively obvious lesion area for a user, an area corresponding to adjacent pixels with the same binarization value can be used as a connected domain (corresponding to the connected domain extraction in fig. 2), the connected domain visually corresponds to one area of the 3D lung image, and then based on each connected domain, a detection result of the lesion area corresponding to each disease in the 3D lung image can be determined.

Testing is carried out based on a trained neural network model, and the specific process of the testing is as follows:

acquiring a 3D medical image to be processed, wherein the 3D medical image to be processed is a 3D lung image, dividing the 3D lung image to be processed into at least two image blocks, wherein an overlapping area is arranged between two adjacent image blocks in the at least two image blocks, and determining that a detection result corresponding to each disease in the 3D lung image to be processed is consistent with the determination process described above based on the at least two image blocks, so that the detection result is not repeated.

After the prediction probability map of each image block is obtained, since each image block is from the same 3D lung image, in order to determine the detection result corresponding to each disease in the 3D lung image, the prediction probability maps of each image block can be spliced according to the segmentation sequence of each image block, so as to obtain the complete probability map corresponding to the 3D lung image.

The method comprises the steps of dividing a 3D lung image into at least two image blocks, processing each image block, splicing the predictive probability graphs corresponding to the image blocks into probability graphs of the 3D lung image according to the dividing sequence, and enabling overlapping areas to be formed between the divided adjacent two image blocks when dividing the adjacent image blocks in order to ensure that the information of the boundaries of the image blocks is reserved when the adjacent image blocks are processed. Because the overlapping areas are arranged between two adjacent image blocks, in the process of obtaining the predictive probability map of each image block, in order to enable the spliced probability map to be more accurate, for each overlapping area, a new probability map part of the overlapping area can be determined based on the probability map parts corresponding to the overlapping areas in the corresponding two predictive probability maps, and when the predictive probability maps of each image block are spliced according to the segmentation sequence of each image block, the probability map part corresponding to each overlapping area in the spliced probability map of the 3D lung image is the corresponding new probability map part.

As an example, as three image blocks shown in fig. 5 are a first image block, a second image block, and a third image block, respectively, an overlapping area between the first image block and the second image block is a, for example, a prediction probability map of the first image block is A1, a prediction probability map of the second image block is B1, a probability map portion corresponding to the overlapping area a in the prediction probability map A1 and the prediction probability map B1 is D, a weight corresponding to the first image block is w1, and a weight corresponding to the second image block is w2, and a new probability map portion can be determined based on the weight w1, the weight w2, and the probability map portion is D.

In the above example, w1 and w2 may be based on the actual configuration, if the weights w1 and w2 are the same, it means that the roles of image block a and image block B are the same.

In the probability map, the probability corresponding to each pixel may be different, in order to determine the detection result corresponding to each disease in the 3D lung image, binarization processing may be performed on the probability map of the 3D lung image for each disease, the probability corresponding to the pixel belonging to the same disease is normalized to be an identifier, that is, two results are included after the binarization processing, one result identifier pixel is not the pixel corresponding to the disease, the other result identifier pixel is the pixel corresponding to the disease, and the binarized probability map is used as the segmentation result of the 3D lung image.

After binarizing the probability map of the 3D lung image, pixels belonging to each disease can be distinguished, and in order to provide a relatively obvious focus area for a user, an area corresponding to adjacent pixels with the same binarization value can be used as a connected area, and the connected area visually corresponds to an area of the 3D lung image, and then based on each connected area, a detection result corresponding to each disease in the 3D lung image can be determined.

In practical application, based on the 3D medical image provided by the user, the scheme of the invention can accurately and rapidly provide the positioning result of the disease region (the region corresponding to the focus) in the 3D medical image for the user. Referring to fig. 6, the method of the present invention is applied to the back end, and the 3D medical image provided by the front end a is a lung CT image, and the lung CT image can determine and obtain the disease location result of the lung CT image, that is, the detection result of the lesion area corresponding to each disease in the lung CT image, through the scheme of the present invention at the back end, and provide the disease location result to the user at the front end a, so that the user can clearly determine the lesion position in the lung CT image.

Based on the same principle as the method shown in fig. 1, the embodiment of the present invention further provides a medical image processing apparatus 20, as shown in fig. 7, the medical image processing apparatus 20 may include an image acquisition module 210, an image segmentation module 220, a feature map determination module 230, a prediction probability map determination module 240, and a detection result determination module 250, wherein:

an image acquisition module 210, configured to acquire a three-dimensional 3D medical image to be processed;

An image segmentation module 220, configured to segment the 3D medical image into at least two adjacent image blocks;

the feature map determining module 230 is configured to perform feature extraction on each image block to obtain a feature map of each image block;

a prediction probability map determining module 240, configured to determine a prediction probability map corresponding to each image block based on the respective feature map of each image block, where the prediction probability map characterizes a probability that each pixel in the image block belongs to each disease;

the detection result determining module 250 is configured to determine a detection result of a focus area corresponding to each disease in the 3D medical image based on the prediction probability map of each image block.

According to the medical image processing device provided by the embodiment of the invention, the 3D medical image can be segmented into at least two adjacent image blocks, the characteristic extraction is carried out on each image block, the characteristic image corresponding to each image block is obtained, and the characteristic of each pixel in the image can be reflected by the characteristic image corresponding to each image block, so that the detection result corresponding to the same disease type in the 3D medical image can be accurately determined based on the characteristic image corresponding to each image block and the relevance between the adjacent image blocks.

Optionally, the feature map determining module is specifically configured to, when performing feature extraction on each image block to obtain a feature map of each image block:

Optionally, when the feature map determining module includes at least two levels of feature maps in at least one level of feature maps, the feature map determining module is specifically configured to, when determining a weight feature map corresponding to each image block based on the at least one level of feature maps:

Optionally, the detection result determining module is specifically configured to, when determining, based on the prediction probability map of each image block, a detection result of a focus area corresponding to each disease in the 3D medical image:

Optionally, the detection result determining module is specifically configured to, when determining the detection result of the focal region corresponding to each disease in the 3D medical image based on the probability map of the 3D medical image:

Optionally, the detection result determining module is specifically configured to, when determining a detection result of a focus area corresponding to each disease in the 3D medical image based on the segmentation result corresponding to the 3D medical image:

Optionally, an overlapping area is formed between two adjacent image blocks in the at least two image blocks, and the detection result determining module is specifically configured to, when the prediction probability map of each image block is spliced according to the segmentation order of each image block to obtain the probability map of the 3D medical image:

Optionally, the detection result determining module is specifically configured to, when determining a new probability map portion of the overlapping region based on the probability map portions corresponding to the overlapping region in the two corresponding prediction probability maps:

Optionally, feature extraction is performed on each image block to obtain a feature map corresponding to each image block, and based on the feature map corresponding to each image block, a prediction probability map corresponding to each disease of each image block is determined through a neural network model, and the neural network model is obtained through training in the following manner:

Optionally, the prediction result is a probability that each pixel in each slice belongs to a focal region of each disease; the loss function of the initial network model comprises a first loss function and a second loss function, wherein the value of the first loss function represents the difference degree between the prediction result of each pixel in each slice and the labeling result corresponding to each pixel, and the value of the second loss function represents the difference degree between the prediction result corresponding to each pair of slices in each slice pair and the corresponding labeling result.

Optionally, the value of the second loss function is determined by:

determining the association weight corresponding to each slice pair;

Optionally, the sample 3D medical image is a 3D lung image, the disease species including at least one of nodules, arteriosclerosis, lymph node calcification or chordae; the labeling result is a target labeling frame corresponding to the focus region, and a slice in a sample 3D medical image is targeted;

Since the medical image processing apparatus provided in the embodiment of the present invention is an apparatus capable of executing the medical image processing method in the embodiment of the present invention, a person skilled in the art can understand the specific implementation of the medical image processing apparatus in the embodiment of the present invention and various modifications thereof based on the medical image processing method provided in the embodiment of the present invention, so how the medical image processing method in the embodiment of the present invention is implemented by the apparatus will not be described in detail herein. The medical image processing apparatus used by those skilled in the art to implement the medical image processing method according to the embodiment of the present invention falls within the scope of the present invention.

Based on the same principle as the medical image processing method and the medical image processing apparatus provided by the embodiment of the present invention, the embodiment of the present invention further provides an electronic device, which may include a processor and a memory. The memory stores readable instructions that, when loaded and executed by the processor, implement the methods described in any of the embodiments of the present invention.

As an example, a schematic structural diagram of an electronic device 4000 to which the scheme of the embodiment of the present invention is applied is shown in fig. 8, and as shown in fig. 8, the electronic device 4000 may include a processor 4001 and a memory 4003. Wherein the processor 4001 is coupled to the memory 4003, such as via a bus 4002. Optionally, the electronic device 4000 may also include a transceiver 4004. It should be noted that, in practical applications, the transceiver 4004 is not limited to one, and the structure of the electronic device 4000 is not limited to the embodiment of the present invention.

The processor 4001 may be a CPU (Central Processing Unit ), general purpose processor, DSP (Digital Signal Processor, data signal processor), ASIC (Application Specific Integrated Circuit ), FPGA (Field Programmable Gate Array, field programmable gate array) or other programmable logic device, transistor logic device, hardware components, or any combination thereof. Which may implement or perform the various exemplary logic blocks, modules and circuits described in connection with this disclosure. The processor 4001 may also be a combination that implements computing functionality, e.g., comprising one or more microprocessor combinations, a combination of a DSP and a microprocessor, etc.

Bus 4002 may include a path to transfer information between the aforementioned components. Bus 4002 may be a PCI (Peripheral Component Interconnect, peripheral component interconnect standard) bus or an EISA (Extended Industry Standard Architecture ) bus, or the like. The bus 4002 can be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 8, but not only one bus or one type of bus.

Memory 4003 may be, but is not limited to, ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, RAM (Random Access Memory ) or other type of dynamic storage device that can store information and instructions, EEPROM (Electrically Erasable Programmable Read Only Memory ), CD-ROM (Compact Disc Read Only Memory, compact disc Read Only Memory) or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.

The memory 4003 is used for storing application program codes for executing the inventive arrangements, and is controlled to be executed by the processor 4001. The processor 4001 is configured to execute application code stored in the memory 4003 to implement the scheme shown in any of the method embodiments described above.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.

The foregoing is only a partial embodiment of the present invention, and it should be noted that it will be apparent to those skilled in the art that modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations should and are intended to be comprehended within the scope of the present invention.

Claims

1. A medical image processing method, comprising:

acquiring a three-dimensional 3D medical image to be processed;

segmenting the 3D medical image into at least two adjacent image blocks;

extracting features of each image block through a neural network model to obtain a feature map of each image block, and determining a prediction probability map corresponding to each image block based on the respective feature map of each image block, wherein the prediction probability map characterizes the probability that each pixel in the image block belongs to each disease;

determining detection results of focus areas corresponding to the disease types in the 3D medical image based on the prediction probability map of each image block;

wherein the neural network model is trained by:

training an initial network model based on the sample 3D medical image until a loss function of the initial network model converges, taking the model after training as the neural network model, wherein the value of the loss function characterizes the difference degree of a prediction result and a labeling result corresponding to each slice;

The prediction result is a detection result of a focus area of each disease of each pixel in a slice obtained through the initial network model; the loss function comprises a first loss function and a second loss function, wherein the value of the first loss function represents the degree of difference between the prediction result of each pixel in each slice and the labeling result corresponding to each pixel, and the value of the second loss function represents the degree of difference between the prediction result corresponding to each pair of slices in each slice and the corresponding labeling result.

2. The method according to claim 1, wherein the feature extraction of each image block to obtain a feature map of each image block includes:

extracting features of each image block to obtain a feature map of at least one hierarchy of the 3D medical image;

determining a weight feature map corresponding to each image block based on the feature map of the at least one hierarchy;

and obtaining the feature map of each image block based on the weight feature map corresponding to each image block and the feature map of the at least one hierarchy.

3. The method according to claim 2, wherein the at least one level of feature map includes at least two levels of feature maps, and the determining the weighted feature map corresponding to each image block based on the at least one level of feature maps includes:

the obtaining the feature map of each image block based on the weight feature map corresponding to each image block and the feature map of the at least one hierarchy includes:

4. A method according to any one of claims 1 to 3, wherein determining a detection result of a lesion area corresponding to each disease type in the 3D medical image based on the predictive probability map of each image block comprises:

and determining detection results of focus areas corresponding to each disease type in the 3D medical image based on the probability map of the 3D medical image.

5. The method of claim 4, wherein the determining a detection result of a lesion area corresponding to each disease in the 3D medical image based on the probability map of the 3D medical image comprises:

Performing binarization processing on the probability map of the 3D medical image for each disease, and obtaining a segmentation result corresponding to the 3D medical image, wherein the segmentation result represents a detection result of a focus area of each disease belonging to each pixel in the 3D medical image;

and determining detection results of focus areas corresponding to the disease types in the 3D medical image based on the segmentation results corresponding to the 3D medical image.

6. The method of claim 5, wherein determining a detection result of a lesion area corresponding to each disease in the 3D medical image based on the segmentation result corresponding to the 3D medical image comprises:

based on each connected domain, determining a detection result of a focus area corresponding to each disease type in the 3D medical image.

7. The method according to claim 4, wherein an overlapping area is provided between two adjacent image blocks in the at least two image blocks, the stitching the predictive probability map of each image block according to the segmentation order of each image block, to obtain the probability map of the 3D medical image, includes:

and splicing the prediction probability graphs of the image blocks according to the segmentation sequence of the image blocks to obtain a probability graph of the 3D medical image, wherein the probability graph part corresponding to each overlapping region in the spliced probability graph of the 3D medical image is the corresponding new probability graph part.

8. The method of claim 7, wherein determining a new probability map portion of the overlap region based on the corresponding probability map portions of the overlap region in the corresponding two predictive probability maps comprises:

and determining a new probability map part of the overlapped area based on the weight corresponding to each image block and the probability map parts corresponding to the overlapped area in the two corresponding prediction probability maps.

9. The method of claim 1, wherein the value of the second loss function is determined by:

Determining the association weight corresponding to each slice pair;

10. The method of claim 1, wherein the sample 3D medical image is a 3D lung image, the disease comprises at least one of a nodule, arteriosclerosis, lymph node calcification, or a chordae; the labeling result is a target labeling frame corresponding to the focus area, and a slice in a 3D medical image of one sample is targeted;

determining a target labeling frame of a corresponding focus area in the slice based on a center point and a radius corresponding to an original labeling result corresponding to the slice, wherein the original labeling result is a labeling result corresponding to each slice in a training dataset corresponding to the 3D medical image of the sample;

if the disease is the cable, the labeling result of the slice is determined by the following modes:

and determining an initial labeling frame of a corresponding focus area in the slice based on the center point and the radius corresponding to the original labeling result corresponding to the slice, performing image morphological operation on the initial labeling frame, and determining a target labeling frame of the corresponding focus area in the slice based on an operation result.

11. A medical image processing apparatus, comprising:

an image segmentation module for segmenting the 3D medical image into at least two adjacent image blocks;

the prediction probability map determining module is used for determining a prediction probability map corresponding to each image block based on the characteristic map of each image block, wherein the prediction probability map represents the probability that each pixel in the image block belongs to various diseases;

the detection result determining module is used for determining detection results of focus areas corresponding to various disease types in the 3D medical image based on the prediction probability map of each image block;

the method comprises the steps of extracting features of each image block to obtain a feature map corresponding to each image block, and determining a prediction probability map corresponding to each disease of each image block based on the feature map corresponding to each image block, wherein the prediction probability map corresponding to each disease of each image block is obtained through a neural network model, and the neural network model is trained through the following modes:

12. An electronic device, comprising:

a processor and a memory;

the memory is used for storing computer operation instructions;

the processor is configured to perform the method of any one of claims 1 to 10 by invoking the computer operating instructions.

13. A computer readable storage medium, characterized in that the storage medium stores at least one computer program, which is loaded and executed by a processor to implement the method of any one of claims 1 to 10.