CN112651960A

CN112651960A - Image processing method, device, equipment and storage medium

Info

Publication number: CN112651960A
Application number: CN202011641682.6A
Authority: CN
Inventors: 黄静; 吴迪嘉
Original assignee: Shanghai United Imaging Intelligent Healthcare Co Ltd
Current assignee: Shanghai United Imaging Intelligent Healthcare Co Ltd
Priority date: 2020-12-31
Filing date: 2020-12-31
Publication date: 2021-04-13

Abstract

The application provides an image processing method, an image processing device, an image processing apparatus and a storage medium, wherein the method comprises the following steps: acquiring a plurality of images to be processed of different phases; performing image classification processing on the to-be-processed images of the multiple different phases based on a first neural network model to obtain target image classification results corresponding to the to-be-processed images of the multiple different phases; performing size regression processing on the to-be-processed images of the multiple different phases based on a second neural network model to obtain target size regression results corresponding to the to-be-processed images of the multiple different phases; and determining a first grade classification result corresponding to the images to be processed of the plurality of different phases based on the target sign classification result and the target size regression result. The method and the device can improve the efficiency and accuracy of determining the first grade classification result and reduce the cost of determining the first grade classification result.

Description

Image processing method, device, equipment and storage medium

Technical Field

The present application belongs to the field of computer technologies, and in particular, to an image processing method, apparatus, device, and storage medium.

Background

The prior art employs image reporting and data systems as a rating criterion for objects of interest. For example, the liver imaging report and data system (LI-RADS) is used as the evaluation criteria for liver cancer grade, and the specific process can be as follows: according to the LI-RADS evaluation standard, each symptom (the evaluation object comprises three main symptoms and more than twenty secondary symptoms) is evaluated to obtain a symptom evaluation result of each symptom, and then the comprehensive analysis is carried out on the symptom evaluation results of each evaluation object to give a final LI-RADS evaluation grade.

However, the process of grading liver cancer by using the LI-RADS is complicated and difficult, and the evaluation result is easily affected by subjective errors of different doctors, thereby limiting the feasibility of applying the LI-RADS standard to clinic.

Disclosure of Invention

In order to solve the above technical problem, the present application provides an image processing method, an apparatus, a device and a storage medium.

In one aspect, the present application provides an image processing method, including:

acquiring a plurality of images to be processed of different phases;

performing image classification processing on the to-be-processed images of the multiple different phases based on a first neural network model to obtain target image classification results corresponding to the to-be-processed images of the multiple different phases;

performing size regression processing on the to-be-processed images of the multiple different phases based on a second neural network model to obtain target size regression results corresponding to the to-be-processed images of the multiple different phases;

and determining a first grade classification result corresponding to the images to be processed of the plurality of different phases based on the target sign classification result and the target size regression result.

Further, the step of performing the feature classification processing on the to-be-processed images of the multiple different phases based on the first neural network model to obtain the target feature classification results corresponding to the to-be-processed images of the multiple different phases includes:

performing feature extraction on the plurality of first feature images based on a first feature extraction layer in the first feature classification model to obtain feature extraction results corresponding to the plurality of first feature images;

fusing the feature extraction results corresponding to the first feature images respectively based on a first full-connection layer in the first feature classification model to obtain feature fusion results corresponding to the first feature images;

and processing the feature fusion result based on the first feature classification model to obtain the first feature classification results corresponding to the plurality of first feature images.

performing feature extraction on the plurality of second feature images based on a second feature extraction layer in the second feature classification model to obtain feature extraction results corresponding to the plurality of second feature images;

processing the feature extraction results corresponding to the second feature images based on a second full-connection layer in the second feature classification model to obtain the feature classification results corresponding to the second feature images;

and processing the feature classification result corresponding to each of the plurality of second feature images based on the second feature classification model to obtain the second feature classification result corresponding to the plurality of second feature images.

Further, if the images to be processed of the multiple different phases include a target image, before performing size regression processing on the images to be processed of the multiple different phases based on the second neural network model to obtain target size regression results corresponding to the images to be processed of the multiple different phases, the method further includes:

and carrying out segmentation processing on the target image based on a third neural network model to obtain a segmentation result corresponding to the target image.

Further, the performing size regression processing on the to-be-processed images of the multiple different phases based on the second neural network model to obtain target size regression results corresponding to the to-be-processed images of the multiple different phases includes:

performing feature extraction on the segmentation result based on a third feature extraction layer in the second neural network model to obtain a feature extraction result corresponding to the target image;

based on a third full-connection layer in the second neural network model, carrying out fusion processing on the feature extraction result corresponding to the target image to obtain a size regression result corresponding to the target image;

and taking the size regression result corresponding to the target image as the target size regression result.

Further, after the determining the first grade classification results corresponding to the plurality of different phases of the image to be processed based on the target symptom classification result and the target size regression result, the method further includes:

acquiring a first feature vector corresponding to the first feature classification result, a second feature vector corresponding to the second feature classification result and a third feature vector corresponding to the target size regression result; the first feature vector is a feature vector input into the first fully-connected layer, the second feature vector is a feature vector input into the second fully-connected layer, and the third feature vector is a feature vector input into the third fully-connected layer;

and performing fusion processing on the first feature vector, the second feature vector and the third feature vector based on a fourth neural network model to obtain a second hierarchical classification result.

Further, the fusing the first feature vector, the second feature vector, and the third feature vector based on the fourth neural network model to obtain a second hierarchical classification result includes:

processing the first feature vector based on a fourth full-link layer and a first nonlinear activation layer in the fourth neural network model to obtain a first nonlinear mapping result;

processing the second feature vector based on a fifth full-link layer and a second nonlinear activation layer in the fourth neural network model to obtain a second nonlinear mapping result;

processing the third feature vector based on a sixth full-link layer and a third nonlinear activation layer in the fourth neural network model to obtain a third nonlinear mapping result;

and based on a seventh full connection layer in the fourth neural network model, performing fusion processing on the first nonlinear mapping result, the second nonlinear mapping result and the third nonlinear mapping result to obtain the second hierarchical classification result.

In another aspect, the present application provides an image processing apparatus, comprising:

the acquisition module is used for acquiring a plurality of images to be processed of different phases;

the classification module is used for carrying out symptom classification processing on the images to be processed of the multiple different phases based on a first neural network model to obtain target symptom classification results corresponding to the images to be processed of the multiple different phases;

the regression module is used for carrying out size regression processing on the to-be-processed images of the multiple different phases based on a second neural network model to obtain target size regression results corresponding to the to-be-processed images of the multiple different phases;

and the fusion module is used for determining the first grade classification results corresponding to the to-be-processed images of the different phases based on the target sign classification result and the target size regression result.

In another aspect, the present application proposes an image processing apparatus comprising a processor and a memory, wherein at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the image processing method as described above.

In another aspect, the present application proposes a computer-readable storage medium, in which at least one instruction or at least one program is stored, the at least one instruction or the at least one program being loaded and executed by a processor to implement the image processing method as described above.

The image processing method, the image processing device, the image processing apparatus and the storage medium provided by the embodiment of the application first obtain a plurality of images to be processed of different phases, perform symptom classification processing on the images to be processed of the different phases based on a first neural network model to obtain target symptom classification results corresponding to the images to be processed of the different phases, perform size regression processing on the images to be processed of the different phases based on a second neural network model to obtain target size regression results corresponding to the images to be processed of the different phases, and then determine first grade classification results (namely primary classification results) corresponding to the images to be processed of the different phases according to the target symptom classification results and the target size regression results. On one hand, the image to be processed of different phases is subjected to the symptom classification based on the first neural network model, so that the efficiency and the accuracy of the symptom classification can be improved. On the other hand, the size of the to-be-processed images in different phases is calculated based on the second neural network, so that the problems of large size calculation error, long time consumption, high cost and the like caused by calculation of the size based on manual intervention can be solved, the efficiency and the accuracy of size calculation are improved, and the cost of size calculation is reduced. According to the embodiment of the application, through the two aspects, the efficiency and the accuracy of determining the first grade classification result are effectively improved, and the cost of determining the first grade classification result is reduced.

Drawings

In order to more clearly illustrate the technical solutions and advantages of the embodiments of the present application or the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a schematic flowchart of an image processing method according to an embodiment of the present application.

Fig. 2 is a schematic flow chart of performing, based on a first neural network model, a feature classification process on the to-be-processed images of the multiple different phases to obtain target feature classification results corresponding to the to-be-processed images of the multiple different phases according to the embodiment of the present application.

Fig. 3 is a schematic structural diagram of a first feature classification model according to an embodiment of the present application.

Fig. 4 is another schematic flow chart of performing, based on the first neural network model, a feature classification process on the to-be-processed images of the multiple different phases to obtain target feature classification results corresponding to the to-be-processed images of the multiple different phases according to the embodiment of the present application.

Fig. 5 is a schematic structural diagram of a second feature classification model according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of a second neural network model provided in an embodiment of the present application.

Fig. 7 is a schematic flowchart of another image processing method according to an embodiment of the present application.

Fig. 8 is a schematic structural diagram of a third neural network provided in an embodiment of the present application.

Fig. 9 is a schematic flow chart for determining a second hierarchical classification result according to an embodiment of the present application.

Fig. 10 is a schematic flow chart of performing fusion processing on the first feature vector, the second feature vector, and the third feature vector based on a fourth neural network model to obtain a second hierarchical classification result according to the embodiment of the present application.

Fig. 11 is a schematic structural diagram of a multi-feature fused LI-RADS rating evaluation model provided in an embodiment of the present application.

Fig. 12 is a block diagram of an image processing apparatus according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Fig. 1 is a schematic flow chart of an image processing method provided in an embodiment of the present application, and the present specification provides the method operation steps as described in the embodiment or the flowchart, but more or less operation steps may be included based on conventional or non-inventive labor. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In practice, the system or server product may be implemented in a sequential or parallel manner (e.g., parallel processor or multi-threaded environment) according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 1, the method may include:

s101, acquiring a plurality of images to be processed of different phases.

The "image to be processed" in the embodiment of the present application may be an image corresponding to an object of interest in a target object, which may be various parts of a human body including, but not limited to, a liver, a heart, a blood vessel, a lung, a stomach, a kidney, and the like. The object of interest may be, but is not limited to, a tumor, a lesion, a bleeding spot, etc.

The images to be processed of the multiple different phases in the embodiment of the present application may be multi-phase enhanced Computed Tomography (CT) images and/or enhanced Magnetic Resonance Imaging (MRI) images.

The "different phases" are distinguished according to whether or not a contrast medium is injected during the imaging process and the position of the injected contrast medium.

Taking the target object as a liver and the target of interest as a liver tumor as an example, the "images to be processed in different phases" may include:

liver tumor image at the plateau: normal scan, i.e. image of the liver without contrast agent injection.

Liver tumor image at arterial stage: a contrast agent is injected and the liver is imaged with the contrast agent in the hepatic artery.

Liver tumor image at venous stage: contrast agent is injected and the liver is imaged with the contrast agent in the portal vein.

Liver tumor images at delayed phase: liver images after the contrast agent has flowed out of the liver vessels.

S103, carrying out symptom classification processing on the to-be-processed images of the multiple different phases based on a first neural network model to obtain target symptom classification results corresponding to the to-be-processed images of the multiple different phases.

In the embodiment of the application, the image to be processed of a plurality of different phases can be subjected to image classification based on the first neural network model, so that a target image classification result is obtained.

The first Neural Network model in this embodiment includes, but is not limited to, Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs).

In some embodiments, specific methods may be used to classify different signs based on existing scoring criteria (e.g., LI-RADS ratings). The method is characterized in that an existing scoring standard (such as an LI-RADS scoring standard) is used as priori knowledge, features required by the symptom classification are extracted, and the symptom classification is carried out according to the extracted features, so that the problems that effective features of images are difficult to guarantee to be learned by a network under the condition that the data volume is limited and overfitting is easily caused due to the fact that images of different phases are directly input into a neural network model as different input channels are solved.

For example, the target symptom corresponding to the target object comprises a plurality of symptoms (symptom 1, symptom 2 and symptom 3), and since the judgment of different symptoms depends on different photography phases, the photography phase (symptom 1 depends on the photography phases a and B, symptom 2 depends on the photography phases a and C, and symptom 3 depends on the photography phases B and C) of each symptom can be determined according to the existing scoring standard, then the features in the photography phases a and B are extracted to classify the symptom 1, the features in the photography phases a and C are extracted to classify the symptom 2, and the features in the photography phases B and C are extracted to classify the symptom 3.

In some embodiments, if it is determined that the target feature corresponding to the target object includes the first feature according to the existing scoring criteria, and the determination of the first feature depends on a plurality of first feature images included in the plurality of images to be processed of different phases, the first neural network model may include a first feature classification model, the target feature classification result may include a first feature classification result, and the first feature classification result may be determined based on the first feature classification model. Specifically, as shown in fig. 2, the S103 may include:

and S103101, based on a first feature extraction layer in the first feature classification model, performing feature extraction on the plurality of first feature images to obtain feature extraction results corresponding to the plurality of first feature images.

And S103103, fusing the feature extraction results corresponding to the first feature images respectively based on the first full connection layer in the first feature classification model to obtain feature fusion results corresponding to the first feature images.

And S103105, processing the feature fusion result based on the first feature classification model to obtain the first feature classification results corresponding to the plurality of first feature images.

In this embodiment, a schematic structural diagram of the first feature classification model may be as shown in fig. 3. As shown in fig. 3, the first feature classification model is built based on a residual error network (RES-Net, RES), and may include several independent sub-networks and a shared network, each independent sub-network may include an input layer, two downsampling modules (downblocks), and the shared network includes two downblocks, a global pooling layer (GAP), a fully connected layer (FC), and a classification layer (Softmax). The DownBlock may include, among others, a convolutional layer (Conv), a Batch Normalization layer (BN), a nonlinear activation layer (RELU), a max pooling layer (MP), and a RES. The RES has a Bottleneck Layer (BL) structure, which can deepen the Layer number of the neural network so as to sufficiently extract the characteristics of the target of interest (such as liver tumor), and can ensure high training efficiency of the model.

The several independent sub-networks correspond to the first feature extraction layer in S103101, and are respectively responsible for feature extraction of images of different phases. The feature extraction result extracted by the first feature extraction layer includes, but is not limited to, features related to feature judgment such as gray contrast, texture and the like, and can also be coupled with information such as the shape, size and the like of an interested target, and the feature extraction result extracted by the first feature extraction layer is mainly used for positive and negative classification of the features. Continuing with fig. 3, because the DownBlock in the independent sub-network includes multiple pooling layers, the multiple pooling can alleviate the situation that the positions of the target locations of interest (e.g., liver tumors) between different phases do not correspond to each other to some extent.

It should be noted that the number of independent sub-networks is greater than or equal to the number of first image indicators.

The first full connection layer in S103103 is located in the shared network, and it may fuse the features between phases (i.e., the feature extraction results corresponding to the first feature images), and use the fused features to determine the category of the feature, thereby improving the performance of the classification model and further improving the accuracy of the feature classification.

In this embodiment, the first feature classification model may output a positive or negative result of the first feature, i.e., the first feature classification result may be a positive or negative result of the first feature.

In this embodiment, for respective differences and relationships between images to be processed of different phases, the first feature classification model is designed according to an existing scoring standard (e.g., LI-RADS scoring standard), and the first feature classification model enables different phases to maintain respective feature extraction sub-networks and share a part of feature classification sub-networks, so that features of different phases can be mutually referenced when respective features are independently extracted, and the accuracy of feature classification is further improved.

In some embodiments, if it is determined that the feature corresponding to the target object includes a second feature according to the existing scoring criteria, and the determination of the second feature depends on a plurality of second feature images included in the plurality of images to be processed of different phases, the first neural network model may include a second feature classification model, the target feature classification result may include a second feature classification result, and the second feature classification result may be determined based on the second feature classification model. Specifically, as shown in fig. 4, the S103 may include:

s103301, feature extraction is carried out on the plurality of second feature images based on a second feature extraction layer in the second feature classification model, and feature extraction results corresponding to the plurality of second feature images are obtained.

S103303, processing the feature extraction results corresponding to the second feature images respectively based on a second full connection layer in the second feature classification model to obtain the feature classification results corresponding to the second feature images respectively.

S103305, processing the feature classification result corresponding to each of the plurality of second feature images based on the second feature classification model to obtain the second feature classification result corresponding to the plurality of second feature images.

In this embodiment, a schematic structural diagram of the second feature classification model may be as shown in fig. 5. As shown in fig. 5, the second feature classification model is built based on a residual error network (RES-Net), which may be a single-branch network including a feature extraction layer and a feature fusion layer. The one feature extraction layer may include one input layer, two downblocks. The one feature fusion layer may comprise two downblocks, one GAP, one FC, one Softmax layer. Among them, the DownBlock may contain Conv, BN, RELU layers, most maxporoling layers, and RES. The RES has a Bottleneck structure, and can deepen the layer number of the neural network so as to sufficiently extract the characteristics of the target position (such as liver tumor) of interest, and meanwhile, the higher training efficiency of the model can be ensured.

The one feature extraction layer corresponds to the second feature extraction layer in S103301, the feature extraction result extracted by the second feature extraction layer includes, but is not limited to, features related to grayscale contrast, texture, and the like, and is more biased to grayscale gradient features, and the feature extraction result extracted by the first feature extraction layer is mainly used for positive and negative classification of the features. Continuing with fig. 5, since the DownBlock in the second feature extraction layer includes multiple pooling layers, the multiple pooling can alleviate the situation that the target location of interest (e.g., liver tumor) does not correspond to different phases.

The second full connection layer in S103303 is FC in the feature fusion layer in fig. 5, and for each second feature image, it may fuse the features extracted by the second feature extraction layer (i.e., the feature extraction results corresponding to the plurality of second feature images), and use the fused features to determine the category of each second feature image.

In this embodiment, the result that the second symptom is positive or negative can be output through the second symptom classification model, i.e., the second symptom classification result can be the result that the second symptom is positive or negative.

It should be noted that, for each second image, the image classification result corresponding to each second image may be output for processing, and in S103305, the or relationship may be taken for the image classification result corresponding to each second image, so as to obtain the final second image classification result.

In summary, for the differences in the relationships between different periods of different symptoms, in the embodiment of the present application, when designing the feature extraction module, the feature extraction sub-networks designed for different symptoms are different, so that it can be ensured that the extracted features are more effective and reliable when the network extracts the features according to the existing scoring standard (e.g., LI-RADS scoring standard), thereby improving the accuracy of the classification of the symptoms.

Hereinafter, S103101 to S103105 and S103301 to S103305 will be described by taking an example in which the target object is a liver and the target of interest is a liver tumor.

According to the LI-RADS scoring criteria, liver tumors can be determined to include three major signs: non-circular arterial phase enhancement (APHE), non-peripheral clearance (washout), and enhanced capsule (capsule). The evaluation of the three major signs relies on the sweep phase image, the arterial phase image, the venous phase image and the delay phase image. Specifically, the assessment of APHE depends on the tumor images of the sweep and arterial phases, the assessment of washout depends on the tumor images of the arterial and venous phases, and the assessment of capsule depends on the tumor images of the venous and delayed phases.

For APHE and waspout signs, it can be evaluated using S103101-S103105:

the evaluation of APHE needs to depend on the tumor images of two phases of a flat scanning phase and an arterial phase, and if the tumor in the flat scanning phase is high signal, and the tumor in the arterial phase is low signal, the APHE is judged to be positive. The procedure for assessment of APHE using S103101-S103105 may be as follows:

as described in S103101, the liver tumor image in the scout scan phase is input into one of the sub-networks in fig. 3, the liver tumor image in the arterial phase is input into the other sub-network in fig. 3, and feature extraction is performed on the liver tumor image in the scout scan phase and the liver tumor image in the arterial phase through the respective sub-networks, so as to obtain feature extraction results corresponding to the liver tumor image in the scout scan phase and the liver tumor image in the arterial phase (mainly, features related to image judgment such as gray scale contrast and texture can be extracted, and information such as shape and size of the liver tumor can be coupled). Next, as described in S103103, feature extraction results corresponding to the liver tumor image in the flat scan phase and the liver tumor image in the arterial phase are fused by the first full-junction layer in fig. 3, so as to obtain a feature fusion result. Finally, as described in S103105, the positive and negative evaluation results of APHE are determined according to the feature fusion results.

Taking the first feature classification model for evaluating APHE as an example, a training process of the first feature classification model is introduced:

1. data pre-processing

Selecting a liver tumor image in a sample flat scanning stage and a liver tumor image in a sample artery stage, and firstly selecting the largest X, Y, Z-direction physical size (size) in a bbox of four stage phases (a flat scanning stage, an artery stage, a vein stage and a delay stage) based on a preset liver tumor regression frame (bbox) as the size of the bbox common to the liver tumor image in the sample flat scanning stage and the liver tumor image in the sample artery stage. The corresponding region of interest (ROI) is then jointly size-cropped (crop) with the center of the respective original bbox of the liver tumor image at the sample sweep stage and the liver tumor image at the sample arterial stage. Meanwhile, the liver tumor image in the sample flat scanning period and the liver tumor image in the sample arterial period are respectively normalized, and because gray scale comparison is required between input images of the APHE classification network, the liver tumor image in the sample flat scanning period and the liver tumor image in the sample arterial period can be normalized by using the same window width window level (obtained by calculating the mean value and the variance of signal values of all pixels of the four phase images), and finally each image pixel is normalized to a preset range (for example, between-1 and 1). The ROI from crop is then resampled to a preset size (48 × 48 pixel size), respectively. And finally, performing enhancement processing such as rotation, translation, scaling and the like on the image.

2. Network training process

And (3) sending the liver tumor image in the sample flat scanning period and the liver tumor image in the sample arterial period after data preprocessing into a network for forward propagation (each branch inputs an image of a phase), and outputting the probability that the liver tumor is predicted to be positive and negative. Then, a classification loss function is calculated according to the positive and negative sign gold standard, and then the loss is propagated reversely to update the weight of the neural network, thereby obtaining the first sign classification model.

Wherein, the Loss function is a Focal local Loss function, and the specific formula can be as follows:

FL(p_t)＝-α_t(1-p_t)γlog(p_t)，

wherein p is_tThe probability value that the t class sample belongs to the real label is predicted for the neural network, and in the embodiment of the application, t has two types, namely negative and positive classes. Alpha is alpha_t∈[0，1]Is a parameter that balances positive and negative samples. Gamma e [1, 5 ]]Is a parameter for balancing the difficult and easy samples.

The estimation of washout depends on two phases of an arterial phase and a venous phase, and the judgment of the positive and negative estimation of washout also needs to comprehensively compare the signal conditions of the two phases. The positive and negative assessments of waspout are similar to APHE, i.e. waspout can be assessed using the first symptom classification model described above, and will not be described herein.

For capsule signs, it can be evaluated using S103301-S103305:

the assessment of the capsules depends on two phases, namely a venous phase and a delayed phase, the two phases can independently judge whether the liver tumor capsules are positive or negative, when any phase has a positive sign, the final diagnosis result is positive, and if the two phases are negative, the final diagnosis result is negative. Thus, the two phases are equivalent with respect to the final diagnostic result.

The process of evaluating capsules using S103301-S103305 may be as follows:

as described in S103301, feature extraction may be performed on the liver tumor images in the vein phase and the delay phase, respectively, using the second feature extraction layer in fig. 5 (mainly extracting the ring-shaped enhanced features that determine the capsule symptom, which are also features related to the gray scale gradient and texture, and are more biased to the gray scale gradient features), so as to obtain feature extractions corresponding to the liver tumor images in the vein phase and the delay phase, respectively. Next, as described in S103303, the second full-link layer in fig. 5 is used to process the feature extraction results corresponding to the liver tumor images in the venous phase and the delayed phase, so as to obtain capsule feature classification results corresponding to the liver tumor images in the venous phase and the delayed phase. Finally, as described in S103305, the second feature classification result is determined according to the capsule feature classification results corresponding to the liver tumor images in the venous phase and the delayed phase, respectively.

If one of the capsule symptom classification results corresponding to the liver tumor images of the vein phase and the delay phase is positive, the second symptom classification result is positive, and if both are negative, the second symptom classification result is negative.

Taking the second feature classification model for evaluating capsules as an example, a training process of the second feature classification model is introduced:

1. data pre-processing

Selecting a liver tumor image of a sample venous stage and a liver tumor image of a sample delay stage, and firstly selecting the largest X, Y, Z-direction physical size (size) in the bbox of four stage phases (a flat scan stage, an artery stage, a venous stage and a delay stage) based on a preset bbox as the size of the bbox common to the liver tumor image of the sample venous stage and the liver tumor image of the sample delay stage. The corresponding ROI area is then jointly size-cropped (crop) with the center of the original bbox of each of the liver tumor image of the sample venous stage and the liver tumor image of the sample delayed stage. Meanwhile, the liver tumor image in the sample venous phase and the liver tumor image in the sample delay phase are respectively normalized, and the liver tumor image in the sample venous phase and the liver tumor image in the sample delay phase are relatively independent and have an equivalent relation with the final judgment of negative and positive, so that the window width and the window level of each image are used for normalization of the liver tumor image in the sample venous phase and the liver tumor image in the sample delay phase, and finally each image pixel is normalized to a preset range (for example, between-1 and 1). The ROI from crop is then resampled to a preset size (48 × 48 pixel size), respectively. And finally, performing enhancement processing such as rotation, translation, scaling and the like on the image.

2. Network training process

And sending the liver tumor image of the sample venous phase and the liver tumor image of the sample delay phase after data preprocessing into a network for forward propagation, and outputting the probability that the image of each phase is predicted to be positive and negative by the network. Then, a classification loss function is calculated according to the capsule symptom negative and positive gold standard, and then the loss is propagated reversely to update the weight of the neural network, so that the first symptom classification model is obtained.

The calculation formula of the loss function may be as follows:

FL(p₁)＝-a₁(p_V0p_D0)^ylog(1-p_V0p_D0)，

FL(p₀)＝-α₀(1-p_V0p_D0)^γlog(p_V0p_D0)，

wherein p is₁、p₀Respectively representing the probability that the capsule symptom is finally positive and negative, p_V0、p_D0Indicates the probability of capsules showing negativity from the venous phase and the delayed phase, respectively, alpha₁And alpha₀Is a parameter for balancing positive and negative samples, gamma is in the range of [1, 5 ]]Is a parameter for balancing the difficult and easy samples.

And S105, performing size regression processing on the to-be-processed images of the multiple different phases based on a second neural network model to obtain target size regression results corresponding to the to-be-processed images of the multiple different phases.

The size of an object of interest is also one of the indispensable indicators in the rating process of an object of interest, such as a liver tumor. In order to avoid the problems of time and labor waste, large human error and the like caused by manual measurement of the size of the target of interest, the size of the target of interest can be automatically calculated based on the second neural network model, so that the time and the cost for calculating the size of the target of interest are saved, and the accuracy of calculating the size of the target of interest is improved.

The dimensions of the object of interest include, but are not limited to: diameter, length, width, height, etc. of the object of interest.

The second Neural Network model in the embodiment of the present application includes, but is not limited to, Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN).

Fig. 6 is a structural diagram of a second neural network model provided in an embodiment of the present application, and as shown in fig. 6, the second neural network may be a two-dimensional regression model similar to a classification network (LE-NET), which may include three two-dimensional Conv/BN/RELU modules, a GAP layer and an FC layer.

In some embodiments, the plurality of images to be processed of different phases include a target image, and before S105, as shown in fig. 7, the method may further include:

and S104, carrying out segmentation processing on the target image based on a third neural network model to obtain a segmentation result corresponding to the target image.

Accordingly, as shown in fig. 7, the S105 may include:

and S10501, performing feature extraction on the segmentation result based on a third feature extraction layer in the second neural network model to obtain a feature extraction result corresponding to the target image.

And S10503, based on a third full-connection layer in the second neural network model, carrying out fusion processing on the feature extraction results corresponding to the target image to obtain a size regression result corresponding to the target image.

And S10505, taking the size regression result corresponding to the target image as the target size regression result.

In this embodiment, in order to improve the accuracy of determining the target size regression result and further improve the accuracy of determining the subsequent first-level classification result and the second-level classification result, a well-defined target image may be selected from a plurality of different-phase images to be processed, and as described in S104, before the target image is input into the second neural network model, the target image is segmented in advance based on the third neural network model, so as to obtain a segmentation result.

The third Neural Network in this embodiment includes, but is not limited to, Convolutional Neural Networks (CNNs), Recurrent Neural Networks (RNNs).

Fig. 8 is a schematic structural diagram of a third neural network provided in this embodiment. As shown in fig. 8, the third neural network may adopt a U-shaped three-dimensional segmentation network (U-Net) with BL as a basic network architecture, and may include four down-sampling modules and four up-sampling modules, where a plurality of BL sub-modules are embedded in the four down-sampling modules and the four up-sampling modules, the four down-sampling modules may implement an encoding function of a tumor image, and the four up-sampling modules implement a decoding function of the tumor image. On one hand, the embedded BL sub-modules can deepen the layer number of the neural network so as to more fully extract the features of the interested target (mainly extract the features of the interested target such as the shape and the size, and simultaneously couple the features such as gray texture), and meanwhile, the higher training efficiency of the model can be ensured. It should be noted that fig. 8 is only a schematic diagram, and does not show all the downsampling modules and downsampling modules.

The segmentation result in this embodiment may be a three-dimensional probability map (prob map). After obtaining the prob _ map, as described in S10501, feature extraction may be performed on the prob _ map based on the third feature extraction layer (i.e., the three two-dimensional Conv/BN/RELU modules in fig. 6) in the second neural network model, so as to obtain a feature extraction result corresponding to the target image. Subsequently, as described in S10503 to S10505, the feature extraction results are fused based on the third fully-connected layer in fig. 6, and a target size regression result is obtained.

In one possible embodiment, the Prob _ map may be input into the second neural network model layer by layer (along the Z-axis direction of the image, each XY plane is taken out and sent to the two-dimensional network for regression), each layer of image is propagated forward through the network independently to obtain the corresponding size of the object of interest, and the maximum value of the sizes regressed from all layers is taken as the target size regression result.

The result output by the second neural network model in this embodiment may provide information such as diameter, shape, etc. for subsequent models, thereby improving the accuracy of determining subsequent second-level classification results.

Hereinafter, S104 and S10501 to S10505 will be described by taking as an example that the target object is a liver, the target of interest is a liver tumor, and the size is a diameter.

Because the tumor image in the delay period is well-defined, the liver tumor image in the delay period can be selected to calculate the diameter of the liver tumor, thereby improving the accuracy of the calculation of the diameter of the liver tumor.

The liver tumor image of the delay period is segmented by using the third neural network in fig. 8 to obtain prob _ map of the liver tumor image of the delay period, the prob _ map is input into the second neural network model in fig. 6 layer by layer (along the Z-axis direction of the image, each XY plane is taken out and sent into the second neural network model for regression), each layer of image is propagated forward through the network independently to obtain the corresponding liver tumor diameter, and the maximum value of the diameters regressed at all layers is taken as the liver tumor diameter.

Taking the example of segmenting the liver tumor image in the delayed period as an example, the training process of the third neural network is introduced:

1. data pre-processing

Selecting a liver tumor image in a sample delay period, and selecting the largest X, Y, Z-direction physical size (size) in the bbox of four phase phases (a flat scanning phase, an arterial phase, a venous phase and a delay phase) based on a preset bbox as the size of the bbox of the liver tumor image in the sample delay period. Then, the center of the original bbox of the liver tumor image in the sample delay period is used, the corresponding ROI area is cropped (crop) by the size of the bbox, and meanwhile, the liver tumor image in the sample delay period is subjected to adaptive normalization, so that pixels are normalized to a preset range (for example, between-1 and 1). The crop out ROI is then resampled to a preset size (48 x 48 pixel size). And finally, performing enhancement processing such as rotation, translation, scaling and the like on the image.

2. Network training process

The liver tumor image of the sample delay period after data preprocessing is sent into a network for forward propagation, and two probability graphs with the same size as the original image are output, wherein one probability graph represents the probability graph predicted as a foreground (namely a tumor area), and the other probability graph represents the probability graph predicted as a background (a non-tumor area). The loss factor of the segmentation was then calculated with the gold standard for the lag phase. And then, the loss is propagated reversely, and the weight of the neural network is updated, so that the third symptom classification model is obtained.

Taking the liver tumor diameter calculation of the liver tumor image in the delayed period as an example, the training process of the second neural network is introduced:

1. data pre-processing

Selecting a liver tumor image in a sample delay period, and selecting the largest X, Y, Z-direction physical size (size) in the bbox of four phase phases (a flat scanning phase, an arterial phase, a venous phase and a delay phase) based on a preset bbox as the size of the bbox of the liver tumor image in the sample delay period. Then, the ROI area of the delayed period is cropped (crop) by the size of the bbox at the center of the original bbox of the sample delayed period, and meanwhile, the liver tumor image of the sample delayed period is subjected to adaptive normalization to normalize the pixels to a preset range (for example, between-1 and 1). The crop out ROI is then resampled to a preset size (48 x 48 pixel size). And finally, performing enhancement processing such as rotation, translation, scaling and the like on the image.

2. Network training process

The preprocessed image is sent to a segmentation network (a third neural network model) for segmentation, and the segmentation network is set to an evaluation mode (a mode in which the parameters of the segmentation network are not updated in the evaluation mode). And (3) based on the segmentation result of the liver tumor image of the sample delay period by the segmentation network, taking the Z axis as reference, sending the probability map on each XY plane into the regression network to perform forward propagation to obtain a regression result, finally taking the maximum value in all the XY plane regression results as the diameter of the tumor, and calculating the loss coefficient by using the diameter and the marked tumor diameter gold standard. And then, the loss is propagated reversely, and the weight of the neural network is updated, so that the second neural network model is obtained.

S107, determining first grade classification results corresponding to the to-be-processed images of the multiple different phases based on the target sign classification result and the target size regression result.

In an embodiment of the present invention, if the target of interest has more corresponding signs (e.g., liver tumor includes more than twenty secondary signs in addition to three main signs, i.e., APHE, washout, and capsule), the first ranking result may be a preliminary ranking result (e.g., a preliminary LI-RADS ranking result).

In some embodiments, the first class classification result may be directly mapped based on the target symptom classification result and the target size regression result.

In other embodiments, in order to improve the accuracy of the level classification result determination, the features extracted by the several models (the first feature classification model, the second feature classification model, and the second neural network model) may be fused to determine the second level classification result (i.e., the final level classification result, such as the final LI-RADS level classification result). Accordingly, as shown in fig. 9, after S107, the method may further include: s109, determining a second grade classification result. Accordingly, the determining the second hierarchical classification result may include:

s10901, acquiring a first feature vector corresponding to the first feature classification result, a second feature vector corresponding to the second feature classification result and a third feature vector corresponding to the target size regression result; the first feature vector is the feature vector of the first fully-connected layer, the second feature vector is the feature vector of the second fully-connected layer, and the third feature vector is the feature vector of the third fully-connected layer.

S10903, fusing the first feature vector, the second feature vector and the third feature vector based on a fourth neural network model to obtain a second hierarchical classification result.

In this embodiment, as shown in fig. 10, the S10903 may include:

s109031, the first feature vector is processed based on a fourth full connection layer and a first nonlinear activation layer in the fourth neural network model, and a first nonlinear mapping result is obtained.

S109033, processing the second feature vector based on a fifth full connection layer and a second nonlinear activation layer in the fourth neural network model to obtain a second nonlinear mapping result.

S109035, processing the third feature vector based on a sixth full connection layer and a third nonlinear activation layer in the fourth neural network model to obtain a third nonlinear mapping result.

S109037, based on a seventh full connection layer in the fourth neural network model, performing fusion processing on the first nonlinear mapping result, the second nonlinear mapping result and the third nonlinear mapping result to obtain the second hierarchical classification result.

In this embodiment, as described in S10901, the feature vectors before the corresponding fully connected layer in the first, second and second neural network models may be extracted. And as S10903, fusing the feature vectors before the fully-connected layer based on a fourth neural network model to obtain the second hierarchical classification result. Because the features in the full-connection layer are fused between the multilayer convolution of the image and the shallow features, and meanwhile, the features in the full-connection layer can obtain a final classification or regression result through simple linear regression, the final classification or regression result is the feature which can best represent the judgment of the negativity and the positivity of a certain sign. Therefore, the feature vectors of all the model input full-connected layers are fused, and the accuracy of the second-grade classification result can be improved.

The fourth neural network model may include a number of FC/RELU layers (i.e., a fourth fully-connected layer and a first nonlinear activation layer, a fifth fully-connected layer and a second nonlinear activation layer, a sixth fully-connected layer and a third nonlinear activation layer) and one FC layer (i.e., a seventh fully-connected layer), the number of FC/RELU layers may be greater than or equal to a sum of the number of the first symptom classification model, the second symptom classification model, and the second neural network model. Multiple non-linear RELU layers are introduced because it is considered that all the signs of LI-RADS are non-linearly related to the final level mapping. Correspondingly, as described in S109031-S109037, the feature vectors before different connection layers may be processed by each FC/RELU layer in the fourth neural network model, and then the processed results obtained by each feature vector by the seventh full connection layer are fused to obtain the second hierarchical classification result.

The advantage of fusing features by the fourth neural network model over directly mapping a plurality of feature classification results or diameter regression results to first class classification results is that: fusing a plurality of characteristics, keeping the main characteristics, and simultaneously, the network learns other useful characteristics and automatically learns the weight of each characteristic; and the characteristic fusion mode reduces the process of artificially selecting the classification threshold values aiming at different signs, and reduces human errors.

Hereinafter, S10901-S10903 and S109031-S109037 will be described with reference to an example in which the target object is a liver and the target of interest is a liver tumor.

Fig. 11 is a schematic structural diagram of a multi-feature fused LI-RADS rating evaluation model provided in the embodiment of the present application. S10901, a first feature vector input to the first fully-connected layer (including the feature vector input to the first fully-connected layer in the first feature classification for evaluating APHE and the feature vector input to the first fully-connected layer in the first feature classification for evaluating wash), a second feature vector input to the second fully-connected layer, and a third feature vector input to the third fully-connected layer may be obtained. Then, as described in S109031-S109037, the first feature vector, the second feature vector, and the third feature vector may be processed by each FC/RELU layer in the fourth neural network model, and then the processed results of each feature vector are fused by the seventh full connection layer to obtain the second hierarchical classification result. The second hierarchical classification result may include seven levels: positive (LR-1), positive (LR-2), negative (LR-3), positive (LR-4), positive (LR-5), positive (LR-TIV), or negative (LR-M) without hepatocellular carcinoma.

Taking the classification of liver tumor as an example, the training process of the fourth neural network model is introduced:

1. data pre-processing

And selecting a liver tumor image in a sample flat scanning period, a liver tumor image in a sample artery period, a liver tumor image in a sample vein period and a liver tumor image in a sample delay period. Based on the preset bbox, the largest size in the X, Y, Z direction among the four phase bboxes is selected as the size of the bbox common to the four phase. The ROI regions of each facies are then jointly size-cropped (crop) with the center of the original bbox of each facies. Two normalization processes are then performed on each phase image: one set of normalizations normalizes the pixels to a preset range (e.g., -1 to 1) according to the same window level (calculated from the mean and variance of the signal values of all the pixels of the four phase image), and the other set of normalizations normalizes the pixels to a preset range (e.g., -1 to 1) according to the window level of each phase. The crop out ROI is then resampled to (pixel size 48 x 48). And finally, performing enhancement processing such as rotation, translation, scaling and the like on the image.

2. Network training process

Setting the segmentation network (namely, a third neural network model), the two-dimensional regression network (namely, a second neural network model) and the three symptom classification networks (namely, the first symptom classification model and the second symptom classification model) into evaluation modes respectively, namely fixing the weights of the neural networks. Then, according to the input mode during training of each model (the preprocessing mode and the input mode of the image during training of each model have been introduced, and are not described herein again), the corresponding normalized image of the corresponding phase is automatically selected as needed, and is input into each sub-network for forward propagation. Then extracting the input of each sub-network FC layer, respectively inputting the input into the corresponding FC/RELU layer, splicing (concat) the characteristics (features) passing through the RELU layer in a channel form, and obtaining the final LI-RADS grade through one FC layer. And finally, calculating a loss coefficient according to the LI-RADS-grade gold standard of the tumor, carrying out back propagation on the loss, and updating the weight of the neural network so as to obtain the fourth neural network model.

As described above, in the embodiment of the present application, classification and diameter prediction of three main signs of three-dimensional enhanced CT/MR liver tumor image images in different phases are performed sequentially, and then features of a diameter regression model and a classification model are extracted as inputs of a fourth neural network model, so that a final LI-RADS grade can be obtained through training. Since the above learning for three major signs and tumor size is a strong supervised learning, each network extracted feature has the ability to classify the three major signs and regress the tumor size, although a preliminary LI-RADS rating can be obtained based on the three major signs and tumor size. However, in order to increase the accuracy of the LI-RADS level determination, an auxiliary judgment may be made based on some secondary features on the image (e.g., tumors may have structures such as halo-enhancement, nodule formation, etc.), wherein if some secondary features appear to support the tumors as benign, one level is subtracted from the preliminary level, and if some secondary features appear to support the tumors as malignant, one level is added to the preliminary level (the secondary features may not raise the preliminary level to the LR-5 level). The learning of the previous image is that the fully-connected layer of each sub-network contains the features capable of classifying the main image and some auxiliary features which are helpful for distinguishing the secondary image, that is, the three main image classification models have a function of guiding feature learning, under the guidance of the main image classification models, the relevant features of the main image are extracted in a targeted manner, and the extracted features still contain all the features of the whole image, but the secondary features still remain hidden therein. Through the process of feature fusion, the final LI-RDS grade is taken as a target, the network can automatically retrieve the secondary features on the basis of the existing features, and therefore the LI-RADS grade classification model training mode is more efficient and reliable.

As shown in fig. 12, an embodiment of the present application further provides an image processing apparatus, which may include:

the acquiring module 201 may be configured to acquire a plurality of images to be processed in different phases.

The classification module 203 may be configured to perform, based on the first neural network model, a feature classification process on the to-be-processed images of the multiple different phases, so as to obtain target feature classification results corresponding to the to-be-processed images of the multiple different phases.

In some embodiments, the plurality of different phasic images to be processed includes a plurality of first feature images, the first neural network model includes a first feature classification model, the target feature classification result includes a first feature classification result, and the classification module 203 may include:

the first feature extraction unit may be configured to perform feature extraction on the plurality of first feature images based on a first feature extraction layer in the first feature classification model, so as to obtain feature extraction results corresponding to the plurality of first feature images.

The first fully-connected layer processing unit may be configured to fuse, based on a first fully-connected layer in the first feature classification model, feature extraction results corresponding to the plurality of first feature images, to obtain feature fusion results corresponding to the plurality of first feature images.

The first feature classification result determining unit may be configured to process the feature fusion result based on the first feature classification model to obtain the first feature classification results corresponding to the plurality of first feature images.

In some embodiments, the plurality of different phasic images to be processed includes a plurality of second feature images, the first neural network model includes a second feature classification model, the target feature classification result includes a second feature classification result, and the classification module 203 may include:

the second feature extraction unit may be configured to perform feature extraction on the plurality of second feature images based on a second feature extraction layer in the second feature classification model, so as to obtain feature extraction results corresponding to the plurality of second feature images.

The second fully-connected layer processing unit may be configured to process, based on a second fully-connected layer in the second feature classification model, feature extraction results corresponding to the plurality of second feature images, to obtain feature classification results corresponding to the plurality of second feature images.

The second feature classification result determining unit may be configured to process, based on the second feature classification model, a feature classification result corresponding to each of the plurality of second feature images to obtain the second feature classification result corresponding to the plurality of second feature images.

The regression module 205 may be configured to perform size regression processing on the to-be-processed images of the multiple different phases based on the second neural network model to obtain target size regression results corresponding to the to-be-processed images of the multiple different phases.

In this embodiment of the application, the multiple images to be processed in different phases include a target image, and the apparatus may further include: and the segmentation module can be used for carrying out segmentation processing on the target image based on a third neural network model to obtain a segmentation result corresponding to the target image.

Accordingly, the regression module 205 may include:

and the third feature extraction unit may be configured to perform feature extraction on the segmentation result based on a third feature extraction layer in the second neural network model to obtain a feature extraction result corresponding to the target image.

And the third full-connection layer processing unit may be configured to perform fusion processing on the feature extraction result corresponding to the target image based on the third full-connection layer in the second neural network model, so as to obtain a size regression result corresponding to the target image.

The target size regression result determination unit may be configured to use a size regression result corresponding to the target image as the target size regression result.

The fusion module 207 may be configured to determine the first class classification result corresponding to the to-be-processed images of the plurality of different phases based on the target symptom classification result and the target size regression result.

In some embodiments, the apparatus may include a second hierarchical classification result determination module, which may include:

a feature vector obtaining unit, configured to obtain a first feature vector corresponding to the first feature classification result, a second feature vector corresponding to the second feature classification result, and a third feature vector corresponding to the target size regression result; the first feature vector is the feature vector of the first fully-connected layer, the second feature vector is the feature vector of the second fully-connected layer, and the third feature vector is the feature vector of the third fully-connected layer.

The classification result determining unit may be configured to perform fusion processing on the first feature vector, the second feature vector, and the third feature vector based on a fourth neural network model to obtain a second hierarchical classification result.

In some embodiments, the classification result determining unit may include:

the first nonlinear processing subunit may be configured to process the first feature vector based on a fourth fully-connected layer and a first nonlinear activation layer in the fourth neural network model, so as to obtain a first nonlinear mapping result.

The second nonlinear processing subunit may be configured to process the second feature vector based on a fifth fully-connected layer and a second nonlinear activation layer in the fourth neural network model, so as to obtain a second nonlinear mapping result.

The third nonlinear processing subunit may be configured to process the third feature vector based on a sixth fully-connected layer and a third nonlinear activation layer in the fourth neural network model, so as to obtain a third nonlinear mapping result.

A fourth nonlinear processing subunit, configured to perform fusion processing on the first nonlinear mapping result, the second nonlinear mapping result, and the third nonlinear mapping result based on a seventh full connection layer in the fourth neural network model, so as to obtain the second hierarchical classification result.

It should be noted that the device embodiments in the embodiments of the present application are based on the same inventive concept as the method embodiments described above.

The embodiment of the present application further provides an image processing apparatus, which includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the image processing method provided by the above method embodiment.

Embodiments of the present application further provide a computer-readable storage medium, which may be disposed in a terminal to store at least one instruction or at least one program for implementing an image processing method according to the method embodiments, where the at least one instruction or the at least one program is loaded and executed by a processor to implement the image processing method according to the method embodiments.

Alternatively, in an embodiment of the present application, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: various media capable of storing program codes, such as a usb disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk.

The memory according to the embodiments of the present application may be used to store software programs and modules, and the processor may execute various functional applications and data processing by operating the software programs and modules stored in the memory. The memory can mainly comprise a program storage area and a data storage area, wherein the program storage area can store an operating system, application programs needed by functions and the like; the storage data area may store data created according to use of the apparatus, and the like. Further, the memory may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory may also include a memory controller to provide the processor access to the memory.

The image processing method, the image processing device, the image processing apparatus and the storage medium provided by the embodiment of the application have the following beneficial effects:

1) the image classification method and the image classification device have the advantages that the image classification is carried out on the images to be processed of different phases based on the first neural network model, and the efficiency and the accuracy of the image classification can be improved. In addition, the size of the to-be-processed images of different phases is calculated based on the second neural network, so that the problems of large size calculation error, long time consumption, high cost and the like caused by calculation of the size based on manual intervention can be solved, the efficiency and the accuracy of size calculation are improved, and the cost of size calculation is reduced. According to the embodiment of the application, through the two aspects, the efficiency and the accuracy of determining the first grade classification result are effectively improved, and the cost of determining the first grade classification result is reduced.

2) In the embodiment of the application, the existing evaluation grade (e.g., LI-RADS evaluation grade) is used as the priori knowledge for deep learning network training, important features of the existing evaluation grade are firstly emphasized and then the extracted features are subjected to fusion training to obtain a second grade classification result (i.e., a final grade classification result, e.g., a final LI-RADS grade classification result). On one hand, the interpretability of the neural network as a black box model is increased by the aid of the training mode under the artificial guidance, so that the reliability of the model used for evaluating the second grade classification result is increased, and on the other hand, the performance of the model is better by the aid of the efficient training mode.

3) In the embodiment of the application, three main signs of the three-dimensional enhanced CT/MR image images in different phases are classified and size prediction is carried out, and then the characteristics of the second neural network model and the first neural network model are respectively extracted as the input of the fourth neural network model, so that the final second grade classification result can be obtained through training. On one hand, the extracted effective features are subjected to fusion training through a fourth neural network model, so that the model is ensured to extract more other secondary feature characteristics which are beneficial to the evaluation of the second grade classification result from the tumor image on the basis of the main feature characteristics, and the accuracy of the determination of the second grade classification result (namely the final grade classification result) is improved; on the other hand, the size of the target of interest (such as liver tumor) does not need to be measured manually, so that the efficiency and the accuracy of size calculation of the target of interest are improved, and the cost of size calculation of the target of interest is reduced.

It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages and disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the device and server embodiments, since they are substantially similar to the method embodiments, the description is simple, and the relevant points can be referred to the partial description of the method embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An image processing method, characterized in that the method comprises:

acquiring a plurality of images to be processed of different phases;

2. The method according to claim 1, wherein the plurality of different-phase images to be processed include a plurality of first image features, the first neural network model includes a first image feature classification model, and the target image feature classification result includes a first image feature classification result, and then the image to be processed of the plurality of different phase is subjected to image feature classification processing based on the first neural network model to obtain the target image feature classification results corresponding to the plurality of different-phase images to be processed, including:

3. The method according to claim 2, wherein the plurality of different-phase images to be processed include a plurality of second image features, the first neural network model includes a second image feature classification model, and the target image feature classification result includes a second image feature classification result, and then the image to be processed of the plurality of different phase is subjected to image feature classification processing based on the first neural network model to obtain the target image feature classification results corresponding to the plurality of different-phase images to be processed, including:

4. The method according to claim 3, wherein the images to be processed of the plurality of different phases include a target image, before performing size regression processing on the images to be processed of the plurality of different phases based on the second neural network model to obtain target size regression results corresponding to the images to be processed of the plurality of different phases, the method further comprises:

5. The method according to claim 4, wherein the performing size regression processing on the to-be-processed images of the plurality of different phases based on the second neural network model to obtain target size regression results corresponding to the to-be-processed images of the plurality of different phases comprises:

6. The method of claim 5, wherein after determining the first class classification results corresponding to the plurality of different facies images to be processed based on the target symptom classification results and the target size regression results, the method further comprises:

7. The method of claim 6, wherein the fusing the first feature vector, the second feature vector, and the third feature vector based on the fourth neural network model to obtain a second hierarchical classification result comprises:

8. An image processing apparatus, characterized in that the apparatus comprises:

9. An image processing apparatus, characterized in that the apparatus comprises: a processor and a memory, the memory having stored therein at least one instruction or at least one program, the at least one instruction or the at least one program being loaded and executed by the processor to implement the image processing method according to any one of claims 1 to 7.

10. A computer-readable storage medium, in which at least one instruction or at least one program is stored, which is loaded and executed by a processor to implement the image processing method according to any one of claims 1 to 7.