CN117974832A

CN117974832A - Multi-modal liver medical image expansion algorithm based on generation countermeasure network

Info

Publication number: CN117974832A
Application number: CN202410384429.9A
Authority: CN
Inventors: 陈英; 林洪平; 张伟; 陈旺; 周宗来; 彭坤; 雷飞洋
Original assignee: Nanchang Hangkong University
Current assignee: Nanchang Hangkong University
Priority date: 2024-04-01
Filing date: 2024-04-01
Publication date: 2024-05-03
Anticipated expiration: 2044-04-01
Also published as: CN117974832B

Abstract

The invention discloses a multimode liver medical image expansion algorithm based on a generated countermeasure network, which comprises the following steps: acquiring a data set and preprocessing; creating a training data set; constructing a multi-scale residual error connection module; constructing a multi-stage loss function module; constructing a channel enhanced self-attention module; constructing a gating convolution module; combining a multi-scale residual error connection module, a multi-stage loss function module, a channel enhanced self-attention module and a gating convolution module; and constructing a multi-modal liver medical image data set expansion model, and carrying out experiments and evaluation on the multi-modal liver medical image data set expansion model. The method can efficiently generate a large number of high-quality liver medical images in multiple modes for expanding the existing medical image data, and relieve the situation that the currently publicly available and available liver medical image data are scarce.

Description

Multi-modal liver medical image expansion algorithm based on generation countermeasure network

Technical Field

The invention relates to the field of medical image dataset expansion, in particular to a multi-modal liver medical image expansion algorithm based on a generated countermeasure network.

Background

With the continued development of deep learning techniques, medical image datasets such as computed tomography, magnetic resonance imaging, positron emission tomography, and the like have been widely used to train computer-aided diagnosis systems. The computer aided diagnosis system plays an important role in the medical field, provides powerful support for doctors, is beneficial to improving the accuracy, efficiency and comprehensiveness of medical diagnosis, and is an indispensable important tool in the modern medical diagnosis field. But derives from the high degree of protection of patient privacy and ethics, legal regulations, high data acquisition costs, and compliance requirements of medical institutions, making publicly available medical image datasets quite scarce. The scarcity of data can limit the training and performance of deep learning models integrated in computer-aided diagnosis systems because these models typically require large amounts of labeled data to learn effective feature representations and patterns. Therefore, how to solve the problem of scarcity of publicly available medical image data sets is of great importance for driving the medical image diagnostic field to a deeper and wider range of applications.

The medical image generation method has a key role in solving the problem of scarcity of medical image data sets, and can synthesize more virtual images by utilizing the existing limited data sets, thereby expanding training data. The generation model, in particular the generation countermeasure network technology, can learn and simulate the distribution of the original data, and generate images similar to the real data but with a certain difference. The image can increase the adaptability of the model in the face of different scenes and changes, and improves the generalization capability of the model. Through the medical image generation method, researchers can construct a more powerful and widely adaptive deep learning model on a smaller data set, so that a computer-aided diagnosis system with powerful performance is designed, the diagnosis accuracy of doctors is improved, and the occurrence of misdiagnosis and missed diagnosis is reduced.

However, the existing medical image generation method based on the generation countermeasure network generally directly generates the whole brand new medical image, has high requirement on the characteristic learning capability of the network, and meanwhile, the existing data set quantity is difficult to drive a deep learning generation model with large parameter quantity, so that the generated image generally lacks enough structural information and has lower image quality. This will lead to blurring or distortion of the boundaries of the partial areas of the generated image, which will completely change the meaning of the image, so that the generated medical image is too different from the original dataset to be used. Therefore, the invention provides a multi-mode liver medical image expansion algorithm based on the generation of an countermeasure network by means of the design thought of image restoration, and the method provided by the invention can generate liver medical images with detailed textures and reasonable structures, can effectively expand the existing liver medical image data set, and relieves the situation of scarcity of the existing liver medical image data set.

Disclosure of Invention

The invention aims to provide a multi-mode liver medical image expansion algorithm based on a generation countermeasure network, which is used for solving the problem of scarcity of the current liver medical image data set, and the liver region of a medical image is generated in a targeted manner by means of the idea of image restoration.

In order to achieve the above object, the present invention provides the following solutions: the multi-modal liver medical image expansion algorithm based on the generation countermeasure network comprises the following steps:

Step S1: acquiring a data set and preprocessing computed tomography image data in the data set;

Step S2: creating a training data set based on the data set preprocessed in the step S1;

Step S3: constructing a multi-scale residual error connection module based on the training data set manufactured in the step S2, and preventing the training data set from network degradation;

Step S4: constructing a multistage loss function module based on the structure and the multilayer characteristics of the training data set manufactured in the step S2;

step S5: integrating the position information in the training data set manufactured in the step S2 and the long-distance dependency relationship in capturing to construct a channel enhanced self-attention module;

Step S6: eliminating parameter increase brought by the steps S3 to S5, and constructing a gating convolution module;

step S7: combining the multi-scale residual error connection module of the step S3, the multi-level loss function module constructed in the step S4, the channel enhanced self-attention module of the step S5 and the gating convolution module of the step S6; constructing a multi-modal liver medical image data set expansion model, and carrying out experiments and evaluation on the multi-modal liver medical image data set expansion model;

In step S3, the multi-scale residual error connection module specifically includes:

Forming a multi-scale residual connection module using 3×3, 5×5, and 7×7 convolution kernels; feature map extracted from 3×3 convolution kernel Feature map extracted from 5×5 convolution kernelAnd 7 x 7 convolution kernelConnecting, and fusing the connected characteristics by using a convolution kernel of 3 multiplied by 3 to obtain a new characteristic diagram F which is obtained by the characteristics output by the multi-scale convolution module; obtaining a multi-scale residual error connection module based on the obtained new feature diagram F;

in step S4, a multistage loss function module is constructed, specifically:

The multi-stage loss function module comprises three parts of perceived loss, content loss and counterloss; the content loss focuses on the similarity of the generated image and the real image on the representation of a specific layer, and is realized by calculating the characteristic difference of the generated image and the real image in a certain middle layer of the generator; distinguishing a generated image from a real image against loss; the final multi-level loss is obtained by combining the perceived loss, the content loss, and the counter-loss.

Further, preprocessing is performed on the computed tomography image data in the dataset in step S1; the method comprises the following steps:

s11, splitting a three-dimensional format public data set into two-dimensional slice data sets;

step S12, carrying out Haosfield unit value adjustment on the computer tomography image data in the three-dimensional format disclosure data set.

Further, in step S2, a training data set is created; the method comprises the following steps:

step S21, acquiring the data set preprocessed in the step S1, and adjusting the resolution to 256×256;

Step S22, using a part of the data set with the adjusted resolution as a training set and the other part as a verification set; training and validation of an extended model based on the multimodal liver medical image dataset generating the challenge network is then performed.

Further, in step S3, the multi-scale residual error connection module specifically includes:

Step S31, a multi-scale residual error connection module is formed by adopting 3×3, 5×5 and 7×7 convolution kernels; the feature extraction equation is shown in formula (1):

（1）；

Wherein, Representing the feature map extracted from the 3 x 3 convolution kernel,A 3 x3 convolution operation is shown,The input characteristic diagram is represented by a graph of the input characteristics,Representing the feature map extracted from the 5 x 5 convolution kernel,A 5 x5 convolution operation is shown,Representing the feature map extracted from the 7 x 7 convolution kernel,Representing a 7 x7 convolution operation;

step S32, extracting the characteristic diagram from the 3×3 convolution kernel Feature map extracted from 5×5 convolution kernelAnd 7 x 7 convolution kernelAnd (3) connecting, and using the characteristic of 3×3 convolution kernel fusion connection to obtain the characteristic of the output of the multi-scale convolution module, wherein the characteristic is expressed as shown in a formula (2):

（2）；

Wherein, The new feature map obtained is represented by a graph,Representing the characteristic series connection;

Based on the obtained new feature map Obtaining a multi-scale residual error connection module, wherein an output equation is shown in a formula (3):

（3）；

Wherein, Representing the output of the multi-scale residual connection module.

Further, in step S4, a multi-stage loss function module is constructed, where the multi-stage loss function module includes three parts, namely, perceived loss, content loss and counterloss; the method comprises the following steps:

step S41, calculating the perception loss, wherein the formula of the perception loss is shown in formula (4):

（4）；

Wherein, Indicating a loss of perception,The representation of the generator is provided with a representation,Indicating the number of layers calculated,Representation pair generator numberThe layers are summed up and,Representing the square of the calculated euclidean norm,Representation generator 1The number of elements of the layer feature,The first of the representation generatorsThe characteristics of the layer are extracted,Representing an output feature map;

step S42, calculating the content loss, wherein the specific formula is shown in formula (5):

（5）；

Wherein, The content loss value is indicated as such,Feature extraction representing the selected intermediate layer;

step S43, calculating the countermeasures loss, wherein the formula of the countermeasures loss is shown in formula (6):

（6）；

Wherein, Indicating the value of the contrast loss,The representation of the discriminator is given by,The value of the distribution is represented by,Representing the true data distribution corresponding to the output feature map y,The representation generator computes the distribution of the input feature map x,Representing a logarithmic function;

Step S44, calculating the final multistage loss, specifically shown in formula (7):

（7）；

Wherein, Representing the value of the multi-level loss function,、AndThe weight parameters are represented for balancing the contributions of the combined perceived loss, content loss, and combat loss.

Further, in step S5, a channel-enhanced self-attention module is constructed, specifically:

Step S51, combining the self-attention mechanism and the convolution block attention module to generate a channel enhanced self-attention module, and integrating the channel enhanced self-attention module into a generator;

step S52, the generator inputs the feature map Inputting the output characteristics of the self-attention mechanism and the convolution block attention module into the self-attention mechanism and the convolution block attention module for calculation;

step S53, after obtaining two output features of the self-attention mechanism and the convolution block attention module, performing output feature calculation of the final channel enhanced self-attention module in a serial form, and integrating the position information in the training data set manufactured in step S2 and the dependency relationship between the middle and long distances in capturing, wherein the dependency relationship is specifically shown in a formula (8):

（8）；

Wherein, Representing the output of the channel enhanced self-attention module,Representing self-attention mechanism, CBAM representing convolutional block attention module self-attention computation, the self-attention mechanism being a feature map of the input by learning weightsEach of the characteristic values of (a) is assigned a different attention.

Further, in step S6, a gating convolution module is constructed, which specifically includes:

The gating convolution value is calculated, and the specific calculation process of the gating convolution is shown in a formula (9):

（9）；

Wherein, Represents a gated convolution operation on the input signature x and the output signature y,The output value of (a) is between 0 and 1,The input eigenvalues representing the gated convolution,The summation operation is represented as a sum operation,A pixel domain is represented by a representation of the pixel domain,AndRepresenting two different convolution kernels from each other,Representing the final calculated value of the gated convolution,AndRepresenting two different activation functions,Representing element multiplication.

Further, in step S7, a multi-mode liver medical image dataset expansion model is designed, and experiments and evaluations are performed on the multi-mode liver medical image dataset expansion model; the method comprises the following steps:

Step S71, integrating the modules constructed in the steps S3 to S6 into a multi-mode liver medical image data set expansion model, and performing test experiments to verify the feasibility and the reliability of the model;

Step S72, performing experiments on the data set in the step S1 to obtain a generated computed tomography image and a magnetic resonance imaging image;

Step S73, comprehensively evaluating the quality of the generated computed tomography image and the magnetic resonance imaging image by adopting peak signal-to-noise ratio, structural similarity index and average absolute value index;

And S74, dequantizing the quality of the generated image by using the peak signal-to-noise ratio, the structural similarity index and the average absolute value index in the step S73, and providing an objective basis for analysis and evaluation of algorithm performance.

The beneficial effects of the invention are as follows:

(1) The invention constructs a multi-scale residual error connection module, establishes a residual error learning relation between input and output images and effectively solves the problem that a deep network is easy to degrade to cause gradient disappearance or explosion.

(2) The invention builds a multistage loss function module to promote the network to generate more reasonable graph texture characteristics.

(3) The invention constructs the channel enhanced self-attention module, so that the perceptibility of the model to local and global information of the medical image is improved, the study of the relation between channels is enhanced, and the generated image is focused on the details of textures.

(4) According to the method, the gating convolution module is constructed, the convergence speed of the model is improved, and the accuracy of the generated image and the attention to the key structure are improved.

(5) The invention can efficiently generate a large number of high-quality multi-mode medical images for expanding the existing medical image data and relieving the situation that the medical image data which can be publicly acquired at present is rare.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the drawings that are needed in the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of a multi-modal liver medical image augmentation algorithm based on generating an countermeasure network in accordance with the present invention.

FIG. 2 is a diagram of a dataset fabrication process according to the present invention.

Fig. 3 is a block diagram of a multi-scale residual error connection module according to the present invention.

FIG. 4 is a block diagram of a channel enhancement self-attention module according to the present invention.

Fig. 5 is a graph of experimental results of a multi-modal liver medical image expansion algorithm based on generating an countermeasure network in accordance with the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the examples described are only some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention relates to the technical fields of deep learning, computer vision and generation of an countermeasure network framework, in particular to the design and training of a deep learning neural network in the deep learning and computer vision, which relates to image generation and image restoration. The invention aims to provide a multi-modal liver medical image expansion algorithm based on a generation countermeasure network, takes an idea of image restoration as a local medical image generation method, introduces a multi-scale residual error connection module, a multi-stage loss function module, a channel enhanced self-attention module and a gating convolution technology, and aims to efficiently generate a large number of high-quality multi-modal medical images for expanding the existing medical image data and relieve the problem of scarcity of the liver medical image data which can be obtained in a public way at present.

In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.

The invention adopts the flow chart shown in fig. 1, and the key steps of the method and the treatment comprise:

step S3: constructing a multi-scale residual error connection module based on the training data set manufactured in the step S2 and preventing the training data set from network degradation;

Step S4: constructing a multistage loss function module based on the complex structure and the multilayer characteristics of the training data set manufactured in the step S2 to be fully extracted;

Step S7: combining the multi-scale residual error connection module of the step S3, the multi-stage loss function module of the step S4, the channel enhanced self-attention module of the step S5 and the gating convolution module of the step S6; and constructing a multi-modal liver medical image data set expansion model, and carrying out experiments and evaluation on the multi-modal liver medical image data set expansion model.

Of the above 7 main steps, step S1 and step S2 are the basis of the invention implementation, and steps S3 to S6 are the most central, and step S7 is the demonstration of the invention implementation. The specific implementation of the steps is as follows:

Further, step S1 is specifically designed as follows:

The data set used in the invention comprises a magnetic resonance imaging image and a computer tomography image, and the original medical image is in a three-dimensional data format, but the invention aims at the two-dimensional data. The first preprocessing operation is therefore to split the three-dimensional format data set into two-dimensional slice data. In particular, the computer tomographic image data also requires an operation step of adjusting the hausfield unit value. This is because the pixel values in a computed tomography image are typically expressed in hausfield unit values due to the difference between the tissue density measured by the computed tomography and the water density. The adjustment of the haustfield unit value is to make the gray value of the computed tomography image more intuitively reflect the biological characteristics of the tissue, and simultaneously, the doctor can conveniently interpret and diagnose the image. In computed tomography images, the Hastefield unit value is typically zero as the density of water, with negative values representing a relatively low density and positive values representing a relatively high density. In general, through the adjustment of the hausfield unit values, doctors can more accurately identify and evaluate various tissues and abnormalities in the images, improving the clinical usability of the images. The different adjustment steps may vary depending on the specific clinical task and the needs of the physician.

Further, step S2 is specifically designed as follows:

After preprocessing the data set in step S1, a further process of creating the data set for experiments is required. In experiments, the computed tomography image employed was from a public dataset of a computed tomography image segmentation challenge race, which is one of the largest current open source liver segmentation datasets. The public dataset of the computed tomography image segmentation challenge race contains computed tomography images of 130 patients, and segmentation labels of the liver and tumor in these images by the doctor. 130 of the three-dimensional computed tomography images were used as a training set and the other 70 three-dimensional computed tomography images were used as a validation set. These images cover the common sites of primary liver tumors (such as hepatocellular carcinoma and liver cancer) and secondary tumors (such as metastatic lesions from colorectal cancer). For unified processing, the pixel values of the computed tomography scan are adjusted to be within a range, i.e., -800, 100 hausfield unit values. Due to hardware performance limitations, the resolution of all slices in these datasets was adjusted to 256×256, and then the model was trained and validated. The magnetic resonance imaging image dataset is derived from a computed tomography-magnetic resonance imaging combined healthy abdominal organ segmentation public dataset, which is a medical image dataset focused on the task of segmenting healthy abdominal organs (computed tomography and magnetic resonance imaging images). The computed tomography-magnetic resonance imaging combined healthy abdominal organ segmentation public dataset contains high resolution images from the computed tomography and magnetic resonance imaging scans encompassing multiple organs of liver, spleen, pancreas, left kidney and right kidney. Each image is manually annotated by a professional medical imaging specialist, providing an accurate segmentation mask for the abdominal organ. The main purpose of the computer tomography-magnetic resonance imaging combined healthy abdominal organ segmentation public data set is to promote the research and the evaluation of a medical image segmentation algorithm, so that researchers can train and test an automatic or semi-automatic segmentation method and improve the accurate segmentation of the healthy abdominal organ. The training set and the number of test sets included in each data are shown in table 1.

Table 1 data set information table

The experiments of the invention are based on the liver region unfolding research in the two data sets, and the marked images contained in the data sets are used as mask images used in the experiments through certain adjustment. Because the invention uses the thought of image restoration to carry out experiments, medical images including defective parts must be obtained first, and the invention is characterized in that the medical images of defective liver areas are required to be obtained preferentially. As shown in fig. 2, the experimental data set is subjected to a certain operation based on the original three-dimensional slice and the corresponding liver part label value to obtain a required liver data format.

Further, step S3 is specifically designed as follows:

In order to effectively solve the degradation problem of the network and improve the model training effect, a multi-scale residual error connection module is designed and added in a generator of the network, and the purpose is to establish a residual error learning relation between input and output and integrate the residual error learning relation into a coding stage of the generator. In this module, convolution of 3×3, 5×5, and 7×7 sizes are employed, along with normalization, activation functions, and join operations to form a multi-scale convolution module. The multi-scale residual error connection module can fully utilize the advantages among three different scale features to extract more detailed feature detail information, thereby being beneficial to high-quality generation of medical image structural details. Meanwhile, the structure can effectively solve the problem that a deep network is easy to degrade, so that gradient disappears or explosion is caused, and the structure is shown in figure 3.

Further, step S4 is specifically designed as follows:

In medical image generation tasks, the use of multi-level loss functions is an effective strategy because medical images have complex structures and multi-level features. By introducing perception loss, content loss and countering loss into the generation model, the attention of the model to details and the whole structure can be balanced better, and the problems of complexity of the hierarchical structure, noise and unbalance of categories of data are solved, so that the quality and applicability of the generated image are improved, the generated image meets the actual requirements of the medical field better, and a multi-stage loss function is designed and integrated into the generation algorithm. The multi-level loss function involved consists of three parts, namely perceived loss, content loss and counterloss. Where perceptual loss is typically used to compare the similarity of the generated image and the real image on higher level features, an intermediate layer of a pre-trained convolutional neural network (such as a VGG16 network) may be used for calculation. Content loss focuses on the similarity of the generated image and the real image on a representation of a specific layer, and the countermeasure loss can be realized by calculating the characteristic difference of the generated image and the real image at a certain middle layer of the network, and the generated image is distinguished from the real image based on the idea of generating the countermeasure network.

Further, step S5 is specifically designed as follows:

Self-attention does not always deal well with relationships between channels, whereas the convolution block attention module focuses on emphasizing the importance of different channels by channel attention. Therefore, in order to enhance the perceptibility of the model to the local and global information of the medical image and enhance the learning of the relation between channels, so that the generated image is more focused on the texture details, a channel enhanced self-focusing module is provided by combining a self-focusing mechanism and the self-focusing of a convolution block and is integrated into a generator of a network, and the structure of the channel enhanced self-focusing module is shown in fig. 4.

The channel enhanced self-attention module comprises a self-attention mechanism and the attention mechanism of three parts, namely a space and a channel attention mechanism in the convolution block attention module. After obtaining the two output characteristics of the self-attention and convolution block attention module, final output characteristic calculation is performed through a serial connection mode.

The self-attention mechanism assigns a different attention to each element in the sequence by learning a weight. In the calculation of the self-attention mechanism, representations of the query value, the key value and the calculated value are obtained through three linear transformations. Then, calculating the similarity between the query value and the key value, and performing scaling calculation on the similarity in a dot product mode to obtain the attention score. Next, the attention score is normalized by the activation function and the attention weight is obtained. Finally, the attention weight is multiplied by the calculated value to obtain the output of the self-attention mechanism.

Output from attentionAnd then, the input feature map is input into a convolution block self-attention module to calculate the feature value of the other part, the convolution block self-attention module firstly extracts global information of the input feature map in the channel dimension through global maximum pooling and global average pooling, and the global information is mapped to two weight vectors after passing through a full connection layer to respectively represent the maximum pooling and average pooling weight vectors. The two weight vectors are subjected to an activation function and batch normalization to form the channel attention weight. Meanwhile, through 1×1 convolution, feature graphs in horizontal and vertical directions are generated for capturing the importance of the input feature graphs in the spatial dimension, and the feature graphs form spatial attention weights after the operations of activating functions, multiplying and normalizing. Finally, multiplying the channel and the spatial attention weight to obtain an integrated attention weight matrix. This matrix is used to weight the input feature map, producing the output of the convolution block self-attention moduleThus completing the calculation process of the channel enhanced self-attention mechanism.

Further, step S6 is specifically designed as follows:

Gating convolution has important potential benefits in the field of medical image generation. By introducing a gating mechanism, gating convolution can effectively solve core pain points in a medical image generation task and accelerate the training speed of a model. For the medical image generation task of the present invention: firstly, gating convolution is helpful for modeling long-range dependency relationship in medical images, so that the understanding capability of the whole structure of the images is improved; secondly, by suppressing irrelevant information, the network is focused on the key area, and the accuracy of the generated image and the attention to the key structure are improved. The gating convolution can also process the highly complex structure of the medical image, enhance the representation capability of the features, enable the network to learn and generate complex medical images, and improve the quality and sense of reality of the generated images; meanwhile, the gating convolution improves the stability of training and is helpful to solve the problem of gradient disappearance or explosion in deep network training; finally, gating convolution improves the capture ability of details of medical images, providing a potential solution for generating high quality, high definition and detail-rich medical images.

Further, step S7 is specifically designed as follows:

using 3 evaluation indexes of peak signal-to-noise ratio, structural similarity index and average absolute value error, image generation experiments were performed on two medical image data sets, namely, a public data set of a computed tomography image segmentation challenge large race and a public data set of computed tomography-magnetic resonance imaging combined healthy abdominal organ segmentation, and the generated experimental results are shown in fig. 5:

In fig. 5, the left 3 columns are computed tomography images in the public dataset of the computed tomography image segmentation challenge race, the mask is a label corresponding to the liver position in the dataset, the right 3 columns are magnetic resonance imaging images in the computed tomography-magnetic resonance imaging combined healthy abdominal organ segmentation public dataset, the mask is also a label corresponding to the liver position, the first row represents a real liver medical image, the second row represents a liver region label corresponding to the real liver image, the third row represents an image of the missing liver region, and the fourth row represents a finally generated liver medical image. It can be seen from the figure that the image generated by the method provided by the invention is a computed tomography image or a magnetic resonance imaging image, the texture and detail corresponding to the liver position are no closer to the actual medical image, and the generated area is not completely consistent with the liver area in the original image from the perspective of subjective vision, so that the abundant feature diversity is maintained.

For objectively evaluating the effect of generating the image, 3 widely used and accepted objective indexes are selected in the experiment, wherein the indexes comprise a structural similarity index, a peak signal to noise ratio and an average absolute value variance, so that the performance of the algorithm in the field of liver medical image generation is evaluated. The structural similarity SSIM not only can measure the distortion degree of an image, wherein the smaller the value thereof is representative of the lower the distortion degree of the image, but also can be used for measuring the similarity degree of two images, and is essentially a perception model, and the calculation formula thereof is shown as formula (10):

（10）；

wherein, SSIM represents the structural similarity, AndRepresenting a set of generated images and a real image respectively,AndRespectively representAndIs used for the gray level average value of (a),AndRespectively represent，The standard deviation of the gray-scale values,Representative ofAndCovariance of the gray-scale values,、The constant is adopted, so that the problem of calculation errors caused by 0 denominator can be avoided.

The peak signal-to-noise ratio is based on the error between the corresponding pixel points, namely, based on an error sensitive image evaluation standard, the index does not consider the visual characteristics of human eyes, so that a result inconsistent with subjective feeling can appear, and other indexes are often combined for comprehensive evaluation, but the larger the peak signal-to-noise ratio value is, the better the image quality is, and the expression is shown as a formula (11):

（11）；

Wherein, Representing the peak signal-to-noise ratio,A logarithmic function is represented and is used to represent,Representing the square of the maximum pixel values of X and Y,Representation ofOr alternativelyIs used for the number of the whole pixels,AndRepresenting two different images respectively.

The mean absolute value error is an indicator for measuring the prediction value error. For a set of predicted values and corresponding actual observed values, the calculation formula of the average absolute error is shown in formula (12):

（12）；

Wherein, Representing the mean absolute value error of the signal,Is the number of samples that are to be taken,Is the firstThe actual observed value of the individual samples is,Is the firstThe average absolute error calculation process involves averaging the absolute errors of each sample. The absolute error is the absolute value of the difference between the actual observed value and the predicted value.Smaller indicates more accurate prediction of the model.

According to the method provided by the invention, the objective evaluation index values of the generated new liver medical image are good, and the specific experimental evaluation results are shown in table 2:

Table 2 the present invention generates objective evaluation results of liver images

The system uses the computer vision front technology research result, expands the liver medical image data set by designing a high-quality multi-mode liver medical image expansion model so as to support the training of a subsequent large-scale segmentation model; through the whole set of processes, the high-quality liver medical image data set expansion is realized.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other. For the system disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.

The principles and embodiments of the present invention have been described herein with reference to specific examples, the description of which is intended only to assist in understanding the methods of the present invention and the core ideas thereof; also, it is within the scope of the present invention to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the invention.

Claims

1. The multi-modal liver medical image expansion algorithm based on the generation countermeasure network is characterized in that: the method comprises the following steps:

Step S7: combining the multi-scale residual error connection module of the step S3, the multi-stage loss function module of the step S4, the channel enhanced self-attention module of the step S5 and the gating convolution module of the step S6; constructing a multi-modal liver medical image data set expansion model, and carrying out experiments and evaluation on the multi-modal liver medical image data set expansion model;

Forming a multi-scale residual connection module using 3×3, 5×5, and 7×7 convolution kernels; feature map extracted from 3×3 convolution kernel Feature map extracted from 5×5 convolution kernel/>And 7 x 7 convolution kernel extracted feature map/>Connecting, and fusing the connected characteristics by using a convolution kernel of 3 multiplied by 3 to obtain a new characteristic diagram F which is obtained by the characteristics output by the multi-scale convolution module; obtaining a multi-scale residual error connection module based on the obtained new feature diagram F;

in step S4, a multistage loss function module is constructed, specifically:

2. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 1, wherein: preprocessing the computed tomography image data in the data set in step S1; the method comprises the following steps:

3. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 2, wherein: step S2, a training data set is manufactured; the method comprises the following steps:

4. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 3, wherein: in the step S3, a multi-scale residual error connection module comprises the following specific steps:

（1）；

Wherein, Representing the feature map extracted from the 3 x3 convolution kernel,/>Representing a3 x 3 convolution operation,/>Representing input feature map,/>Representing the feature map extracted from a5 x 5 convolution kernel,/>Representing a 5 x 5 convolution operation,/>Representing the feature map extracted from a 7 x 7 convolution kernel,/>Representing a 7 x7 convolution operation;

step S32, extracting the characteristic diagram from the 3×3 convolution kernel Feature map extracted from 5×5 convolution kernel/>And 7 x 7 convolution kernel extracted feature map/>And (3) connecting, and using the characteristic of 3×3 convolution kernel fusion connection to obtain the characteristic of the output of the multi-scale convolution module, wherein the characteristic is expressed as shown in a formula (2):

（2）；

Wherein, Representing the new feature map obtained,/>Representing the characteristic series connection;

based on the obtained new feature map F, a multi-scale residual error connection module is obtained, and an output equation is shown in a formula (3):

（3）；

Wherein, Representing the output of the multi-scale residual connection module.

5. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 4, wherein: step S4, constructing a multi-stage loss function module which comprises three parts of perceived loss, content loss and counterloss; the method comprises the following steps:

（4）；

Wherein, Representing perceived loss,/>Representation generator,/>Representing the number of layers calculated,/>Representation pair generator number/>Layer summation,/>Representing the square of the calculated euclidean norm,/>Representation generator number/>Number of elements of layer feature,/>Representation generator number/>Layer feature extraction,/>Representing an output feature map;

（5）；

Wherein, Representing content loss,/>Representing the selected feature extraction;

（6）；

Wherein, Representing the challenge loss value,/>Representing discriminator,/>Representing the expected value/>Representing the real data distribution corresponding to the output characteristic diagram y,/>Representing the distribution of the generator after the input feature map x is operated on,/>Representing a logarithmic function;

（7）；

Wherein, Representing a multi-level loss function value,/>、/>And/>The weight parameters are represented for balancing the contributions of the combined perceived loss, content loss, and combat loss.

6. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 5, wherein: in step S5, a channel-enhanced self-attention module is constructed, specifically:

（8）；

Wherein, Representing the output of a channel enhanced self-attention module,/>Representing self-attention mechanism, CBAM representing convolutional block attention module self-attention computation, the self-attention mechanism learning weights to input feature map/>Each of the characteristic values of (a) is assigned a different attention.

7. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 6, wherein: in step S6, a gating convolution module is constructed, specifically as follows:

（9）；

Wherein, Representing a gated convolution operation on an input feature map x and an output feature map y,/>The output value of the output is between 0 and 1,/>Input eigenvalues representing gated convolution,/>Representing summation operations,/>Representing pixel domain,/>AndRepresenting two different convolution kernels,/>Representing the final calculated value of the gated convolution,/>And/>Representing two different activation functions,/>Representing element multiplication.

8. The multi-modal liver medical image augmentation algorithm based on generation of countermeasure network of claim 6, wherein: step S7, designing a multi-mode liver medical image data set expansion model, and carrying out experiments and evaluation on the multi-mode liver medical image data set expansion model; the method comprises the following steps: