CN114240797B

CN114240797B - OCT image denoising method, device, equipment and medium

Info

Publication number: CN114240797B
Application number: CN202111579290.6A
Authority: CN
Inventors: 黄梦醒; 曾莉荣; 冯思玲; 毋媛媛; 冯文龙; 张雨; 吴迪
Original assignee: Hainan University
Current assignee: Hainan University
Priority date: 2021-12-22
Filing date: 2021-12-22
Publication date: 2023-04-18
Anticipated expiration: 2041-12-22
Also published as: CN114240797A

Abstract

The invention discloses an OCT image denoising method, a device, equipment and a medium, wherein the method comprises the following steps: obtaining an OCT image to form a sample data set, wherein the sample data set is divided into a training set and a verification set; constructing a feature fusion attention intensive network for OCT image denoising, wherein the main structure of the feature fusion attention intensive network comprises a sparse block, a feature enhancement block, an attention mechanism and a reconstruction block, and the feature fusion attention intensive network is used for denoising the OCT image; inputting the sample data set into the feature fusion attention intensive network for training and testing until convergence; and inputting the image to be detected into a trained feature fusion attention-intensive network for processing to obtain a clear OCT image after denoising processing. The invention provides a network for progressively converging features and concentrating attention on the basis of the original deep convolution network, and OCT image noise can be effectively removed on the basis of the network.

Description

OCT image denoising method, device, equipment and medium

Technical Field

The invention relates to the technical field of image processing, in particular to an OCT image denoising method, device, equipment and medium.

Background

Coherence Tomography (OCT) has been widely used for ophthalmic diagnostics as an effective diagnostic tool. However, the OCT imaging process is inevitably disturbed by sample-based speckle and detector noise, which degrades image quality, thereby severely impacting subsequent processing and analysis of the OCT image. Speckle noise is multiplicative, as distinguished from gaussian noise, and is related to the microstructure of tissue. The noise that is common at present is mainly 4 types: speckle noise, gaussian noise, poisson noise, and impulse noise, wherein speckle noise is the most common and representative noise. The invention mainly aims at the denoising algorithm of speckle noise to study. The image denoising is an essential step in various image processing algorithms, is helpful for further distinguishing and explaining the image content, and has important significance in the field of image processing.

Image denoising is to remove noise from an interfered image and restore the image to a clean image. It is a challenging task to design a suitable denoising method for the types of noise present in the image. Conventional image denoising methods are generally classified into 2 types: although the methods of time domain denoising and frequency domain denoising can better remove noise, the method can cause an edge to be excessively smoothed, thereby causing the loss of detail information. For the denoising method of the time domain, the bilateral filtering algorithm can keep a better filtering effect. The bilateral filter is a nonlinear filter, and can replace the gray value of each pixel point and the weighted average of the gray values of the adjacent pixel points. For the frequency domain denoising method, in 2014, researchers proposed a Variational Modal Decomposition (VMD) algorithm improved based on an Empirical Mode Decomposition (EMD), which mainly seeks an optimal value of each intrinsic modal component by a variational method to overcome the defect of the EMD. In addition, VMD is also extended to two-dimensional range, i.e., two-dimensional variational modal decomposition (2D-VMD) methods. The traditional method can obtain a certain denoising effect, but the effect is not ideal when processing speckle noise images, and image details are easy to lose. In response to this deficiency, a number of researchers have proposed to perform denoising using an image prior model. Buades et al propose a denoising method of Non-Local Means NLM (Non-Local Means). The method makes full use of redundant information in the image, and can greatly save the detailed characteristics of the image while denoising. The BM3D (Block Method of 3-Dimension) algorithm proposed by Dabov et al fuses ideas of a non-local denoising Method and a wavelet transform domain denoising Method, and achieves a good denoising effect. The PID (Progressive Image Denoising) algorithm proposed by Knaus et al takes Image Denoising as a simple physical process, and can obtain a good Denoising effect under the condition of no artifact by gradually reducing the noise. Gu et al propose a weighted nuclear norm minimization denoising method WNNM (weighted norm minimization method) by combining a non-local idea and a low-rank approximation theory. Although the denoising methods can achieve better denoising effect, the denoising performance of the methods is greatly reduced when the noise density is increased.

Some conventional methods involve filtering, such as bilateral filters and wavelet filters. Can be used to suppress speckle patterns, but at the same time is particularly prone to loss of image detail, especially if the image features are not of the same size. Non-local mean (NLM) and block matching 3D (BM 3D) were originally designed to remove additive white gaussian noise and have been successfully applied to various imaging techniques. Still more advanced digital noise identification algorithms have been used to noise identify OCT images. However, these methods may cause blurring of the contours and loss of detail. Sparse representation [11] has become a modern denoising tool, but ignores the inherent features of spectral signals.

At present, some researches prove that the image recognition rate can be greatly improved by using a deep learning network to denoise the OCT image, including RED-Net and a persistent memory network (MemNet). In these works, a mathematical model of the noise is built and the noise is randomly multiplied in the OCT image to simulate a labeled noisy image, thereby avoiding the labor intensive work of efficiently generating a training data set. This method has limitations because the noise is simulated in the training data, and because the noise in the OCT image is complex, it is difficult to obtain a perfect simulation result. Thus, using true noisy data may be more effective in training the data set. In contrast to noise modeling, a noise map is obtained by subtracting a noise image from a sharp image. However, in an OCT scanning system, moving the sample will cause distortion of the OCT image. Therefore, an additional alignment work between B-scan images is very necessary, and the quality of scanning largely affects the denoising performance. Although FF-OCT can easily obtain multiple B-scans in good registration, it has the disadvantage that the imaging speed of the sensor frame rate can be limited.

Disclosure of Invention

In order to solve the technical problems, the invention provides an OCT image denoising method, device, equipment and medium, and provides a network (DnFFA) for progressive feature fusion attention intensive based on the original deep convolution network, and the network (DnFFA) is used for removing speckle noise. Firstly, local features of an image are extracted and a series of preprocessing is carried out, and the preprocessing strategy is to remove a background by enhancing contrast and eliminating noise. And then, performing feature fusion on the network by adopting a progressive method, extracting global network features of noise, then introducing an attention mechanism to further improve the network, and finally, fusing the output feature maps of all the dense blocks and inputting the fused output feature maps to a reconstruction output layer to obtain the denoised OCT image.

In order to achieve the purpose, the technical scheme of the invention is as follows:

an OCT image denoising method comprises the following steps:

obtaining an OCT image to form a sample data set, wherein the sample data set is divided into a training set and a verification set;

constructing a feature fusion attention intensive network for OCT image denoising, wherein the main structure of the feature fusion attention intensive network comprises a sparse block, a feature enhancement block, an attention mechanism and a reconstruction block, and the feature fusion attention intensive network is used for denoising the OCT image;

inputting a sample data set into the feature fusion attention intensive network for training and testing until convergence;

and inputting the image to be detected into the trained feature fusion attention-intensive network for processing to obtain a clear OCT image after denoising.

Preferably, the method further comprises the following steps: and preprocessing the sample data set and the image to be detected.

Preferably, the preprocessing includes graying, clipping, and filtering processing.

Preferably, the sparse block is used for learning the noise distribution in the image after each residual error network inputs a new fusion feature;

the characteristic enhancement block is used for extracting an image containing noise in a shallow layer;

the attention mechanism is used for summarizing and evaluating the importance of information of each channel between layers, extracting an attention heat map of a space when summarizing and evaluating the importance of the same position of multiple channels, and finally multiplying the attention heat map back to the original channel;

and the reconstruction block is used for fusing the global features output by the previous layer and keeping the sizes of the input and the output consistent.

Preferably, the feature fusion attention-intensive network module is divided into fifteen layers, wherein the first layer is Conv + ReLU, the convolution kernel of the convolution layer has the size of 3 × 3, and the number of the convolution kernels is 128, and the shallow feature extraction is performed on the input image containing noise; the second layer is Conv, the size of the convolution kernel is 3 x 3, the number of the convolution kernels is 48, and the second layer is used for adjusting the size of the channel of the feature map; third to twelfth layers: the system comprises 4 Dense Block modules, four residual error blocks and a Transition layer between every two modules, wherein in each Dense Block module, the sizes of convolution kernels of all residual error structures are 3 × 3, the numbers of the convolution kernels are 64 respectively, each residual error network can learn the noise distribution in an image after new fusion characteristics are input, and the sizes of the convolution kernels in the Transition layer are 1 × 1 and the numbers of the convolution kernels are 48; the thirteenth layer is a Concat layer and is used for fusing the feature modules; the fourteenth layer is an attention mechanism layer, the feature graph generated by the dense connecting blocks is sent to an attention module to generate a new feature graph, the new feature graph is added to the feature graph generated by jumping connection at the encoder end, then the feature graph is subjected to deconvolution by 3 x 3 and then sent to the dense connecting blocks, and the operation is repeated in such a way, and finally the high-resolution feature graph generated by the network is converted into the feature graph with the channel number of 2 through convolution mapping of 1 x 1; the fifteenth layer is a reconstruction output layer of a single convolution kernel, and is used for fusing the global features of the previous layer and keeping the size of the output consistent with that of the input.

Preferably, the loss function expression of the feature fusion attention-intensive network is as follows:

in the formula (I), the compound is shown in the specification,

residual image for noise input, y _i For input noisy images, x _i An image without any foreign matter, (y) _i -x _i ) The result is a standard residual image, N being the number of input samples.

Preferably, the weight update of the feature fusion attention-intensive network is as follows:

wherein W is a convolution kernel; l is the second convolution layer; b is the number of iterations; α is a learning rate.

An OCT image denoising apparatus, comprising: an acquisition module, a construction module, a model acquisition module and a detection module, wherein,

the acquisition module is used for acquiring an OCT image to form a sample data set, and the sample data set is divided into a training set and a verification set;

the construction module is used for constructing a feature fusion attention intensive network for OCT image denoising, and the main structure of the feature fusion attention intensive network comprises a sparse block, a feature enhancement block, an attention mechanism and a reconstruction block and is used for denoising the OCT image;

the model acquisition module is used for inputting the sample data set into the feature fusion attention intensive network for training and testing until convergence;

and the detection module is used for inputting the image to be detected into the trained feature fusion attention-intensive network for processing to obtain the clear OCT image after denoising processing.

A computer device, comprising: a memory for storing a computer program; a processor for implementing an OCT image denoising method as described in any one of the above when executing the computer program.

A readable storage medium having stored thereon a computer program which, when executed by a processor, implements an OCT image denoising method as recited in any one of the above.

Based on the technical scheme, the invention has the beneficial effects that:

1) According to the method, densely connected dense blocks are arranged in the network, each layer takes the previous feature map as input, and the shallow convolution feature map is connected with the short lines of the deep convolution feature map extracted from each dense block in sequence by combining a progressive idea to form a residual block, so that the network can better predict noise distribution;

2) The invention adopts the DenseNet with the attention mechanism to focus attention on key features and inhibit other irrelevant features in the feature extraction process;

3) The dense short circuit structure designed by the invention can effectively reduce the computation complexity of the network, reduce a large number of network parameters and shorten the computation time of the algorithm.

Drawings

FIG. 1 is a flow chart of an OCT image denoising method in one embodiment;

FIG. 2 is a block diagram of a feature fusion attention-intensive network architecture;

FIG. 3 is a test artwork for one embodiment with noise levels σ =15, σ =25, and σ = 40;

fig. 4 is a graph of the denoising result when the noise level is σ =15, σ =25, and σ =40 in one embodiment;

FIG. 5 is a schematic structural diagram of an OCT image denoising apparatus in an embodiment;

FIG. 6 is a diagram illustrating an internal structure of a computer device in one embodiment.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention.

As shown in fig. 1, the present embodiment provides an OCT image denoising method, which includes the following specific steps:

and 110, acquiring an OCT image to form a sample data set, wherein the sample data set is divided into a training set and a verification set.

In this embodiment, 512 OCT images of 224 × 224 pixels are selected from the sample data set. In order to make the network training converge more quickly, the images used in the experiment are uniformly cut into 512 × 451 (width × height), namely, the middle effective retina area is taken, 412 images are cut into 200000 sub image blocks of 50 × 50 to be used as a training set, the remaining 100 images are used as a verification set, and the robustness and the practicability of the network are verified by carrying out multiple experiments. The experimental verification sets are all selected from noise image data sets constructed by SD-OCT diabetic retina images of different patients.

Step 120, constructing a feature fusion attention intensive network for OCT image denoising; the main structure of the feature fusion attention-intensive network comprises a sparse block, a feature enhancement block, an attention mechanism and a reconstruction block, and is used for denoising the OCT image.

In this embodiment, the feature fusion attention-intensive network (DnFFA) mainly utilizes four modules, as shown in fig. 2: a sparse block (DB), a Feature Enhancement Block (FEB), an Attention Mechanism (AM) and a Reconstruction Block (RB) for image denoising. The linear correction unit ReLU is adopted by each convolution-laminated activation function, and the correction unit enables the network to be sparse, so that the dependency relationship among parameters is reduced, and the problem of disappearance of network gradients is avoided.

The sparse block (DB) learns the noise distribution in the image after each residual network inputs new fusion features, wherein the dense block structure connects layers by short-circuiting lines, the output of each layer will be used as the input of the next layer, and the connection is expressed by the following formula:

M _l ＝X _l ([M ₀ ，M ₁ ，...，M _L - ₁ ])

in the formula: m _l Represents the output characteristic diagram of the l-th layer, [ M ] ₀ ，M ₁ ，...，M _L - ₁ ]The output characteristic diagrams of the 0 to l-1 layers are merged into channels, namely the channels are directly stacked without other operations on the channels. X _l It means that the merged feature maps are sequentially input into the bottleneck layer. Because the output of each layer will be used as the input of the next layer, all the inputs are superposed, and the extracted feature maps of all the previous layers are fused, the input channels of the next layers will be increased by a lot. By designing a convolution layer with convolution kernel of 3 multiplied by 3 in the bottleneck layer, the number of characteristic graphs is reduced, the calculated amount of the network can be reduced, and the gradient transfer efficiency is improved;

the Feature Enhancement Block (FEB) mainly aims at shallow extraction of images containing noise; the Attention Mechanism (AM) is to perform summary evaluation on the importance of information of each channel among layers to improve the proportion of the important information, and when the importance of the same position of multiple channels is performed the summary evaluation, an attention heat map of a space is extracted and finally multiplied back to the original channel. Therefore, the network performance can be effectively improved on the basis of not increasing the network parameter number; the Reconstruction Block (RB) fuses the global features output by the previous layer and keeps the size of the input and the output consistent.

The DnFFA network modules are mainly divided into fifteen layers. The first layer was Conv + ReLU, the convolution kernel size of this convolutional layer was 3 × 3, and the number was 128. The main function is to perform shallow feature extraction on an input image containing noise. The second layer is Conv, the convolution kernel size is 3 x 3, and the number is 48. The method has the main function of adjusting the channel size of the feature diagram, and is convenient for removing redundant feature diagrams in the Dense Block module, thereby reducing the calculation burden of the network. The most core part of the network appears in the third to twelfth layers, and the part mainly comprises 4 Dense Block modules, four residual blocks, and a Transition layer and a ReLU + Conv layer between each module. In the sense Block module, the convolution kernels of all residual structures have a size of 3 × 3, and the number of the convolution kernels is 64. Each residual network can learn the noise distribution in the image after the new fusion features are input. The convolution kernels in the TransitionLayer are all 1 × 1, and the number is 48, so that the network parameters are further compressed, and the channel features are fused to form a new feature. The thirteenth layer is a Concat layer, and the layer mainly fuses feature modules, namely the feature modules are fused with the first and second sense Block modules and the third sense Block module to strengthen the global features of the learning network. And the fourteenth layer is an attention mechanism layer, the feature map generated by the dense connection block is sent to an attention module to generate a new feature map, the new feature map is added with the feature map generated by the skip connection at the encoder end, then the feature map is subjected to deconvolution by 3 x 3 and then sent to the dense connection block, the process is repeated, finally the high-resolution feature map generated by the network is converted into the feature map with the channel number of 2 through convolution mapping of 1 x 1, the main function of the layer is to extract noise information hidden in a complex background, and the module is very effective for noise images, both real noise images and blind noise. The fifteenth layer is a reconstructed output layer of a single convolution kernel, which fuses the global features of the previous layer and keeps the output size consistent with the input size, so that the predicted noise can be more easily extracted from the image containing noise through residual error learning, and the output of the network is a purer image.

Step 130, inputting the sample data set into the feature fusion attention intensive network for training and testing until convergence.

In this embodiment, these noise images are input into a designed feature fusion attention-intensive network during training, and parameters are adjusted by back propagation through a loss function, so that the network finally reaches a convergence state. And in the testing stage, the noise-containing image is input into the converged network, and the corresponding prediction de-noising image can be directly output. The loss function expression used is as follows:

in the formula (I), the compound is shown in the specification,

residual image input for noise, y _i For input noisy images, x _i An image without any foreign matter, (y) _i -x _i ) The result is a standard residual image. N is the number of samples input. The network model is trained in a batch processing mode, namely N noisy images are input each time, the N noisy images are correspondingly output through a feature fusion attention-intensive network, then a loss function L is calculated, parameters are optimized by using an Adam (adaptive motion estimation) method, and then the weight value is updated as follows:

in the formula: w is a convolution kernel; l is the second convolution layer; b is the number of iterations; α is learning rate, and is initially set to 10 ^-3 With training, will fall to 10 ^-4 . Through continuous iteration of the training process, the error between the estimated residual error and the standard residual error is reduced, so that the predicted de-noised image is closer to the original image, and a better de-noising effect is obtained.

And 140, inputting the image to be detected into the trained feature fusion attention-intensive network for processing to obtain the OCT image after denoising processing.

In one embodiment, in order to achieve the purpose of removing noise and simultaneously maximally preserving original image features, a preprocessing method based on a dense network with progressive feature fusion attention is provided, and the preprocessing method specifically comprises the following steps:

step 111: firstly, reading an image with speckle noise, performing gray processing on the image to obtain f (x, y), and uniformly setting the window size of the image to be 224 × 224 pixels;

step 112: the image data set obtained in step 111 is decomposed into high frequency detail layers y by filtering using a low pass filter _high And a low frequency background layer y _low The formula is as follows:

y＝y _high +y _low

some pixels with strong gray scale change, namely pixels which are not processed yet, are reserved in the high-frequency layer. The information contained in the low-frequency layer of the processed image is the same, and the pixel points are all 0;

and step 113: the feature fusion attention intensive network is fully pre-trained, and edge profile features such as edges, corners and the like can be efficiently extracted by the low-layer convolution layer of the network;

step 114: after the step 113, the weight of the high layer of the network can be changed by using less data for training, so that the network learns more abstract characteristics;

step 115: as can be seen from the above steps, noise is basically present in the high frequency layer, and the high frequency layer is input to the residual network of the image.

Experiment of

The experiment is carried out in a Linux system, a Deep Learning framework of Deep Learning Toolbox is used for network training, the hardware configuration is Intel Xeon CPU, the memory is 16G, and the display card is 11GB NVIDIATesliaV100 PCIe GPU. OCT image data is from a common dataset: the AI-filler 2018for automatic segmentation of the relevant images data set contains a lot of speckle noise in 85 OCT images of the retina cube (1024 × 512 × 128), and in the original OCT images, there are many areas with pure noise. In the experiment, the larger the PSNR value is, the higher the similarity between the denoised image and the original image is, and the better the denoising effect is. The PSNR calculation formula is:

/>

in the formula: m and N represent predicted values and true values; j and k represent all the pixels in the image. H and W respectively represent the height and width of the image, and the value of n is 8.

In order to evaluate the denoising effect more objectively and fairly, five comparison algorithms of WNNM, BM3D, NLM and DnCNN and DnFFA of the invention are selected for the experiment, FIG. 3 is a test original image, a verification set of the experiment is from an OCT image, and the denoising performance of each algorithm is compared from the subjective and objective aspects when Gaussian white noise levels of 15, 25 and 40 are respectively added in the image verification set. In the image denoising field, BM3D is considered as the best denoising algorithm, and DnCN is an advanced denoising algorithm in the deep learning field. The denoising result is shown in fig. 4, it can be seen that the WNNM algorithm is very limited in the aspect of capturing the characteristics of the image structure, the BM3D algorithm edge is well preserved, although noise points are reduced, a lot of clear noise points can be seen, but the number of the BM3D noise points is obviously less than that of wavelet denoising, the DnCNN algorithm has a good effect on white gaussian noise, and NLM can retain more detailed information, but the edge is still not clear enough. The experimental results are shown in the following table, with bold font representing the best values of the test results.

As can be seen from table 1, when the noise levels are 15, 25, and 40, the noise reduction effect of DnFFA is the best, and compared with the best noise reduction algorithm BM3D considered in the image noise reduction field, the noise level PSNR of DnFFA is all over about 0.31 on average, and compared with other algorithms, the noise reduction effect of DnFFA has the best performance in PSNR values.

TABLE 1 average PSNR (dB) of the denoising results of OCT test plots under 5 algorithms

The OCT image quality reference index characteristic shows that the image quality is better when the PSNR is larger, and the table 1 shows that the denoising result of the DnFFA algorithm can obtain higher PSNR values for 5 noises with different levels, and the PSNR increases more and more along with the enhancement of the noise level, so that the improvement effect is better and better. The experimental result shows that the comparison of the 5 algorithms shows that the DnFFA network structure has better denoising effect to a certain extent.

In one embodiment, as shown in fig. 5, an OCT image denoising apparatus 200 includes: an acquisition module 210, a construction module 220, a model acquisition module 230, and a detection module 240, wherein,

the obtaining module 210 is configured to obtain an OCT image to form a sample data set, where the sample data set is divided into a training set and a verification set;

the constructing module 220 is configured to construct a feature fusion attention-intensive network for OCT image denoising, where a main structure of the feature fusion attention-intensive network includes a sparse block, a feature enhancing block, an attention mechanism, and a reconstruction block, and is configured to denoise an OCT image;

the model obtaining module 230 is configured to input a sample data set into the feature fusion attention-intensive network for training and testing until convergence;

the detection module 240 is configured to input the image to be detected into the trained feature fusion attention-intensive network for processing, so as to obtain a clear OCT image after denoising processing.

The apparatuses or modules illustrated in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 6. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an OCT image denoising method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with the present application, and is not intended to limit the computing device to which the present application may be applied, and that a particular computing device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the OCT image denoising method as described in any one of the above.

It will be understood by those skilled in the art that all or part of the processes of the OCT image denoising method according to the above embodiments may be implemented by a computer program, which may be stored in a non-volatile computer-readable storage medium, and the computer program may include the processes of the above embodiments of the methods when executed. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The above description is only the preferred embodiment of the OCT image denoising method, apparatus, device and medium disclosed in the present invention, and is not intended to limit the scope of protection of the embodiments of the present specification. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the embodiments of the present disclosure should be included in the protection scope of the embodiments of the present disclosure.

Claims

1. An OCT image denoising method is characterized by comprising the following steps:

constructing a feature fusion attention-intensive network for OCT image denoising, wherein the main structure of the feature fusion attention-intensive network comprises a sparse block, a feature enhancement block, an attention mechanism and a reconstruction block and is used for denoising an OCT image, and the feature fusion attention-intensive network is divided into fifteen layers, wherein the first layer is Conv + ReLU, the convolution kernel of the convolution layer is 3 × 3, the number of the convolution kernels is 128, and the feature fusion attention-intensive network is used for performing shallow feature extraction on an input image containing noise; the second layer is Conv, the size of the convolution kernel is 3 x 3, the number of the convolution kernels is 48, and the second layer is used for adjusting the size of the channel of the feature map; third to twelfth layers: the image fusion system comprises 4 Dense Block modules, 4 residual blocks and Transition layers among the Dense Block modules, wherein the sizes of convolution kernels of all residual structures in the Dense Block modules are 3 × 3, the numbers of the convolution kernels are 64 respectively, each residual network can learn the noise distribution in an image after new fusion characteristics are input, and the sizes of convolution kernels in the Transition layers are 1 × 1 and the numbers of the convolution kernels are 48; the thirteenth layer is a Concat layer and is used for fusing the feature modules; the fourteenth layer is an attention mechanism layer, the feature graph generated by the dense connecting blocks is sent to an attention module to generate a new feature graph, the new feature graph is added to the feature graph generated by jumping connection at the encoder end, then the feature graph is subjected to deconvolution by 3 x 3 and then sent to the dense connecting blocks, and the operation is repeated in such a way, and finally the high-resolution feature graph generated by the network is converted into the feature graph with the channel number of 2 through convolution mapping of 1 x 1; the fifteenth layer is a reconstruction output layer of a single convolution kernel and is used for fusing the global features of the previous layer and keeping the size of the output consistent with that of the input;

inputting the sample data set into the feature fusion attention intensive network for training and testing until convergence;

2. The OCT image denoising method of claim 1, further comprising: and preprocessing the sample data set and the image to be detected.

3. The method of claim 2, wherein the preprocessing comprises graying, cropping, and filtering.

4. The method for denoising the OCT image according to claim 1, wherein the sparse block is used for learning noise distribution in the image after inputting the new fusion feature for each residual network;

5. The OCT image denoising method of claim 1, wherein the loss function expression of the feature fusion attention-intensive network is as follows:

in the formula (I), the compound is shown in the specification,

6. The OCT image denoising method of claim 1, wherein the weight value of the feature fusion attention-intensive network is updated as follows:

7. An OCT image denoising apparatus, comprising: an acquisition module, a construction module, a model acquisition module and a detection module, wherein,

the construction module is used for constructing a feature fusion attention-intensive network for OCT image denoising, the main structure of the feature fusion attention-intensive network comprises a sparse block, a feature enhancement block, an attention mechanism and a reconstruction block and is used for denoising the OCT image, the feature fusion attention-intensive network module is divided into fifteen layers, wherein the first layer is Conv + ReLU, the convolution kernel of the convolution layer is 3 × 3, the number of the convolution kernels is 128, and the shallow feature extraction is performed on the input image containing noise; the second layer is Conv, the size of the convolution kernel is 3 x 3, the number of the convolution kernels is 48, and the second layer is used for adjusting the size of the channel of the feature map; third to twelfth layers: the image fusion system comprises 4 Dense Block modules, 4 residual blocks and Transition layers among the Dense Block modules, wherein the sizes of convolution kernels of all residual structures in the Dense Block modules are 3 × 3, the numbers of the convolution kernels are 64 respectively, each residual network can learn the noise distribution in an image after new fusion characteristics are input, and the sizes of convolution kernels in the Transition layers are 1 × 1 and the numbers of the convolution kernels are 48; the thirteenth layer is a Concat layer and is used for fusing the feature modules; the fourteenth layer is an attention mechanism layer, the feature graph generated by the dense connecting blocks is sent to an attention module to generate a new feature graph, the new feature graph is added to the feature graph generated by the jumping connection of the encoder end, then the feature graph is subjected to deconvolution by 3 × 3 and then sent to the dense connecting blocks, the operation is repeated in this way, and finally the high-resolution feature graph generated by the network is converted into the feature graph with the channel number of 2 through convolution mapping of 1 × 1; the fifteenth layer is a reconstruction output layer of a single convolution kernel and is used for fusing the global features of the previous layer and enabling the size of the output to be consistent with the size of the input;

8. A computer device, comprising: a memory for storing a computer program; a processor for implementing an OCT image denoising method as claimed in any one of claims 1 to 6 when executing the computer program.

9. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when being executed by a processor, implements an OCT image denoising method according to any one of claims 1 to 6.