CN111402174A - Single OCT B-scan image denoising method and device - Google Patents

Single OCT B-scan image denoising method and device Download PDF

Info

Publication number
CN111402174A
CN111402174A CN202010259216.5A CN202010259216A CN111402174A CN 111402174 A CN111402174 A CN 111402174A CN 202010259216 A CN202010259216 A CN 202010259216A CN 111402174 A CN111402174 A CN 111402174A
Authority
CN
China
Prior art keywords
layer
image
scan
convolution
images
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010259216.5A
Other languages
Chinese (zh)
Inventor
杨卓榛
汪霄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Topi Imaging Technology Co ltd
Original Assignee
Beijing Topi Imaging Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Topi Imaging Technology Co ltd filed Critical Beijing Topi Imaging Technology Co ltd
Priority to CN202010259216.5A priority Critical patent/CN111402174A/en
Publication of CN111402174A publication Critical patent/CN111402174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10101Optical tomography; Optical coherence tomography [OCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a method and a device for denoising a single OCT B-scan image, wherein the method comprises the following steps: collecting a plurality of groups of B-scan images, wherein each group comprises a plurality of images which are obtained by scanning the same position at the same time and have the same content and independent and same noise distribution; registering the images in each group to form a sample pair so as to construct a data set; training a shallow layer U-shaped network model by using the data set to serve as an image denoising model; and inputting the single image to be denoised into an image denoising model, and obtaining a denoised image at an output end. Aiming at the characteristics that the input and the output of a single B-scan image are images with the same resolution ratio during denoising, and the denoising task is relatively simple and does not depend on high-level semantic information strongly, the shallow U-shaped network is used, so that the effect is better and the efficiency is higher; meanwhile, the invention takes two images with the same content and random noise as a training sample pair without acquiring high-definition images, thereby reducing the difficulty of data acquisition and facilitating implementation.

Description

Single OCT B-scan image denoising method and device
Technical Field
The invention relates to the field of OCT image processing, in particular to a method and a device for denoising a single OCT B-scan image.
Background
With the rapid development of computer and medical technology, OCT (Optical Coherence Tomography) has been widely used in the diagnosis of fundus diseases, and is of great significance in the detection and treatment of ophthalmic diseases. OCT belongs to a high-sensitivity, high-resolution, high-speed and non-invasive tomography imaging mode, optical coherence is used for imaging fundus scan, each scan is called an A-scan, adjacent continuous multiple scans are combined together to be called a B-scan, the B-scan is also called a commonly seen OCT sectional view, and the OCT imaging mode is the most important imaging mode in medical diagnosis.
In practice, a single B-scan image tends to contain more noise and less image quality. In the prior art, a B-scan image denoising scheme is to scan the same position to obtain a plurality of images, and fuse the images after registration to denoise, however, the inventor finds that the scheme of reducing noise by fusing the images obtained by a plurality of scans cannot denoise a single B-scan image in the process of implementing the invention. In the prior art, another scheme is to use a filter in the traditional computer vision technology, such as a gaussian filter, a mean filter, a median filter, etc., to denoise an image, however, the inventor finds that the denoising effect of the filter is not ideal, the image still contains noise, and the image is blurred in the process of implementing the invention. Therefore, an effective scheme for denoising a single OCT B-scan is lacked in the prior art.
Disclosure of Invention
The embodiment of the invention provides a method and a device for denoising a single OCT B-scan image, which aim to solve the denoising problem of the single OCT B-scan image.
According to a first aspect of the embodiments of the present invention, there is provided a single OCT B-scan image denoising method, including:
collecting a plurality of groups of B-scan images, wherein each group of B-scan images comprises a plurality of B-scan images which are obtained by scanning the same position at the same time, have the same content and are independently and uniformly distributed in noise;
registering each B-scan image in each group to form a sample pair so as to construct a data set;
training an image denoising model by using the data set, wherein the image denoising model is a shallow U-shaped network model;
and inputting a single B-scan image to be denoised into the trained image denoising model, and obtaining a denoised image at an output end.
Optionally, registering the respective B-scan images in each group includes:
for each group of B-scan images, selecting one B-scan image in the current group as a reference, and respectively aligning other B-scan images in the current group with the selected B-scan image.
Optionally, aligning the other B-scan images in the current group with the selected B-scan image respectively includes:
respectively calculating the positions of the other B-scan images in the current group, which have the maximum cross correlation with the selected B-scan image, based on Fourier transform;
and respectively aligning other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross correlation.
Optionally, aligning the other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross-correlation, respectively, includes:
and respectively aligning other B-scan images in the current group with the selected B-scan image in a mode of only intercepting the specified range of the center position of the image according to the position with the maximum cross correlation.
Optionally, the U-type network model is 3 layers, and includes two down-sampling and two up-sampling, the network structures of the coding layer and the decoding layer of the U-type network model are the same, the decoding layer is connected with the coding layer concat, and the input and output residuals are connected.
Optionally, the first layer of the coding layer of the U-shaped network model performs convolution with the number of filters being 32 and the size of convolution kernel being 3 × twice, and the step size is 1, the second layer of the coding layer performs convolution with the number of filters being 64 and the size of convolution kernel being 3 × twice, wherein the first step size is 2, downsampling is realized, and the second step size is 1;
the decoding layer first layer of the U-shaped network model carries out deconvolution with the number of filters at one time of 64, the convolution kernel size of 6 × 6 and the step length of 2, the decoding layer second layer is connected with the output concat of the second convolution of the coding layer second layer, convolution with the number of filters at one time of 64, the convolution kernel size of 3 × 3 and the step length of 1 and deconvolution with the number of filters at one time of 32, the convolution kernel size of 6 × 6 and the step length of 2 are carried out, the decoding layer third layer is connected with the output concat of the second convolution of the coding layer first layer, convolution with the number of filters at one time of 32, the convolution kernel size of 3 × 3 and the step length of 1 is carried out, and a convolution output result with the number of filters at one time of 1, the convolution kernel size of 3 × 3 and the step length of 1 is output.
Optionally, before the single B-scan image to be denoised is input into the trained image denoising model, the method further includes:
and taking the image after the registration averaging of the plurality of B-scan images as a standard, and evaluating the denoising performance of the image denoising model.
According to a second aspect of the embodiments of the present invention, there is provided a single-sheet OCT B-scan image denoising apparatus, including:
the image acquisition module is used for acquiring a plurality of groups of B-scan images, wherein each group of B-scan images comprises a plurality of B-scan images which are obtained by scanning the same position at the same time and have the same content and independent and same noise distribution;
the data set construction module is used for forming a sample pair after registering each B-scan image in each group so as to construct a data set;
the model training module is used for training an image denoising model by using the data set, wherein the image denoising model is a shallow U-shaped network model;
and the model application module is used for inputting the single B-scan image to be denoised into the image denoising model after training is completed, and obtaining a denoised image at the output end.
Optionally, when the data set building module aligns the B-scan images in each group, the data set building module is specifically configured to:
for each group of B-scan images, selecting one B-scan image in the current group as a reference, and respectively aligning other B-scan images in the current group with the selected B-scan image.
Optionally, when aligning the other B-scan images in the current group with the selected B-scan image, the data set constructing module is specifically configured to:
respectively calculating the positions of the other B-scan images in the current group, which have the maximum cross correlation with the selected B-scan image, based on Fourier transform; and respectively aligning other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross correlation.
Optionally, when the data set constructing module aligns other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross-correlation, the data set constructing module is specifically configured to:
and respectively aligning other B-scan images in the current group with the selected B-scan image in a mode of only intercepting the specified range of the center position of the image according to the position with the maximum cross correlation.
Optionally, the U-type network model is 3 layers, and includes two down-sampling and two up-sampling, the network structures of the coding layer and the decoding layer of the U-type network model are the same, the decoding layer is connected with the coding layer concat, and the input and output residuals are connected.
Optionally, the first layer of the coding layer of the U-shaped network model performs convolution with the number of filters being 32 and the size of convolution kernel being 3 × twice, and the step size is 1, the second layer of the coding layer performs convolution with the number of filters being 64 and the size of convolution kernel being 3 × twice, wherein the first step size is 2, downsampling is realized, and the second step size is 1;
the decoding layer first layer of the U-shaped network model carries out deconvolution with the number of filters at one time of 64, the convolution kernel size of 6 × 6 and the step length of 2, the decoding layer second layer is connected with the output concat of the second convolution of the coding layer second layer, convolution with the number of filters at one time of 64, the convolution kernel size of 3 × 3 and the step length of 1 and deconvolution with the number of filters at one time of 32, the convolution kernel size of 6 × 6 and the step length of 2 are carried out, the decoding layer third layer is connected with the output concat of the second convolution of the coding layer first layer, convolution with the number of filters at one time of 32, the convolution kernel size of 3 × 3 and the step length of 1 is carried out, and a convolution output result with the number of filters at one time of 1, the convolution kernel size of 3 × 3 and the step length of 1 is output.
Optionally, the apparatus further comprises:
and the model evaluation module is used for evaluating the denoising performance of the image denoising model by taking an image obtained by registering and averaging a plurality of B-scan images as a standard before inputting a single B-scan image to be denoised into the trained image denoising model.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
the inventor finds in the process of implementing the invention that for the B-scan image denoising task, the input and output are the same resolution, in other words, the output of the task is high resolution and spatially correlated, so a U-type network with better performance on such task is used; meanwhile, the denoising task is relatively simple and does not depend on high-level semantic information strongly, so that the U-type network is set to be a shallow layer (for example, 3 layers), and the efficiency of the model is improved; the scheme of the invention also adopts a Noise2Noise idea, namely two images with the same content and random Noise can be used as a sample pair to train the model, and a high-definition image does not need to be obtained as a label. Therefore, the method can solve the denoising problem of a single OCT B-scan image, and has the advantages of high efficiency, convenience in implementation and the like.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention, as claimed.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise. Furthermore, these descriptions should not be construed as limiting the embodiments, wherein elements having the same reference number designation are identified as similar elements throughout the figures, and the drawings are not to scale unless otherwise specified.
FIG. 1 is a flow diagram illustrating a single OCT B-scan image denoising method according to an exemplary embodiment of the invention;
FIG. 2 is a schematic diagram of a 3-layer U-type network shown in accordance with an exemplary embodiment of the present invention;
FIG. 3 is a flow diagram illustrating a single OCT B-scan image denoising method according to an exemplary embodiment of the invention;
FIG. 4 is a schematic diagram illustrating a sample pair in a data set according to an exemplary embodiment of the present invention;
FIG. 5 is a schematic diagram of a 3-layer U-type network shown in accordance with an exemplary embodiment of the present invention;
FIG. 6 is a clear picture after registration averaging shown in accordance with an exemplary embodiment of the present invention;
FIG. 7 is a single B-scan picture containing noise shown in accordance with an exemplary embodiment of the present invention;
FIG. 8 illustrates the effect of denoising a picture according to an exemplary embodiment of the present invention;
FIG. 9 is a schematic diagram illustrating a single-sheet OCT B-scan image denoising apparatus according to an exemplary embodiment of the present invention;
FIG. 10 is a schematic diagram illustrating a single-sheet OCT B-scan image denoising apparatus according to an exemplary embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
FIG. 1 is a flow chart illustrating a single-sheet OCT B-scan image denoising method according to an exemplary embodiment of the invention. The present embodiment can be applied to a system composed of an OCT apparatus, a computer, or the like.
Referring to fig. 1, the method may include:
and S101, acquiring a plurality of groups of B-scan images, wherein each group of B-scan images comprises a plurality of B-scan images which are obtained by scanning the same position at the same time and have the same content and independent and same noise distribution.
For example, the same location may be the same retinal location in the same eye of the same person. As an example, M sets may be acquired, each set containing N B-scan images of the same location.
And S102, registering the B-scan images in each group to form a sample pair so as to construct a data set. For example, N x (N-1) training pairs may be formed within each group.
In the embodiment, a Noise2Noise idea is adopted, and for Noise with a zero Mean value and independent and same distribution, when MSE (Mean Squared Error) is used as a loss function, two images with the same content and random Noise can be used as a sample pair to train a denoising model, so that a high-definition image does not need to be acquired as a label. Because the difficulty of acquiring medical data is relatively high, the difficulty of acquiring data can be reduced by adopting the Noise2Noise idea in the embodiment.
In order to ensure the performance of the model, it is usually necessary to register two images in each sample, and how to register the respective B-scan images in each group in specific implementation is not limited in this embodiment, and those skilled in the art can select and design the images according to different requirements/different scenarios, and these selections and designs can be used herein without departing from the spirit and scope of the present invention.
And S103, training an image denoising model by using the data set, wherein the image denoising model is a shallow layer U-shaped network model.
The U-shaped network can simultaneously contain low-level edge, shape and other features and high-level semantic features, is suitable for solving the computer vision problem of image-to-image conversion, and has high accuracy. The U-type network belongs to a codec structure. In the encoding process, a Receptive field (received field) is enlarged through downsampling, and deep semantic information is captured; in the decoding process, the characteristics of the bottom layer and the characteristics of the up-sampling recovery are directly combined through concat connection, and richer characteristics are obtained.
As an example, the shallow layer may be 3 layers. The input and the output of the B-scan image denoising task are the same in resolution, and a U-type network has better performance on the task; meanwhile, because the denoising task is relatively simple and does not have strong dependence on high-level semantic information, a 3-level U-shaped network can be used, and the efficiency of the model can be improved. The specific structure of the U-type network model is not limited in this embodiment, and those skilled in the art can select and design the model according to different requirements/different scenarios, and these choices and designs can be used herein without departing from the spirit and scope of the present invention.
By way of example, reference may be made to fig. 2, which is a simplified schematic diagram of a 3-layer U-type network architecture that may be used with embodiments of the present invention.
And S104, inputting a single B-scan image to be denoised into the image denoising model after training is finished, and obtaining a denoised image at an output end.
In the embodiment, aiming at the characteristics that the input and the output of a single B-scan image are images with the same resolution ratio during denoising, the denoising task is relatively simple, and the dependence on high-level semantic information is not strong, a shallow U-shaped network is used, so that the effect is better, and the efficiency is higher; meanwhile, the embodiment of the invention also adopts a Noise2Noise idea, namely two images with the same content and random Noise can be used as a sample pair to train the model, and a high-definition image does not need to be obtained as a label. Therefore, the embodiment of the invention not only can solve the denoising problem of a single OCT B-scan image, but also has the beneficial effects of higher efficiency, convenience in implementation and the like.
As an example, in this embodiment or some other embodiments of the present invention, registering the respective B-scan images in each group may specifically include:
for each group of B-scan images, selecting one B-scan image in the current group as a reference, and respectively aligning other B-scan images in the current group with the selected B-scan image.
As an example, in this embodiment or some other embodiments of the present invention, aligning other B-scan images in the current group with the selected B-scan image respectively may specifically include:
respectively calculating the positions of the other B-scan images in the current group, which have the maximum cross correlation with the selected B-scan image, based on Fourier transform;
and respectively aligning other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross correlation.
Since the images may move during the registration \ alignment process, only the central portion of the images may be cut, that is, in this embodiment or some other embodiments of the present invention, the aligning the other B-scan images in the current group with the selected B-scan image according to the position with the largest cross-correlation may specifically include:
and respectively aligning other B-scan images in the current group with the selected B-scan image in a mode of only intercepting the specified range of the center position of the image according to the position with the maximum cross correlation.
By way of example, in this embodiment or some other embodiments of the present invention, the U-type network model may be a 3-layer structure, including two downsampling and two upsampling, the network structure of the encoding layer and the decoding layer of the U-type network model is the same, the decoding layer is connected to the encoding layer concat, and the input and output residuals are connected.
Concat is included in the structural design of the U-type network itself, such as a network structure diagram, from a first layer on the left to a first layer on the right, and from a second layer on the left to a second layer on the right, in order to combine semantic information of different scales; residual joining (Res) is proposed by ResNet, as is the joining of inputs to outputs in a network, where Res facilitates the back propagation of gradients, speeding up convergence while improving the accuracy of the model.
Further, in this embodiment or some other embodiments of the present invention, the U-type network model may specifically be designed as follows:
the encoding layer first layer of the U-shaped network model performs convolution with the number of filters being 32 and the convolution kernel size being 3 × 3 twice, the step length is 1, the encoding layer second layer performs convolution with the number of filters being 64 and the convolution kernel size being 3 × 3 twice, the first step length is 2, downsampling is achieved, the second step length is 1, and the encoding layer third layer performs convolution with the number of filters being 128, the convolution kernel size being 3 × 3 and the step length being 2;
the decoding layer first layer of the U-shaped network model carries out deconvolution with the number of filters at one time of 64, the convolution kernel size of 6 × 6 and the step length of 2, the decoding layer second layer is connected with the output concat of the second convolution of the coding layer second layer, convolution with the number of filters at one time of 64, the convolution kernel size of 3 × 3 and the step length of 1 and deconvolution with the number of filters at one time of 32, the convolution kernel size of 6 × 6 and the step length of 2 are carried out, the decoding layer third layer is connected with the output concat of the second convolution of the coding layer first layer, convolution with the number of filters at one time of 32, the convolution kernel size of 3 × 3 and the step length of 1 is carried out, and a convolution output result with the number of filters at one time of 1, the convolution kernel size of 3 × 3 and the step length of 1 is output.
In addition, referring to fig. 3, in this embodiment or some other embodiments of the present invention, before inputting a single B-scan image to be denoised into the image denoising model after training, the method may further include:
and S301, taking the image after the registration averaging of the plurality of B-scan images as a standard, and evaluating the denoising performance of the image denoising model.
For example, the mean value of the registered N B-scan images can be used as a standard clear image, and the denoising performance of the image denoising model is evaluated through the peak signal-to-noise ratio (PSNR).
The scheme of the present invention is further described below by taking a specific U-type network model as an example and combining specific application scenarios. Of course, the following application scenarios are only exemplary, and in practical applications, the application scenarios may also be applied to other application scenarios.
The scheme of the embodiment can comprise the following five steps: data acquisition, data set construction, model training, model evaluation and model use.
1) Collecting data:
and acquiring 20 groups of B-scans, wherein each group of 32 pictures are obtained by scanning the same position with the same person, the same eye and the same time, the picture contents are the same, the noise is independently distributed, and the resolution of each picture is 820 × 1024.
2) Constructing a data set:
for each group of pictures, each picture may form a sample pair with the remaining 31 pictures, the data set includes 20 × 32 × 31-19840 sample pairs, as an example, the pictures may be further divided into three sets, i.e., a training set, a verification set and a test set (for example, 12 groups of pictures may be used as the training set, 4 groups may be used as the verification set and 4 groups may be used as the test set), and meanwhile, in order to guarantee the validity of the data set division, the pictures of the same group should not appear in different sets.
Fig. 4 is a schematic diagram of a sample pair in a data set.
The registration may be based on Fourier registration, and the position with the maximum cross-correlation between the two images is calculated and the two images are aligned, so as to simplify the registration process, instead of performing registration separately for each sample pair, one image in each group of images is selected as a reference, and the other images are aligned with the image, and only the part of the center 768 × 896 of the image can be cut off because the images are moved in the registration.
3) Training a model:
because the denoising task is relatively simple and the dependency on high-level semantic information is not strong, the embodiment adopts a 3-layer U-type network, whereas a general U-type network is a 5-layer network.
The general idea of CNN (Convolutional Neural Networks) design can be adopted, in order to ensure that the model does not lose too much information in up-down sampling, convolution (conv) operation is utilized, the step length is 2 to realize down-sampling, and the number of the filters is doubled; with the deconvolution (deconv) operation, the step size is 2, the upsampling is implemented and the number of filters becomes half of the original.
To avoid the checkerboard effect of deconvolution and ensure that the effective field of the deconvolution layer is 3 × 3, the deconvolution layer uses a convolution kernel of 6 × 6.
For each sample pair, the input and output are identical in picture content, but contain different noise. Adding a residual connection (res) between the input and output can make the network learn the noise more efficiently.
Each convolutional layer was followed by a Batch normalization layer and an L eakyRE L U activation layer, except for the output layer of the last layer.
As an example, referring to fig. 5, the 3-layer U-type network in the present embodiment may specifically include:
and an encoding part:
the first layer performs convolution with the number of filters being 32 and the convolution kernel size being 3 × 3 twice, the step size being 1, the second layer performs convolution with the number of filters being 64 and the convolution kernel size being 3 × 3 twice, wherein the step size for the first time is 2, downsampling is realized, the step size for the second time is 1, and the third layer performs convolution with the number of filters being 128, the convolution kernel size being 3 × 3, and the step size being 2.
A decoding part:
the first layer performs deconvolution with a first filter number of 64, a convolution kernel size of 6 × 6, and a step size of 2, the second layer is connected to the output concat of the second convolution of the second layer of the encoding layer, performs deconvolution with a first filter number of 64, a convolution kernel size of 3 × 3, and a step size of 1, and deconvolution with a first filter number of 32, a convolution kernel size of 6 × 6, and a step size of 2, and the third layer is connected to the output concat of the second convolution of the first layer of the encoding layer, performs convolution with a first filter number of 32, a convolution kernel size of 3 × 3, and a step size of 1, and outputs a result by convolution with a first filter number of 1, a convolution kernel size of 3 × 3, and a step size of 1.
During training, the model can be trained by using a constructed training set, the training process is supervised by using a verification set, the batch processing size is 32, the optimizer is Adam (adaptive moment estimation), and the learning rate is 10-2After 40 epochs of training, the learning rate is reduced to 10-3Training is continued for 10 epochs.
4) And (3) evaluating the model:
and evaluating the effect of the model by using the test set. Evaluation criteria: PSNR (peak signal to noise ratio). PSNR measures the difference between two pictures. It is necessary to obtain label, i.e. a clear picture.
The method for obtaining the clear picture comprises the following steps: and averaging after registration. Because each group of pictures are the result of taking pictures of the same person, the same eye, the same position and the same time for a plurality of times, the mean value of the pictures after registration can be calculated to be used as a clear picture.
Fig. 6 is a clear picture after registration averaging.
For each group of data, the result of registration averaging of 32 pictures is taken as a standard clear image, and the signal-to-noise ratios of the original picture, various filters and the denoising model of the invention are shown in table 1:
TABLE 1
Method of producing a composite material Original picture Gauss filter Mean value filter Median filter The model
PSNR 19.91 21.78 23.49 22.29 29.87
It can be seen that the effect of this model is far superior to that of each filter. The filter can improve the signal-to-noise ratio by 2-4 db, and the model can improve the signal-to-noise ratio by 10 db.
5) Using the model:
and collecting a plurality of B-scan scanning images, inputting the B-scan scanning images into the model, and observing the performance of the model in actual use through the output images.
A single B-scan picture containing noise is shown in fig. 7, and fig. 8 is the effect after denoising. As can be seen, most of noise is removed by the model, and the denoised picture is clearer than the original picture.
In the embodiment, aiming at the characteristics that the input and the output of a single B-scan image are images with the same resolution ratio during denoising, the denoising task is relatively simple, and the dependence on high-level semantic information is not strong, a shallow U-shaped network is used, so that the effect is better, and the efficiency is higher; meanwhile, the embodiment of the invention also adopts a Noise2Noise idea, namely two images with the same content and random Noise can be used as a sample pair to train the model, and a high-definition image does not need to be obtained as a label. Therefore, the embodiment of the invention not only can solve the denoising problem of a single OCT B-scan image, but also has the beneficial effects of higher efficiency, convenience in implementation and the like.
The following are embodiments of the apparatus of the present invention that may be used to perform embodiments of the method of the present invention. For details which are not disclosed in the embodiments of the apparatus of the present invention, reference is made to the embodiments of the method of the present invention.
FIG. 9 is a schematic diagram illustrating a single-sheet OCT B-scan image denoising apparatus according to an exemplary embodiment of the present invention. The present embodiment can be applied to a system composed of an OCT apparatus, a computer, or the like.
Referring to fig. 9, the apparatus may include:
the image acquisition module 901 is configured to acquire multiple groups of B-scan images, where each group of B-scan images includes multiple B-scan images with the same content and independent and same noise distribution obtained by scanning the same position at the same time.
And a data set constructing module 902, configured to form sample pairs after registering the B-scan images in each group to construct a data set.
As an example, in this embodiment or some other embodiments of the present invention, when registering the respective B-scan images in each group, the data set constructing module may specifically be configured to:
for each group of B-scan images, selecting one B-scan image in the current group as a reference, and respectively aligning other B-scan images in the current group with the selected B-scan image.
As an example, in this embodiment or some other embodiments of the present invention, when aligning the other B-scan images in the current group with the selected B-scan image respectively, the data set constructing module may specifically be configured to:
respectively calculating the positions of the other B-scan images in the current group, which have the maximum cross correlation with the selected B-scan image, based on Fourier transform; and respectively aligning other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross correlation.
Further, when the data set constructing module aligns other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross-correlation, the data set constructing module may specifically be configured to:
and respectively aligning other B-scan images in the current group with the selected B-scan image in a mode of only intercepting the specified range of the center position of the image according to the position with the maximum cross correlation.
A model training module 903, configured to train an image denoising model using the data set, where the image denoising model is a shallow U-shaped network model.
By way of example, in this embodiment or some other embodiments of the present invention, the U-type network model may be 3 layers, including two downsampling and two upsampling, the network structures of the encoding layer and the decoding layer of the U-type network model are the same, and the decoding layer is connected to the encoding layer concat, and the input and output residuals are connected.
Further, in this embodiment or some other embodiments of the invention:
the encoding layer first layer of the U-shaped network model performs convolution with the number of filters being 32 and the convolution kernel size being 3 × 3 twice, the step length is 1, the encoding layer second layer performs convolution with the number of filters being 64 and the convolution kernel size being 3 × 3 twice, the first step length is 2, downsampling is achieved, the second step length is 1, and the encoding layer third layer performs convolution with the number of filters being 128, the convolution kernel size being 3 × 3 and the step length being 2;
the decoding layer first layer of the U-shaped network model carries out deconvolution with the number of filters at one time of 64, the convolution kernel size of 6 × 6 and the step length of 2, the decoding layer second layer is connected with the output concat of the second convolution of the coding layer second layer, convolution with the number of filters at one time of 64, the convolution kernel size of 3 × 3 and the step length of 1 and deconvolution with the number of filters at one time of 32, the convolution kernel size of 6 × 6 and the step length of 2 are carried out, the decoding layer third layer is connected with the output concat of the second convolution of the coding layer first layer, convolution with the number of filters at one time of 32, the convolution kernel size of 3 × 3 and the step length of 1 is carried out, and a convolution output result with the number of filters at one time of 1, the convolution kernel size of 3 × 3 and the step length of 1 is output.
And the model application module 904 is used for inputting a single B-scan image to be denoised into the image denoising model after training is completed, and obtaining a denoised image at the output end.
In addition, referring to fig. 10, in this embodiment or some other embodiments of the present invention, the apparatus may further include:
the model evaluation module 1001 is configured to evaluate the denoising performance of the image denoising model by using an image obtained by registering and averaging a plurality of B-scan images as a standard before inputting a single B-scan image to be denoised into the trained image denoising model.
In the embodiment, aiming at the characteristics that the input and the output of a single B-scan image are images with the same resolution ratio during denoising, the denoising task is relatively simple, and the dependence on high-level semantic information is not strong, a shallow U-shaped network is used, so that the effect is better, and the efficiency is higher; meanwhile, the embodiment of the invention also adopts a Noise2Noise idea, namely two images with the same content and random Noise can be used as a sample pair to train the model, and a high-definition image does not need to be obtained as a label. Therefore, the embodiment of the invention not only can solve the denoising problem of a single OCT B-scan image, but also has the beneficial effects of higher efficiency, convenience in implementation and the like.
Regarding the apparatus in the foregoing embodiments, the specific manner in which each unit \ module executes operations has been described in detail in the embodiments of the related method, and is not described herein again.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This application is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (14)

1. A single OCT B-scan image denoising method is characterized by comprising the following steps:
collecting a plurality of groups of B-scan images, wherein each group of B-scan images comprises a plurality of B-scan images which are obtained by scanning the same position at the same time, have the same content and are independently and uniformly distributed in noise;
registering each B-scan image in each group to form a sample pair so as to construct a data set;
training an image denoising model by using the data set, wherein the image denoising model is a shallow U-shaped network model;
and inputting a single B-scan image to be denoised into the trained image denoising model, and obtaining a denoised image at an output end.
2. The method of claim 1, wherein registering the individual B-scan images within each group comprises:
for each group of B-scan images, selecting one B-scan image in the current group as a reference, and respectively aligning other B-scan images in the current group with the selected B-scan image.
3. The method of claim 2, wherein aligning other B-scan images within the current group with the selected B-scan image, respectively, comprises:
respectively calculating the positions of the other B-scan images in the current group, which have the maximum cross correlation with the selected B-scan image, based on Fourier transform;
and respectively aligning other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross correlation.
4. The method of claim 3, wherein aligning other B-scan images in the current group with the selected B-scan image according to the position where the cross-correlation is greatest comprises:
and respectively aligning other B-scan images in the current group with the selected B-scan image in a mode of only intercepting the specified range of the center position of the image according to the position with the maximum cross correlation.
5. The method of claim 1, wherein:
the U-shaped network model is 3 layers and comprises two times of down-sampling and two times of up-sampling, the network structures of an encoding layer and a decoding layer of the U-shaped network model are the same, the decoding layer is connected with the encoding layer concat, and input and output residual errors are connected.
6. The method of claim 5, wherein:
the encoding layer first layer of the U-shaped network model performs convolution with the number of filters being 32 and the convolution kernel size being 3 × 3 twice, the step length is 1, the encoding layer second layer performs convolution with the number of filters being 64 and the convolution kernel size being 3 × 3 twice, the first step length is 2, downsampling is achieved, the second step length is 1, and the encoding layer third layer performs convolution with the number of filters being 128, the convolution kernel size being 3 × 3 and the step length being 2;
the decoding layer first layer of the U-shaped network model carries out deconvolution with the number of filters at one time of 64, the convolution kernel size of 6 × 6 and the step length of 2, the decoding layer second layer is connected with the output concat of the second convolution of the coding layer second layer, convolution with the number of filters at one time of 64, the convolution kernel size of 3 × 3 and the step length of 1 and deconvolution with the number of filters at one time of 32, the convolution kernel size of 6 × 6 and the step length of 2 are carried out, the decoding layer third layer is connected with the output concat of the second convolution of the coding layer first layer, convolution with the number of filters at one time of 32, the convolution kernel size of 3 × 3 and the step length of 1 is carried out, and a convolution output result with the number of filters at one time of 1, the convolution kernel size of 3 × 3 and the step length of 1 is output.
7. The method of claim 1, wherein before inputting the single B-scan image to be denoised into the trained image denoising model, the method further comprises:
and taking the image after the registration averaging of the plurality of B-scan images as a standard, and evaluating the denoising performance of the image denoising model.
8. A single OCT B-scan image denoising apparatus, comprising:
the image acquisition module is used for acquiring a plurality of groups of B-scan images, wherein each group of B-scan images comprises a plurality of B-scan images which are obtained by scanning the same position at the same time and have the same content and independent and same noise distribution;
the data set construction module is used for forming a sample pair after registering each B-scan image in each group so as to construct a data set;
the model training module is used for training an image denoising model by using the data set, wherein the image denoising model is a shallow U-shaped network model;
and the model application module is used for inputting the single B-scan image to be denoised into the image denoising model after training is completed, and obtaining a denoised image at the output end.
9. The apparatus of claim 8, wherein the dataset construction module, when registering the individual B-scan images within each group, is specifically configured to:
for each group of B-scan images, selecting one B-scan image in the current group as a reference, and respectively aligning other B-scan images in the current group with the selected B-scan image.
10. The apparatus of claim 9, wherein the data set construction module, when aligning the other B-scan images in the current group with the selected B-scan image, is specifically configured to:
respectively calculating the positions of the other B-scan images in the current group, which have the maximum cross correlation with the selected B-scan image, based on Fourier transform; and respectively aligning other B-scan images in the current group with the selected B-scan image according to the position with the maximum cross correlation.
11. The apparatus according to claim 10, wherein the data set construction module, when aligning the other B-scan images in the current group with the selected B-scan image respectively according to the position where the cross-correlation is largest, is specifically configured to:
and respectively aligning other B-scan images in the current group with the selected B-scan image in a mode of only intercepting the specified range of the center position of the image according to the position with the maximum cross correlation.
12. The apparatus of claim 8, wherein:
the U-shaped network model is 3 layers and comprises two times of down-sampling and two times of up-sampling, the network structures of an encoding layer and a decoding layer of the U-shaped network model are the same, the decoding layer is connected with the encoding layer concat, and input and output residual errors are connected.
13. The apparatus of claim 12, wherein:
the encoding layer first layer of the U-shaped network model performs convolution with the number of filters being 32 and the convolution kernel size being 3 × 3 twice, the step length is 1, the encoding layer second layer performs convolution with the number of filters being 64 and the convolution kernel size being 3 × 3 twice, the first step length is 2, downsampling is achieved, the second step length is 1, and the encoding layer third layer performs convolution with the number of filters being 128, the convolution kernel size being 3 × 3 and the step length being 2;
the decoding layer first layer of the U-shaped network model carries out deconvolution with the number of filters at one time of 64, the convolution kernel size of 6 × 6 and the step length of 2, the decoding layer second layer is connected with the output concat of the second convolution of the coding layer second layer, convolution with the number of filters at one time of 64, the convolution kernel size of 3 × 3 and the step length of 1 and deconvolution with the number of filters at one time of 32, the convolution kernel size of 6 × 6 and the step length of 2 are carried out, the decoding layer third layer is connected with the output concat of the second convolution of the coding layer first layer, convolution with the number of filters at one time of 32, the convolution kernel size of 3 × 3 and the step length of 1 is carried out, and a convolution output result with the number of filters at one time of 1, the convolution kernel size of 3 × 3 and the step length of 1 is output.
14. The apparatus of claim 8, further comprising:
and the model evaluation module is used for evaluating the denoising performance of the image denoising model by taking an image obtained by registering and averaging a plurality of B-scan images as a standard before inputting a single B-scan image to be denoised into the trained image denoising model.
CN202010259216.5A 2020-04-03 2020-04-03 Single OCT B-scan image denoising method and device Pending CN111402174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010259216.5A CN111402174A (en) 2020-04-03 2020-04-03 Single OCT B-scan image denoising method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010259216.5A CN111402174A (en) 2020-04-03 2020-04-03 Single OCT B-scan image denoising method and device

Publications (1)

Publication Number Publication Date
CN111402174A true CN111402174A (en) 2020-07-10

Family

ID=71413712

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010259216.5A Pending CN111402174A (en) 2020-04-03 2020-04-03 Single OCT B-scan image denoising method and device

Country Status (1)

Country Link
CN (1) CN111402174A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112819732A (en) * 2021-04-19 2021-05-18 中南大学 B-scan image denoising method for ground penetrating radar
CN113191960A (en) * 2021-03-18 2021-07-30 北京理工大学 Optical coherence tomography image denoising model training method, denoising method and denoising device
CN113191972A (en) * 2021-04-27 2021-07-30 西安交通大学 Neural network design and training method for denoising light-weight real image
CN116681618A (en) * 2023-06-13 2023-09-01 强联智创(北京)科技有限公司 Image denoising method, electronic device and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109345469A (en) * 2018-09-07 2019-02-15 苏州大学 It is a kind of that speckle denoising method in the OCT image of confrontation network is generated based on condition
CN109410289A (en) * 2018-11-09 2019-03-01 中国科学院武汉物理与数学研究所 A kind of high lack sampling hyperpolarized gas lung MRI method for reconstructing of deep learning
CN109685813A (en) * 2018-12-27 2019-04-26 江西理工大学 A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information
CN110097611A (en) * 2019-04-28 2019-08-06 上海联影智能医疗科技有限公司 Image rebuilding method, device, equipment and storage medium
US20200034948A1 (en) * 2018-07-27 2020-01-30 Washington University Ml-based methods for pseudo-ct and hr mr image estimation
CN110930334A (en) * 2019-11-26 2020-03-27 浙江大学 Grid denoising method based on neural network

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200034948A1 (en) * 2018-07-27 2020-01-30 Washington University Ml-based methods for pseudo-ct and hr mr image estimation
CN109345469A (en) * 2018-09-07 2019-02-15 苏州大学 It is a kind of that speckle denoising method in the OCT image of confrontation network is generated based on condition
CN109410289A (en) * 2018-11-09 2019-03-01 中国科学院武汉物理与数学研究所 A kind of high lack sampling hyperpolarized gas lung MRI method for reconstructing of deep learning
CN109685813A (en) * 2018-12-27 2019-04-26 江西理工大学 A kind of U-shaped Segmentation Method of Retinal Blood Vessels of adaptive scale information
CN110097611A (en) * 2019-04-28 2019-08-06 上海联影智能医疗科技有限公司 Image rebuilding method, device, equipment and storage medium
CN110930334A (en) * 2019-11-26 2020-03-27 浙江大学 Grid denoising method based on neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ARINDAM BHATTACHARYA等: "OCT image noise reduction using deep learning without additional priors", ARVO IMAGING IN THE EYE CONFERENCE ABSTRACT, pages 1 - 3 *
JAAKKO LEHTINEN等: "Noise2Noise: Learning Image Restoration without Clean Data", ARXIV, pages 3 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113191960A (en) * 2021-03-18 2021-07-30 北京理工大学 Optical coherence tomography image denoising model training method, denoising method and denoising device
CN112819732A (en) * 2021-04-19 2021-05-18 中南大学 B-scan image denoising method for ground penetrating radar
CN112819732B (en) * 2021-04-19 2021-07-09 中南大学 B-scan image denoising method for ground penetrating radar
CN113191972A (en) * 2021-04-27 2021-07-30 西安交通大学 Neural network design and training method for denoising light-weight real image
CN116681618A (en) * 2023-06-13 2023-09-01 强联智创(北京)科技有限公司 Image denoising method, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN111402174A (en) Single OCT B-scan image denoising method and device
Kidoh et al. Deep learning based noise reduction for brain MR imaging: tests on phantoms and healthy volunteers
CN110163809A (en) Confrontation network DSA imaging method and device are generated based on U-net
CN108764342B (en) Semantic segmentation method for optic discs and optic cups in fundus image
CN113160380B (en) Three-dimensional magnetic resonance image super-resolution reconstruction method, electronic equipment and storage medium
CN114266939B (en) Brain extraction method based on ResTLU-Net model
Manakov et al. Noise as domain shift: Denoising medical images by unpaired image translation
CN112037304A (en) Two-stage edge enhancement QSM reconstruction method based on SWI phase image
Alimanov et al. Retinal image restoration and vessel segmentation using modified cycle-cbam and cbam-unet
CN116416156A (en) Swin transducer-based medical image denoising method
Jana et al. Liver fibrosis and nas scoring from ct images using self-supervised learning and texture encoding
CN114332283A (en) Training method based on double-domain neural network and photoacoustic image reconstruction method
CN114187181A (en) Double-path lung CT image super-resolution method based on residual information refining
Adiga et al. Shared Encoder based Denoising of Optical Coherence Tomography Images.
CN103728581B (en) Based on the SPEED rapid magnetic resonance imaging method of discrete cosine transform
CN111145280B (en) OCT image speckle suppression method
Fu et al. ADGAN: An asymmetric despeckling generative adversarial network for unpaired OCT image speckle noise reduction
Iwamoto et al. Super-resolution of MR volumetric images using sparse representation and self-similarity
CN116758120A (en) 3T MRA-7T MRA prediction method based on deep learning
CN116630178A (en) U-Net-based power frequency artifact suppression method for ultra-low field magnetic resonance image
CN115311135A (en) 3 DCNN-based isotropic MRI resolution reconstruction method
US20230337907A1 (en) Segmenting retinal oct images with inter-b-scan and longitudinal information
CN116385329B (en) Multilayer knowledge distillation medical image generation method and device based on feature fusion
CN116188309A (en) OCT image enhancement method based on registration network and application
Chen et al. Reference-free Correction for the Nyquist Ghost in Echo-planar Imaging using Deep Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination