CN115482160A - Tongue color correction method based on deep convolution neural network - Google Patents

Tongue color correction method based on deep convolution neural network Download PDF

Info

Publication number
CN115482160A
CN115482160A CN202210948496.XA CN202210948496A CN115482160A CN 115482160 A CN115482160 A CN 115482160A CN 202210948496 A CN202210948496 A CN 202210948496A CN 115482160 A CN115482160 A CN 115482160A
Authority
CN
China
Prior art keywords
neural network
data
color
color correction
tongue
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210948496.XA
Other languages
Chinese (zh)
Inventor
贺宁波
李志平
陈占春
李治
程旭
李志娟
王倩倩
何平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Bayes Health Technology Co ltd
Original Assignee
Shanghai Bayes Health Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Bayes Health Technology Co ltd filed Critical Shanghai Bayes Health Technology Co ltd
Priority to CN202210948496.XA priority Critical patent/CN115482160A/en
Publication of CN115482160A publication Critical patent/CN115482160A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/80Geometric correction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/40Scaling of whole images or parts thereof, e.g. expanding or contracting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Processing (AREA)

Abstract

A tongue color correction method based on a deep convolutional neural network relates to the field of medical image color correction, and solves the problems of color deviation, tongue color image color distortion and the like generated by tongue image data of mobile equipment in a non-standard light source environment; and then, under the environment of a color cast light source, the collected data is used for carrying out unified scaling. The method comprises the steps of training a data set under a standard light source by using a deep convolution neural network, delivering a result output by the neural network to a downstream to perform a classification task on each pixel point, converting each pixel point from an RGB space to an HSV space, calculating a point, closest to the HSV space, of each current pixel point, taking a closest pixel value as a standard value, finally calculating an RGB value corresponding to HSV to perform color space reduction, and finally realizing tongue picture color correction. The invention can restore the real color of the tongue picture image to the maximum extent.

Description

Tongue color correction method based on deep convolution neural network
Technical Field
The invention relates to the field of medical image color correction, in particular to a tongue color correction method based on a deep neural network.
Background
The tongue diagnosis is an effective characteristic diagnosis method in traditional Chinese medicine, and still plays an important role in the clinical practice of today. At present, there are two main methods for color correction, one is correction based on image analysis, such as mean white balance, perfect reflection, gray scale world assumption, etc.; the other is a color correction method based on a deep full convolution network. The problems with these two correction methods are as follows:
1. they do not refer to a specific standard to correct in the correction process, such as a standard color block, so that the final correction result cannot be measured by the pixel value of the standard color block;
2. even if the correction is performed based on the full convolution network, the true color of each pixel point cannot be estimated in many cases due to the influence of the network structure, and the overflow of pixel values even occurs in some images due to the local exposure problem.
3. The two methods are based on the whole idea to carry out correction, and each pixel point cannot be subjected to one-to-one mapping correction, so that the correction precision is not ideal.
In conclusion, the novel tongue image correction method provided by the invention has a certain reference significance for the research combining the deep learning and tongue diagnosis.
Disclosure of Invention
The invention aims to solve the problem of color deviation generated by tongue image data of mobile equipment in a non-standard light source environment, and provides a tongue color correction method based on a deep neural network.
A tongue color correction method based on a deep neural network comprises the following steps:
the method comprises the following steps of firstly, acquiring image data, and dividing the image data into a pre-trained data set and a test set to be corrected;
step two, constructing a deep neural network model based on Unet for a pre-trained data set;
step three, transmitting the pre-trained data set into the neural network model constructed in the step two for training to obtain a trained neural network model;
step four, transferring the test set to be corrected into the neural network model trained in the step three for prediction to obtain a prediction result; the method specifically comprises the following steps:
mapping the output prediction result into an HSV space from an RGB space, solving the similarity distance between the predicted HSV pixel value and the HSV space, and finding out the pixel point closest to the predicted HSV pixel value;
and step two, taking the pixel point closest to the HSV space as a standard pixel point for color correction, and finally converting the HSV space into an RGB space to realize color correction.
The invention has the beneficial effects that:
the tongue color correction method provided by the invention provides a network structure based on Unet, firstly, a deep full convolution network is adopted to extract color block information close to tongue color under a standard light source in an encoding-decoding stage, then each pixel information with extracted tongue image characteristics is classified at an output end by utilizing a softmax multi-classification idea, pixels are mapped to a standard color space to calculate a similarity distance, and then the standard color block pixel value closest to the standard color block is used as a corrected standard value to be corrected, and the tongue color can be corrected to the maximum extent by combining the full convolution network and the standard color block mapping idea.
The method combines the advantages of various classical networks on the deep neural network structure, takes color correction as a multi-classification task to predict, and selects the standard pixel point closest to the predicted classification result as the pixel point of color correction according to the distance between the standard pixel point and the predicted classification result in the HSV space, thereby reducing the real color of the tongue picture image to the maximum extent. A (c)
Drawings
FIG. 1 is a flow chart of a tongue color correction method based on a deep neural network according to the present invention;
fig. 2 is a schematic structural diagram of a deep neural network.
Detailed Description
The first embodiment is described with reference to fig. 1 and 2, and the tongue color correction method based on the deep neural network is implemented by the following steps:
step 1: image data is acquired.
Step 1.1: data collection, firstly, carrying out scaling and data enhancement processing of uniform size on a data set collected by mobile equipment under a standard light source, taking the data set as a pre-trained data set, and collecting a test data set to be corrected by the mobile equipment under various color cast light sources;
step 1.2: and data processing, namely performing unified size scaling on the image data, wherein the image distortion can be caused by a common scaling method, so that the image data is scaled by adopting a letterbox method. And then carrying out data augmentation on the zoomed data by methods such as turning, translation, rotation and the like. And carrying out scaling processing on the test data set to be corrected collected under the color cast light source in the same way, but not carrying out data enhancement. The specific pre-trained data set flipping method is as follows: horizontally and vertically turning; the rotating mode is as follows: the tongue picture is respectively rotated by 5 degrees, 10 degrees, 15 degrees, 30 degrees and 45 degrees to increase data.
And 2, step: and constructing a deep neural network for training based on characteristics of the Unet network.
Step 2.1: the whole deep neural network is built by a tensoflow deep learning framework and consists of three parts: a back-downsampled layer, a hack-depth separable convolutional layer, a head-upsampled layer. The construction of the whole convolutional neural network is shown in fig. 2.
The implementation of the backhaul-downsampling layer is as follows: the feature fusion and extraction are carried out by using a NIN module in Googlene Inception V1 in the downsampling of the back bone, and the advantages of using the NIN module are that: the large and small convolution parallel connection is used in the same layer, the adaptability of the network to the scale is increased while the width is increased, convolution kernels of various views are parallel, the network can learn useful information for the network, and the 1 x 1 convolution kernel can reduce the number of channels and can greatly reduce the calculation amount of parameters while realizing dimension reduction by fusing information of each channel. That is to say, the feature of the NIN can well fuse large and small objects in the feature map, has better local abstraction capability, and batch normalization (batch normalization) normalization processing is performed after the relu activation function is performed on each convolution, so that the consistency of the data distribution before and after the convolution is ensured as much as possible. The NIN idea is used for three times continuously, a maximum pooling (maxporoling) is connected behind each NIN module, the step length is set to be 2 to reduce the dimension, down sampling is achieved, the NIN serial number is flexible, the number of the NIN modules can be increased or decreased according to requirements, and the network depth is controlled automatically.
In the embodiment, in order to prevent gradient dispersion or explosion when the updated parameters are propagated backward due to the deepening of the network, a short circuit (shortcut) concept in a residual error network is adopted. The convolution kernel size during shortcut is set to be 1 x 1, and the step size can be customized according to the size of an input picture, so that add operation can be performed every time NIN module outputs.
The implementation of the tack-depth separable convolutional layer is as follows: compared with the common convolution, the convolution method has fewer required parameters, and the key point is that the depth separable convolution realizes the separation of a channel and a region, so that the model deployment is lighter, and the parameters of the three times of depth separable convolution are specifically set as: the number of convolution kernels of the first-time depth separable convolution layer is set to be 256, the size of each convolution kernel is 1 multiplied by 1, the step length is 1, the number of channels of convolution output in the depth direction of each input channel is 2, and the expansion coefficient is 1; the second depth-separable convolutional layer convolution kernels are set to be 256, the size of the convolutional kernels is 1 x 1, the step size is 1, the number of channels output by depth direction convolution of each input channel is 2, the expansion coefficient is 2, the second depth-separable convolutional layer convolution kernels are set to be 256, the size of the convolutional kernels is 1 x 1, the step size is 1, the number of channels output by depth direction convolution of each input channel is 2, the expansion coefficient is 4, and finally jump-connection is carried out on the input end and the output end.
The Head-up sampling layer specifically comprises: with 3 consecutive upsamplings, the input block _ size is 2. Compared with the common upsampling, the method can solve the problem of too low resolution to a great extent, because the main function of the PixelShuffle is to obtain a high-resolution feature map by recombining a low-resolution feature map and multiple channels of a convolution kernel, the method becomes an effective upsampling means for solving the problem of super-resolution.
In the embodiment, splicing (concat) stacking is performed on the channel dimension during each upsampling so as to realize fusion of feature maps with different scales, and the purpose of using the upsampling method is to make the downstream classification task more accurate. Then the obtained feature graph is subjected to global average pooling, and finally the multi-classification task is realized by full connection.
And 3, step 3: and transmitting the pre-trained data sets into the constructed neural network model in batches for training. The concrete implementation is as follows:
the data set is divided into data with the proportion of 5; 1/5 of the training set is determined as a verification set, model parameters are selected through the verification set, and if overfitting (overfitting) occurs, training (training) can be terminated in advance; 1/6 is defined as a test set, and data pollution can be prevented when a final model test is selected through the test set; and verifying whether the model overlaps or not by using a cross-verification mode during training.
In this embodiment, the training rounds epoch are set to 50, the batch size transmitted into the network model each time is set to 32, and the learning rate is set to be dynamic, that is, the learning rate increases first and then decreases with the increase of the training rounds, so that the model convergence process is more flexible, and the loss function is a cross entropy loss function.
In this embodiment, the gradient optimizer is chosen as Adam, which is chosen because it has great advantages in non-convex function optimization: the parameter updating is not influenced by the gradient scaling transformation; the hyper-parameters are very well interpretable and require no or little fine tuning; the step size of the update can be limited to approximately the range of measurement; the annealing process can be naturally realized (the learning rate is automatically adjusted); the adaptive learning method is suitable for unstable target functions and the like, namely the Adam algorithm is different from the traditional gradient descent algorithm, the traditional gradient descent algorithm keeps a single learning rate to update the weight and cannot be changed in the training process, and Adam designs independent adaptive learning rates for different parameters by calculating the first moment estimation and the second moment estimation of the gradient.
The formula of the cross entropy loss function is specifically as follows:
L=-[ylogy+(1-y)log(1-y)]
in the formula, y is a label of a predicted value, and y is a label of a true value.
And 4, step 4: and transmitting the data obtained under the color cast light source into the trained model for prediction. And (4) solving the distance between the predicted pixel point and the standard pixel point in the HSV space, and taking the pixel point closest to the predicted pixel point as the standard pixel point for color correction to finally realize the color correction of the tongue picture. The concrete implementation is as follows:
the color cast data is used as data for correcting the color of the model, and only uniform scaling is carried out without data enhancement, so that other interference in the test process is avoided, and the color correction performance of the model can be better tested. The correction process of the tongue picture image comprises the following specific steps:
firstly, multi-classification is carried out according to the prediction result of the model, and each pixel point is classified according to 143 types of the color comparison table.
Firstly, converting the 143 color table category pixel points from RGB space to HSV space, and determining the positions of the 143 color category pixel points in a three-dimensional space formed by HSV.
And (3) the calculation model predicts that the HSV value of the pixel point is closest to the category in which color table in the HSV three-dimensional space, the closest color category point is taken as a corrected standard pixel point, and finally the RGB value corresponding to HSV is calculated to carry out color space reduction, and finally the color correction of the tongue picture is realized.
The formula for converting the RGB color space into the HSV color space is as follows:
R′=R/255
G′=G/255
B′=B/255
C max =max(R′,G′,B′)
C min =min(R′,G′,B′)
Δ=C max -C min
in the formula: r is the red channel, G is the green channel, and B is the blue channel of the image. R ', G ' and B ' are red, green and blue channels, respectively, which are converted to HSV color space. C max And C min Respectively taking the maximum value and the minimum value of the three-color channel, wherein delta is the maximum value C max And minimum value C min The difference of (a).
Hue (H) calculation:
Figure BDA0003787879870000061
saturation (S) calculation:
Figure BDA0003787879870000062
luminance Value (V) calculation:
V=C max
the HSV-to-RGB formula is specifically realized as follows:
C=V×S
X=C×(1-|(H/60°)mod2-1|)
m=V-C
Figure BDA0003787879870000063
(R,G,B)=((R′+m)×255,(G′+m)×255,(B′+m)×255)
in the formula: v denotes luminance, S denotes saturation, H denotes hue, C denotes a product of luminance V and saturation S, X denotes a product of C and hue H, and m denotes a difference between luminance V and saturation S.
The spatial distance formula used is as follows, where the letter subscript p in the formula represents the value of the predicted value in HSV space, and the subscript r represents the standard value of the color class in HSV space:
Figure BDA0003787879870000064
the above description is only a preferred embodiment of the present invention, and not intended to limit the present invention, the scope of the present invention is defined by the appended claims, and all structural changes that can be made by using the contents of the description and the drawings of the present invention are intended to be embraced therein.

Claims (5)

1. The tongue color correction method based on the deep neural network is characterized by comprising the following steps: the method is realized by the following steps:
the method comprises the following steps of firstly, acquiring image data, and dividing the image data into a pre-trained data set and a test set to be corrected;
step two, constructing a deep neural network model based on Unet for a pre-trained data set;
step three, transmitting the pre-trained data set into the neural network model constructed in the step two for training to obtain a trained neural network model;
step four, the color cast data serving as a test set to be corrected is transmitted into the neural network model trained in the step three for prediction, and a prediction result is obtained; the method comprises the following specific steps:
step four, mapping the output prediction result into an HSV space from an RGB space, solving the similarity distance between the predicted HSV pixel value and the HSV space, and finding out the pixel point closest to the HSV space;
and step two, taking the pixel point closest to the HSV space as a standard pixel point for color correction, and finally converting the HSV space into an RGB space to realize color correction.
2. The tongue color correction method based on the deep neural network of claim 1, wherein: the specific process in the first step is as follows:
firstly, carrying out uniform-size scaling and data enhancement processing on a data set collected under a standard light source by adopting mobile equipment, and taking the data set as a pre-training data set;
collecting a test data set to be corrected under various color cast light sources by adopting mobile equipment;
step two, data processing, namely zooming the pre-trained data set by adopting a letterbox method to perform unified size zooming, and then performing data amplification on the zoomed data by adopting a turning, translation and rotation method;
carrying out scaling processing on test data to be corrected collected under various color cast light sources by adopting a letterbox method;
the specific pre-trained data set flipping method is as follows: horizontally and vertically turning;
the specific rotation mode is as follows: data augmentation is performed by rotating the pre-trained data set by 5 °, 10 °, 15 °, 30 °, and 45 °, respectively.
3. The tongue color correction method based on the deep neural network of claim 1, wherein: in the second step, the specific process of constructing the deep neural network model based on the Unet is as follows:
step two, setting a neural network model from a down-sampling layer, a depth separable convolution layer and an up-sampling layer;
the down-sampling layer adopts NIN modules for multi-scale feature fusion for 4 times continuously, adopts short-circuit operation in a residual error network at the input end and the output end of each NIN module, and carries out addition operation when each NIN module outputs;
the depth separable convolution layer adopts three times of continuous depth separable convolution and simultaneously performs jump connection on the input end and the output end of the depth separable convolution layer;
the up-sampling layer adopts 3 times of continuous up-sampling, the up-sampling mode is completed through PixelShuffle, and splicing and stacking are carried out on the channel dimension after each up-sampling so as to realize the fusion of feature maps with different scales.
4. The tongue color correction method based on the deep neural network of claim 1, wherein: the concrete process of the third step is as follows:
dividing the data set into 5;
wherein, 5/6 of the data is used as a training set, and the model parameters are updated through the training set; 1/5 of the training set is determined as a verification set, model parameters are selected through the verification set, and if overfitting occurs, training can be terminated in advance;
and 1/6 of the data is used as a test set, and a cross validation mode is used for verifying whether the model is over-fitted during training.
5. The tongue color correction method based on the deep neural network of claim 4, wherein: the number of training rounds epoch is set to be 50, the batch size of the network model transmitted into each time is set to be 32, the learning rate is set to be dynamic, and the loss function adopts a cross entropy loss function;
the formula of the cross entropy loss function is specifically as follows:
L=-[ylogy+(1-y)log(1-y)]
in the formula, y is a label of a predicted value, and y is a label of a real value;
the gradient optimizer is chosen to be Adam.
CN202210948496.XA 2022-08-09 2022-08-09 Tongue color correction method based on deep convolution neural network Pending CN115482160A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210948496.XA CN115482160A (en) 2022-08-09 2022-08-09 Tongue color correction method based on deep convolution neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210948496.XA CN115482160A (en) 2022-08-09 2022-08-09 Tongue color correction method based on deep convolution neural network

Publications (1)

Publication Number Publication Date
CN115482160A true CN115482160A (en) 2022-12-16

Family

ID=84421970

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210948496.XA Pending CN115482160A (en) 2022-08-09 2022-08-09 Tongue color correction method based on deep convolution neural network

Country Status (1)

Country Link
CN (1) CN115482160A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116433508A (en) * 2023-03-16 2023-07-14 湖北大学 Gray image coloring correction method based on Swin-Unet
CN116593408A (en) * 2023-07-19 2023-08-15 四川亿欣新材料有限公司 Method for detecting chromaticity of heavy calcium carbonate powder

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116433508A (en) * 2023-03-16 2023-07-14 湖北大学 Gray image coloring correction method based on Swin-Unet
CN116433508B (en) * 2023-03-16 2023-10-27 湖北大学 Gray image coloring correction method based on Swin-Unet
CN116593408A (en) * 2023-07-19 2023-08-15 四川亿欣新材料有限公司 Method for detecting chromaticity of heavy calcium carbonate powder
CN116593408B (en) * 2023-07-19 2023-10-17 四川亿欣新材料有限公司 Method for detecting chromaticity of heavy calcium carbonate powder

Similar Documents

Publication Publication Date Title
CN115482160A (en) Tongue color correction method based on deep convolution neural network
CN108182456B (en) Target detection model based on deep learning and training method thereof
CN111524135B (en) Method and system for detecting defects of tiny hardware fittings of power transmission line based on image enhancement
CN109190684B (en) SAR image sample generation method based on sketch and structure generation countermeasure network
WO2018161775A1 (en) Neural network model training method, device and storage medium for image processing
CN112734646B (en) Image super-resolution reconstruction method based on feature channel division
CN111612017B (en) Target detection method based on information enhancement
CN110929602A (en) Foundation cloud picture cloud shape identification method based on convolutional neural network
CN110675462A (en) Gray level image colorizing method based on convolutional neural network
CN111815665B (en) Single image crowd counting method based on depth information and scale perception information
CN109685716A (en) A kind of image super-resolution rebuilding method of the generation confrontation network based on Gauss encoder feedback
CN112801904B (en) Hybrid degraded image enhancement method based on convolutional neural network
CN105550649A (en) Extremely low resolution human face recognition method and system based on unity coupling local constraint expression
CN111931857B (en) MSCFF-based low-illumination target detection method
US12008779B2 (en) Disparity estimation optimization method based on upsampling and exact rematching
CN111461006B (en) Optical remote sensing image tower position detection method based on deep migration learning
CN113052775B (en) Image shadow removing method and device
CN113140019A (en) Method for generating text-generated image of confrontation network based on fusion compensation
CN117372898A (en) Unmanned aerial vehicle aerial image target detection method based on improved yolov8
CN114782298A (en) Infrared and visible light image fusion method with regional attention
CN115457258A (en) Foggy-day ship detection method based on image enhancement algorithm and improved YOLOv5
CN112818777B (en) Remote sensing image target detection method based on dense connection and feature enhancement
CN112766340B (en) Depth capsule network image classification method and system based on self-adaptive spatial mode
CN110348339B (en) Method for extracting handwritten document text lines based on case segmentation
CN112785517A (en) Image defogging method and device based on high-resolution representation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination