CN115984550A

CN115984550A - Automatic segmentation method for eye iris pigmented spot texture

Info

Publication number: CN115984550A
Application number: CN202211259326.7A
Authority: CN
Inventors: 张波; 梅笑云
Original assignee: Shenyang University of Chemical Technology
Current assignee: Shenyang University of Chemical Technology
Priority date: 2022-12-28
Filing date: 2022-12-28
Publication date: 2023-04-18

Abstract

The invention discloses an automatic segmentation method of iris pigmented macule texture of eyes, which relates to an eye pigment segmentation method and comprises the following specific steps: establishing an iris detection data set and making an iris pigmented spot semantic segmentation data set, wherein the steps comprise normalization preprocessing of the data set, histogram equalization, data set amplification, dividing the data set into a training set and a test set, and testing of the test set and the processed data set; building a DIPSnet semantic segmentation network, training the network to obtain an integral semantic segmentation model, and obtaining a weight parameter of the network model through a gradient back propagation method; inputting the DIPSnet semantic segmentation network model into the images in the test set, and generating a classification probability map by the output images through a softmax layer; obtaining a segmentation result graph of the image according to the class probability in the classification probability graph; the invention is applied to an iris recognition system and a human health monitoring system based on the iris.

Description

Automatic segmentation method for eye iris pigmented spots

Technical Field

The invention relates to an iris pigment segmentation method for eyes, in particular to an iris pigment spot texture automatic segmentation method for eyes.

Background

The iris texture information of human eyes is very unique, and the human eyes have the advantages of stability, safety and the like in a certain period, and even twins with the same gene have different iris textures. Different types of characteristic textures such as pigment spots, pits, cracks and annular stripes are usually present on the surface of the iris, and parameters such as the size, the position and the shape of characteristic texture elements can be used for auxiliary information of an iris identification system and iris health diagnosis. Among all biometric systems, the iris is the most promising solution, which can be applied in various fields, such as laboratories and the like, and requires authentication for credit card authentication, bank account security access, airport security inspection, and the like. Iridography, also known as iris diagnosis, is a discipline for determining the health status of human organs by examining iris texture. The pigmented macule texture as the massive texture in the iris texture generally represents several aspects, namely accumulation of toxin or metabolite in the tissues of visceral organs at pigment attachment areas, tissue dullness or energy blockage, local blood stasis of the tissues, tissue ischemia and hypoxia. Secondly, the viscera at the attachment have hereditary weakness. Thirdly, the organ tissues at the attachment site are weakened due to diseases, for example, the tissues tend to crystallize: when a dark color appears, it indicates an organic tissue change. It is therefore highly desirable to detect the speckle pattern.

Since the theory of iris recognition is provided, researchers such as Daugman and the like provide a series of classical local texture feature extraction methods by using a multi-scale or multi-resolution thought, and obtain good recognition accuracy under the condition of good iris image definition. However, because the depth of field of the lens in the iris image acquisition device is relatively small, the definition of the image cannot be guaranteed in the acquisition process, and some texture features are lost, so that the feature dimension is reduced, and correct matching cannot be realized.

In recent years, researchers have targeted massive textures such as pigmented spots on the surface of the iris. A series of traditional target detection algorithms are applied to the physical detection of the visible light iris pigment speckle, and good effects are achieved. The literature is classified into the following two categories according to the method for detecting the pigmented plaque texture.

The first method can be divided into two methods, i.e., using a template and not using a template in the initial positioning. Document [1] extracts all different types of feature textures on the iris by using two linear templates with different lengths, inputs the feature textures into a BP neural network as feature vectors for training, and finally realizes the detection of the pixel speckle textures. Document [2] searches an iris image using a group of variable-size windows, and binarizes the obtained region using a clustering method to detect the texture of pigmented spots in the iris. The method uses the template for initial positioning, although the accuracy rate of detecting the pigment spot texture is higher, the size of the detected texture is required, and only the pigment spot texture which meets the size of the detected template can be detected.

Next, a method without using a template for detection is described in [3] document which uses a region texture energy parameter as an iris texture descriptor, uses a support vector machine to obtain all texture feature regions with drastic gray level changes, and finally realizes the detection of the pigmented spot texture through a shape factor. Document [4] realizes initial positioning of a region where a pigmented spot may exist by using a gray-scale clustering method, and realizes detection of a pigmented spot by using a support vector machine according to a gray-scale spatial distribution characteristic of the pigmented spot. The first method has requirements on the position of the pigmented spots, only pigmented spot textures in the iris region except for the crimp ring can be detected, and the detection result is easily influenced by light spots and eyelids.

The second type of method has the following studies: document [5] divides an image into L representative blocks on average, calculates a gray level probability distribution function thereof, and implements contour detection of a pigmented spot based on a difference between a general representative block and a pixel sample with a pigmented spot and a K-S distance. Document [6] locates the pigmented spot texture by finding the local gray minimum in the center of the large area. Document [7] uses a modified laplacian algorithm to preprocess the iris, and combines a mean shift algorithm with a pyramid method to locate the pigmented spot texture. The method has long detection time, is easily influenced by eyelashes and is suitable for detecting the iris image only with the texture of the pigmented spots.

The Unet network is used for medical image segmentation at the earliest time, and compared with FCN and deep Lab series semantic segmentation networks, the Unet network model has the characteristics of less training data and high accuracy and conforms to the characteristics of iris pigmented spot texture detection. The improved model has 3 down-sampling coding layer operations, 3 up-sampling decoding layer operations, the gray part is a jump connection layer, and finally, a classifier is used for classifying each pixel.

In order to make the neural network model more suitable for iris pigmented speckle detection, iris pigmented speckle textures are analyzed: due to the complex and irregular iris pigmented macule characteristics, the following factors are generally considered when studying the segmentation characteristics: (1) The iris pigmented spots are uncertain in size and unclear in edge, and the feature map after convolution is more abstract, which means that the fuzzy part of the edge is easy to omit in detection. (2) Different iris pigmented spots have different colors, some are dark, some are light, and the colors and colors are not uniform in detail, so that the color characterization of the pigmented spots needs to be more flexibly performed, and the network focuses on some common characteristics of the pigmented spot images. (3) The speckles located inside the crimp wheel have a complex background and are likely to erroneously detect another texture in the background as a speckle, and therefore, it is necessary to pay attention and analyze the pixels in the region of the speckle texture.

Embedding space and channel attention modules in a down-sampling stage of an encoding layer, enabling a network to concentrate on certain feature layer channels and space regions during feature extraction, inhibiting redundant features of certain invalid non-feature regions, and reducing the number of down-sampling layers.

In the network training stage, a mixed loss function which combines a cross entropy loss function and a Dice loss function is adopted. Cross entropy loss is used when the semantic segmentation platform classifies pixel points using Softmax. The Dice pass takes evaluation indexes of semantic segmentation as the pass, and a Dice coefficient is an aggregate similarity measurement function and is usually used for calculating the similarity of two samples, and the value range is [0,1].

The Dice loss function is firstly proposed to mainly solve the problem of data imbalance in natural language processing, and has good effect when being used in small target segmentation of medical images later. Because the texture of the iris pigmented spots is small, the number of pixels occupied by the plaques in the image is far less than that of the background pixels, and a single cross entropy loss function can seriously bias the model to the background, so that the detection effect is poor, the Dice loss function is added to make up for the defect of the single cross entropy loss function.

Reference documents:

[1] zhu Lijun, qi of Swedienwa Iris pigmented macula detection based on bilinear template and partitioning strategy [ J ] Instrument and Meter report, 2015,36 (12): 2714-2721.

[2] Liu Xiaonan, qi of Swedienwa Zhang Bo Iris mass texture detection based on combined window search [ J ] Instrument and Meter report 2014, 35 (8): 1900-1906.

[3] Sword of Swedish, liu Xiaonan, sun Xiao, an iris image block texture detection algorithm [ J ] instrumental report, 2014, 35 (5): 1093-1100.

[4] Liu Xiaonan, qi of Swedia and Zhang Bo, iris pigment Block detection and Classification method [ J ]. Shenyang university of Industrial science, 2014,36 (06): 688-693.

[5] Zhao Chenxu iris image feature extraction and its medical application [ D ]. University of inner Mongolia, 2018.

[6] Zhao Libin automated localization algorithm for iris pigmented deposition plaques study [ D ]. Shenyang university of industry, 2012.

[7] Liu Yujie study of iris feature extraction method [ D ]. University of mass transit, 2014.

[8] Qi of garden worth, lin Zhonghua, xu Lou a novel iris location algorithm [ J ] photoelectric engineering based on human eye structural features, 2007, 34 (1): 112-116.

Disclosure of Invention

The invention aims to provide an automatic segmentation method of iris pigmented macule texture of eyes, which comprises a DIPSnet iris pigmented macule detection model of a coder and a decoder; a residual error attention module is introduced into the model at the stage of an encoder, so that the redundant characteristics of certain invalid non-characteristic areas are inhibited, and the accuracy of pigment spot detection is improved. Secondly, in order to adapt to the pigment spot image, the semantics is simpler, the structure is more fixed, the convolution number is reduced, and overfitting is prevented. And finally, mixed loss is introduced to train the network to solve the problem of class imbalance, the model integrates cross entropy and Dice loss, and the segmentation precision is further improved. On the basis of not removing the internal texture of the crimping wheel, the method adopts a deep learning method, and the iris pigmented spots of all sizes are automatically identified.

The purpose of the invention is realized by the following technical scheme:

an automatic segmentation method of iris pigmented macule texture of eyes is a biomedical image automatic segmentation method based on U net network structure, the method is an iris pigmented macule texture automatic segmentation method based on DIPSnet semantic segmentation network structure, comprising the following steps:

s1: establishing an iris detection data set and making an iris pigmented spot semantic segmentation data set, wherein the steps comprise normalization preprocessing of the data set, histogram equalization, data set amplification, dividing the data set into a training set and a test set, and testing the test set and the processed test set;

s2: marking a data set image by using labelme, inputting the marked data set image into a network model, and outputting the image to generate a classification probability graph with the channel number of 2, wherein the resolution of the classification probability graph is the same as that of the input image;

s3: building a DIPSnet semantic segmentation network, training the network to obtain an integral semantic segmentation model, and obtaining a weight parameter of the network model through a gradient back propagation method;

s4: inputting images in a test set into a DIPSnet semantic segmentation network model, and generating a classification probability map by the output images through a softmax layer;

s5: obtaining a segmentation result graph of the image according to the class probability in the classification probability graph;

s6: detecting pigment stripe physical edges by using a canny operator according to the segmentation result graph to obtain an edge detection graph;

the method comprises the steps that a DIPSnet network model is established, wherein the DIPSnet network model comprises an encoder with a residual error attention mechanism and a decoder network with a deconvolution double up-sampling structure, and the encoder sequentially comprises an input layer, a convolutional layer with 64 first output channels, a second convolutional layer, a first fusion attention mechanism layer, a first residual error attention convolution fusion layer, a first jump connection layer, a first maximum pooling layer, a third convolutional layer, a fourth convolutional layer, a second fusion attention mechanism layer, a second residual error attention convolution fusion layer, a second jump connection layer, a second maximum pooling layer, a fifth convolutional layer, a sixth convolutional layer, a third fusion attention mechanism layer, a third residual error attention convolution fusion layer, a third jump connection layer, a third maximum pooling layer, a seventh convolutional layer, an eighth convolutional layer and a fourth jump connection layer; the decoder network of the up-sampling structure comprises a first up-sampling layer, a first connection layer, a first conventional convolution layer, a second up-sampling layer, a second connection layer, a second conventional convolution layer, a third up-sampling layer, a third connection layer, a fourth conventional convolution layer and a fifth conventional convolution layer, namely an output layer; jump the connection layer: the first jump connection layer is connected with the third connection layer, the second jump connection layer is connected with the second connection layer, the third jump connection layer is connected with the first connection layer, and the fourth jump connection layer is connected with the first upper sampling layer; there is an activation function in each convolutional layer and conventional convolutional layer; the operation of reconstructing the upsampled layer includes:

s31: except for the first layer, for a feature map with the resolution of h multiplied by w and the number of channels of c, the length and the width of the feature map are unchanged by 2 convolutions of 1 multiplied by 1;

s32: s31, the feature graph output by the step C is subjected to a fusion attention mechanism layer and a residual attention convolution fusion layer to obtain a channel number C of the feature graph;

s33: and S32, the feature graph output by the step S32 passes through a maximum pooling layer to obtain a feature graph 1/2 hX 1/2 wX 2C, so that an up-sampling process that the resolution is reduced by two times and the number of channels is increased by one time is completed.

The automatic segmentation method for the iris pigmented macule texture of the eye specifically comprises the following steps of S1:

s11: positioning and normalizing the acquired iris image data of 800 × 600 pixels to 200 × 720 pixels;

s12: adaptive histogram equalization is performed on the image data with a probability of 80%, and then a step S13 is skipped:

s13: the image data is rotated by 180 degrees, turned upside down and reversed left and right, and then the process jumps to step S14:

s14: elastically distorting image data with a probability of 10%, and then completing data amplification processing;

the automatic segmentation method for the iris pigmented macule texture of the eyes adopts a mixed loss function which integrates a cross entropy loss function and a Dice loss function;

wherein the content of the first and second substances,

and &>

The mixed loss function is the sum of the two loss functions.

The invention has the advantages and effects that:

a DIPSnet iris pigmented spot detection model based on a deep learning algorithm comprising an encoder and a decoder. The model introduces a residual attention module at the stage of an encoder, and pays attention to detailed information while acquiring deep semantic information, so that redundant features of certain invalid non-feature areas are inhibited, and the accuracy of pigment spot detection is improved. Secondly, in order to adapt to the pigment spot image, the semantics is simpler, the structure is more fixed, the convolution number is reduced, and overfitting is prevented. And finally, mixed loss is introduced to train the network to solve the problem of class imbalance, and the model integrates cross entropy and Dice loss to further improve the segmentation precision.

On the basis of not removing the internal texture of the crimping wheel, the iris pigmented spots with various sizes can be automatically identified by adopting a deep learning method, and the method is applied to an iris identification system and a human health monitoring system based on the iris.

Drawings

FIG. 1 is an overall flow diagram of the present invention;

FIG. 2 is a schematic diagram of a network architecture used in the present invention;

FIG. 3 is a schematic view of a power drawing of the present invention;

FIG. 4 is a schematic diagram of a reconstructed downsampling structure of the present invention;

FIG. 5 is a detailed diagram of a network downsampling structure in the present invention;

FIG. 6 is a flowchart of the algorithm details of the present invention during the testing phase;

FIG. 7 is a graphical illustration of the output results of the test set of the algorithm of the present invention;

FIG. 8 is a graph of the segmentation results obtained by the method of the present invention;

fig. 9 is a diagram of an acquisition apparatus.

Detailed Description

The present invention will be described in detail with reference to the embodiments shown in the drawings.

The method specifically comprises the following steps:

s4: inputting the images in the test set into a DIPSnet semantic segmentation network model, and outputting the images to generate a classification probability map through a softmax layer;

s6: and detecting the pigment stripe physical edge by using a canny operator according to the segmentation result graph to obtain an edge detection graph.

Further, step S1 specifically includes:

s11: positioning and normalizing the image of 800 × 600 pixels of the acquired iris image data to 200 × 720 pixels,

s12: the image data is subjected to adaptive histogram equalization, and then the process jumps to step S13:

preferably, the DIPSnet network model comprises an encoder added with a residual attention mechanism and a decoder network of an up-sampling structure, and the encoder sequentially comprises an input layer, a convolutional layer with a first output channel number of 64, a second convolutional layer, a first fused attention mechanism layer, a first residual attention convolution fusion layer, a first jump connection layer, a first maximum pooling layer, a third convolutional layer, a fourth convolutional layer, a second fused attention mechanism layer, a second residual attention convolution fusion layer, a second jump connection layer, a second maximum pooling layer, a fifth convolutional layer, a sixth convolutional layer, a third fused attention mechanism layer, a third residual attention convolution fusion layer, a third jump connection layer, a third maximum pooling layer, a seventh convolutional layer, an eighth convolutional layer and a fourth jump connection layer;

the decoder network with the up-sampling structure comprises a first up-sampling layer, a first connecting layer, a first conventional convolutional layer, a second up-sampling layer, a second connecting layer, a second conventional convolutional layer, a third up-sampling layer, a third connecting layer, a fourth conventional convolutional layer and a fifth conventional convolutional layer, namely an output layer; jump the connection layer: the first jump connection layer is connected with the third connection layer, the second jump connection layer is connected with the second connection layer, the third jump connection layer is connected with the first connection layer, and the fourth jump connection layer is connected with the first up-sampling layer; there is an activation function in each convolutional layer and conventional convolutional layer;

preferably, the operation of the convolutional layer comprises:

s31: inputting the feature map with the size of h multiplied by w multiplied by c into a deformable convolution layer, and performing convolution on the feature map by using the convolution layer with the activation function of relu;

s31: except for the first layer, for a feature map with the resolution of h multiplied by w and the channel number of c, the length and the width of the feature map are unchanged by 2 convolutions of 1 multiplied by 1, and the channel number of the feature map is increased by 1;

s32: s31, the feature graph output by the step C is subjected to an attention fusion mechanism layer and a residual attention convolution fusion layer, and the number of channels of the feature graph is obtained;

Preferably, the operation of reconstructing the upsampled layer comprises:

s31: except for the first up-sampling, the resolution ratio needs to be adjusted to increase the resolution ratio by one time, and for the feature map with the resolution ratio of h multiplied by w and the channel number of c, the feature map is spliced into the feature map with h multiplied by w multiplied by 2c through a jump connection layer;

s32: the feature graph output by the S31 is subjected to 2 convolution operations of 1 × 1 by up-sampling, and is subjected to group standardization operation and relu activation function to obtain a h × w × c feature graph;

s33: and performing an upsampling operation on the feature map obtained in the step S32 to generate 2 hx 2 wx 1/2c, thereby completing an upsampling process of expanding the resolution by two times and reducing the number of channels by half.

Preferably, a mixed loss function combining the cross entropy loss function and the Dice loss function is adopted.

Wherein the content of the first and second substances,

and &>

The method is characterized in that the method comprises the steps of respectively calculating pixel points in a predicted image and a label image, and the used mixed loss function is the sum of the two loss functions.

Examples

The detection flow of the pigmented spot texture is as follows:

the first step is as follows: in the data set of the mobile phone, the subject group uses the HM9918 type iris apparatus as the acquisition device of the iris as shown in fig. 7, the students of the university of shenyang industry and the present science, who went to the patients in the hospital in shenyang city, as the acquisition objects, and the acquisition device and the acquisition method are shown in fig. 9, wherein the acquisition objects include iris samples of different ages such as the elderly, the middle-aged and the young, and the resolution of the image is 800 × 600 pixels and 24 bits of bitmap. There are 980 iris images. The iris color image is acquired by a visible light source, wherein the iris color image is a natural open iris color image of human eyes, and the visible light iris image containing various iris textures is acquired. The necessary hardware guarantee is provided for the smooth development of the subject.

The dataset with iris pigmented macule texture was preprocessed using the method of document [8], as shown in the image normalization portion of FIG. 2. The iris image is normalized to a rectangle in the diameter direction. The horizontal axis of the normalized iris image is the circumferential direction of the iris, the vertical axis of the normalized iris image is the diameter direction of the iris, and the upper edge of the normalized iris image is the edge of the pupil. The normalized picture size is 720 × 200.

The present study adopted histogram equalization of iris images. The iris image for training can be subjected to brightness and contrast enhancement, so that the effect of simulating an actual application scene is achieved. The data set may also be augmented. And the generalization capability and the robustness of the model are improved. In order to ensure the diversity of the data set, the pictures with modified brightness and contrast are added into the data set, and then the modes of horizontal mirror symmetry, vertical mirror symmetry and 180-degree rotation are adopted. The expanded data set has 1138 pieces. With 90% of the images used for model training and 10% for testing.

And the contour labeling is to manually label the pigmented spot texture according to the VOC data set format by using labelme. In the outline marking, a group of seven subject group members with certain professional cognition on iris pigmented macule physical characteristics is adopted to evaluate the pigmented macule outline marking, and when the marking is disputed, a voting statistical mode is adopted and the principle that minority obeys majority is utilized to determine the pigmented macule outline marking.

The second step is that: the input image is first resampled to 512 x 3, and then the training data set is introduced into the model for training,

the experimental environment is that in the windows10 environment, the display card adopts GTX1650, the display memory is an independent display card of 4G, and the python3.6 is used for programming on a pytorch1.8 platform. In the training process, the learning rate of the initial freezing stage is 0.0001, the learning rate of the thawing stage is 0.00001, the batch size is 1, the training times are 100, and the loss factor is 0.96. And continuously performing iterative optimization through a random gradient descent algorithm to find a global optimum point to obtain the best effect.

In the training process, the image is subjected to coding layer operation of 3 times of down-sampling and decoding layer operation of 3 times of up-sampling, a connection layer is jumped, and finally a classifier is used for classifying each pixel. And a residual error space and channel attention module is embedded in a down-sampling stage of an encoding layer, so that the network can concentrate on certain characteristic layer channels and space regions when extracting characteristics. In the following sampling, for example, an attention module is embedded between two convolution modules and a maximum pooling module, and a feature map processed by the attention module and a feature map obtained by original convolution are superposed once, so that feature information can be aggregated better, and the maximum effect can be exerted. Each convolution module consists of a 3 x 3 convolution and the ReLu activation function.

A mixed loss function which combines two loss functions of a cross entropy loss function and a Dice loss function is adopted. Cross entropy loss is used when the semantic segmentation platform classifies pixel points using Softmax.

The Dice loss function is firstly proposed to mainly solve the problem of data imbalance in natural language processing, and has good effect when being used in small target segmentation of medical images later. Because the texture of the iris pigmented spots is small, the number of pixels occupied by the plaques in the image is far smaller than that of the background pixels, and a single cross entropy loss function can cause the model to be seriously biased to the background, so that the detection effect is poor, the Dice loss function is added to make up for the defect of the single cross entropy loss function.

And putting the result parameters into a test data set for prediction. And judging the network performance by using the evaluation index. And finally, converting the network output result into a binary visual result graph.

The third step: and detecting the edge of the result image by using a canny operator to obtain a contour image of the pigment spot target detection, and fusing the contour image with the original image.

Comparing with different deep learning target detection algorithm results

Contrast experiment with traditional detection algorithm

。/>

Claims

1. An automatic segmentation method of iris pigmented macule texture of eyes is a biomedical image automatic segmentation method based on U net network structure, and is characterized in that the method is an automatic segmentation method of iris pigmented macule texture based on DIPSnet semantic segmentation network structure, and comprises the following steps:

s2: marking a data set image by using label, inputting the marked data set image into a network model, and outputting the image to generate a classification probability graph with the channel number of 2, wherein the classification probability graph is the same as the resolution of the input image;

the method comprises the steps that a DIPSnet network model is established, wherein the DIPSnet network model comprises an encoder with a residual error attention mechanism and a decoder network with a deconvolution double up-sampling structure, and the encoder sequentially comprises an input layer, a convolutional layer with 64 first output channels, a second convolutional layer, a first fusion attention mechanism layer, a first residual error attention convolution fusion layer, a first jump connection layer, a first maximum pooling layer, a third convolutional layer, a fourth convolutional layer, a second fusion attention mechanism layer, a second residual error attention convolution fusion layer, a second jump connection layer, a second maximum pooling layer, a fifth convolutional layer, a sixth convolutional layer, a third fusion attention mechanism layer, a third residual error attention convolution fusion layer, a third jump connection layer, a third maximum pooling layer, a seventh convolutional layer, an eighth convolutional layer and a fourth jump connection layer; the decoder network of the up-sampling structure comprises a first up-sampling layer, a first connection layer, a first conventional convolution layer, a second up-sampling layer, a second connection layer, a second conventional convolution layer, a third up-sampling layer, a third connection layer, a fourth conventional convolution layer and a fifth conventional convolution layer, namely an output layer; jump the connection layer: the first jump connection layer is connected with the third connection layer, the second jump connection layer is connected with the second connection layer, the third jump connection layer is connected with the first connection layer, and the fourth jump connection layer is connected with the first up-sampling layer; there is an activation function in each convolutional layer and in the conventional convolutional layer; the operation of reconstructing the upsampled layer includes:

2. The method of claim 1, wherein the step S1 comprises:

s13: rotating the image data by 180 degrees, turning the image data upside down and turning the image data left and right, and then jumping to step S14:

s14: the image data was elastically distorted with a probability of 10%, and then the data amplification process was completed.

3. The method of claim 1, wherein a mixed loss function is used, which combines cross entropy loss function and Dice loss function;

wherein, the first and the second end of the pipe are connected with each other,

and &>

The mixed loss function is the sum of the two loss functions. />