CN111667453A

CN111667453A - Gastrointestinal endoscope image anomaly detection method based on local feature and class mark embedded constraint dictionary learning

Info

Publication number: CN111667453A
Application number: CN202010316171.0A
Authority: CN
Inventors: 李胜; 申子欣; 何熊熊
Original assignee: Zhejiang University of Technology ZJUT
Current assignee: Zhejiang University of Technology ZJUT
Priority date: 2020-04-21
Filing date: 2020-04-21
Publication date: 2020-09-15

Abstract

A gastrointestinal endoscope image anomaly detection method based on local feature and class mark embedding constraint dictionary learning comprises the steps of preprocessing an original image, extracting an interested region, further extracting color and texture features from image data and fusing, constructing a test set and a training set, establishing a dictionary learning model with an original class mark embedding item and a structural feature constraint item of Profiles, inputting a training set matrix into the model for solving, iteratively updating, respectively training out a dictionary and a coding coefficient matrix, and obtaining classifier parameters by utilizing a coding coefficient matrix and a marking matrix of a training sample; and finally, for the test image, obtaining a sparse coefficient and a prediction label thereof through an orthogonal matching pursuit algorithm, and classifying the multi-class disease images by comparing the label of the reconstructed test set with the label of the original test set. The invention can realize the classification of different lesion images of the gastrointestinal endoscope; can effectively classify endoscope diseases.

Description

Gastrointestinal endoscope image anomaly detection method based on local feature and class mark embedded constraint dictionary learning

Technical Field

The invention relates to an abnormality classification technology of gastrointestinal endoscopy images, in particular to a dictionary learning method based on atomic class label embedding and profiles local feature constraint, which is suitable for gastrointestinal endoscope image abnormality detection.

Background

More and more people suffer from gastrointestinal diseases, and endoscopy is taken as the gold standard of gastrointestinal tract research, is widely applied to early detection and adjuvant therapy of the gastrointestinal tract, and effectively reduces the morbidity and mortality. Conventional hand-held endoscopes are invasive diagnostic devices that have difficulty accurately acquiring the entire gastrointestinal tract. Compared with the traditional endoscopy technology, the wireless capsule endoscope is a new painless and noninvasive technology, can completely enter the small intestine, and reduces the discomfort of patients. The patient swallows the capsule endoscope from the mouth, wirelessly captures color images by gastrointestinal peristalsis for an average duration of about 8 hours, and then wirelessly transmits the images to a data recording device attached to the patient's waist for the clinician to review the images and make a diagnosis.

Although wireless capsule endoscopy is a technological breakthrough, it takes approximately 2 hours for an experienced clinician to analyze approximately 5 million images generated per patient. Abnormal images including polyps, bleeding, ulcers, etc. typically account for less than 5% of all images. In addition, spatial features of the abnormal image (such as shape, texture, size, and contrast with the surrounding environment) may vary, so it may be difficult for a doctor to reliably detect an abnormality in all cases. Therefore, an automatic computer-aided system is designed to assist clinicians in classifying abnormal images, so that the burden of the clinicians is reduced, and the efficiency and the accuracy are improved.

Disclosure of Invention

In order to overcome the defect that the existing multi-anomaly classification result is far unsatisfactory, the invention provides a dictionary learning method of atomic class mark embedding and profiles local feature constraint, which is used for classifying gastrointestinal endoscope image anomalies, and a linear classifier based on a multi-class dictionary is constructed by combining dictionary learning, sparse representation of image features and error comparison of reconstruction of the multi-class images, so that classification of different abnormal images of a gastrointestinal endoscope can be effectively realized.

The technical scheme proposed for solving the technical problems is as follows:

a gastrointestinal endoscope image anomaly detection method based on local feature and class mark embedding constraint dictionary learning comprises the following steps:

step 1: acquiring an endoscope image set which consists of polyps, ulcers and normal images;

step 2: image preprocessing: extracting a tissue area of the whole endoscope image, generating a mask in an invalid area, and removing an image with low quality;

and step 3: extracting image color features: extracting color features of the image by a color histogram method in an HSI color space;

and 4, step 4: extracting image texture features: under the HSI color space, extracting the texture features of the image by adopting a scale-invariant feature transform operator;

and 5: image feature fusion: fusing and normalizing the color and texture characteristics of the image;

step 6: initializing a dictionary and a coding coefficient matrix by using a K-SVD algorithm, wherein the model expression is as follows:

wherein Y is [ Y ]₁,…,Y_C]＝[y₁,y₂,…,y_n]∈R^m×nRepresenting a training data set with n training samples, m being the dimension of the training samples, C being the class of the training samples, Y_iIs a matrix composed of training samples of the ith class; d ═ D₁,d₂,…,d_K]^T∈R^m×KRepresenting a dictionary, K is the number of atoms, d_iIs the ith atom of dictionary D, X ═ X₁,x₂,…x_n]^T∈R^K×nIs a matrix of coding coefficients, x_i＝[x_1i,x_2i,…,x_Ki]^TIs the firsti training samples y_iCoding coefficient, T, corresponding to dictionary D₀Is a sparse factor, | | x | | Y₀Represents a 0 norm representing the number of non-zero elements in the coefficient vector x, | x | | luminance₂Represents a 2-norm whose values represent the sum of the squares of the individual elements in the coefficient vector x; solve to obtain an initialization dictionary D₀＝[D₁,D₂,…,D_C]And initializing the coding coefficient matrix X₀＝[X₁,X₂,…,X_C]；

And 7: according to the step 6, an atom class mark embedding item and a structural feature constraint item of Profiles are introduced, and a dictionary model expression of endoscope image classification is as follows:

wherein

The reconstruction performance of the dictionary is represented,

is a regular term of the coding coefficient, λ, α and β are scalar parameters;

7.1 atomic class label insert model construction:

aiming at the ith class training sample, learning a specific class dictionary D by utilizing a K-SVD algorithm_iTherefore, the C class training sample is learned to obtain a special class dictionary D₀＝[D₁,D₂,…D_C]Dictionary D₀Containing a C-type atom; if atom d_i∈D_iThen atom d_iIs defined as b_i＝[0,0,…,0,1,0,…,0]∈R^C，b_iThe ith element in (1) is 1, and the rest are zero, so the atom class matrix B in the dictionary D is defined as B ═ B₁,b₂,…,b_k]^T∈R^k×C(ii) a The weight class index matrix H of the dictionary is defined as

By usingCoding coefficient matrix X^TAnd constructing the following class label embedding items by the atom weight class label matrix H:

7.2 construction of structural feature constraint term model of Profiles:

each row vector of the coding coefficient matrix in dictionary learning is defined as a profile, and P ═ X is defined^TThen P ═ P₁,p₂,…,p_k]∈R^n×KWherein p is_i＝[x_i1,…,x_in]^TAtom d_iThe corresponding profile is p_iIn order to improve the discrimination performance of the coding coefficient, a discriminant is designed by combining the structural features of profiles with atoms, a neighbor graph M is constructed by utilizing a P matrix, and each P is_iRepresenting a vertex; the weight matrix W of the neighbor map M is calculated as follows:

wherein is the parameter, kNN (p)_i) Represents p_iK is close to, w_i,jIs p_iAnd p_jReflects the similarity between them, in order to better reflect p_iAnd p_jThe Laplace graph L based on the profiles characteristics is constructed as follows:

the discriminant based on the profiles features is designed as follows:

and 8: optimizing a dictionary model:

8.1 solution update of dictionary D:

assuming that the encoding coefficient matrix X, the Laplace graph L and the matrix U in the objective function are constants, the formula (2) is converted into the formula (7)

And (3) solving by using a Lagrange function to obtain an optimal dictionary D:

D＝YX^T(XX^T+βL+ηI) (8)

where η is a parameter and I is an identity matrix;

8.2 solving and updating the coding coefficient matrix X:

assuming that the dictionary D, the laplacian graph L, and the matrix U in the objective function are constants, equation (2) translates to:

the equation (10) is directly derived to obtain the optimal coding coefficient matrix X:

X＝(D^TD+αU+γI)^-1D^TY (10)

wherein γ is a parameter, once a coding coefficient is obtained, a matrix U is obtained according to formula (3), and a laplacian graph L is obtained according to formula (5);

and step 9: constructing a linear classifier: firstly, obtaining classifier parameters G by using a coding coefficient matrix X and a marking matrix H of a training sample_x:

G_x＝HX^T(XX^T+Ι)^-1(11)

Second, for the test image

Computing coding coefficient vector by orthogonal matching pursuit algorithm

And a dictionary D, by

Calculating to obtain a predicted label vector l_xAnd finally, testing the sample

Is a predictive label vector l_xThe index corresponding to the maximum element value of (2).

Compared with the prior art, the invention has the beneficial effects that:

a. the invention provides an abnormal image classification algorithm to assist doctors in gastrointestinal abnormality diagnosis, effectively shortens analysis time and reduces workload;

b. the invention adopts a dictionary learning algorithm for classifying gastroscope abnormal images, designs a discriminant based on profiles structural features, and improves the discriminant performance of coding coefficients; distributing a class mark for each atom in the dictionary, and constructing a discriminant by using the atom class mark to improve the discriminability of the dictionary;

c. the invention designs a gastroscope abnormal image dictionary learning classification optimization method, which solves an objective function by adopting an alternative constraint direct derivation method and continuously updates the objective function until the algorithm is converged; obtaining linear classifier parameters by using a coding coefficient matrix and a marking matrix of a training sample, obtaining the relation between a sparse representation coefficient vector and a learning dictionary of a test image by an orthogonal matching pursuit algorithm, then calculating a prediction label vector by using the linear classifier parameters, and finally, the category of the test sample is an index corresponding to the maximum element value in the prediction label vector;

d. the invention integrates the color characteristic and the texture characteristic, and the color characteristic can effectively distinguish gastrointestinal endoscope abnormal images according to the experience of a clinician, but the distinction degree on certain gastrointestinal endoscope abnormal images is still limited. The texture features are fused, so that the classification accuracy can be effectively enhanced;

e. the invention utilizes an image preprocessing method suitable for a capsule endoscope, which comprises region-of-interest extraction and low-quality image rejection, and reduces interference caused by useless information.

Drawings

FIG. 1 is a block diagram of a dictionary learning system of the present invention;

FIG. 2 is a graph showing the effect of pretreatment according to the present invention;

FIG. 3 is a detailed algorithm flow chart of the multi-class dictionary learning and sparse reconstruction of the present invention.

FIG. 4 is a graph of the accuracy of the classification concept of the present invention using fused features with other features at different dictionary sizes.

Detailed Description

The invention is further described below with reference to the accompanying drawings.

Referring to fig. 1 to 4, a gastrointestinal endoscope image anomaly detection method based on local feature and class mark embedding constraint dictionary learning comprises the following steps:

step 1: acquiring an endoscope image set, wherein the endoscope image set consists of three different gastroscope images, namely polyp images, ulcer images and normal images, and the number of the images in each type is the same;

step 2: the images taken directly from the endoscope usually contain a lot of useless information, such as instrument markers, time and patient information, etc. Meanwhile, because a plurality of interference factors exist in the intestines and stomach, such as bubbles, food residues or a plurality of low-quality images caused by shooting reasons, the acquired gastroscope images are preprocessed, the tissue area of the whole endoscopic image is extracted, a mask is generated in an invalid area, and meanwhile, some low-quality images are eliminated;

and step 3: extracting image color features: under the HSI color space, extracting the color characteristics of the image by adopting a color histogram algorithm, quantizing the values of three channels of hue, saturation and lightness respectively, and then performing statistics to form a color histogram as the color characteristics of the image;

and 4, step 4: extracting image texture features: under an HSI color space, searching an extreme point in a space scale by adopting a scale-invariant feature transformation operator, extracting position, scale and rotation invariant information of the extreme point, describing the extreme point by using a group of vectors, and extracting texture features of an image;

and 5: the characteristics of the image cannot be well represented by a single characteristic, the mutual fusion and supplement of the color characteristic and the texture characteristic are more discriminative, and the generated data dimension is high or low due to different characteristic calculation methods, so that the differences are eliminated through normalization, and the data representation is improved; in the new fused features, the number of rows of a matrix of the new fused features represents the feature dimension, the number of columns of the new fused features represents the number of images, namely, each column of vectors represents the features of each image;

wherein Y is [ Y ═ Y₁,y₂,…,y_n]∈R^m×nRepresenting a training data set with n training samples, m being the dimension of the training samples, D ═ D₁,d₂,…,d_K]^T∈R^m×KRepresenting a dictionary, K is the number of atoms, d_iIs the ith atom of dictionary D, X ═ X₁,x₂,…x_n]^T∈R^K×nIs a matrix of coding coefficients, x_i＝[x_1i,x_2i,…,x_Ki]^TIs the ith training sample y_iCoding coefficient, T, corresponding to dictionary D₀Is a sparse factor, | | x | | Y₀Represents a 0 norm representing the number of non-zero elements in the coefficient vector x, | x | | luminance₂Represents a 2-norm whose values represent the sum of the squares of the individual elements in the coefficient vector x; solve to obtain an initialization dictionary D₀＝[D₁,D₂,…,D_C]And initializing the coding coefficient matrix X₀＝[X₁,X₂,…,X_C]；

wherein

The reconstruction performance of the dictionary is represented,

7.1 atomic class label insert model construction:

Using a matrix X of coding coefficients^TAnd constructing the following class label embedding items by the atom weight class label matrix H:

7.2 construction of structural feature constraint term model of Profiles:

the discriminant based on the profiles features is designed as follows:

and 8: optimizing a dictionary model:

8.1 solution update of dictionary D:

And (3) solving by using a Lagrange function to obtain an optimal dictionary D:

D＝YX^T(XX^T+βL+ηI) (19)

where η is a parameter and I is an identity matrix;

8.2 solving and updating the coding coefficient matrix X:

X＝(D^TD+αU+γI)^-1D^TY (21)

G_x＝HX^T(XX^T+Ι)^-1(22)

Second, for the test image

Computing coding coefficient vector by orthogonal matching pursuit algorithm

And a dictionary D, by

Is a predictive label vector l_xIs used to determine the index corresponding to the maximum element value of (a).

The effect of the pre-processing is illustrated in fig. 2, which for gastroscopic images taken directly during a gastrointestinal examination, roughly contains two large areas: a background region and a tissue region. The background area often carries patient information, instrument information, time, etc., and the tissue area also has many interference factors, such as bubbles, food residues or dark areas and light spots caused by camera shooting. Firstly, after converting an original image into a gray image, searching a maximum connected area to determine a tissue area, and then cutting out a colorful tissue area from the original image according to boundary points. FIG. (a) is a raw gastroscopic image of the stomach; panel (b) is the extracted tissue region; the pretreatment method can effectively extract the tissue area and reduce the interference of interference factors on feature extraction.

As shown in fig. 3, a detailed algorithm flowchart for dictionary learning is shown. Randomly selecting 80% of images as a training set, using the rest 20% of images as a test set for verification, and initializing to obtain a dictionary D₀And a coding coefficient matrix X₀Constructing a class mark matrix B of the dictionary by using the class mark matrix of the training sample, calculating an atom weight class mark matrix H and an extension class mark matrix U thereof, and using a profiles matrix P₀Computing an initialized Laplace graph matrix L₀And outputting the final dictionary D and the encoding coefficient matrix X after multiple iterations, and calculating the accuracy of the classification of the test image through a linear classifier.

To verify the effectiveness of the invention, the following is illustrated by simulation:

the gastroscopic dataset consists of three different categories of gastroscopic sites including gastric polyps, gastric ulcers and normal stomach images, with the same number of images per site, 250 each, for a total of 750. Wherein, 200 images of each type are randomly extracted as test images, and the other 50 images are used as training images. The experimental results take the accuracy as the judgment standard.

In the experiment 1, the classification method provided by the invention is used for classifying gastroscope images, when a dictionary is trained in the experiment, in order to select proper parameters, the training samples and the test samples are randomly used for dictionary learning and classification, the parameters with better classification results are selected, and α is set to 10^-4，β＝10^-2，λ＝10^-3The size of the dictionary is 200, the average value is obtained after 30 times of experiments, and as shown in table 1, the classification result is the classification result of the endoscope image classification method in the dictionary learning algorithm, and the classification accuracy of polyps, ulcers and normal images and the comprehensive average accuracy of the three types of images are respectively calculated;

categories	Polyp	Ulcer of stomach	Is normal	Synthesis of
					Rate of accuracy	0.926	0.901	0.943	0.923

TABLE 1

By adopting the classification method provided by the invention, the average accuracy of each type of image can reach more than 90%, and meanwhile, the calculation time is rapid, thus proving the effectiveness and reliability of the method.

Experiment 2, single color features or texture features were compared to the fusion features proposed by the present invention. The atomic number of the dictionary in the experiment is 200. As shown in table 2, the endoscope image classification method of the present invention compares with other feature classification results, and calculates the average accuracy rate;

feature(s)	CH	SIFT	CH+SIFT
				Average rate of accuracy	0.907	0.861	0.923

TABLE 2

Compared with single characteristics, the fusion characteristics provided by the invention obtain better results in the aspect of classification accuracy.

In the experimental verification, the following factors need to be considered:

1) when the test images are distributed, a certain number of samples are randomly selected from each type of samples to serve as a training sample set, the rest samples are used as test samples, the training deviation caused by too few training samples is avoided, and multiple experiments are carried out to obtain an average value, so that a more stable result is obtained;

2) the number of the training samples is far larger than the total class number, so that the high correlation of the training samples is ensured;

3) in the process of training the dictionary, the influence of the dictionary size on the result needs to be considered.

Therefore, the invention also carries out the discussion of the dictionary size, the object of experiment 3 is the same as experiment 2, and the three selected characteristics are tested to respectively obtain the comparison results of the accuracy under different dictionary sizes. As shown in fig. 4, the effect of dictionary size on the result varies from feature to feature. But for the fusion features selected by the invention, higher accuracy can be obtained when the dictionary is smaller. At the same time, complexity and training time may be reduced compared to larger dictionaries.

In addition, fig. 4 further reflects that the fusion feature provided by the present invention can achieve higher accuracy under each dictionary column number.

Claims

1. A gastrointestinal endoscope image anomaly detection method based on local feature and class mark embedding constraint dictionary learning is characterized by comprising the following steps:

wherein Y is [ Y ]₁,…,Y_C]＝[y₁,y₂,…,y_n]∈R^m×nRepresenting a training data set with n training samples, m being the dimension of the training samples, C being the class of the training samples, Y_iIs a matrix composed of training samples of the ith class; d ═ D₁,d₂,…,d_K]^T∈R^m×KRepresenting a dictionary, K is the number of atoms, d_iIs the ith atom of dictionary D, X ═ X₁,x₂,…x_n]^T∈R^K×nIs a matrix of coding coefficients, x_i＝[x_1i,x_2i,…,x_Ki]^TIs the ith training sample y_iCoding coefficient, T, corresponding to dictionary D₀Is a sparse factor, | | x | | Y₀Represents a 0 norm representing the number of non-zero elements in the coefficient vector x, | x | | luminance₂Represents a 2-norm whose values represent the sum of the squares of the individual elements in the coefficient vector x; solve to obtain an initialization dictionary D₀＝[D₁,D₂,…,D_C]And initializing the coding coefficient matrix X₀＝[X₁,X₂,…,X_C]；