CN115496950A

CN115496950A - Neighborhood information embedded semi-supervised discrimination dictionary pair learning image classification method

Info

Publication number: CN115496950A
Application number: CN202211217899.3A
Authority: CN
Inventors: 周国华; 卢剑伟; 申燕萍; 陆兵
Original assignee: Changzhou Vocational Institute of Light Industry
Current assignee: Changzhou Vocational Institute of Light Industry
Priority date: 2022-09-30
Filing date: 2022-09-30
Publication date: 2022-12-20

Abstract

The invention relates to the technical field of image processing, in particular to an image classification method for learning of a neighborhood information embedded semi-supervised discrimination dictionary pair, which comprises the steps of generating an initial synthesis dictionary and an analysis dictionary pair, and solving sparse coding according to the synthesis dictionary and the analysis dictionary pair; constructing an inter-class diagram, an intra-class diagram and a weight matrix, and calculating the weight matrix; obtaining an objective function of the SDDPL-NIE model; and updating the synthesized dictionary, the analyzed dictionary and the sparse code, and when the target function converges or reaches the maximum iteration times. On the basis of a dictionary pair learning model, a synthesized dictionary and an analyzed dictionary pair are jointly trained by using marked and unmarked samples; minimizing a reconstruction error term based on a sparse norm, and using the norm to constrain a structured analysis dictionary to obtain a stable dictionary pair; meanwhile, a local boundary item is established based on neighborhood structure information of sparse coding so as to ensure intra-class compactness and inter-class separation between adjacent sparse coding.

Description

Neighborhood information embedded semi-supervised discrimination dictionary pair learning image classification method

Technical Field

The invention relates to the technical field of image processing, in particular to a method for classifying learned images by using a semi-supervised discrimination dictionary embedded with neighborhood information.

Background

With the development of remote sensing imaging technology, the quantity and quality of high-resolution remote sensing images are continuously improved. In recent years, there has been considerable interest in automatically and accurately identifying these images. As a hot topic, the remote sensing image scene classification is to classify the land coverage type of a scene image according to the image content. Remote sensing image classification has important application requirements, and is widely applied to natural disaster detection, land resource utilization and coverage management, geospatial object detection, geographic image retrieval, vegetation mapping, environment monitoring, urban planning and the like.

In many practical applications, collecting a sufficient number of marker samples is time consuming and expensive; at the same time, it is easier to obtain many unlabeled samples.

Semi-supervised classification methods are widely used in the field of machine learning, aiming at extracting reliable classification effects by considering limited labeled samples and unlabeled samples. Among many semi-supervised learning methods, graph-based embedding methods are well-used in many fields, and their main advantage is that they can identify arbitrarily distributed classes. Based on the graph embedding method, marked samples and unmarked samples are regarded as the vertexes of the graph, and label information is transmitted through edges, so that the smoothness requirement between samples near the classification boundary and the distinguishing requirement between different marked samples can be met through the graph embedding method; however, how to solve the problem of semi-supervised remote sensing image classification with insufficient training data and high annotation cost is urgent.

Disclosure of Invention

Aiming at the defects of the existing algorithm, the SDDPL-NIE method provided by the invention utilizes marked and unmarked samples to jointly train and synthesize a dictionary and analyze a dictionary pair on the basis of a dictionary pair learning model; by minimizing the l based sparsity _2,1 Reconstruction of the error term of the norm and use of l _2,1 The norm restrains the structural analysis dictionary to obtain a steady dictionary pair; meanwhile, a local boundary item is established based on neighborhood structure information of sparse coding so as to ensure intra-class between adjacent sparse codesCompactness and inter-class separation.

The technical scheme adopted by the invention is as follows: the method for classifying the learned images by using the neighborhood information embedded semi-supervised discriminating dictionary comprises the following steps:

constructing marked and unmarked samples, generating an initial synthesis dictionary and analysis dictionary pair, and solving sparse coding according to the synthesis dictionary and the analysis dictionary pair;

further, the method specifically comprises the following steps: structured analysis dictionary P = [ P = [ P ] ₁ ；P ₂ ；...,P _C ]∈R ^K×d And then Z = [ Z ] _l ,Z _u ]Projection to its corresponding sparse code a by a = PZ, and synthesis of dictionary D = [ D ] by structuring ₁ ；D ₂ ；...,D _C ]∈R ^d×K Synthesizing input samples, wherein Z is a training sample set, Z is approximately equal to DA, C is the number of classes,

forming a class k associated sub-dictionary pair.

Step two, constructing an inter-class diagram, an intra-class diagram and a weight matrix, and calculating the weight matrix; obtaining an objective function of the SDDPL-NIE model;

further, the formula for calculating the weight matrix is as follows:

according to class-class diagram G _b And similar inner graph G _w Defining two weight matrices

And

respectively belong to G _w And G _b ：

Wherein n is _k Represents n _l The number of the kth type samples in (c),

further, the objective function of the SDDPL-NIE model is as follows:

wherein α and β are weight coefficients, [ Z ] _l ,Z _u ]For labeled and unlabeled sample sets, D for a synthesis dictionary, P for an analysis dictionary, P _k For analysing the associated sub-dictionary of class k of the dictionary P, D _k Associated sub-dictionary for kth class of compound dictionary D, Z _k For class k training samples, Λ _k In the form of a diagonal matrix,

is that the diagonal element of

Wherein

Is a matrix

Row i of (2).

Step three, updating a synthesis dictionary, an analysis dictionary and sparse coding; performing iterative optimization, and finishing the iterative optimization when the objective function of the SDDPL-NIE model is converged or reaches the maximum iteration times;

further, updating the analysis dictionary includes: solving and analyzing dictionary P by using fixed synthesis dictionary D and sparse coding A, and carrying out analysis on P _k First order derivation, one can obtain:

wherein the content of the first and second substances,

is in matrix A about P _k Sub-matrix of, U _k Is a diagonal matrix, U _k Is U _k,ii ＝1/(||(A _k -P _k Z _k ) _i || ₂ ),(A _k -P _k Z _k ) _i Is a matrix (A) _k -P _k Z _k ) The ith row vector of (2).

Further, updating the composite dictionary includes: solving a synthetic dictionary D by using a fixed analysis dictionary P and a sparse code A, and carrying out a pair D _k First order derivation to obtain D _k Is represented by the formula:

where τ is a positive number and I is an identity matrix.

Further, updating the sparse coding includes: solving sparse code A by fixed analysis dictionary P and synthesis dictionary D, and comparing A _U First order derivation, one can obtain:

to A _k Solving is carried out, and the following results can be obtained:

to A _k The first order of derivation,

the invention has the beneficial effects that:

1. experimental results on RSSCN7 and UC Merced data sets show that the dictionary pair obtained by the method has good discriminability and can be effectively applied to the problem of semi-supervised remote sensing image classification;

2. analytical synthetic dictionary training with labeled and unlabeled data and based on sparsity l by minimization _2,1 And obtaining a dictionary pair with discriminant by using the reconstruction error term of the norm and the local boundary term based on sparse coding.

Drawings

FIG. 1 is a flow chart of a method for classifying images learned by neighborhood information embedded semi-supervised discriminative dictionary pairs of the present invention;

FIG. 2 is an example of a remote sensing image dataset of the present invention, (a) RSSCN7, (b) Ucmerced Land;

FIG. 3 is a graph of average classification accuracy for different atomic numbers according to the present invention;

FIG. 4 is a graph of average classification accuracy for different alpha values according to the present invention;

fig. 5 shows the average classification accuracy for different β values according to the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings and examples, which are simplified schematic drawings and illustrate only the basic structure of the invention in a schematic manner, and therefore only show the structures relevant to the invention.

As shown in fig. 1, the method for classifying the learned image by using the neighborhood information embedded semi-supervised discriminative dictionary comprises the following steps:

dictionary pair learning

For the training sample set Z, jointly Learning a synthesized Dictionary and an analysis Dictionary Pair by a Dictionary Pair Learning (DPL) model; structured analysis dictionary P = [ P = ₁ ；P ₂ ；...,P _C ]∈R ^K×d Projecting Z through A = PZTo its corresponding sparse code a and by structuring the composite dictionary D = [ D ] ₁ ；D ₂ ；...,D _C ]∈R ^d×K Synthesizing an input sample, namely Z is approximately equal to DA; c is the number of classes that are present,

forming a kth class associated sub-dictionary pair;

the objective function of the DPL model is:

wherein Z is _k A training sample representing the k-th class,

is Z _k Complementary data in the training set Z, d _i Represents the ith column of D, and lambda is more than or equal to 0 and is a preset weight value;

minimization

So that

The obtained sparse coding PZ presents a diagonal block form and constraint items

Can avoid P _k And =0, ensuring a stable dictionary pair.

For arbitrary test samples z _new The DPL model solves its class label y (z) by the following problem _new )，

y(z _new )＝arg min _k ||z _new -D _k P _k z _new || ₂ (2)

As seen from equation (2), the DPL model is based on minimizing z _new And a synthesized dictionary D for each category _k And an analysis dictionary P _k And z _new Constituent reconstructed test data D _k P _k z _new The error value therebetween to determine the category of the test sample.

The classification problem of the DPL model does not relate to the solution of sparse coding, so that the execution efficiency of a classification task can be improved; however, the DPL model only uses labeled training data, and the classification effect of the model cannot be guaranteed to be optimal when the labeling information is insufficient.

Neighborhood information embedded semi-supervised discrimination dictionary pair learning method

Objective function

Let Z = [ Z ] ₁ ,z ₂ ,...,z _n ]∈R ^d×n Representing a training sample set, wherein each column is a data sample, n samples are total, and d represents the dimensionality of the data sample; without loss of generality, assume

And

respectively, labeled sample set and unlabeled sample set (Z = [ Z ] _l ,Z _u ]) (ii) a However, the objective function in equation (1) only utilizes labeled training data, and ignores the discriminative information implied in unlabeled samples; however, as a supervised dictionary learning method, the performance of the supervised dictionary learning method is often limited by the amount of available label training data; in order to adapt the DPL model to semi-supervised learning with less labeled training data, the following improvements are proposed,

wherein, alpha is a weight coefficient,

class dependent sub-dictionary pairs are learned in labeled sample Z _l The above process is carried out.

Semi-supervised discrimination dictionary pair learning method (semi-supervised discrimination dictionary pair learning method) for neighborhood information embeddingion embedding, SDDPL-NIE) structured analysis of the molecular dictionary P _k It is possible to project non-k-class sparse codes into almost zero space, also by minimizing the sparsity l-based _2,1 Reconstructing error terms of the norm to obtain a stable dictionary pair; using a robust and easily optimized sparsity l _2,1 The norm restrains the structural analysis dictionary and can force

The matrix is row sparse; with conventional ones ₀ Norm and l ₁ The norm is compared with the norm,

the items can improve the training efficiency of the model; according to l _2,1 The definition of the norm is that,

wherein Λ is _k Is a diagonal matrix whose diagonal elements are:

wherein, the first and the second end of the pipe are connected with each other,

is P _k The ith row vector of (1).

Thus, equation (3) can be rewritten as:

and alternately learning the analysis dictionary and the synthesis dictionary by using the mark data to finally obtain a dictionary pair with discrimination capability.

In order to obtain discrimination sparse codes, according to the thought of inter-class edge distance, the invention establishes a local boundary item based on neighborhood structure information of the sparse codes; for each sparse code Pz _i Consider Pz _i Distance between sparse codes of samples of different classesClass diagram G _b From each sample Pz in the graph _i To and Pz _i Establishing directed edges by adopting samples of different labels; meanwhile, consider Pz _i Establishing an intra-class graph G according to the distance between the intra-class graph G and the sparse codes of the samples of the same class _w Each sample Pz in the figure _i To and Pz _i Establishing a non-directional edge for the samples sharing the same label; according to class-class diagram G _b And similar inner graph G _w Defining two weight matrices

And

respectively belong to G _w And G _b ，

Wherein n is _k Represents n _l The number of the kth type samples in (c),

notably, S ^w And S ^b Is 1, and the matrix S ^w Is symmetrical.

Projection of the analyzed dictionary, sparse coding P _k z _k,i The boundary of (d) can be expressed as:

definition G ^b Is a diagonal matrix whose diagonal elements are S ^b Of columns of (i) i.e.

All labeled samples Z _k The sparse coding mean boundary of (a) is expressed as:

by considering the local neighborhood information among the sparse codes, the distance between similar adjacent sparse codes is small, and the distance between dissimilar adjacent sparse codes is large, so that the intra-class compactness and the inter-class separation among the adjacent sparse codes are ensured, and the purpose of effectively improving the discriminativity of the sparse codes is achieved.

Combining equation (5) and equation (8), the objective function of the SDDPL-NIE model can be obtained:

where β is a weight coefficient.

Optimization solution

The invention uses an iterative optimization solution method, when one variable is solved, the other variable is kept unchanged, and in order to simplify the solving process, a variable sparse code A is introduced, so that

Formula (10) is rewritten as:

1) Fixing D and A and solving P to obtain:

according to l _2,1 The definition of the norm is that,

U _k is a diagonal matrix whose diagonal elements are U _k,ii ＝1/(||(A _k -P _k Z _k ) _i || ₂ ) Wherein (A) _k -P _k Z _k ) _i Is a matrix (A) _k -P _k Z _k ) The ith row vector of (1); therefore, equation (11) can be rewritten as:

to P _k First order derivation, one can obtain:

wherein

Is in matrix A about P _k A sub-matrix of (a);

2) Fixing P and A and solving D to obtain:

with respect to sub-dictionary D _k Is expressed as:

wherein the content of the first and second substances,

A ^j is Z in the sub-dictionary D _j The sparse coding of (1); according to l _2,1 The definition of the norm is that,

is a diagonal matrix whose diagonal elements are

Wherein the content of the first and second substances,

is a matrix

Row i of (2).

To D _k First order derivation to obtain D _k Is represented by the formula:

wherein τ is a positive number, and I is an identity matrix;

3) Fixing P and D and solving A to obtain:

let Z _u Corresponding sparse coding is A _u To A, a _u The solution is carried out, so that the solution can be obtained,

to A _u First order derivation, one can obtain:

to A _k The solution is carried out, so that the solution can be obtained,

to A _k The first order of derivation,

after obtaining the optimal solution for parameters D, P, and A, for test sample z _new The classification can be performed according to the DPL model using equation (2), and the training process of the SDDPL-NIE method is given below:

experiment of

Experimental setup

The RSSCN7 dataset contains 2800 remote sensing scene images from 7 scene categories, namely grassland, forest, farmland, parking lot, residential area, industrial area, river and lake. There are 400 images in each scene type, each image being 400 x 400 pixels in size; the UC Merced dataset comprises 21 scene categories, each category containing 100 images of 256 × 256 pixels; FIG. 2 shows an exemplary diagram of RSSCN7 and UC Merceded data sets; the remote sensing image characteristics are extracted by adopting a VGG-VD-16 and Caffenet deep convolution neural network model in the experiment, and the VGG-VD-16 network is composed of 16 layers of neural networks; caffeNet is composed of 13 layers of neural networks, and characteristics extracted by the two depth methods are 2048-dimensional.

For performance comparisons, the following semi-supervised classification methods were compared experimentally: lapSVM, SS-TCA, KFME, SFFE and JEFW; lapSVM is a Laplace support vector machine classifier, which learns the geometry of samples from labeled and unlabeled examples using a manifold regularization technique; the SS-TCA method is a semi-supervised learning method based on a pseudo label, and improves classifier training dynamically through the pseudo label; FSEL is a graph-based semi-supervised embedding algorithm; KFME is a nuclear semi-supervised embedding algorithm based on a graph; JEFW is a graph-based joint embedding and feature weighting method for obtaining a non-linear data representation of a sample on a manifold; SFFE is a semi-supervised embedding method based on adaptive loss regression; all nuclear methods used Gaussian kernels in the experiment, and the selection range of nuclear parameters was {10 } ^-3 ,10 ^-2 ,...,10 ³ }; the regularization parameter of the LapSVM method is chosen within a range of 10 ^-3 ,10 ^-2 ,...,10 ³ }; mu e {10 } for the SS-TCA method ^-3 ,10 ^-2 ,...,10 ³ }; the FSEL method uses a nearest neighbor classifier to classify, and the dimension of an embedding space is determined by the number of training samples; gamma epsilon of SFFE method {10 ∈ ^-2 ,0.1,0.5,1,1000},λ∈{1,10,50,100,1000},σ∈{10 ^-2 0.1,1,5,10}; the SDDPL-NIE model of the invention has 2 parameters alpha and beta to be adjusted, and the selection range is {10 } ^-3 ,10 ^-2 ,...,10 ³ }; in a semi-supervised remote sensing scene classification experiment, according to the setting of a document (a benchmark database for performance evaluation of advanced scene classification), 10%,15%,20% and 25% of images are randomly selected from each sub-class data set as labeled training samples in the experiment, and the rest images in the data set are used as non-labeled training samples; the experiment adopts a ten-fold cross validation method to determine parameter values, and in order to avoid randomness of experiment results, 20 times of experiments are operated, and an average value of classification precision is taken as an evaluation basis.

Classification Performance comparison

The SDDPL-NIE model of the invention obtains higher classification precision than other methods under the same condition, especially a training setWhen the medium class labels are less, the SDDPL-NIE model performs better; on one hand, in the remote sensing scene classification problem, the number of samples with labels in a training set directly influences the final classification result; in the case of limited tag data, it is necessary to use a semi-supervised approach; the classification effect can be improved no matter based on a graph embedding method, a label propagation method or a manifold structure embedding method. On the other hand, the dictionary obtains the best result for the learning-based SDDPL-NIE model, and the result shows that the constructed model is suitable for semi-supervised classification scenes. The SDDPL-NIE model performs learning of structural analysis dictionary and synthesis dictionary pairs on the tag data by minimizing sparsity l-based _2,1 Reconstruction error term and sparseness of norm _2,1 And analyzing dictionary constraint terms of the norm to obtain a dictionary pair with strong discrimination. Meanwhile, according to the concept of the edge distance between classes, neighborhood structure information based on sparse coding is constructed to establish a local boundary item, so that the class information of the marked image can be well utilized to carry out model training. In addition, the effective parameter iterative learning strategy ensures that all parameters can obtain the optimal solution.

The characteristics of VGG-VD-16 and CaffeNet of the remote sensing scene image are adopted in the experiment; as can be seen from the results in tables 1-2, comparable classification results were obtained for both characteristics. The depth features are specified to be applicable to remote sensing scene classification. Because the traditional manual characteristics can only extract shallow image characteristics from color, texture, space and spectrum information, and the characteristics of complex content presentation of remote sensing scene images and small difference among classes are not suitable for remote sensing scene image classification.

Table 1 compares the classification results of the methods on the RSSCN7 data set

Parameter sensitivity analysis

The number of atoms of the sub-dictionary in the dictionary learning model directly feels the performance of the model. In the experiment, the atom number selection range of the sub-dictionary of the SDDPL-NIE model is {20,30,40}; FIG. 3 shows the average classification accuracy for 20% of labeled training samples using different atomic numbers; the experimental result of the table 2 shows that the SDDPL-NIE model can obtain satisfactory classification accuracy only by a small-scale sub-dictionary, and the classification accuracy changes but fluctuates little when the number of atoms is different, because the SDDPL-NIE model fully considers the correlation information among dictionary atoms in the process of learning the dictionary pair, the sparse coding local boundary term can ensure to obtain a dictionary with discriminability.

The SDDPL-NIE model includes 2 parameters α and β to be optimized, which are selected in the range of {10 } ^-3 ,10 ^-2 ,...,10 ³ }; fig. 4 shows the average classification accuracy for 20% of labeled training samples using different alpha values. The results of the experiment in FIG. 4 show that α<The performance of the model is more stable when 1 is used, when alpha is>The average classification precision of the model is sharply reduced when the time is 10 hours; because the alpha parameter adjusts the dictionary pair learning items of the whole volume data set, when the alpha value is too large, the dictionary pair learning items of the whole volume data set occupy a decisive proportion in the target function, and the discriminant information of the mark data cannot be effectively utilized. FIG. 5 shows the average classification accuracy for 20% of labeled training samples using different β values; the beta value adjusts a local boundary term of sparse coding and plays an important role in a model; the experimental result of fig. 5 shows that the SDDPL-NIE model has low sensitivity to β values, the performance of the model corresponding to different β values is stable, and the difference between the maximum value and the minimum value of the average classification precision is small; therefore, the value of the parameter β can be fixed in practical applications.

In light of the foregoing description of the preferred embodiment of the present invention, many modifications and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The technical scope of the present invention is not limited to the content of the specification, and must be determined according to the scope of the claims.

Claims

1. The method for classifying the learned images by using the neighborhood information embedded semi-supervised discriminative dictionary is characterized by comprising the following steps of:

step one, constructing a marked sample Z _l And unlabeled specimen Z _u Generating an initial synthesis dictionary and analysis dictionary pair, and solving the sparse code A according to the synthesis dictionary D and the analysis dictionary P pair;

and step three, performing iterative optimization, updating the synthesized dictionary, the analyzed dictionary and the sparse codes, and ending the iterative optimization when the target function of the SDDPL-NIE model converges or reaches the maximum iteration times.

2. The method for classifying images learned by neighborhood information embedded semi-supervised discriminative dictionary pairs according to claim 1, wherein the first step specifically comprises:

structured analysis dictionary P = [ P = [ P ] ₁ ；P ₂ ；...,P _C ]∈R ^K×d And then Z = [ Z ] _l ,Z _u ]Projection to the corresponding sparse code a by a = PZ, and synthesis of dictionary D = [ D ] by structuring ₁ ；D ₂ ；...,D _C ]∈R ^d×K Synthesizing input samples, wherein Z is a training sample set, Z is approximately equal to DA, C is the number of classes,

forming a class k associated sub-dictionary pair.

3. The neighborhood information embedded semi-supervised discriminative dictionary pair-learning image classification method according to claim 1, characterized in that a formula for calculating a weight matrix is as follows:

And

respectively belong to G _w And G _b ：

Wherein n is _k Represents n _l The number of the kth type samples in (c),

4. the neighborhood information embedded semi-supervised discriminative dictionary pair learning image classification method according to claim 1, characterized in that an objective function of the SDDPL-NIE model is as follows:

is that the diagonal element of

Is a matrix

Row i of (2).

5. The method for classifying images learned by neighborhood information embedded semi-supervised discriminative dictionary pair according to claim 1, wherein updating the analysis dictionary comprises: solving and analyzing the dictionary P by using the fixed synthesis dictionary D and the sparse coding A, and carrying out the analysis on the dictionary P _k First order derivation, one can obtain:

wherein the content of the first and second substances,

is in matrix A about P _k Sub-matrix of, U _k Is a diagonal matrix, U _k Is U _k,ii ＝1/(||(A _k -P _k Z _k ) _i || ₂ ),(A _k -P _k Z _k ) _i Is a matrix (A) _k -P _k Z _k ) The ith row vector of (1).

6. The method for classifying images learned by neighborhood information embedded semi-supervised discriminative dictionary pair according to claim 1, wherein updating a composite dictionary comprises: solving a synthetic dictionary D by using a fixed analysis dictionary P and a sparse code A, and carrying out a pair D _k First order derivation to obtain D _k Is represented by the formula:

where τ is a positive number and I is an identity matrix.

7. The neighborhood information embedded semi-supervised discriminative dictionary pair learning image classification method according to claim 1, wherein updating sparse coding comprises: solving sparse code A by using fixed analysis dictionary P and synthesis dictionary D, and making Z _u Corresponding sparse coding is A _u To A, a _u The solution is carried out, so that the solution can be obtained,

to A _u The first order derivative can be obtained by the first order derivation,

to A _k Solving is carried out, and the following results can be obtained:

to A _k First order derivation, one can obtain: