CN115728247B

CN115728247B - Spectrum measurement quality judging method based on machine learning

Info

Publication number: CN115728247B
Application number: CN202211292083.7A
Authority: CN
Inventors: 郭春付; 吴华希; 石雅婷; 李伟奇
Original assignee: Wuhan Eoptics Technology Co ltd
Current assignee: Wuhan Eoptics Technology Co ltd
Priority date: 2022-10-20
Filing date: 2022-10-20
Publication date: 2024-05-28
Anticipated expiration: 2042-10-20
Also published as: CN115728247A

Abstract

The invention relates to a machine learning-based spectrum measurement quality judging method, which comprises the following steps: performing dimension reduction processing on the characteristic data of the spectrum to obtain dimension-reduced spectrum characteristic data; generating a training set containing a plurality of dimension-reduced spectral feature data, and carrying out clustering model training on each data point in the training set to obtain a clustering model and central points of each cluster contained in the clustering model; performing dimension reduction treatment on the characteristic data of the spectrum to be detected, and classifying the spectrum to be detected based on a clustering model to obtain a classification result; judging the quality of the spectrum to be detected based on the classification result of the spectrum to be detected; the traditional machine learning model is applied to the field of optical precision measurement, and the similarity of an unknown spectrum and a known spectrum can be judged under the condition that an optical model is not needed by processing data, so that the influence of abnormal spectrum caused by spectrum structural parameters and noise is effectively eliminated, and the quality of the optical measurement can be analyzed.

Description

Spectrum measurement quality judging method based on machine learning

Technical Field

The invention relates to the field of optical and computer combination, in particular to a machine learning-based spectrum measurement quality judging method.

Background

The basic principle of the optical scatterometry method, also known as OCD (optical critical dimension ) measurement method, can be summarized as: a beam of polarized light with a special polarization state is projected onto the surface of a sample to be measured, the change of the polarization state of the polarized light before and after reflection is obtained by measuring the diffracted light of the sample to be measured, and further the structural parameters of the sample to be measured, such as the thickness of a film obtained in a film coating process of chemical vapor deposition and the like, and the line width, the line height, the side wall angle and the like of a nano grating obtained in the processes of photoetching, etching and the like, are extracted from the polarized light.

Compared with microscopic morphology measuring means such as a scanning electron microscope, an atomic force microscope and the like, the optical scattering measuring technology has the advantages of high speed, low cost, no contact, no damage and the like, and therefore, the method is widely applied to the field of online monitoring of the prior process. However, the measuring means such as a scanning electron microscope, an atomic force microscope and the like can directly obtain the microscopic morphology and the structural parameters of the sample to be measured, and the measuring means is a measuring means of 'what you see is what you get'; in contrast, the optical scattering measurement technology only obtains a group of light intensity signals related to incident wavelength or incident angle distribution and other derivative signals, such as reflectivity, ellipsometry parameters, mueller matrix, and the like, and a certain data analysis means is needed to extract the structural parameters to be measured of the sample from the measurement signals. The main methods are as follows: ① And establishing a corresponding physical model for the structure to be detected by using priori knowledge (such as the shape of the structure to be detected, the refractive index of the used material and the like), and adjusting parameters in the physical model by a nonlinear fitting method and the like so as to minimize the deviation between the corresponding theoretical spectrum and the sample measured spectrum. ② And directly predicting the structural parameters to be detected by using the actually measured spectrum data through a neural network or a machine learning method. The first type of method needs to repeatedly solve a physical model in the parameter fitting process, and when a structure to be measured is complex, the calculation efficiency of the method is difficult to meet the actual measurement requirement. Compared with the method, the second type of method can directly realize the mapping from the measured spectrum to the parameter to be measured without depending on the solution of a physical model, so that the method has wider application prospect in the development trend of the semiconductor structure which is more and more complicated.

In the second class of methods, because their modeling is based entirely on mathematical models and there are typically few samples, they require some assurance of the quality of the training spectrum and the test spectrum. However, in the practical application process of semiconductor measurement, the spectrum quality of the semiconductor measurement is affected by a plurality of interference factors such as the existence of defect points in the ③ measurement sample with larger difference in the ① measurement spectrum affected by noise ② measurement structure parameters, so that it is difficult to fit a correct measurement result, thereby affecting the training quality and the test result of the mapping model. In order to improve the calculation efficiency, a physical modeling process is omitted, so that analysis on the optical quality through a physical model is difficult, and instability of the model is increased. Even the low quality spectrum is difficult to fit, so that the evaluation index of the spectrum can not meet the requirement all the time.

Disclosure of Invention

Aiming at the technical problems in the prior art, the invention provides a spectrum measurement quality judging method based on machine learning, which applies a traditional machine learning model to the field of optical precision measurement, can judge the similarity of an unknown spectrum and a known spectrum under the condition of no need of an optical model by processing data, effectively eliminates the abnormal spectrum influence caused by spectrum structure parameters and noise, and can analyze the spectrum measurement quality.

According to a first aspect of the present invention, there is provided a machine learning-based spectroscopic measurement quality determination method comprising: step 1, performing dimension reduction treatment on the characteristic data of the spectrum to obtain dimension reduced spectrum characteristic data;

step 2, generating a training set containing a plurality of the dimension-reduced spectral feature data, and carrying out clustering model training on each data point in the training set to obtain a clustering model and central points of each cluster contained in the clustering model;

Step 3, performing dimension reduction processing on the feature data of the spectrum to be detected, and classifying the spectrum to be detected based on the clustering model to obtain a classification result, wherein the classification result comprises: the cluster k to which the spectrum to be measured belongs, the distance d _k from the spectrum to be measured to the center point of the cluster k to which the spectrum to be measured belongs, and the number t _k of the spectrum of the cluster k to which the spectrum to be measured belongs;

And 4, judging the quality of the spectrum to be detected based on the classification result of the spectrum to be detected.

On the basis of the technical scheme, the invention can also make the following improvements.

Optionally, in the step1, PCA dimension reduction is performed on the spectrum.

Optionally, the step 1 includes:

Step 101, carrying out standardization processing on the spectrum characteristics to generate standardized spectrum characteristic vectors;

Step 102, calculating a covariance matrix C among the various dimensional features of the spectral feature vector;

Step 103, solving the eigenvalue input and the corresponding eigenvector u of the covariance matrix C, and inputting the K eigenvalues with the largest values into the corresponding eigenvectors u to form an eigenvalue plane;

and 104, projecting the spectral feature vector to the feature plane to obtain the reduced-dimension spectral feature.

Optionally, in the step 2, mean-Shift cluster model training is performed on each data point in the training set.

Optionally, the step 2 includes:

step 201, estimating and obtaining a bandwidth distance d of the training set; the bandwidth distance d is the average value of the distances from any point to adjacent points in the training set;

Step 202, randomly selecting N points in the training set as starting center points of N clusters, wherein N is a random parameter which is far smaller than the number N of samples in the training set, and the center point set is marked as C= [ C ₁,c₂,...c_n ];

Step 203, classifying all data points occurring in the area with the radius d and the center point c _k as a center into a cluster set M _k; k is more than or equal to 1 and less than or equal to n;

Step 204, calculating the sum of vectors from the center point c _k to each data point in the cluster set M _k by taking the center point c _k as a center point, so as to obtain an offset vector s _k;

Step 205, moving the center point c _k along the direction of the offset vector s _k to obtain a new center point c_new _k＝c_k+s_k;

Step 206, repeating steps 203-205, wherein iteration convergence is determined when the offset vector s _k is smaller than a set threshold; if the distance between the two center points in the iteration process is smaller than the radius d, merging the two center points, merging the two clustering sets to which the two center points belong, forming a new cluster and the center point, and continuing iteration;

Step 207, repeating step 206 until all of the cluster sets converge.

Optionally, the step 201 includes:

Step 20101, randomly disturbing the sequence of each data point in the training set, and proportionally sampling the data points in the training set, wherein the sampling number is N;

Step 20102, finding the nearest K adjacent points of each data point through a KNN algorithm;

step 20103, recording the farthest point in the K adjacent points of each data point, and recording the distance set d= [ D ₁,d₂,...d_N ];

step 20104, calculating the mean value of the set D as the bandwidth distance

Optionally, the step4 further includes: and judging the distribution similarity between the spectrum to be tested and the spectrum in the training set according to the cross entropy of the spectrum to be tested and judging the credibility of the training result of the spectrum to be tested according to the distribution similarity.

Optionally, the calculation formula of the cross entropy is:

H(train,test)＝-∑_k(t_train_k*log(t_test_k)+(1-t_train_k)*log(1-t_test_k));

Wherein, T _k' represents the number of spectra of any class k in the training set, T _train＝[t_train₁,t_train₂,...,t_train_K represents the distribution of all classes in the spectra of the training set;

t _test＝[t_test₁,t_test₂,...,t_test_K represents the distribution of all classifications in the spectrum to be measured.

Optionally, the determining the quality of the spectrum to be measured in the step 4 includes:

calculating the discrete distance fraction of the spectrum to be measured Wherein d _k is the spectral distance;

Wherein,

And judging the quality of the spectrum to be detected according to the discrete distance fraction.

Optionally, the step 4 further includes:

And converting the discrete distance score s into a score interval of 0-1.

According to the machine learning-based spectrum measurement quality judging method, by utilizing the characteristics of a clustering algorithm, the self-adaptive scoring of the spectrum is effectively completed by performing non-supervision learning on the spectrum and establishing a plurality of center points and classification distances in a high-dimensional space. Compared with the traditional physical model method, the method has the characteristics of high efficiency, rapidness and no dependence on physical modeling, and can effectively utilize the spectrum characteristics to finish the measurement quality judgment.

Drawings

Fig. 1 is a flowchart of a method for determining quality of spectral measurement based on machine learning according to the present invention.

Detailed Description

The principles and features of the present invention are described below with reference to the drawings, the examples are illustrated for the purpose of illustrating the invention and are not to be construed as limiting the scope of the invention.

Fig. 1 is a flowchart of a machine learning-based spectrum measurement quality determination method according to an embodiment of the present invention, as shown in fig. 1, where the determination method includes:

and step 1, performing dimension reduction treatment on the characteristic data of the spectrum to obtain dimension-reduced spectrum characteristic data.

And 2, generating a training set containing a plurality of dimension-reduced spectral feature data, and carrying out clustering model training on each data point in the training set to obtain a clustering model and central points of each cluster contained in the clustering model.

And 3, performing dimension reduction treatment on the characteristic data of the spectrum to be detected, and classifying the spectrum to be detected based on a clustering model to obtain a classification result, wherein the classification result comprises the following steps: the cluster k to which the spectrum to be measured belongs, the distance d _k from the spectrum to be measured to the center point of the cluster k to which the spectrum to be measured belongs, and the number t _k of the spectrum of the cluster k to which the spectrum to be measured belongs.

According to the spectrum measurement quality judging method based on machine learning, dimension reduction processing is conducted on a spectrum, a clustering method is utilized to model a training spectrum, and score evaluation is conducted on the quality of a test spectrum through the model; the traditional machine learning model is applied to the field of optical precision measurement, and the similarity of an unknown spectrum and a known spectrum can be judged under the condition that an optical model is not needed by processing data, so that the influence of abnormal spectrum caused by spectrum structural parameters and noise is effectively eliminated, and the quality of the optical measurement can be analyzed.

Example 1

Embodiment 1 provided by the present invention is an embodiment of a method for determining quality of spectrum measurement based on machine learning, and as can be seen in fig. 1, the embodiment of the method includes:

In one possible embodiment, the spectra in step1 are subjected to PCA (PRINCIPLE COMPONENT ANALYSIS, principal component analysis) dimension reduction.

PCA dimension reduction is a common method of converting high-dimensional spatial data into low-dimensional spatial data. The core idea is to transform a set of variables that may have correlation into a set of linearly uncorrelated variables by orthogonal transformation, thereby eliminating redundant information in high-dimensional data and minimizing information loss in the spectrum in low-dimensional space.

In one possible embodiment, step 1 comprises:

Step 101, performing standardization processing on the spectral features to generate standardized spectral feature vectors.

In particular, the normalized spectral feature vector is

Wherein S is a spectral feature. The measured spectrum is denoted as n×m spectrum s= [ S ₁,s₂,...,s_M ], where N is the number of spectra, M is the number of features of each spectrum, and the M-th dimension value of the nth sample can be expressed asWherein N is more than or equal to 1 and less than or equal to N, and M is more than or equal to 1 and less than or equal to M.

An average vector is generated for the average value of each spectral feature. /(I)Wherein the method comprises the steps of

Representing the cross product. [1, 1..1 ] ^T is an N-dimensional standard vector.

Step 102, a covariance matrix C between the features of the various dimensions of the spectral feature vector is calculated.

In particular implementations, the covariance matrix C is obtainable by covariance matrix definition:

Wherein the method comprises the steps of Is vector/>M-dimensional features of (c). The covariance calculation formula between every two of the two isWherein j is more than or equal to 1 and less than or equal to M, i is more than or equal to 1 and less than or equal to M.

Step 103, solving the eigenvalue lambda and the corresponding eigenvector u of the covariance matrix C, and forming the eigenvector u corresponding to the K eigenvalues lambda with the largest value into an eigenvalue plane.

In particular implementations, the eigenvalue λ and corresponding eigenvector u of the solution covariance matrix C can be expressed as:

Cu＝λu

Wherein, the number of the eigenvalues lambda is the same as the spectrum dimension M, and each eigenvector corresponds to a u value. And arranging the eigenvalues lambda in sequence from large to small, and taking the previous K eigenvalues and the corresponding eigenvectors [ (lambda ₁,u₁),(λ₂,u₂),...,(λ_k,u_k) ], thereby forming an eigenvalue U _k＝[u₁,u₂,...,u_k ].

And 104, projecting the spectral feature vector to a feature plane to obtain the reduced-dimension spectral feature.

Projecting the original vector onto a feature vector plane U _k to obtain a vector S_new= [ s_new ₁,s_new₂,...,s_new_K ] with a calculation formula ofWherein K is more than or equal to 1 and less than or equal to K, N is more than or equal to 1 and less than or equal to N.

And reducing the dimension of high dimension M in the original spectrum data to low dimension K through PCA, and simultaneously reserving information in the original spectrum as much as possible, namely screening and projecting by taking the front K dimension in the feature vector, so that the rapid and efficient spectrum dimension reduction process is completed.

In one possible embodiment, in step 2, mean-Shift (Mean Shift) cluster model training is performed on each data point in the training set.

Mean-Shift is a density-based non-parametric clustering algorithm. The algorithm idea is to filter and classify the data set again by updating the position of the center point inside the cluster until the stable cluster center and label are finally formed.

In one possible embodiment, step 2 includes:

step 201, estimating the bandwidth distance d of the training set; the bandwidth distance d is the average of the distances from any point in the training set to adjacent points.

The bandwidth distance d represents the degree of dispersion between the data points. The larger the value, the more discrete the data point distribution; the smaller its value, the tighter the distribution of data points. And (3) effectively estimating the average point-to-point distance d between samples, and carrying out Mean-Shift clustering model training based on the average point-to-point distance d.

The spectrum set is X= [ X ₁,x₂,...x_N ], N is the number of samples, X _N is the low-dimensional feature after PCA processing, and the bandwidth distance is d. The Mean-shift step can be divided into:

in one possible embodiment, step 201 includes:

In step 20101, the sequence of each data point in the training set is randomly disturbed, the data points in the training set are sampled proportionally, and the number of the recorded samples is N.

In step 20102, the nearest K nearest points of each data point are found by KNN algorithm.

In step 20103, the farthest point among the K neighboring points of each data point is recorded, and the distance set d= [ D ₁,d₂,...d_N ] thereof is recorded.

Step 20104, calculating the mean value of the set D as the bandwidth distance

Step 202, randomly selecting N points in the training set as the initial center points of N clusters, wherein N is a random parameter which is far smaller than the number N of samples in the training set, and the center point set is marked as C= [ C ₁,c₂,...c_n ].

Step 203, classifying all data points which occur in the area with the radius d and the arbitrary center point c _k as a center into a cluster set M _k; adding 1 to the access frequency of the data points in the cluster set M _k; k is more than or equal to 1 and less than or equal to n.

In step 204, the sum of the vectors from the center point c _k to each data point in the cluster set M _k is calculated by using the center point c _k as the center point, so as to obtain the offset vector s _k.

In step 205, the center point c _k is moved along the direction of the offset vector s _k to obtain a new center point c_new _k＝c_k+s_k.

Step 206, repeating steps 203-205, and determining iteration convergence when the offset vector s _k is smaller than the set threshold; if the distance between the two center points in the iteration process is smaller than the radius d, merging the two center points, merging two cluster sets to which the two center points belong, forming a new cluster and the center point, and continuing iteration.

Step 207, repeat step 206 until all cluster sets converge.

According to the Mean-Shift algorithm, the training set can be divided into K classes, the set of center points is c= [ C ₁,c₂,...c_K ], and the number of each class is recorded as set t= [ T ₁,t₂,...t_K ]. And saving the data model as a training result.

For any sample Test _i in the spectrum test= [ Test ₁,test₂,...,test_n ], firstly, converting the sample into a low-dimensional vector test_new _i＝[s_test₁,s_test₂,...,s_test_K by the step 1, and then calculating the distance D= [ D ₁,d₂,...d_K ] between the sample and all the center sets C in the step 2, and taking the minimum value D _k as belonging to the k-th class. The number of sets t _k of the kth class in the training set is extracted.

By this method, the classification k can be obtained for any sample test _i, and the distance d _k and the classification number t _k can be obtained.

According to the result obtained in the step 3, the spectrum quality analysis is carried out. The spectral quality can be analyzed from both macroscopic and microscopic angles.

Macroscopic analysis is to analyze the similarity of the distribution of the training spectrum and the test spectrum, and the standard is the distribution similarity between the training spectrum and the test spectrum, namely the proportion of different categories in the whole spectrum after clustering is calculated through Cross entropy (Cross-Entropy). Cross entropy is mainly used to measure how close an actual output is to an expected output, and the smaller the value H, the closer the test spectrum is to the training spectrum distribution. The larger the value, the larger the difference between the test spectrum and the training spectrum distribution, and the machine learning model training result may be not ideal.

Microcosmic considerations are to evaluate whether a single spectrum is close to the training set results.

Specifically, in one possible embodiment, step 4 further includes: and judging the distribution similarity between the spectrum to be tested and the spectrum in the training set according to the cross entropy of the spectrum to be tested and judging the credibility of the training result of the spectrum to be tested according to the distribution similarity.

In one possible embodiment, the cross entropy is calculated by the formula:

H(train,test)＝-∑_k(t_train_k*log(t_test_k)+(1-t_train_k)*log(1-t_test_k)).

Wherein, T _k' represents the number of spectra for any class k in the training set, and T _train＝[t_train₁,t_train₂,...,t_train_K represents the distribution of all classes in the spectra for the training set.

In a possible embodiment, the determining the quality of the spectrum to be measured in step 4 includes:

calculating the discrete distance fraction of the spectrum to be measured Where d _k is the spectral distance.

Wherein,

The discrete distance score is a formula theoretical basis for calculating the discrete distance score s for measuring the classification distance and the original distance, and is as follows:

based on a bandwidth value d of a training set, the larger the test distance d _k is, the farther the center point between the training set and a training spectrum is considered, and the lower the discrete distance score of the training set is; the greater the bandwidth value d, the thinner the training spectral data itself is considered, and the greater the test distance can be tolerated. Conversely, the smaller the bandwidth value d, the higher the test distance requirement. Wherein the method comprises the steps of I.e. consider the less categories in the training spectrum classification, whose spectra are unusual and thus penalized by factors. Because of the lower label_factor, the spectrum distance d _k of the corresponding class in the test set needs to be lower to obtain a high discrete distance score.

In one possible embodiment, step 4 further comprises:

The discrete distance score s is converted to a score interval of 0-1.

In an implementation, the discrete distance score is converted to a specific score, the conversion formula may be sigmod formula,So as to unify the evaluation standards and give clearer and more visual score comparison.

Wherein a score above 0.8 is considered a high quality spectrum, a score between 0.5 and 0.8 is considered a medium quality spectrum, and a score below 0.5 is considered a low quality spectrum.

According to the machine learning-based spectrum measurement quality judging method, the characteristics of a clustering algorithm are utilized, and by means of non-supervision learning on the spectrum, a plurality of center points and classification distances are established in a high-dimensional space, so that the spectrum self-adaptive scoring is effectively completed. Compared with the traditional physical model method, the method has the characteristics of high efficiency, rapidness and no dependence on physical modeling, and can effectively utilize the spectrum characteristics to finish the measurement quality judgment.

In the foregoing embodiments, the descriptions of the embodiments are focused on, and for those portions of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.

It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A machine learning-based spectroscopic measurement quality determination method, the spectrum being used for OCD measurement, the determination method comprising:

step 1, performing dimension reduction treatment on the characteristic data of the spectrum to obtain dimension reduced spectrum characteristic data;

step4, judging the quality of the spectrum to be detected based on the classification result of the spectrum to be detected;

The process of determining the quality of the spectrum to be measured in the step 4 includes:

calculating the discrete distance fraction of the spectrum to be measured D is a bandwidth distance and represents the average value of the distances from any point to adjacent points in the training set; d _k is the spectral distance;

T _train represents the distribution of all classifications in the spectrum of the training set, and K represents the number of all classifications in the spectrum of the training set;

2. The method according to claim 1, wherein the PCA dimension reduction processing is performed on the spectrum in step 1.

3. The method according to claim 1 or 2, wherein the step 1 includes:

Step 103, solving the eigenvalue lambda and the corresponding eigenvector u of the covariance matrix C, and forming the eigenvector u corresponding to the K eigenvalues lambda with the largest value into an eigenvector plane;

4. The method according to claim 1, wherein in the step 2, mean-Shift cluster model training is performed on each data point in the training set.

5. The method according to claim 1 or 4, wherein the step 2 includes:

Step 201, estimating and obtaining a bandwidth distance d of the training set;

Step 202, randomly selecting N points in the training set as starting center points of N clusters, wherein N is a random parameter which is far smaller than the number N of samples in the training set, and the center point set is marked as C= [ C ₁,c₂,…c_n ];

Step 207, repeating step 206 until all of the cluster sets converge.

6. The method according to claim 5, wherein the step 201 includes:

Step 20103, recording the farthest point in the K adjacent points of each data point, and recording the distance set d= [ D ₁,d₂,…d_N ];

step 20104, calculating the mean value of the set D as the bandwidth distance

7. The method according to claim 6, wherein the step 4 further comprises: and judging the distribution similarity between the spectrum to be tested and the spectrum in the training set according to the cross entropy of the spectrum to be tested and judging the credibility of the training result of the spectrum to be tested according to the distribution similarity.

8. The method according to claim 7, wherein the cross entropy is calculated by the formula:

H(train,test)＝-∑_k(t_train_k*log(t_test_k)+(1-t_train_k)*log(1-t_test_k));

Wherein, T _k' represents the number of spectra of any class k in the training set, T _train＝[t_train₁,t_train₂,…,t_train_K represents the distribution of all classes in the spectra of the training set;

t _test＝[t_test₁,t_test₂,…,t_test_K represents the distribution of all classifications in the spectrum to be measured.

9. The method according to claim 1, wherein the step4 further comprises:

And converting the discrete distance score s into a score interval of 0-1.