CN112529077A - Embedded feature selection method and equipment based on prior probability distribution - Google Patents

Embedded feature selection method and equipment based on prior probability distribution Download PDF

Info

Publication number
CN112529077A
CN112529077A CN202011438665.2A CN202011438665A CN112529077A CN 112529077 A CN112529077 A CN 112529077A CN 202011438665 A CN202011438665 A CN 202011438665A CN 112529077 A CN112529077 A CN 112529077A
Authority
CN
China
Prior art keywords
value
determining
preset
module
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011438665.2A
Other languages
Chinese (zh)
Inventor
陈会
姜青山
刘薇
肖焯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN202011438665.2A priority Critical patent/CN112529077A/en
Publication of CN112529077A publication Critical patent/CN112529077A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides an embedded feature selection method and equipment based on prior probability distribution, wherein the method comprises the following steps: step 1, acquiring a K-th type sample in a training data set; step 2, a preset constant is given, wherein the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is used for estimating prior probability of Bayes theorem; step 3, determining a preset average value based on the average value of the one-dimensional Gaussian distribution function for determining that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem; step 4, determining a middle value based on the value of the sample and the preset average value; step 5, determining the sum of the intermediate values subjected to logarithmic operation; and 6, determining the weight of the K classes based on a preset constant, the intermediate value and the sum value. The scheme can simply, quickly and effectively obtain the weight. The attribute with large weight value can be selected to represent the category, and data redundancy is reduced.

Description

Embedded feature selection method and equipment based on prior probability distribution
Technical Field
The invention relates to the technical field of computers, in particular to an embedded feature selection method and equipment based on prior probability distribution.
Background
Classification Using labeled samples to classify unlabeled samples into known classes is a supervised learning technique. At present, there are many classifiers with good performance, such as Decision Tree (DT), Logistic Regression (LR), naivebutteryes (nb), neural network, etc., and with the development of information technology, we face the problem of processing high dimensional data, zettabytes data volume, and thousands of features. The dimension cursing thus produced affects the performance of the classification result.
Feature selection is an important data mining preprocessing technique that attempts to remove redundant information attributes in high-dimensional data. Conventional feature extraction methods include Linear Discriminant Analysis (LDA), Principal Component Analysis (PCA), Canonical Correlation Analysis (CCA), Local Linear Embedding (LLE), RelifF algorithm, and the like. Currently, there are three basic techniques for feature selection: filtration, packaging and embedding methods. In which the filtering method selects features that have strong correlation with the target variable, but ignores the correlation between the features. The packing (english name wrapper) method is based on the correlation coefficient of a linear model, and the model obtains AUC, which does not change or decrease greatly when the absolute value of the correlation coefficient is small. The NB classifier has the characteristics of interpretability, simplicity, usability, practicality, expandability, good incremental learning ability, and the like. For these reasons, it is widely used to address classification problems encountered in the field of data mining. But the usual na iotave Bayes is based on the assumption of conditional independence and cannot be used directly in practical applications. I.e. all attributes play the same role in a given class (w)1=w2…=wDAnd w represents weight, importance of attribute). To alleviate its conditional independence hypothesis, some researchers have investigated feature weighting methods. However, these methods are almost independent of the entire NB classification process, only as a separate processing step.
In the prior art, in order to deal with the condition independence assumption and embed the feature weighting method into the feature selection algorithm, scholars at home and abroad develop related researches. One study has proposed a global weighted gaussian distribution, FWNB, for each feature of all classes. Their feature weights may be given by wjIndicating that j-th attributes represent all attribute classes having the same weightAnd (4) heavy. Furthermore, Chen et al propose subspace feature weights
Figure BDA0002829399080000021
Bayes (SWNB), each class has different weight and can be represented by wkjAnd (4) showing. For class k, each attribute plays the same role, but the weights are different for different classes. SWNB iteratively optimizes the calculation of the weight values using newton's method. In addition, there have been studies that propose class-specific attribute weighted naive Bayes (CAWNB) that learn weights by maximizing a conditional log-likelihood (CLL) objective function and a minimum Mean Square Error (MSE) objective function, and optimize a weight matrix with L-BFGS-M.
Specifically, most of the current research methods select features as preprocessing steps of data, and are separated from the whole algorithm. And most of the methods mainly adopt optimization calculation, and more algorithm time is needed.
Disclosure of Invention
Aiming at the defects in the prior art, the invention provides an embedded feature selection method and equipment based on prior probability distribution.
The embodiment of the invention provides an embedded feature selection method based on prior probability distribution, which comprises the following steps:
step 1, acquiring a K-th type sample in a training data set;
step 2, a preset constant is given, and the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is used for estimating prior probability of Bayes theorem;
step 3, determining a preset average value based on the average value of the one-dimensional Gaussian distribution function for determining that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem;
step 4, determining a middle value based on the value of the sample and the preset average value;
step 5, determining the sum of the intermediate values subjected to logarithmic operation;
and 6, determining the weight of the K classes based on a preset constant, the intermediate value and the sum value.
In a specific embodiment, the method further comprises the following steps:
the value of K is changed and steps 1-6 are repeated to determine the weights of all classes in the training dataset.
In a specific embodiment, the method further comprises the following steps:
determining a final class of the sample based on the weights.
In a specific embodiment, the intermediate value is determined based on the following formula:
Figure BDA0002829399080000031
said XkjIs a median value, said μkjIs the preset mean value, xijIs the value of the sample, the | ckAnd | is the number of samples belonging to the kth class in the training samples.
In a specific embodiment, the weight is determined based on the following formula:
Figure BDA0002829399080000032
wherein tau is a preset constant, alpha is a hyperparameter, and wkjIs the weight, the XkjIs a median value, said D is the number of samples, said λ1For the introduced Lagrangian constant, said | ckL is the number of samples belonging to the kth class in the training samples, the sigmakA standard deviation of the class k samples estimated for a one-dimensional gaussian function.
The embodiment of the invention also provides embedded feature selection equipment based on prior probability distribution, which comprises the following steps:
the acquisition module is used for acquiring a K-th type sample in the training data set;
the device comprises a first determining module, a second determining module and a control module, wherein the first determining module is used for giving a preset constant, and the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem;
a second determination module that determines a preset mean value based on an average value of one-dimensional Gaussian distribution functions used to determine that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem;
a median module for determining a median based on the values of the samples and the preset mean;
the sum module is used for determining the sum of the intermediate values subjected to logarithmic operation;
and the weighting module is used for determining the weighting of the K classes based on a preset constant, the intermediate value and the sum value.
In a specific embodiment, the method further comprises the following steps:
and the processing module is used for replacing the K value and sequentially and repeatedly executing the acquisition module to the weight module so as to determine the weights of all classes in the training data set.
In a specific embodiment, the method further comprises the following steps:
a classification module to determine a final classification of the sample based on the weights.
In a specific embodiment, the intermediate value is determined based on the following formula:
Figure BDA0002829399080000041
said XkjIs a median value, said μkjIs the preset mean value, xijIs the value of the sample, the | ckAnd | is the number of samples belonging to the kth class in the training samples.
An embodiment of the present invention further provides a computer storage medium, in which a program for executing the above method is stored.
Therefore, the embodiment of the invention provides an embedded feature selection method and equipment based on prior probability distribution, wherein the method comprises the following steps: step 1, acquiring a K-th type sample in a training data set; step 2, a preset constant is given, wherein the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem; step 3, determining a preset average value based on the average value of the one-dimensional Gaussian distribution function for determining that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem; step 4, determining a middle value based on the value of the sample and the preset average value; step 5, determining the sum of the intermediate values subjected to logarithmic operation; and 6, determining the weight of the K classes based on a preset constant, the intermediate value and the sum value. The prior probability of the Dirichlet distribution estimation Bayes theory is introduced in the scheme, so that the analytic expression of the algorithm output is realized, and the weight can be simply, quickly and effectively obtained. After the weight is obtained, the attribute representative category with the large weight value can be selected, and data redundancy is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.
Fig. 1 is a schematic flowchart of an embedded feature selection method based on prior probability distribution according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of an embedded feature selection device based on prior probability distribution according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an embedded feature selection device based on prior probability distribution according to an embodiment of the present invention;
fig. 4 is a schematic diagram of a frame structure of a computer storage medium according to an embodiment of the present invention.
Detailed Description
Various embodiments of the present disclosure will be described more fully hereinafter. The present disclosure is capable of various embodiments and of modifications and variations therein. However, it should be understood that: there is no intention to limit the various embodiments of the disclosure to the specific embodiments disclosed herein, but rather, the disclosure is to cover all modifications, equivalents, and/or alternatives falling within the spirit and scope of the various embodiments of the disclosure.
The terminology used in the various embodiments of the present disclosure is for the purpose of describing particular embodiments only and is not intended to be limiting of the various embodiments of the present disclosure. As used herein, the singular forms are intended to include the plural forms as well, unless the context clearly indicates otherwise. Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the various embodiments of the present disclosure belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined in various embodiments of the present disclosure.
Example 1
The embodiment 1 of the invention discloses an embedded feature selection method based on prior probability distribution, which comprises the following steps as shown in figure 1:
step 1, acquiring a K-th type sample in a training data set; in particular, for example, class k samples x in the training dataset are obtainedi=<xi1,xi2,...,xiD>。
Step 2, a preset constant is given, wherein the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem; (ii) a
First, in probability statistics, a Dirichlet (Dirichlet) distribution probability density expression is:
Figure BDA0002829399080000071
Figure BDA0002829399080000072
wherein α ═ (α)1,…,αD) Is a hyperparameter, and x1,…,xD> 0 and x1+x2+…xKOne more common form is a symmetric Dirichlet distribution, where all α's are1,…,αDTake the same value of alpha. Since there is usually no a priori knowledge to determine that a component is better than other components, the symmetric form is often used when Dirichlet priors are used, among:
Figure BDA0002829399080000073
Figure BDA0002829399080000074
when α is 1, the above formula is equivalent to a uniform distribution regardless of the x value; when alpha is more than 1, the distribution tends to be more stable, and all values in one sampling tend to be the same; the distribution tends to be sharper when α < 1, and most values tend to be 0 in one sample, with few components having larger values.
Specifically, bayesian theory can be expressed as: p (k/x) P (x) ═ P (k) P (x | k); then
Figure BDA0002829399080000075
Since P (x) does not vary with k, it can be considered a constant, and P (k/x) is proportional to P (x | k). Based on the conditional independence assumption and the subspace weighting method, then:
Figure BDA0002829399080000076
in the method of introducing probability density, Bayes' theorem and density function are combined, and we can obtain prior probability P (k) from f (k) (density function) and likelihood probability P (x | k) from f (x | k) (also density function).
For classification, the probability that a sample xt to be detected belongs to which class is large is the same as the class, so that the parameters can be optimized to the maximum extent:
J1k)=p(θk/ck)∝p(θk)p(ckk);
step 3, determining a preset average value based on the average value of the one-dimensional Gaussian distribution function for determining that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem;
assuming that the probability that a sample belongs to class k is represented by a one-dimensional Gaussian distribution
Figure BDA0002829399080000081
Wherein mukAnd σkMean and standard deviation of the one-dimensional gaussian distribution are shown.
Step 4, determining a middle value based on the value of the sample and the preset average value;
specifically, the intermediate value is determined based on the following formula:
Figure BDA0002829399080000082
said XkjIs a median value, said μkjIs the preset mean value, xijIs the value of the sample, the | ckAnd | is the number of samples belonging to the kth class in the training samples.
Step 5, determining the sum of the intermediate values subjected to logarithmic operation;
specifically, lnwkjRepresenting intermediate values by logarithmic operation, based on formula
Figure BDA0002829399080000083
A sum value is determined.
And 6, determining the weight of the K classes based on a preset constant, the intermediate value and the sum value.
The weights are determined based on the following formula:
Figure BDA0002829399080000091
wherein tau is a preset constant, alpha is a hyperparameter, and WkjIs the weight, the XkjIs a median value, said D is the number of samples, said λ1For the introduced Lagrangian constant, said | ckL is the number of samples belonging to the kth class in the training samples, the sigmakA standard deviation of the class k samples estimated for a one-dimensional gaussian function.
Specifically, based on the dirichlet distribution and the gaussian distribution in the above steps 2 and 3, through logarithmic transformation, the objective function is defined as:
Figure BDA0002829399080000092
then:
Figure BDA0002829399080000093
wherein,
Figure BDA0002829399080000101
the limiting conditions are as follows:
Figure BDA0002829399080000102
logarithm is calculated for both sides:
Figure BDA0002829399080000103
introducing a lagrange multiplier, the objective function is:
Figure BDA0002829399080000104
objective function pair sigmakDerivation:
Figure BDA0002829399080000105
then the process of the first step is carried out,
Figure BDA0002829399080000106
objective function pair wkjDerivation:
Figure BDA0002829399080000107
when X is presentkjWhen the signal is not equal to 0, the signal is transmitted,
Figure BDA0002829399080000108
objective function pair lambda1Derivation:
Figure BDA0002829399080000111
bringing (2) into (3) to obtain:
Figure BDA0002829399080000112
bringing (4) into formula (2) to obtain:
Figure BDA0002829399080000113
in addition, this scheme still includes: steps 1-6 are repeated to determine the weights of all classes in the training dataset.
In the scheme, the embedded features are selectively embedded into the algorithm for learning, and the prior probability of the Dirichlet distribution estimation Bayes theory is introduced, so that the analytic expression of the algorithm output is realized, and the weight can be simply, quickly and effectively obtained. After the weight is obtained, the attribute representative category with the large weight value can be selected, and data redundancy is reduced. The type of the sample to be tested can also be judged by using Bayesian theory.
Example 2
Embodiment 2 of the present invention also discloses an embedded feature selection device based on prior probability distribution, as shown in fig. 2, including:
an obtaining module 201, configured to obtain a kth class sample in a training data set;
a first determining module 202, configured to set a preset constant, where the preset constant is obtained from a weight based on a dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem;
a second determining module 203, configured to determine a preset mean value based on a mean value of a one-dimensional gaussian distribution function used to determine that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem;
a median module 204, configured to determine a median based on the values of the samples and the preset mean;
a sum module 205, configured to determine a sum of the intermediate values subjected to the logarithm operation;
a weight module 206, configured to determine the weight of the K class based on a preset constant, the intermediate value, and the sum value.
In a specific embodiment, as shown in fig. 3, the method further includes:
a processing module 207, configured to repeatedly execute the obtaining module to the weighting module in sequence, so as to determine the weights of all classes in the training data set.
In a specific embodiment, the method further comprises the following steps:
a classification module to determine a final classification of the sample based on the weights.
In a specific embodiment, the intermediate value is determined based on the following formula:
Figure BDA0002829399080000121
said XkjIs a median value, said μkjIs the preset mean value, xijIs the value of the sample, the | ckAnd | is the number of samples belonging to the kth class in the training samples.
In a specific embodiment, the weight is determined based on the following formula:
Figure BDA0002829399080000122
wherein tau is a preset constant, alpha is a hyperparameter, and WkjIs the weight, the XkjIs a median value, said D is the number of samples, said λ1For the introduced Lagrangian constant, said | ckL is the number of samples belonging to the kth class in the training samples, the sigmakA standard deviation of the class k samples estimated for a one-dimensional gaussian function.
Example 3
Embodiment 3 of the present invention also discloses a computer storage medium, as shown in fig. 4, in which a program for executing the method described in embodiment 1 is stored.
Therefore, the embodiment of the invention provides an embedded feature selection method and equipment based on prior probability distribution, wherein the method comprises the following steps: step 1, acquiring a K-th type sample in a training data set; step 2, a preset constant is given, wherein the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem; step 3, determining a preset average value based on the average value of the one-dimensional Gaussian distribution function for determining that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem; step 4, determining a middle value based on the value of the sample and the preset average value; step 5, determining the sum of the intermediate values subjected to logarithmic operation; and 6, determining the weight of the K classes based on a preset constant, the intermediate value and the sum value. The prior probability of the Dirichlet distribution estimation Bayes theory is introduced in the scheme, so that the analytic expression of the algorithm output is realized, and the weight can be simply, quickly and effectively obtained. After the weight is obtained, the attribute representative category with the large weight value can be selected, and data redundancy is reduced.
Those skilled in the art will appreciate that the figures are merely schematic representations of one preferred implementation scenario and that the blocks or flow diagrams in the figures are not necessarily required to practice the present invention.
Those skilled in the art will appreciate that the modules in the devices in the implementation scenario may be distributed in the devices in the implementation scenario according to the description of the implementation scenario, or may be located in one or more devices different from the present implementation scenario with corresponding changes. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The above-mentioned invention numbers are merely for description and do not represent the merits of the implementation scenarios.
The above disclosure is only a few specific implementation scenarios of the present invention, however, the present invention is not limited thereto, and any variations that can be made by those skilled in the art are intended to fall within the scope of the present invention.

Claims (10)

1. An embedded feature selection method based on prior probability distribution is characterized by comprising the following steps:
step 1, acquiring a K-th type sample in a training data set;
step 2, a preset constant is given, wherein the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem;
step 3, determining a preset average value based on the average value of the one-dimensional Gaussian distribution function for determining that the sample belongs to a preset class; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem;
step 4, determining a middle value based on the value of the sample and the preset average value;
step 5, determining the sum of the intermediate values subjected to logarithmic operation;
and 6, determining the weight of the K classes based on a preset constant, the intermediate value and the sum value.
2. The method of claim 1, further comprising:
steps 1-6 are repeated to determine the weights of all classes in the training dataset.
3. The method of claim 1 or 2, further comprising:
determining a final class of the sample based on the weights.
4. The method of claim 1, wherein the intermediate value is determined based on the following formula:
Figure FDA0002829399070000011
said XkjIs a median value, said μkjIs the preset mean value, xijIs the value of the sample, the | ckAnd | is the number of samples belonging to the kth class in the training samples.
5. The method of claim 1, wherein the weight is determined based on the following formula:
Figure FDA0002829399070000021
wherein tau is a preset constant, alpha is a hyperparameter, and WkjIs the weight, the XkjIs a median value, said D is the number of samples, said λ1For the introduced Lagrangian constant, said | ckL is the number of samples belonging to the kth class in the training samples, the sigmakA standard deviation of the class k samples estimated for a one-dimensional gaussian function.
6. An embedded feature selection device based on prior probability distribution, comprising:
the acquisition module is used for acquiring a K-th type sample in the training data set;
the device comprises a first determining module, a second determining module and a control module, wherein the first determining module is used for giving a preset constant, and the preset constant is obtained by a weight based on a Dirichlet distribution function; the Dirichlet distribution is a prior probability used to estimate Bayes' theorem;
the second determining module is used for determining the average value of the one-dimensional Gaussian distribution functions of the samples belonging to the preset class to determine a preset average value; the one-dimensional Gaussian distribution function is used for estimating the conditional probability of Bayes theorem;
a median module for determining a median based on the values of the samples and the preset mean;
the sum module is used for determining the sum of the intermediate values subjected to logarithmic operation;
and the weighting module is used for determining the weighting of the K classes based on a preset constant, the intermediate value and the sum value.
7. The apparatus of claim 6, further comprising:
and the processing module is used for replacing the K value and sequentially and repeatedly executing the acquisition module to the weight module so as to determine the weights of all classes in the training data set.
8. The method of claim 1 or 2, further comprising:
a classification module to determine a final classification of the sample based on the weights.
9. The method of claim 1, wherein the intermediate value is determined based on the following formula:
Figure FDA0002829399070000031
said XkjIs a median value, said μkjIs the preset mean value, xijIs the value of the sample, the | ckAnd | is the number of samples belonging to the kth class in the training samples.
10. A computer storage medium having a program stored therein for executing the method of claims 1-5.
CN202011438665.2A 2020-12-10 2020-12-10 Embedded feature selection method and equipment based on prior probability distribution Pending CN112529077A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011438665.2A CN112529077A (en) 2020-12-10 2020-12-10 Embedded feature selection method and equipment based on prior probability distribution

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011438665.2A CN112529077A (en) 2020-12-10 2020-12-10 Embedded feature selection method and equipment based on prior probability distribution

Publications (1)

Publication Number Publication Date
CN112529077A true CN112529077A (en) 2021-03-19

Family

ID=74999281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011438665.2A Pending CN112529077A (en) 2020-12-10 2020-12-10 Embedded feature selection method and equipment based on prior probability distribution

Country Status (1)

Country Link
CN (1) CN112529077A (en)

Similar Documents

Publication Publication Date Title
CN110287983B (en) Single-classifier anomaly detection method based on maximum correlation entropy deep neural network
CN105975573B (en) A kind of file classification method based on KNN
Bharadiya A tutorial on principal component analysis for dimensionality reduction in machine learning
US10699207B2 (en) Analytic system based on multiple task learning with incomplete data
CN103177265B (en) High-definition image classification method based on kernel function Yu sparse coding
CN115457311B (en) Hyperspectral remote sensing image band selection method based on self-expression transfer learning
CN117789038B (en) Training method of data processing and recognition model based on machine learning
CN116805051A (en) Double convolution dynamic domain adaptive equipment fault diagnosis method based on attention mechanism
CN114463587A (en) Abnormal data detection method, device, equipment and storage medium
CN113935413A (en) Distribution network wave recording file waveform identification method based on convolutional neural network
CN117705059A (en) Positioning method and system for remote sensing mapping image of natural resource
Krishnan et al. Mitigating sampling bias and improving robustness in active learning
CN117154256A (en) Electrochemical repair method for lithium battery
CN111652265A (en) Robust semi-supervised sparse feature selection method based on self-adjusting graph
Huang et al. Resolving intra-class imbalance for gan-based image augmentation
CN111768214A (en) Product attribute prediction method, system, device and storage medium
CN112529077A (en) Embedded feature selection method and equipment based on prior probability distribution
CN115423090A (en) Class increment learning method for fine-grained identification
Zhao et al. Financial time series data prediction by combination model Adaboost-KNN-LSTM
CN113688229B (en) Text recommendation method, system, storage medium and equipment
Wu et al. Research on application of an improved deep convolutional neural network in handwritten character recognition
Ru et al. Interactive change detection using high resolution remote sensing images based on active learning with gaussian processes
CN116740345A (en) Few-sample remote sensing image segmentation method based on progressive analysis and common distillation
CN110728615B (en) Steganalysis method based on sequential hypothesis testing, terminal device and storage medium
Qiu et al. PointAS: an attention based sampling neural network for visual perception

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination