CN109978080B - Image identification method based on discrimination matrix variable limited Boltzmann machine - Google Patents

Image identification method based on discrimination matrix variable limited Boltzmann machine Download PDF

Info

Publication number
CN109978080B
CN109978080B CN201910297655.2A CN201910297655A CN109978080B CN 109978080 B CN109978080 B CN 109978080B CN 201910297655 A CN201910297655 A CN 201910297655A CN 109978080 B CN109978080 B CN 109978080B
Authority
CN
China
Prior art keywords
model
matrix
label
variable
probability
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910297655.2A
Other languages
Chinese (zh)
Other versions
CN109978080A (en
Inventor
尹宝才
田鹏宇
李敬华
孔德慧
王立春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Publication of CN109978080A publication Critical patent/CN109978080A/en
Application granted granted Critical
Publication of CN109978080B publication Critical patent/CN109978080B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image identification method based on a discriminant matrix variable limited Boltzmann machine model, wherein the discriminant matrix variable limited Boltzmann machine is used for two-dimensional image classification and is marked as DisMVRBM, and the model can directly model an image without vectorization and retains the structural information of an original sample. Compared with MVRBM, the model is added with a label layer, which means that label information is blended in the process of extracting the features, so that the extracted features have discriminability and the classification performance can be improved; and because the label layer model is added, the model can be directly used as an independent classifier without linking other classifiers, and the fine tuning training stage of other classifiers is omitted.

Description

Image identification method based on discrimination matrix variable limited Boltzmann machine
Technical Field
The invention belongs to the technical field of pattern recognition, and particularly relates to an image recognition method based on a discriminant matrix variable limited Boltzmann machine model.
Background
An Artificial Neural Network (ANN) is a computational model built by mimicking the structure and function of a biological Neural Network, and a typical ANN is composed of a large number of simple processing nodes (Artificial neurons) that are hierarchically structured and related to each other in a specified manner. Some nodes are visible to the outside and others are hidden from the outside, the association between two nodes is a weight. Training an ANN model involves calculating weighting coefficients based on training data.
A Restricted Boltzmann Machine (RBM) is a random neural network based on statistical mechanics, can fit any discrete distribution, and is often used for the construction of a multilayer structure of a Deep Belief Network (DBN) and different Machine learning problems, such as data dimension reduction, face recognition, collaborative filtering, reconstruction, noise reduction, and the like. The input layer and the hidden layer of the RBM are in a vector form, and when the data is a high-order tensor, vectorization is usually required, and the high-order tensor data vectorization can destroy the spatial structure of the data and lose useful spatial information. In order to not destroy the spatial structure of data and the intrinsic correlation information thereof, Tu et al propose a tensor variable limited Boltzmann machine, but the hidden layer of the model is still in a vector form. The RBM is developed into a Matrix Variable Restricted Boltzmann Machine (MVRBM) by the Qinlei et al, and the model adopts an expression form that an input layer and a hidden layer are both matrixes. Although the matrix form can keep the spatial structure information of the data, the matrix form is also unsupervised and trained like RBM, the label information is not utilized when the features are extracted, and therefore the extracted features have no strong discriminability.
McCallum indicates that it is beneficial to utilize label information in the feature learning process. To extract discriminative features, many people have begun to use label information in the training process. Yang et al studied methods for modeling multimodal data and category information together and for video classification. Schmah proposes a discriminant training method of RBM, which trains an RBM for each type of data, and the method is similar to a Bayesian classifier. Hugo et al propose a class-limited Boltzmann machine learning algorithm. Furthermore, inspired by a discriminant monitoring subspace model, Guo et al add the monitoring subspace constraint to the RBM hidden layer, the models are all vector variable oriented models, that is, the input is vector data, for high-order signals such as images/videos, high-dimensional data needs to be stretched into vectors, and the way of processing data inevitably loses the spatial structure information of the high-dimensional data.
The invention improves MVRBM aiming at the problem that the MVRBM can not extract characteristics with discriminability, namely, the label information of data is fully utilized during training to ensure that the extracted characteristics have discriminability; and the proposed model can be used directly for classification without additional other classifiers to perform the classification task.
Disclosure of Invention
The invention provides an image identification method based on a discriminant matrix variable limited Boltzmann machine model, and provides a discriminant matrix variable limited Boltzmann machine for two-dimensional image classification, which is recorded as DisMVRBM. The model can directly model the image without vectorization, and the structural information of the original sample is reserved. Compared with MVRBM, the model is added with a label layer, which means that label information is blended in the process of extracting the features, so that the extracted features have discriminability and the classification performance can be improved; and because the label layer model is added, the model can be directly used as an independent classifier without linking other classifiers, and the fine tuning training stage of other classifiers is omitted.
Drawings
Fig. 1 is a schematic diagram of a DisMVRBM model according to the present invention.
Detailed Description
The invention provides an image identification method based on a discriminant matrix variable limited Boltzmann machine model, which comprises the following steps of:
step 1, discriminant matrix variable limited boltzmann machine model
The energy function of the matrix variable limited boltzmann model is defined as:
Figure BDA0002027160380000031
here, the following are defined: x ═ Xij]∈iI*JThe matrix variable of the visual layer represents input data, namely input images, and the size of each frame of image is I multiplied by J; h ═ Hkl]∈iK*LRepresenting the characteristics with discriminability of the input data extracted based on the model, namely representing the characteristics of the input image, wherein the size of the characteristics is K × L;
Figure BDA0002027160380000032
the connection weight of X and H is a fourth-order tensor variable which represents a nonlinear mapping relation between the input image and the features extracted by the model; b ═ Bij]∈iI*JA bias matrix variable for the visible layer, representing an offset of the input data; c ═ Ckl]∈iK*LThe offset matrix variable of the hidden layer represents the offset of the output characteristic.
Further, a joint probability distribution of the visible layer and the hidden layer, i.e. a joint probability of the input image and the feature to which the model is fitted, may be defined based on the energy function, as in equation (2):
Figure BDA0002027160380000041
and defining a log-likelihood function based on the joint probability distribution:
Figure BDA0002027160380000042
and then, with the goal of maximizing a log-likelihood function, the probability of all samples occurring under an optimal set of model parameters is maximized by learning the model parameters between the visible layer and the hidden layer, so that the effective representation of the input data is obtained.
However, the MVRBM is still an unsupervised generation model with expressive force and can well extract the characteristics of input data; when used for classification tasks, the conventional Neural Network (NN) is typically combined, the NN is initialized based on the model parameters of the MVRBM, and classification is performed after the NN is fine-tuned through a back propagation algorithm.
In order to avoid the problems of fine adjustment operation and the fact that NN may be trapped in local optimization, the matrix variable limited Boltzmann machine based on discrimination is adopted for two-dimensional image classification and is marked as DisMVRBM, namely category constraint is added on the basis of an original MVRBM model, so that the improved MVRBM has classification capability, as shown in figure 1.
DisMVRBM aims at modeling the number D of input images by means of hidden layer features Htrain={X(1),...,X(n),...,X(N)And the corresponding category label Y ═ Yzt]∈RZ*TAnd Z is a joint distribution of 1, thus defining a class-constrained energy function as follows:
Figure BDA0002027160380000043
where x, h, w, b and c are as defined above, the added label related parts are as defined below: y ═ yzt]∈iZ*TThe label matrix variable of the visible layer identifies the category of the input data, namely the label corresponding to the input image, wherein, Z is a constant, and therefore can be regarded as a vector variable; p ═ Pztkl]∈iZ*T*K*LThe connection weight of Y and H is a fourth-order tensor variable which represents a nonlinear mapping relation between a label of an input image and output characteristics; d ═ Dzt]=[dt]∈iZ*TThe offset matrix variable of the label layer represents the offset of the label and can be regarded as a vector variable in the same way;
the tag layer is a one-bit effective encoding vector, that is, if the tag of the input data is the t-th class, the t-th component of the tag layer vector corresponding to the data is 1, and other components are set to zero.
Because the weight of the model is the fourth-order tensor, the data volume is greatly increased, and the time complexity of the model training stage is high. In order to reduce model parameters and computational complexity, the invention assumes that the connection weights of the hidden layer unit and the visible layer, and the hidden layer and the label layer have a certain specific structure, thereby greatly reducing the number of free parameters, and the specific structure is to decompose the weight tensor:
wijkl=ukivljand pztkl=qkzrlt
By defining the matrix form:
U=[uki]∈iK*I,V=[vlj]∈iL*J,Q=[qkz]∈iK*Z,R=[rlt]∈iL*T
thus, the energy function of the deformed DisMVRBM is obtained as follows:
E(X,Y,H;Θ)=-tr(UTHVXT)-tr(XTB)-tr(QTHRYT)-tr(YTD)-tr(HTC)
(5)
where Θ ═ { U, V, Q, R, B, C, D } represents all parameters of the model.
Based on the above formula, the joint probability of X, Y, H, i.e. the joint probability of the input image, the feature and the corresponding label:
Figure BDA0002027160380000051
the normalization constant Z (Θ) in the above formula is defined as:
Figure BDA0002027160380000061
the probability of a unit of the hidden layer being activated, i.e. the probability of a feature being activated:
Figure BDA0002027160380000062
where σ (a) ═ 1/(1+ exp (-a)), expressed in a matrix:
p(H=1|X,Y;Θ)=σ(C+UXVT+QYRT) (9)
equation (8) represents that the probability that each element of the hidden layer H is 1 is calculated one by one, and σ calculation is applied to each corresponding matrix element.
The activation probability of a certain unit of the visual layer, namely the activation probability of a certain input image pixel:
Figure BDA0002027160380000063
the matrix form is represented as:
p(X=1|H;Θ)=σ(B+UTHV) (11)
like equation (8), equation (10) represents calculating the probability that any one element of the visible layer X is 1 one by one, and the σ calculation is applied to each corresponding matrix element.
Figure BDA0002027160380000064
Wherein, y zt1 indicates that the training image data belongs to the t-th class.
The matrix form represents:
Figure BDA0002027160380000065
here, the subscript t referred to in the denominator indicates that the label belongs to the t-th class, the numerator t*Representing the categories of all possible tags.
The joint probability distribution for the given parameters Θ, X, Y is:
Figure BDA0002027160380000071
step 2, solving the discriminant matrix variable limited boltzmann model
Suppose that a given set of training image data sets D containing N samplestrain={X(1),...,X(n),...,X(N)The invention aims to estimate the parameter theta based on the maximum likelihood method by taking the following conditional probability as an objective function, wherein the likelihood function is
Figure BDA0002027160380000072
Wherein N is the number of samples;n represents the nth sample; y is(n)Is a vector, so here and hereinafter y is used(n)In place of Y(n)The t-th component is 1 and the remaining components are all 0, i.e.
Figure BDA0002027160380000073
Denotes y(n)The t-th component of (A), and
Figure BDA0002027160380000074
is 1, representing data X(n)Is t; Θ represents all model parameters. The objective function described above is aimed at solving the problem that, for an input sample X, under the current model parameters(n)The label is y(n)The probability of (c) is the greatest.
Obtaining the following according to a conditional probability formula:
Figure BDA0002027160380000075
derivative of the objective function to the model parameters:
Figure BDA0002027160380000081
to calculate (17), three parts to the right of the second equal sign in equation (17) need to be calculated, respectively:
Figure BDA0002027160380000082
p(H|X(n),y(n)),p(y,H|X(n))。
these three parts are calculated separately below:
calculation of
Figure BDA0002027160380000083
Wherein the compound represented by the formula (1):
Figure BDA0002027160380000084
Figure BDA0002027160380000085
Figure BDA0002027160380000086
Figure BDA0002027160380000087
Figure BDA0002027160380000091
Figure BDA0002027160380000092
the above (18.1) to (18.6) are the calculation methods for each element in the matrix.
Calculate p (H | X)(n),y(n)):
Because of the fact that
Figure BDA0002027160380000093
All of them contain hkl,hklIs an element in the matrix variable H.
So here are:
Figure BDA0002027160380000094
the above formula is the way each element in the matrix H is calculated.
Calculate p (y, H | X)(n)):
Firstly, simplification:
Figure BDA0002027160380000095
the molecule y of the above formula expresses a specific category; y on the denominator indicates that all categories are to be traversed.
Wherein the molecule:
Figure BDA0002027160380000101
at this point, the partial derivatives of the objective function in equation (17) for each parameter can be obtained and then substituted into the calculation results (17) of (18), (19) and (20).
Given the particularity of (18.6), this is given separately:
Figure BDA0002027160380000102
wherein, p (y)t|X(n)) Representation by training data X(n)And calculating the probability value of the t-th class.
Finally, the objective function is maximized by optimizing by a gradient ascent method, and each result of the formula (17) is substituted into the following formula for updating:
Figure BDA0002027160380000103
and the theta belongs to the theta, and the lambda is the learning rate, and after multiple iterations, the optimized model is obtained.
And (3) experimental verification:
the effectiveness of the method for image recognition is verified through a comparison experiment with the similar method. The experimental part is designed with two types of experiments, wherein the first experiment aims to verify the superiority of the discriminant matrix variable restricted Boltzmann machine (DisMVRBM) relative to an RBM, an MVRBM and corresponding variants thereof and other unsupervised methods; the second experiment aims to verify the superiority of the discriminant matrix variable-constrained boltzmann machine (DisMVRBM) relative to the discriminant vector variable-constrained boltzmann machine (DisRBM).
The experimental data set used in the present invention is as follows:
MNIST Database: the MNIST dataset is a handwritten digit dataset that includes 0-9 ten digits of 60,000 training images and 10,000 test images. Each image is a 28 x 28 gray scale image.
ETH-80 Database: the ETH-80 dataset contains 8 classes of objects (apple, car, cow, cup, dog, horse, pear, tomato), and in each class of object set, 41 images at different perspectives of 10 different objects of that class are contained, i.e., 10 different objects are contained in each class, and each object contains 41 frames of image data, for a total of 8 × 10 × 41 — 3,280 frames of images. The present invention first down samples each image to 32 x 32 and converts each image to a grayscale image.
Ballet Database: the entire data set contains 8 complex ballet actions, 44 pieces of video cut from the ballet DVD, each containing 107 to 506 frames. The present invention randomly selects 200 frames from each of the 8 actions as training data. Each frame image is down-sampled to 32 x 32 size and the image is converted to a grayscale image.
Coil _20: containing 20 different classes of objects, each class of objects having 72 images from different perspectives, each frame of images was down-sampled to 32 x 32 size as training data.
Experiment one: DisMVRBM compares the effect against other unsupervised RBMs and variants thereof.
Experiment one aims to verify the superiority of discriminant matrix variable limited boltzmann machine added with class constraint compared with other unsupervised methods, and the comparison methods comprise traditional RBM, IGBRBM (Gaussian distribution limited boltzmann machine), MVRBM and MVGRBM (Gaussian distribution matrix variable limited boltzmann machine).
In the first experiment, the sizes of the hidden layers of the comparison model and the model provided by the invention are both 28 x 28, the weight learning rate between the hidden layer and the visible layer of the model is 0.01, the weight attenuation is 10^ -3, the weight learning rate between the hidden layer and the label layer is 0.01, the weight attenuation is 10^ -6, and the sizes of the visible layer and the input data are consistent. The comparative results for experiment one are shown in table 1:
TABLE 1 comparison of recognition accuracy of discriminative MVRBM versus other non-discriminative methods
(unit:%)
RBM IGRBM MVRBM MVIGRBM DisMVRBM
MNIST 0.9494 0.9365 0.9658 0.9665 0.9725
Ballet_32 0.3566 0.7063 0.3505 0.9323 0.9509
ETH-80 0.5281 0.8750 0.3319 0.88 0.9053
Table 1 shows the recognition accuracy results of different models in multiple data sets, and it can be seen that the recognition accuracy of the model provided by the present invention in the data sets of MNIST, Ballet _32 and ETH-80 is higher than that of the comparison models RBM, IGRBM, MVRBM and MVIGRBM.
This is because the four models used for comparison are all generative models, and unsupervised training methods are employed, without using the label information of the data. When the classification task of the experimental design is executed, the traditional Neural Network (NN) is combined, NN parameters are initialized based on the training result of the comparison model, then the NN parameters are finely adjusted through a back propagation algorithm, and finally classification is carried out based on the NN. The model DisMVRBM provided by the invention integrates label information, so that on one hand, the extracted features of the provided model have discriminability, thereby being beneficial to classification tasks; on the other hand, because the model provided by the invention is added with a label layer, and a supervision training method is adopted, the model can be used as an independent classifier to directly execute a classification task.
Experiment two: DisMVRBM versus DisRBM effect comparison
Experiment two aims at verifying the recognition accuracy of the discriminant matrix variable restricted boltzmann machine relative to the discriminant vector variable restricted boltzmann machine. Thus, DisMVRBM and DisRBM performance was tested on three datasets, Ballet _32, ETH-80, and Coil _20, using the same parameter settings as experiment one. The comparative results for experiment two are shown in table 2:
TABLE 2 DisMVRBM vs DisRBM recognition accuracy (unit:%)
DisRBM DisMVRBM
Ballet_32 0.9114 0.9509
ETH-80 0.5078 0.9053
Coil_20 0.9779 0.9896
Table 2 shows the recognition accuracy of the distmvrbm model provided by the present invention with respect to the distrbm model on different data sets, and the result shows that the classification effect of the matrix variable discriminant model provided by the present invention is superior to that of the conventional vector variable discriminant model, thereby verifying the superiority of the model provided by the present invention. The matrix variable oriented classification model provided by the invention does not need to stretch the two-dimensional image into vectors when executing an image classification task, namely, the original space structure of the image is not damaged. Therefore, the classification result of the model provided by the invention has better effect than that of a one-dimensional comparison model.

Claims (1)

1. An image identification method based on a discriminant matrix variable limited Boltzmann machine model is characterized by comprising the following steps of:
step 1, discriminant matrix variable limited boltzmann machine model
The energy function of the matrix variable limited boltzmann model is defined as:
Figure FDA0003054741640000011
wherein,
Figure FDA0003054741640000012
The matrix variable of the visual layer represents input data, namely input images, and the size of each frame of image is I multiplied by J;
Figure FDA0003054741640000013
representing the characteristics with discriminability of the input data extracted based on the model, namely representing the characteristics of the input image, wherein the size of the characteristics is K × L;
Figure FDA0003054741640000014
the connection weight of X and H is a fourth-order tensor variable which represents a nonlinear mapping relation between the input image and the features extracted by the model;
Figure FDA0003054741640000015
a bias matrix variable for the visible layer, representing an offset of the input data;
Figure FDA0003054741640000016
the offset matrix variable of the hidden layer represents the offset of the output characteristic;
a joint probability distribution of the visible layer and the hidden layer, i.e. a joint probability of the input image and the feature to which the model is fitted, may be defined based on the energy function, as in equation (2):
Figure FDA0003054741640000017
and defining a log-likelihood function based on the joint probability distribution:
Figure FDA0003054741640000018
further, a matrix variable limited Boltzmann machine based on discrimination is adopted for two-dimensional image classification and is marked as DisMVRBM, namely category constraint is added on the basis of an original MVRBM model, so that the improved MVRBM has classification capability, and the MVRBM is the matrix variable limited Boltzmann machine;
DisMVRBM aims at modeling the number D of input images by means of hidden layer features Htrain={X(1),...,X(n),...,X(N)And the corresponding category label Y ═ Yzt]∈RZ*TAnd Z is a joint distribution of 1, thus defining a class-constrained energy function as follows:
Figure FDA0003054741640000021
wherein the content of the first and second substances,
Figure FDA0003054741640000022
identifying the category of input data, namely a label corresponding to an input image, for a visible layer label matrix variable, wherein Z is a constant, and therefore, the variable can be regarded as a vector variable;
Figure FDA0003054741640000023
the connection weight of Y and H is a fourth-order tensor variable which represents a nonlinear mapping relation between the label of the input image and the output characteristic;
Figure FDA0003054741640000024
the offset matrix variable of the label layer represents the offset of the label, and can be regarded as a vector variable similarly;
wherein, the label layer is a one-bit effective coding vector, that is, if the label of the input data is the t-th class, the t-th component of the label layer vector corresponding to the data is 1, and other components are all set to zero,
assuming that the connection weights of the hidden layer unit and the visible layer, and the hidden layer and the label layer have a specific structure, the specific structure is to decompose the weight tensor:
Figure FDA0003054741640000025
and
Figure FDA0003054741640000026
by defining the matrix form:
Figure FDA0003054741640000027
thus, the energy function of the deformed DisMVRBM is obtained as follows:
E(X,Y,H;Θ)=-tr(UTHVXT)-tr(XTB)-tr(QTHRYT)-tr(YTD)-tr(HTC) (5)
wherein Θ ═ { U, V, Q, R, B, C, D } represents all parameters of the model,
based on the above formula, the joint probability of X, Y, H, i.e. the joint probability of the input image, the feature and the corresponding label:
Figure FDA0003054741640000028
the normalization constant Z (Θ) in the above formula is defined as:
Figure FDA0003054741640000031
the probability of a unit of the hidden layer being activated, i.e. the probability of a feature being activated:
Figure FDA0003054741640000032
where σ (a) ═ 1/(1+ exp (-a)), expressed in a matrix:
p(H=1|X,Y;Θ)=σ(C+UXVT+QYRT) (9)
equation (8) represents the probability that each element of the hidden layer H is 1 is computed one by one, the σ computation is applied to each corresponding matrix element,
the activation probability of a certain unit of the visual layer, namely the activation probability of a certain input image pixel:
Figure FDA0003054741640000033
the matrix form is represented as:
p(X=1|H;Θ)=σ(B+UTHV) (11)
like formula (8), formula (10) represents that the probability that any one element of the visible layer X is 1 is calculated one by one, and σ calculation is applied to each corresponding matrix element, and similarly, the conditional probability of each component of the label layer is as follows:
Figure FDA0003054741640000034
wherein, yzt1 indicates that the training image data belongs to the t-th class,
the matrix form represents:
Figure FDA0003054741640000035
here, the subscript t referred to in the denominator indicates that the label belongs to the t-th class, the numerator t*Indicates the category of all possible tags,
the joint probability distribution for the given parameters Θ, X, Y is:
Figure FDA0003054741640000041
step 2, solving the discriminant matrix variable limited boltzmann model
Suppose that a given set of training image data sets D containing N samplestrain={X(1),...,X(n),...,X(N)The following conditional probability is taken as an objective functionEstimating a parameter theta based on a maximum likelihood method, wherein a likelihood function is
Figure FDA0003054741640000042
Wherein N is the number of samples; n represents the nth sample; y is(n)Is a vector, so here and hereinafter y is used(n)In place of Y(n)The t-th component is 1 and the remaining components are all 0, i.e.
Figure FDA0003054741640000043
Denotes y(n)The t-th component of (A), and
Figure FDA0003054741640000044
is 1, representing data X(n)Is t; theta denotes all model parameters and the objective function is intended to make the input sample X at the current model parameters(n)The label is y(n)The probability of (a) being the highest,
obtaining the following according to a conditional probability formula:
Figure FDA0003054741640000045
derivative of the objective function to the model parameters:
Figure FDA0003054741640000051
to calculate (17), three parts to the right of the second equal sign in equation (17) need to be calculated, respectively:
Figure FDA0003054741640000052
p(H|X(n),y(n)),p(y,H|X(n)),
these three parts are calculated separately below:
calculation of
Figure FDA0003054741640000053
Wherein the compound represented by the formula (1):
Figure FDA0003054741640000054
Figure FDA0003054741640000055
Figure FDA0003054741640000056
Figure FDA0003054741640000057
Figure FDA0003054741640000058
Figure FDA0003054741640000059
the above (18.1) - (18.6) are the calculation of each element in the matrix,
calculate p (H | X)(n),y(n)):
Because of the fact that
Figure FDA0003054741640000061
All of them contain hkl,hklIs an element in the matrix variable H,
so here are:
Figure FDA0003054741640000062
the above equation is a way of calculating each element in the matrix H,
calculate p (y, H | X)(n)):
Firstly, simplification:
Figure FDA0003054741640000063
the molecule y of the above formula expresses a specific category; y on the denominator indicates that all categories are to be traversed,
wherein the molecule:
Figure FDA0003054741640000064
at last, the partial derivatives of the objective function in the formula (17) for each parameter can be obtained and the calculation results of (18), (19) and (20) can be substituted into the formula (17),
given the particularity of (18.6), this is given separately:
Figure FDA0003054741640000065
wherein, p (y)t|X(n)) Representation by training data X(n)The probability value of the t-th class is calculated,
finally, the objective function is maximized by optimizing by a gradient ascent method, and each result of the formula (17) is substituted into the following formula for updating:
Figure FDA0003054741640000071
and the theta belongs to the theta, and the lambda is the learning rate, and after multiple iterations, the optimized model is obtained.
CN201910297655.2A 2018-04-16 2019-04-15 Image identification method based on discrimination matrix variable limited Boltzmann machine Active CN109978080B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN2018103366215 2018-04-16
CN201810336621.5A CN108510009A (en) 2018-04-16 2018-04-16 A kind of image-recognizing method being limited Boltzmann machine based on discrimination matrix variable

Publications (2)

Publication Number Publication Date
CN109978080A CN109978080A (en) 2019-07-05
CN109978080B true CN109978080B (en) 2021-06-25

Family

ID=63382358

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201810336621.5A Withdrawn CN108510009A (en) 2018-04-16 2018-04-16 A kind of image-recognizing method being limited Boltzmann machine based on discrimination matrix variable
CN201910297655.2A Active CN109978080B (en) 2018-04-16 2019-04-15 Image identification method based on discrimination matrix variable limited Boltzmann machine

Family Applications Before (1)

Application Number Title Priority Date Filing Date
CN201810336621.5A Withdrawn CN108510009A (en) 2018-04-16 2018-04-16 A kind of image-recognizing method being limited Boltzmann machine based on discrimination matrix variable

Country Status (1)

Country Link
CN (2) CN108510009A (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112632466B (en) * 2020-11-26 2024-01-23 江苏科技大学 Bearing fault prediction method based on deep bidirectional long-short-time memory network
CN113643722B (en) * 2021-08-27 2024-04-19 杭州电子科技大学 Urban noise identification method based on multilayer matrix random neural network

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488563A (en) * 2015-12-16 2016-04-13 重庆大学 Deep learning oriented sparse self-adaptive neural network, algorithm and implementation device
CN107272644A (en) * 2017-06-21 2017-10-20 哈尔滨理工大学 The DBN network fault diagnosis methods of latent oil reciprocating oil pumping unit

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160112186A (en) * 2015-03-18 2016-09-28 삼성전자주식회사 Method and apparatus for event-based learning in neural network
US10339442B2 (en) * 2015-04-08 2019-07-02 Nec Corporation Corrected mean-covariance RBMs and general high-order semi-RBMs for large-scale collaborative filtering and prediction
CN106886798A (en) * 2017-03-10 2017-06-23 北京工业大学 The image-recognizing method of the limited Boltzmann machine of the Gaussian Profile based on matrix variables

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488563A (en) * 2015-12-16 2016-04-13 重庆大学 Deep learning oriented sparse self-adaptive neural network, algorithm and implementation device
CN107272644A (en) * 2017-06-21 2017-10-20 哈尔滨理工大学 The DBN network fault diagnosis methods of latent oil reciprocating oil pumping unit

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
An Infinite Restricted Boltzmann Machine;Marc-Alexandre Cté, Larochelle H;《Neural Computation》;20161215;第1-24页 *
基于深度学习编码模型的图像分类方法;赵永威,李婷,蔺博宇;《工程科学与技术》;20170115;第213-220页 *
基于混合深度信念网络的多类文本表示与分类方法;翟文洁,闫琰,张博文,殷绪成;《情报工程》;20161015;第30-40页 *

Also Published As

Publication number Publication date
CN109978080A (en) 2019-07-05
CN108510009A (en) 2018-09-07

Similar Documents

Publication Publication Date Title
Gao et al. Flow contrastive estimation of energy-based models
CN110689086B (en) Semi-supervised high-resolution remote sensing image scene classification method based on generating countermeasure network
Hadji et al. What do we understand about convolutional networks?
Patel et al. A probabilistic theory of deep learning
CN106991372B (en) Dynamic gesture recognition method based on mixed deep learning model
Bodapati et al. Feature extraction and classification using deep convolutional neural networks
US9489568B2 (en) Apparatus and method for video sensor-based human activity and facial expression modeling and recognition
US20190087726A1 (en) Hypercomplex deep learning methods, architectures, and apparatus for multimodal small, medium, and large-scale data representation, analysis, and applications
Li et al. Facial expression recognition using deep neural networks
Bengio et al. Unsupervised feature learning and deep learning: A review and new perspectives
US20150104102A1 (en) Semantic segmentation method with second-order pooling
CN107085704A (en) Fast face expression recognition method based on ELM own coding algorithms
Sun et al. Recognition of SAR target based on multilayer auto-encoder and SNN
Balasubramanian et al. Smooth sparse coding via marginal regression for learning sparse representations
Tariyal et al. Greedy deep dictionary learning
CN109978080B (en) Image identification method based on discrimination matrix variable limited Boltzmann machine
Kanan Fine-grained object recognition with gnostic fields
Cho et al. Tikhonov-type regularization for restricted Boltzmann machines
Yao A compressed deep convolutional neural networks for face recognition
Caroppo et al. Facial Expression Recognition in Older Adults using Deep Machine Learning.
CN107563287B (en) Face recognition method and device
CN109063766B (en) Image classification method based on discriminant prediction sparse decomposition model
CN113887509B (en) Rapid multi-modal video face recognition method based on image set
Tang et al. Analysis dictionary learning for scene classification
CN107341485B (en) Face recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant