CN110110794B - Image classification method for updating neural network parameters based on feature function filtering - Google Patents

Image classification method for updating neural network parameters based on feature function filtering Download PDF

Info

Publication number
CN110110794B
CN110110794B CN201910389454.5A CN201910389454A CN110110794B CN 110110794 B CN110110794 B CN 110110794B CN 201910389454 A CN201910389454 A CN 201910389454A CN 110110794 B CN110110794 B CN 110110794B
Authority
CN
China
Prior art keywords
hidden layer
parameter
parameters
weight parameter
function
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910389454.5A
Other languages
Chinese (zh)
Other versions
CN110110794A (en
Inventor
文成林
翟凯凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN201910389454.5A priority Critical patent/CN110110794B/en
Publication of CN110110794A publication Critical patent/CN110110794A/en
Application granted granted Critical
Publication of CN110110794B publication Critical patent/CN110110794B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an image classification method for updating neural network parameters based on characteristic function filtering. The characteristic function filtering used in the invention only needs to assume that the measurement error has a mean value and the model noise has a distribution function. The invention effectively solves the problems of local convergence, excessive calculation complexity and the like in the general neural network parameter updating method for image classification, realizes the online self-adaptive updating of the neural network parameters, and can update the network parameters without combining the old image samples when a new image sample set is input, so that the network model can adapt to the change of the image working condition.

Description

Image classification method for updating neural network parameters based on feature function filtering
Technical Field
The invention belongs to the technical field of image classification in artificial intelligence, and relates to an image classification method for updating neural network parameters based on feature function filtering.
Background
Artificial intelligence is a new technical science for studying and developing theories, methods, techniques and application systems for simulating, extending and expanding human intelligence. Artificial intelligence is a branch of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence, a field of research that includes robotics, language recognition, image recognition, natural language processing, and expert systems, among others.
The neural network is an arithmetic mathematical model which imitates the behavior characteristics of the animal neural network and performs distributed parallel information processing. The network achieves the purpose of processing information by adjusting the mutual connection relationship among a large number of internal nodes depending on the complexity of the system, and has self-learning and self-adapting capabilities. The intelligent network system is an important component of artificial intelligence and consists of an input layer, an output layer and a single hidden layer or a plurality of hidden layers contained between the input layer and the output layer. The structural design of the neural network mainly comprises the following parts: determining the number of hidden layers, the number of nodes in each hidden layer, selecting an excitation function of each node and the like. When the structural problem is determined, the most important of the remaining problems is to construct an objective function and how to identify a plurality of parameters included in the network under certain criteria.
For parameter identification methods in the neural network, methods such as a gradient descent method, a least square method and the like have some defects. For example, the step length in the gradient descent method iterative algorithm is difficult to select and has a general standard; the complexity of the algorithm increases exponentially with the number of hidden layers and the number of nodes in each hidden layer; since it is a local linearization problem in nature, it will generate a large training error due to the increased nonlinearity of the objective function, and it will easily converge to a local extremum.
Image classification is a very active research direction in the fields of computer vision, pattern recognition and machine learning. Image classification and detection are widely applied in many fields, including face recognition, pedestrian detection, intelligent video analysis, pedestrian tracking and the like in the security field, traffic scene object recognition, vehicle counting, retrograde motion detection, license plate detection and recognition in the traffic field, content-based image retrieval, automatic album classification and the like in the internet field. The image classification is usually implemented using a neural network, however, as described above, the problem in the parameter updating method of the neural network is the bottleneck of the image classification accuracy and the time complexity. Therefore, a new neural network parameter updating method has important significance for image classification.
Disclosure of Invention
Aiming at the defects of the prior art, the invention designs an image classification method for updating neural network parameters based on feature function filtering. The characteristic function filtering used in the method only needs to assume that the measurement error has a mean value, the model noise has a distribution function, the problems of local linearization and local convergence do not exist, and the real-time online self-adaptive updating of the network parameters can be realized. The method is used for image identification and classification, can improve the accuracy of image classification and reduce the time complexity of parameter updating.
The invention comprises the following steps:
step (1) establishing a sample input set x (k) ═ x1(k),x2(k),…,xn(k)]TTo output set y (k) ═ y1(k),y2(k),…,ym(k)]TThe neural network model of the relationship mapping between the samples, wherein the sample input set is the feature value of each image sample after preprocessing, the sample output set is the classification category of each corresponding image, k is the selection of the kth sample set, and xn(k) For the nth input of the kth sample, ym(k) Is the mth output of the kth sample.
Figure BDA0002055961570000021
Wherein g (-) is an activation function, usually selected from sigmoid function, ReLU function, Gaussian function, polynomial, etc., ωi=[ωi1i2,…,ωil]TI is 1, …, n and a are weight parameter and bias parameter of hidden layer, ω isiEach item in the list is the component, l is the number of nodes of the single hidden layer, and beta is the weight parameter of the output layer.
The loss function of which has the general form of
Figure BDA0002055961570000022
Wherein,
Figure BDA0002055961570000024
and establishing a picture classification result for the model.
The specific form of the loss function used herein is
Figure BDA0002055961570000023
Step (2) initializing weight parameters and bias parameters of a hidden layer and weight parameters of an output layer in a network
In each iteration, under the condition that hidden layer weight parameters and bias parameters are randomly given, all parameter solving problems of the network are converted into a problem of solving the output layer weight parameters beta through least squares. The algorithm is described in detail as follows:
when the activation function of the hidden layer is infinitely differentiable, the neural network does not need to solve all parameters any more, the hidden layer weight parameter and the hidden layer bias parameter can be realized in a random selection mode and are kept unchanged in the whole process, and at the moment, if the description of the model in the formula (1) is changed into the following form:
y(k)=H(k)β (3)
wherein
H(k)=[H1(k) H2(k) … Hl(k)]T
Figure BDA0002055961570000031
Then since the hidden layer weight parameters and hidden layer bias parameters are determined so that h (k) is known, the problem at this time can be transformed into how to solve the output layer weight parameters β by equation (3), and the objective function is also transformed from equation (2) into the following form:
Figure BDA0002055961570000032
here, a least squares method is used, then the solution is
Figure BDA0002055961570000033
In the above formula, H-1Is the Moore-Penrose inverse of the hidden layer output matrix H.
And (3) updating the output layer weight parameter beta of the neural network through inputting the current new image sample.
Where Kalman filtering is used to perform real-time updates of the output layer weight parameters β. To perform real-time updating of the parameters using Kalman filtering, the state equations and measurement equations that conform to Kalman filtering must be established. Considering that the weight parameter beta of the output layer to be estimated is slowly changed by certain random interference, the state equation of Kalman filtering is modeled as follows:
β(k+1)=A(k+1,k)β(k)+w(k) (4)
in order to simulate the interference on the parameter to be estimated, a white noise sequence w (k) is added into the equation.
From equation (3), the measurement equation can be obtained as follows:
y(k)=Hβ(k)+v(k) (5)
where v (k) is also a white noise sequence, similar to the equation of state.
In the Kalman filtering model, the process noise w (k) and the observation noise v (k) are both white noise sequences, and are constant values in the sampling interval. And E { w (k) w ' (k) } Q and E { v (k) v ' (k) } R, a (k +1, k) E, when w (k) and v (k) are independent from each other, E { w (k) v ' (k) } 0, and β (k) is the kth output layer weight parameter.
Then the optimal estimated value of the weight parameter β of the (k +1) th output layer solved by the model is:
Figure BDA0002055961570000034
wherein,
Figure BDA0002055961570000035
represents a predicted value of the weight parameter beta of the (k +1) th output layer; k (K +1) is the (K +1) th optimal gain array;
Figure BDA0002055961570000036
is the estimated value of the weight parameter beta of the output layer of the (k +1) th.
And (4) updating the hidden layer weight parameter and the hidden layer bias parameter through feature function filtering.
The feature function filtering is a novel non-gaussian filtering method, and in the feature function filtering, when an observation equation is nonlinear with respect to a state variable, if the following two requirements are met:
the method comprises the following steps of 1: { w (k) } and { v (k) } are bounded stationary random processes, x (0) is the initial state, { w (k) }, { v (k) } and x (0) are independent of each other, and the distribution function of { w (k) } is known, and its characteristic function is
Figure BDA0002055961570000041
{ v (k) } mean known, | E (w (k)) > Y<+∞。
The method comprises the following steps: h (-) is a known Bohr measurable and smooth nonlinear function.
Then a filter of the form:
Figure BDA0002055961570000042
Figure BDA0002055961570000043
wherein A (k) is a state transition matrix,
Figure BDA0002055961570000044
is an estimate of the kth state quantity,
Figure BDA0002055961570000045
predicted value for the k +1 th observation, U (k) e Rn×lFor a gain matrix to be designed, the acquisition of u (k) is the core and key of the whole filter design.
Order to
Figure BDA0002055961570000046
The available estimation error equation is
Figure BDA0002055961570000047
The performance index is
Figure BDA0002055961570000048
Wherein,
Figure BDA0002055961570000049
the weighting function K (t) is chosen to ensure J0Is real and bounded, it is a given positive definite weight matrix, and in order to constrain the gain matrix, a filter gain matrix can be found by minimizing this performance index.
The gain matrix K (K +1) solving method and process are given below.
If p is1(k) And p2(k) Are respectively as
Figure BDA00020559615700000410
Figure BDA00020559615700000411
The performance index can be rewritten as
Figure BDA0002055961570000051
To obtain the gain matrix K (K +1), let
Figure BDA0002055961570000052
Obtaining the extreme point of the performance index
Figure BDA0002055961570000053
Due to the fact that
Figure BDA0002055961570000054
Therefore, the solution obtained by equation (13) is the extreme point of the minimum performance index.
The design of the parameter updating method based on the characteristic function can be realized by two steps: firstly, updating the hidden layer weight parameter omega and the bias parameter a, and secondly, updating the output layer weight parameter beta. But is divided into three steps here due to its high complexity. It is described as follows:
for the single hidden layer neural network with the hidden layer weight parameter omega, the bias parameter a and the output layer weight parameter beta determined by the method in the steps (1), (2) and (3), the following three steps are sequentially executed every time a new picture sample is input:
step (4-1) hidden layer weight parameter omega is updated
Firstly, a hidden layer weight parameter omega is updated by using feature function filtering, and if a hidden layer bias parameter a and an output layer weight parameter beta are not changed, the optimal estimation value of the hidden layer weight parameter omega for the (k +1) th sample is
Figure BDA0002055961570000055
In this step, for the update of the hidden layer weight parameter ω, ω ═ ω is used12,…,ωn],ωi=[ωi1i2,…,ωil]TIs a component vector of the hidden layer weight parameter omega, and omega is an n-dimensional vector, the component vector needs to be corresponding to omegaiAnd i is 1, …, and n is updated by modeling solution. For each time omegaiEstimate of (c), assume ωjJ is 1, …, i-1, i +1, …, n is constant. Then by ωiThe following state equations and observation equations may be established for the state variables in the feature function filtering:
ωi(k+1)=A·ωi(k)+w(k) (16)
Figure BDA0002055961570000056
wherein the model noise w (k) only needs to have a distribution function, and the measurement error v (k) only needs to have a mean value. Then the kth ω of the modeli(k) The solving process of the optimal estimated value is as follows:
(a) calculating p1(k)
Solving the composition matrix p of the gain array U (k) as shown in equation (10)1(k) In that respect Wherein K (t) is a weighting function,
Figure BDA0002055961570000061
for a given function of the target feature in question,
Figure BDA0002055961570000062
is the characteristic function of s (k) ═ a (k) e (k),
Figure BDA0002055961570000063
is the characteristic function of q (k +1) ═ G (k +1) w (k + 1).
(b) Calculating p3(k)
Figure BDA0002055961570000064
y (k) classify the class for the kth picture in the sample output set,
Figure BDA0002055961570000065
is an estimate of y (k), and has
Figure BDA0002055961570000066
(c) Calculating gain array U (k)
Figure BDA0002055961570000067
U (k) is a weight matrix with a positive fixed matrix fixed as the hypothetical matrix R (k)
(d) Calculating the component omega of the hidden layer weight parameter to be estimatedi(k) Estimated value at time k
Figure BDA0002055961570000068
Step (4-2) hidden layer bias parameter a updating
And then, updating the hidden layer bias parameter a by using feature function filtering, and assuming that the hidden layer weight parameter omega and the output layer weight parameter beta are unchanged, regarding the (k +1) th picture sample, the optimal estimation value of the hidden layer bias parameter a is
Figure BDA0002055961570000069
In the step, a state equation and an observation equation similar to those in the step (3-1) are established by taking the hidden layer bias parameter a as a state variable, and the optimal estimated value of the kth hidden layer bias parameter a (k) is obtained by solving.
Step (4-3) updating the output layer weight parameter beta
Updating the weight parameter beta of the output layer by using a linear Kalman filtering method, and assuming that the weight parameter omega of the hidden layer and the bias parameter a of the hidden layer are both unchanged, for the (k +1) th sample, the optimal estimation value of the weight parameter beta of the output layer is
Figure BDA0002055961570000071
The modeling and parameter solving in this step are the same as in step (3).
The invention has the beneficial effects that: and a method combining the characteristic function filtering and the Kalman filtering is used for updating all parameters in the neural network. By applying the method in the image classification, every time a new image sample comes, all parameters in the neural network can be updated to adapt to the change of the image working condition without combining with an old image sample, the accuracy of the image classification is improved, and the complexity of calculation is reduced.
Drawings
FIG. 1 is a diagram of a model of a single hidden layer neural network.
FIG. 2 is a flow chart of the computational steps of the present invention.
Detailed Description
The application of the present invention to image classification is further described below with reference to fig. 2.
The method comprises the following specific steps
Step (1) establishing a sample input set x (k) ═ x1(k),x2(k),…,xn(k)]TTo output set y (k) ═ y1(k),y2(k),…,ym(k)]TThe neural network model of the relationship mapping between the samples, wherein the sample input set is the feature value of each image sample after preprocessing, the sample output set is the classification category of each corresponding image, k is the selection of the kth sample set, and xn(k) For the nth input of the kth sample, ym(k) Is the mth output of the kth sample. Taking a single hidden layer neural network as an example, see fig. 1:
Figure BDA0002055961570000072
wherein g (-) is an activation function, usually selected from sigmoid function, ReLU function, Gaussian function, polynomial, etc., ωi=[ωi1i2,…,ωil]TI is 1, …, n and a are weight parameter and bias parameter of hidden layer, ω isiEach item in the list is the component, l is the number of nodes of the single hidden layer, and beta is the weight parameter of the output layer.
The loss function of which has the general form of
Figure BDA0002055961570000073
Wherein,
Figure BDA0002055961570000074
and establishing a picture classification result for the model.
The specific form of the loss function used herein is
Figure BDA0002055961570000081
Step (2) initializing weight parameters omega, bias parameters a and output layer weight parameters beta of hidden layers in the network
In each iteration, under the condition that the hidden layer weight parameter omega and the bias parameter a are randomly given, all parameter solving problems of the network are converted into the problem of solving the output layer weight parameter beta through least squares. The algorithm is described in detail as follows:
when the activation function of the hidden layer is infinitely differentiable, the neural network does not need to solve all parameters any more, the hidden layer weight parameter ω and the hidden layer bias parameter a can be realized in a random selection mode and are kept unchanged in the whole process, and at the moment, if the description of the model in the formula (1) is changed into the following form:
y(k)=H(k)β (3)
wherein
H(k)=[H1(k) H2(k) … Hl(k)]T
Figure BDA0002055961570000082
Then since the hidden layer weight parameter ω and the hidden layer bias parameter a are determined so that h (k) is known, the problem at this time can be transformed into how to solve the output layer weight parameter β by equation (3), and the objective function is also transformed into the following form by equation (2):
Figure BDA0002055961570000083
here, a least squares method is used, then the solution is
Figure BDA0002055961570000084
In the above formula, H-1Is the Moore-Penrose inverse of the hidden layer output matrix H.
And (3) updating the output layer weight parameter beta of the neural network through inputting the current new image sample.
Where Kalman filtering is used to perform real-time updates of the output layer weight parameters β. To perform real-time updating of the parameters using Kalman filtering, the state equations and measurement equations that conform to Kalman filtering must be established. Considering that the weight parameter beta of the output layer to be estimated is slowly changed by certain random interference, the state equation of Kalman filtering is modeled as follows:
β(k+1)=A(k+1,k)β(k)+w(k) (4)
in order to simulate the interference on the parameter to be estimated, a white noise sequence w (k) is added into the equation.
From equation (3), the measurement equation can be obtained as follows:
y(k)=Hβ(k)+v(k) (5)
where v (k) is also a white noise sequence, similar to the equation of state.
In the Kalman filtering model, the process noise w (k) and the observation noise v (k) are both white noise sequences, and are constant values in the sampling interval. And E { w (k) w ' (k) } Q and E { v (k) v ' (k) } R, a (k +1, k) E, when w (k) and v (k) are independent from each other, E { w (k) v ' (k) } 0, and β (k) is the kth output layer weight parameter.
Then the optimal estimated value of the weight parameter β of the (k +1) th output layer solved by the model is:
Figure BDA0002055961570000091
wherein,
Figure BDA0002055961570000092
represents a predicted value of the weight parameter beta of the (k +1) th output layer; k (K +1) is the (K +1) th optimal gain array;
Figure BDA0002055961570000093
is the estimated value of the weight parameter beta of the output layer of the (k +1) th.
And (4) updating the hidden layer weight parameter omega and the hidden layer bias parameter a through feature function filtering.
The design of the parameter updating method based on the characteristic function can be realized by two steps: firstly, updating the hidden layer weight parameter omega and the bias parameter a, and secondly, updating the output layer weight parameter beta. But is divided into three steps here due to its high complexity. It is described as follows:
for the single hidden layer neural network with the hidden layer weight parameter omega, the bias parameter a and the output layer weight parameter beta determined by the method in the step (1), the step (2) and the step (3), when a new picture sample is input, the following three steps are sequentially carried out:
step (4-1) hidden layer weight parameter omega is updated
Firstly, a hidden layer weight parameter omega is updated by using feature function filtering, and if a hidden layer bias parameter a and an output layer weight parameter beta are not changed, the optimal estimation value of the hidden layer weight parameter omega for the (k +1) th sample is
Figure BDA0002055961570000094
In this step, for the update of the hidden layer weight parameter ω, ω ═ ω is used12,…,ωn],ωi=[ωi1i2,…,ωil]TIs a component vector of the hidden layer weight parameter omega, and omega is an n-dimensional vector, the component vector needs to be corresponding to omegaiAnd i is 1, …, and n is updated by modeling solution. For each time omegaiEstimate of (c), assume ωjJ is 1, …, i-1, i +1, …, n is constant. Then by ωiThe following state equations and observation equations may be established for the state variables in the feature function filtering:
ωi(k+1)=A·ωi(k)+w(k) (16)
Figure BDA0002055961570000095
wherein the model noise w (k) only needs to have a distribution function, and the measurement error v (k) only needs to have a mean value. Then the kth ω of the modeli(k) The solving process of the optimal estimated value is as follows:
(a) calculating p1(k)
Solving the composition matrix p of the gain array U (k) as shown in equation (10)1(k) In that respect Wherein K (t) is a weighting function,
Figure BDA0002055961570000101
for a given function of the target feature in question,
Figure BDA0002055961570000102
is the characteristic function of s (k) ═ a (k) e (k),
Figure BDA0002055961570000103
is the characteristic function of q (k +1) ═ G (k +1) w (k + 1).
(b) Calculating p3(k)
Figure BDA0002055961570000104
y (k) classify the class for the kth picture in the sample output set,
Figure BDA0002055961570000105
is an estimate of y (k), and has
Figure BDA0002055961570000106
(c) Calculating gain array U (k)
Figure BDA0002055961570000107
U (k) is a weight matrix with a positive fixed matrix fixed as the hypothetical matrix R (k)
(d) Calculating the component omega of the hidden layer weight parameter to be estimatedi(k) Estimated value at time k
Figure BDA0002055961570000108
Step (4-2) hidden layer bias parameter a updating
And then, updating the hidden layer bias parameter a by using feature function filtering, and assuming that the hidden layer weight parameter omega and the output layer weight parameter beta are unchanged, regarding the (k +1) th picture sample, the optimal estimation value of the hidden layer bias parameter a is
Figure BDA0002055961570000109
In the step, a state equation and an observation equation similar to those in the step (3-1) are established by taking the hidden layer bias parameter a as a state variable, and the optimal estimated value of the kth hidden layer bias parameter a (k) is obtained by solving.
Step (4-3) updating the output layer weight parameter beta
Updating the weight parameter beta of the output layer by using a linear Kalman filtering method, and assuming that the weight parameter omega of the hidden layer and the bias parameter a of the hidden layer are both unchanged, for the (k +1) th sample, the optimal estimation value of the weight parameter beta of the output layer is
Figure BDA00020559615700001010
The modeling and parameter solving in this step are the same as in step (3).
By applying the method in the image classification, every time a new image sample comes, all parameters in the neural network can be updated to adapt to the change of the image working condition without combining with an old image sample, the accuracy of the image classification is improved, and the complexity of calculation is reduced.

Claims (4)

1. The image classification method based on the neural network parameter updating of the characteristic function filtering is characterized by comprising the following steps:
step (1) establishing a sample input set x (k) ═ x1(k),x2(k),…,xn(k)]TTo output set y (k) ═ y1(k),y2(k),…,ym(k)]TThe neural network model of the relationship mapping between the samples, wherein the sample input set is the feature value of each image sample after preprocessing, the sample output set is the classification category of each corresponding image, k is the selection of the kth sample set, and xn(k) For the nth input of the kth sample, ym(k) The mth output for the kth sample;
step (2) initializing weight parameters, bias parameters and output layer weight parameters of a hidden layer in a network
In each iteration, under the condition that hidden layer weight parameters and bias parameters are randomly given, all parameter solving problems of the network are converted into a problem of solving output layer weight parameters beta through least squares;
step (3) inputting and updating an output layer weight parameter beta of the neural network through a current new image sample;
considering that the weight parameter beta of the output layer to be estimated is subjected to a certain random interference and is slowly changed, the state equation of Kalman filtering is modeled as follows:
β(k+1)=A(k+1,k)β(k)+w(k) (4)
in order to simulate the interference on the parameter to be estimated, a white noise sequence w (k) is added into the equation;
the measurement equation is obtained as follows:
y(k)=Hβ(k)+v(k) (5)
wherein v (k) is a white noise sequence;
in the above Kalman filtering model, the process noise w (k) and the observation noise v (k) are both white noise sequences, and are constant values within the sampling interval; and with E { w (k) w ' (k) } Q and E { v (k) v ' (k) } R, a (k +1, k) ═ E, when w (k) and v (k) are mutually independent, E { w (k) v ' (k) } 0, β (k) is the kth output layer weight parameter;
then the optimal estimated value of the weight parameter β of the (k +1) th output layer solved by the model is:
Figure FDA0002055961560000011
wherein,
Figure FDA0002055961560000012
represents a predicted value of the weight parameter beta of the (k +1) th output layer; k (K +1) is the (K +1) th optimal gain array;
Figure FDA0002055961560000013
the weight parameter beta estimated value of the output layer of the (k +1) th is obtained;
step (4) updating hidden layer weight parameters and hidden layer bias parameters through feature function filtering;
in the feature function filtering, when the observation equation is nonlinear with respect to the state variable, if the following two requirements are satisfied:
the method comprises the following steps of 1: { w (k) } and { v (k) } are bounded stationary random processes, x (0) is the initial state, { w (k) }, { v (k) } and x (0) are independent of each other, and the distribution function of { w (k) } is known, and its characteristic function is
Figure FDA0002055961560000021
{ v (k) } mean known, | E (w (k)) > Y<+∞;
The method comprises the following steps: h (-) is a known Bohr measurable and smooth nonlinear function;
then a filter of the form:
Figure FDA0002055961560000022
Figure FDA0002055961560000023
wherein A (k) is a state transition matrix,
Figure FDA0002055961560000024
is an estimate of the kth state quantity,
Figure FDA0002055961560000025
predicted value for the k +1 th observation, U (k) e Rn×lFor a gain matrix to be designed, the acquisition of u (k) is the core and key of the whole filter design;
order to
Figure FDA0002055961560000026
The available estimation error equation is
Figure FDA0002055961560000027
The performance index is
Figure FDA0002055961560000028
Wherein,
Figure FDA0002055961560000029
the weighting function K (t) is chosen to ensure J0It is real and bounded, it is a given positive definite weight matrix, in order to constrain the gain matrix, minimize this performance index can get the filter gain matrix;
the gain matrix K (K +1) solving process is given below:
if p is1(k) And p2(k) Are respectively as
Figure FDA00020559615600000210
Figure FDA00020559615600000211
The performance index can be rewritten as
Figure FDA0002055961560000031
To obtain the gain matrix K (K +1), let
Figure FDA0002055961560000032
Obtaining the extreme point of the performance index
Figure FDA0002055961560000033
Due to the fact that
Figure FDA0002055961560000034
Therefore, the solution obtained by the formula (13) is the extreme point of the minimum performance index;
for the single hidden layer neural network with the determined hidden layer weight parameters, bias parameters and output layer weight parameters, when a new picture sample is input, the following three steps are sequentially executed:
step (4-1) hidden layer weight parameter updating
Assuming that the hidden layer bias parameter and the output layer weight parameter are unchanged, the optimal estimation value of the hidden layer weight parameter for the (k +1) th sample is
Figure FDA0002055961560000035
In this step, for the update of the hidden layer weight parameter, ω ═ ω is used12,…,ωn],ωi=[ωi1i2,…,ωil]TIs the component of the hidden layer weight parameter, the hidden layer weight parameter is the n-dimensional vector, and then it is necessary to the omegaiI is 1, …, n is respectively updated by modeling solution; for each time omegaiEstimate of (c), assume ωjJ is 1, …, i-1, i +1, …, n is constant; then by ωiEstablishing the following state equation and observation equation for the state variable in the feature function filtering:
ωi(k+1)=A·ωi(k)+w(k) (16)
Figure FDA0002055961560000036
wherein, the model noise w (k) only needs to have a distribution function, and the measurement error v (k) only needs to have a mean value;
step (4-2) hidden layer bias parameter a updating
And then, updating the hidden layer bias parameter a by using feature function filtering, and assuming that the hidden layer weight parameter omega and the output layer weight parameter beta are unchanged, regarding the (k +1) th picture sample, the optimal estimation value of the hidden layer bias parameter a is
Figure FDA0002055961560000037
Establishing a state equation and an observation equation similar to those in the step (3-1) by taking the hidden layer bias parameter a as a state variable, and solving to obtain an optimal estimation value of the kth hidden layer bias parameter a (k);
step (4-3) updating the output layer weight parameter beta
Updating the weight parameters of the output layer by using a linear Kalman filtering method, and assuming that the weight parameters of the hidden layer and the bias parameters of the hidden layer are unchanged, for the (k +1) th sample, the optimal estimation value of the weight parameters of the output layer is
Figure FDA0002055961560000041
2. The method of claim 1, wherein: the neural network model in step (1) is represented as:
Figure FDA0002055961560000042
where g (-) is an activation function, ωi=[ωi1i2,…,ωil]TA is the weight parameter and the bias parameter of the hidden layer respectively, l is the number of nodes of the single hidden layer, and beta is the weight parameter of the output layer;
a loss function of
Figure FDA0002055961560000043
Wherein,
Figure FDA0002055961560000044
the image classification result after the model is established is in a specific form
Figure FDA0002055961560000045
3. The method of claim 1, wherein: the step (2) is specifically as follows:
when the activation function of the hidden layer is infinitely differentiable, the neural network does not need to solve all parameters any more, the hidden layer weight parameter and the hidden layer bias parameter can be realized in a random selection mode and are kept unchanged in the whole process, and at the moment, if the description of the model in the formula (1) is changed into the following form:
y(k)=H(k)β (3)
wherein
H(k)=[H1(k) H2(k) … Hl(k)]T
Figure FDA0002055961560000046
Then since the hidden layer weight parameters and hidden layer bias parameters are determined so that h (k) is known, the problem at this point translates to solving the output layer weight parameter β by equation (3), and the objective function is also transformed by equation (2) to the form:
Figure FDA0002055961560000047
here, a least squares method is used, then the solution is
Figure FDA0002055961560000048
In the above formula, H-1Is the Moore-Penrose inverse of the hidden layer output matrix H.
4. The method of claim 1, wherein: the kth ω in the model composed of the formula (16) and the formula (17) in the step (4)i(k) The solving process of the optimal estimated value is as follows:
(a) calculating p1(k)
Solving the composition matrix p of the gain array U (k) as shown in equation (10)1(k) (ii) a Wherein K (t) is a weighting function,
Figure FDA0002055961560000051
for a given function of the target feature in question,
Figure FDA0002055961560000052
is the characteristic function of s (k) ═ a (k) e (k),
Figure FDA0002055961560000053
a characteristic function of q (k +1) ═ G (k +1) w (k + 1);
(b) calculating p3(k)
Figure FDA0002055961560000054
y (k) classify the class for the kth picture in the sample output set,
Figure FDA0002055961560000055
is an estimate of y (k), and has
Figure FDA0002055961560000056
(c) Calculating gain array U (k)
Figure FDA0002055961560000057
U (k) is a weight matrix with a positive fixed matrix fixed as the hypothetical matrix R (k)
(d) Calculating the component omega of the hidden layer weight parameter to be estimatedi(k) Estimated value at time k
Figure FDA0002055961560000058
CN201910389454.5A 2019-05-10 2019-05-10 Image classification method for updating neural network parameters based on feature function filtering Active CN110110794B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910389454.5A CN110110794B (en) 2019-05-10 2019-05-10 Image classification method for updating neural network parameters based on feature function filtering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910389454.5A CN110110794B (en) 2019-05-10 2019-05-10 Image classification method for updating neural network parameters based on feature function filtering

Publications (2)

Publication Number Publication Date
CN110110794A CN110110794A (en) 2019-08-09
CN110110794B true CN110110794B (en) 2021-06-29

Family

ID=67489388

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910389454.5A Active CN110110794B (en) 2019-05-10 2019-05-10 Image classification method for updating neural network parameters based on feature function filtering

Country Status (1)

Country Link
CN (1) CN110110794B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553170B (en) * 2020-07-10 2020-10-20 腾讯科技(深圳)有限公司 Text processing method, text feature relation extraction method and device
CN112649804A (en) * 2020-12-21 2021-04-13 杭州电子科技大学 Centralized multi-sensor fusion filtering method based on characteristic function
CN115601560A (en) * 2022-10-28 2023-01-13 广东石油化工学院(Cn) Parameter updating method based on self-adaptive network
CN115796244B (en) * 2022-12-20 2023-07-21 广东石油化工学院 Parameter identification method based on CFF for ultra-nonlinear input/output system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778645A (en) * 2014-01-16 2014-05-07 南京航空航天大学 Circular target real-time tracking method based on images
CN106059972A (en) * 2016-05-25 2016-10-26 北京邮电大学 Modulation identification method under MIMO related channel based on machine learning algorithm
WO2018120013A1 (en) * 2016-12-30 2018-07-05 Nokia Technologies Oy Artificial neural network

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108701253A (en) * 2015-11-12 2018-10-23 渊慧科技有限公司 The target output training neural network of operating specification
CN109635318B (en) * 2018-11-01 2023-07-25 南京航空航天大学 Intelligent analysis redundancy design method for aero-engine sensor

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103778645A (en) * 2014-01-16 2014-05-07 南京航空航天大学 Circular target real-time tracking method based on images
CN106059972A (en) * 2016-05-25 2016-10-26 北京邮电大学 Modulation identification method under MIMO related channel based on machine learning algorithm
WO2018120013A1 (en) * 2016-12-30 2018-07-05 Nokia Technologies Oy Artificial neural network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
A Derivative-Free Kalman Filter for Parameter Estimation of Recurrent Neural Networks;Jongsoo Choi,Martin Bouchard,Tet Hin Yeap;《ResearchGate》;20051215;第1-7页 *
基于卡尔曼滤波器算法的径向基神经网络训练算法研究;张海涛;《中国优秀硕士论文全文数据库信息科技辑》;20070515;第I140-22页 *

Also Published As

Publication number Publication date
CN110110794A (en) 2019-08-09

Similar Documents

Publication Publication Date Title
CN110110794B (en) Image classification method for updating neural network parameters based on feature function filtering
CN110675623B (en) Short-term traffic flow prediction method, system and device based on hybrid deep learning
CN109523018B (en) Image classification method based on deep migration learning
CN110991027A (en) Robot simulation learning method based on virtual scene training
CN109299701B (en) Human face age estimation method based on GAN expansion multi-human species characteristic collaborative selection
CN105809672B (en) A kind of image multiple target collaboration dividing method constrained based on super-pixel and structuring
CN110619059B (en) Building marking method based on transfer learning
CN109840595B (en) Knowledge tracking method based on group learning behavior characteristics
CN110826437A (en) Intelligent robot control method, system and device based on biological neural network
CN110728698A (en) Multi-target tracking model based on composite cyclic neural network system
CN109064460B (en) Wheat severe disease prediction method based on multiple time sequence attribute element depth characteristics
CN111860787A (en) Short-term prediction method and device for coupling directed graph structure flow data containing missing data
Ranjan et al. A novel and efficient classifier using spiking neural network
CN112926485A (en) Few-sample sluice image classification method
CN113052373A (en) Monthly runoff change trend prediction method based on improved ELM model
CN111027610B (en) Image feature fusion method, apparatus, and medium
CN112990585A (en) Hen laying rate prediction method based on LSTM-Kalman model
CN110289987B (en) Multi-agent system network anti-attack capability assessment method based on characterization learning
Ahmed et al. Design and implementation of a neural network for real-time object tracking
CN115910373A (en) Parameter estimation method and device for fractional order infectious disease model and electronic equipment
Abdulsadda et al. An improved SPSA algorithm for system identification using fuzzy rules for training neural networks
Leke et al. Missing data estimation using ant-lion optimizer algorithm
CN116805384A (en) Automatic searching method, automatic searching performance prediction model training method and device
CN113743341A (en) Human body posture real-time estimation method based on self-adaptive model
JP4267726B2 (en) Device for determining relationship between operation signal and operation amount in control device, control device, data generation device, input / output characteristic determination device, and correlation evaluation device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant