EP0433414A1 - Fortlaufende bayesschätzung mit einer neuronalen netzwerkarchitektur - Google Patents

Fortlaufende bayesschätzung mit einer neuronalen netzwerkarchitektur

Info

Publication number: EP0433414A1
Authority: EP; European Patent Office
Prior art keywords: novum; output; threshold; prediction; input
Prior art date: 1989-06-16
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.): Withdrawn

Application number

EP90909520A

Other languages

English (en)

French (fr)

Inventor

Robert Leo Dawes

Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)

MARTINGALE RESEARCH CORPN.

Original Assignee

MARTINGALE RESEARCH CORPN

Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)

1989-06-16

Filing date

1990-06-15

Publication date

1991-06-26

1990-06-15 Application filed by MARTINGALE RESEARCH CORPN filed Critical MARTINGALE RESEARCH CORPN

1991-06-26 Publication of EP0433414A1 publication Critical patent/EP0433414A1/de

Status Withdrawn legal-status Critical Current

Links

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology

Definitions

the present invention pertains in general to a neural network architecture, and more particularly, to an architecture which is designed to perform adaptive, continuous Bayesian estimation on unpreprocessed large dimensional data.
Artificial neural systems is the study of dynamical systems that carry out useful information processing by means of their state response to initial or continuous input.
one of the goals of artificial neural systems was the development and application of human-made systems that can carry out the kinds of information processing that brains carry out.
These technologies sought to develop processing capabilities such as real- time high performance recognition, knowledge recognition for inexact knowledge domains and fast, precise control of robot effector movement. Therefore, this technology was related to artificial intelligence.
Cognitive systems in which neural networks are implemented can be viewed in terms of an observed system and a network which are interfaced by sensor and motor transducers.
the neural network is a dynamic system which transforms its current state into the subsequent states under the influence of its inputs to produce outputs which generally influence the observed system.
a cognitive system generally attempts to anticipate its sensory input patterns by building internal models of the external dynamics and it minimizes the prediction error by employing a prediction error correction scheme, through improvement of its models, or by influencing the evolution of the observed system, or all three. Mathematically, this is a hybrid of three important and well-known problems: system identification, estimation and control. The theoretical solutions to these have been known for several decades. System identification is accomplished analytically by a number of methods, such as the "model reference" method.
Kalman filter provides an iterative estimation of linear plants in Gaussian noise
Kalman-Bucy filter provides continuously evolving estimates.
the multi- stage Bayesian or continuous Bayesian estimator can be utilized for non-linear plants in non-Gaussian noise. Control is approached through several routes, including the Hamilton-Jacobi theory and the method of Pontryagian.
the present invention disclosed and claimed herein comprises a neural network.
the neural network includes an observation input for receiving a timed series of observations.
a novelty device is then provided for comparing the observations with an internally generated prediction in accordance with the novelty filtering algorithm.
the novelty filter device provides on an output a suboptimal innovations process related to the received observations and the predictions.
the output represents a prediction error.
a prediction device is provided for generating the prediction for output to the novelty device.
This prediction device includes a geometric lattice of nodes. Each of the nodes has associated therewith a memory for storage of spatial patterns which represent a spatial history of the timed series of observations.
a plurality of signal inputs is provided for receiving the prediction error from the novelty device and then this received prediction error is filtered through the stored spatial patterns to produce a correlation coefficient that represents the similarity between the stored pattern and the prediction error.
a plurality of threshold inputs is provided at each node for receiving threshold output levels from selected other nodes.
a threshold memory is provided for storing threshold levels representing the prior probability for the occurrance of the stored spatial patterns prior to receiving the stored spatial patterns.
a CPU at each of the nodes computes an updated threshold level in accordance with a differential-difference equation which operates on the stored threshold level, the received threshold levels and the correlation coefficients to define and propagate a quantum mechanical wave particle across the geometric lattice of nodes and also store the updated threshold in the threshold memory.
a threshold output is provided from each of the nodes for outputting the updated threshold to other nodes.
the CPU computes the internally generated prediction by passing the correlation coe ficients through a sigmoid function whose threshold level comprises the updated threshold level.
the prediction represents the probability for the occurrence of the storage spatial patterns conditioned upon the prior probability represented by the storage threshold level.
the prediction device is adapted such that it is operable to learn by updating the stored spatial patterns so as to correlate the prediction error with the position of the quantum mechanical wave particle over the geometrical lattice. This learning is achieved in accordance with the Hebbian learning law.
the novelty device includes an array of nodes with each node having a plurality of signal inputs that receive the observation inputs, and a plurality of prediction inputs for receiving the prediction outputs of the prediction device.
a memory is provided for storing temporal patterns that represent a timed history of the timed series of observations. The prediction observation inputs are then operated upon with a predetermined algorithm that utilizes the stored temporal patterns to provide the prediction error.
the novelty device also is adaptive. It learns by updating the stored temporal patterns so as to minimize the prediction error.
the learning algorithm utilizes the contraHebbian learning law.
Figure 1 illustrates a block diagram of the neural network of the present invention
Figure 2 illustrates a block diagram of the PA 12 illustrating the novum 14 as an array of separate neurons and the IG 16 as an array of separate neurons;
Figures 3a-3c illustrate the use of traveling wave packets in the IG threshold field
Figure 4 illustrates a dual construction of the gamma outstar avalanche can be made which consists of instars from the pixel array falling on each of a sequence of neurons in a "timing chain";
Figure 5 illustrates a recurrent two-layer neural network similar to that utilized by many neural modelers
Figure 6 illustrates a block diagram of the parametric avalanche which represents the innovations approach to stochastic filtering
Figure 7 illustrates how the novum and the IG of the PA generate and use the innovations process
Figures 8a and 8b illustrate schematic representations of the novum neuron and the IG neuron
Figure 9 illustrates a more detailed flow from the focal plane in the observation block through the novum and the IG for a two dimensional lattice
Figure 10 illustrates a top view of the IG lattice
Figure 11 illustrates a tracking system
Figure 12 illustrates a block diagram of a control module which employs two PA Kalman Filters for the state estimation functions
Figure 13 illustrates a Luemberger observer
Figure 14 illustra ' tes the preferred PACM design
FIGS 15 and 16 illustrate graphs of one example of the PA
Figure 17 illustrates the response of the synaptic weights in the IG:
Figure 18 illustrates the values of the synaptic weights on each neuron of the novum
Figure 19 illustrates a block diagram of one of the neurons in the IG
Figure 20 illustrates a block diagram of one of the neurons in the novum
Figure 21 illustrates an example of an application of the Parametric Avalanche
Figures 22 and 23 illustrate the time evolution of the angle from the vertical for the example of Figure 21 and the corresponding novum output
Figure 24 presents programme information for use in a neural network in accordance with the invention.
the observed system 10 receives on the input control signals ⁇ (t) .
an observing system 12 is provided to monitor the state of this observed system 10.
the observing system 12 is essentially the neural network.
the neural network is comprised of a two-layer recurrent network.
There is an input layer which is referred to as the novum, illustrated by a block 14.
the second layer provides a classification layer and is referred to as the IG as illustrated in a block 16.
the abbreviation IG refers to Infinitesimal
the two blocks 14 and 16 operate together to continuously process information and provide a retrospective classification and prediction (estimate) of the evolution of the classified observation.
the novum 14 provides a prediction assessment of the observed system and receives on one input thereof the output of the observed system 10. The output of the novum 14 provides a prediction error. The novum 14 also receives on the input thereof the output of the IG 16, the output of the IG 16 providing a state estimate in the form of a conditional probability distribution function. The novum 14 essentially extracts the innovations process which will be described in more detail hereinbelow. It decodes the classifications (state estimations) which it receives from the IG 16 and then subtracts that decoded version from the actual received signal to yield the novel residual.
the novum 14 contains the internal model of the transition by which the system 10 is observed and is operable to transform the estimated state output by the IG 16 into a prediction of the observed signal received from the observed system 10.
the novum 14 operates under a simple learning law wherein the output is zero when the novelty is not present; that is, when the internal model correctly predicts the observed signal, there is no novelty in the observed signal and, therefore, a zero prediction error. Therefore, the novum 14 is driven to maximize the entropy of its output which comprises its state of "homeostasis".
the IG 16 operates as a prediction generator. It implements the functions of Kalman gain matrix, the state transition function and the estimation procedure. The output of the IG 16 indicates the probability that prior events have occurred, given the prior history and the current observation.
Both the novum 14 and the IG 16 are comprised of a network of processing elements or "neurons". However, the neurons in the novum 14 are arranged in accordance with an observation matrix such that the observed system is mapped directly to the neurons in the novum 14 whereas the IG 16, which is also comprised of a network of neurons is arranged as a geometric lattice, and the neurons in the IG 16 represent points in an abstract probability space. Each of the neurons in the IG 16 has an activation level, which activation level indicates the likelihood that the events or states which each particular neuron represents have occurred, given only the current measurements.
the output of these neurons indicate the likelihood that those events have occurred, conditioned additionally by the prior history of the observations and the dynamical model, as supplied in the threshold level of the output sigmoid function described hereinbelow.
the synaptic weights of the IG neurons learn by Hebbian learning.
the neural network of the present invention as illustrated by the observing system 12 is referred to as a "parametric avalanche" (PA) which is operable to store dynamic patterns and recall dynamic patterns simultaneously. It performs optimal compression of time varying data and incorporates a moving targert indicator" through its novelty factorization. It can track a non ⁇ linear dynamic system subliminally before accumulating sufficient likelihood to declare a detection.
the PA 12 therefore possesses an internal inertia dynamics of its own, with which the external dynamics are associated by * means of the learning law. This internal dynamics is governed by the Quantum Neurodynamic (QND) theory.
QND Quantum Neurodynamic
FIG. 2 there is illustrated a block diagram of the PA 12 illustrating the novum 14 as an array of separate neurons and the IG 16 as an array of separate neurons.
the novum 14 receives the observation of the plant from the observed system 10 on an input vector 18.
the novum 14 outputs the novelty on an output vector 20 which is input to the IG 16.
the IG 16 generates the state estimates of the plant for input to the novum 14 on an output vector 22.
the observed system 10 is illustrated in the form of a focal plane wherein an object, illustrated as a rocket 26, traverses the focal plane in a predetermined path. This constitutes the observation. This observation is that of a dynamic system which possesses a system inertia.
the focal plane of the observed system 10 is mapped onto the novum 14.
the IG 16 makes a prediction as to the state of the observed system and this is input to the novum 14. If the prediction is correct, then the novelty output of the novum 14 will be zero.
the rocket 26 traverses its predetermined path and, if the internal system model in the PA 12 is correct, the state 12 estimates on output vector 22 will maintain the novelty output of the novum 14 at a zero state.
Each of the neurons in the novum 14 is represented by a neuron 28.
Each neuron 28 receives a single input on a line 30 from the observed system 10 and a plurality of inputs on lines 32 from each of the neurons in the IG 16, which constitute state estimates from the IG 16.
the neuron 28 generates internal weighting factors for each of the inputs 32, as will be described hereinbelow.
the neuron 28 provides a single output to the IG 16, which output goes to each of the neurons in the IG 16.
the IG 16 is comprised of a geometrical lattice of neurons.
Each of the neurons in the IG 16 is illustrated by a neuron 34.
Each of the neurons 34 receives inputs from the neurons 28 in the novum 14 on input lines 36.
Each of the neurons 34 is an independent and asynchronous processor which can generate weighting factors for each of the input line 36 to internally generate an activation level.
the weighting factors provide a stored template and the activation level yields the correlation between an input signal vector (i.e., the novum output) and this stored template. It indicates the degree to which the input signal looks like the stored template across the IG 16. If the activation level is zero, it indicates that the input vector does not match the stored template. However, if there is a match, the activation level is relatively high. This activation level is modified by a threshold field which will be described in more detail hereinbelow, which then generates an output that is input to each of the neurons 28 in novum 14.
Each of the neurons 34 in the IG 16 have a threshold associated therewith such that an overall threshold field is provided over the lattice in the IG 16.
the threshold levels in the IG threshold field are governed by non-linear lattice differential equations.
the "natural mode" of wave propagation in the threshold field favors compact, particle-like depressions in the field which are termed Threshold Field Depressions (TFD's).
TFD's Threshold Field Depressions
the threshold field operation will be described in more detail hereinbelow. However, it can be stated that one or more particle-like wave depressions are propagated across the geometric lattice of the IG 16. This propagation "tracks" the inertia of the plant, as illustrated in the focal plane of the observed system 10. It is important to note that the TFD operates over a number of neurons in the general area of the threshold field depression. Therefore, it is the interaction of the neighboring neurons that control the propagation of the TFD.
the actual wave propagation is illustrated by an output wave 38 on the surface of the IG 16.
the wave 38 has a peak and an indicated path which it travels. Although this path is noted as being arcuate in nature, it is a relatively complex behavior which will be described in more detail hereinbelow.
the IG 16 is comprised of an IG 1 42 and a threshold plane 43.
the IG 1 layer 42 represents the geometrical lattice of neurons which receive on the input thereof the output from the novum 20 with each neuron in the IG* layer 42 providing an activation level in response thereto.
each of the neurons 34 generates a weighting factor for each of the input lines from the novum 14 in accordance with a learning law. This results in the generation of an activation level for each neuron 34 which indicates the degree to which the input signal looks like the stored template.
the stored template will have been learned in a previous operation from which the weighting factors were derived.
the activation across the geometrical lattice of the IG' layer 42 will appear as a distribution of activation levels 44.
the activation levels will be virtually zero.
This distribution even if a zero value, has an inertia which is a result of the wavelike motion described above.
the distribution of the activation levels appears as illustrated in Figure 3a.
threshold depression 46 there is also a threshold depression 46 in the threshold plane 44.
the threshold level in the threshold plane 44 is at a high level.
a threshold field depression present which has an associated inertia. In Figure 3a, this is illustrated as a threshold field depression (TFD) 46. Since the output of the novum is zero, the estimation provided by the IG 16 is correct. Therefore, the TFD 46 will be directly aligned with respect to the distribution of the activation levels 44.
FIG. 3b a trajectory is illustrated whi ⁇ h moves between a beginning point 48 at a time tp. ! to an end point 50 at a time t p+1 .
a point 52 is traversed in the center thereof at a time t p .
This observation illustrates a sequence of events which occur in a temporal manner.
this is a dynamic system with inertia.
the system must examine the output of the
the IG 16 to determine if the state estimates are correct. This is done through the innovation process in the novum 14. At time t p , which represents the next slice in time, the IG 16 must predict what the status of the system will be at this time and these state estimates are again input to the novum 14 which coittpares them with the observation to determine if the observation is correct. If so, the output of the novum 14 is zero. This continues on to the next slice in time at time tp +1 and so on. Initially, it is assumed that the system has learned the trajectory from point 48 to point 50 and passing through point 52. In accordance with an important aspect of the present invention, it is the generation of the state estimates in a spatio-temporal manner that is accomplished by the IG 16.
the TFD in this example has been initiated and is propagating across the geometrical lattice of the IG 16 in a predetermined path.
This path corresponds to the learned path; that is, the neurons over which the TFD is propagated are the same neurons over which it was propagated during the learning process, which will be described hereinbelow.
a prediction is made only in the area of the TFD, which prediction either allows the TFD to be propagated along its continued path, or which prediction modifies the path of the TFD. This latter situation may occur, for example, when the identical path is being traversed, but it is being traversed at a slower rate. Therefore, the propagation of the TFD directly corresponds to the inertia of the system whereas the geometrical lattice of the IG corresponds to the probability that the point along the path has occurred at a specific time.
the TFD occurs at tp ⁇ ⁇ . to yield a TFD 54 which is illustrated as concentric circles in phantom lines.
the concentric circles basically represent the level of the TFD with the center illustrating the lowest value of the depression.
Underlying the TFD is the activation level in the IG 1 layer 42. These two layers are illustrated together in an overlapping manner.
the TFD propagates from an area 54 in the IG 16 at time tp.. ! to an area 56 at time t p . This propagation continues from the area 56 to an area 58 at time t p+1 .
the TFD traversing from area 54 to area 56 to area 58 would track the inertia or speed of the object traversing from points 48 to point 52 to point 50 in the observed system in order for the system to provide the appropriate estimation of the state of the system. It is important to note that the inertia of the system has been embodied in the propagation of the TFD and this TFD in conjunction with the model encoded into the underlying neurons yields the state estimation. Because of the zero activation level in the IG 1 layer 42 (due to zero novelty output) , the wave propagation is not altered.
the inertia of the system will be different from that represented by the propagation of the TFD in the threshold plane 43.
the novum 14 will output a prediction error which will raise the activation level in front of or behind the TFD to essentially modify the threshold depression and either increase or decrease the propagation rate of the threshold depression. For example, suppose that the inertia of the observed system at a time prior to time t p ! is equal to the corresponding inertia of the TFD in the threshold field 43.
the inertia of the observed system has decreased, thus requiring the inertia of the TFD propagation to decrease or slow down. Therefore, this would result in the activation levels just behind the area 54 increasing, thus causing the propagation rate of the TFD to slow down. This would continue until the output of the novum were zero. At this point, the activation level output by the neurons in the IG would be zero as a result of the zero output of the novum 14, due to the whitening effect thereof.
the inertia of the system becomes a constant value, the inertia of the TFD will be forced that inertia and the output of the novum 14 will be forced to a zero value.
the PA 12 is comprised in part of a number of networks which are integrated together. These will each be described in general and then the way that these networks and learning laws are integrated will be described.
the "gamma outstar avalanche" of Grossberg is a well-known neural network architecture which utilizes Hebbian learning and an outstar sequencing scheme to
a dual construction of the gamma outstar avalan ⁇ he can be made which consists of instars from the pixel array falling on each of a sequence of neurons in a "timing chain". This is illustrated in Figure .
the pixel array is represented by an array 60 with an illuminated pattern thereon.
the output of the pixel array 60 is input to a chain of neurons 62- with a pulse 64 represented as traveling down the chain of neurons.
Implementation of this type of "instar" avalanche is not a simple task nor is it obvious that by itself it would have any utility.
the only difficulty in launching this instar avalanche is that it requires one to send a coherent, compact pulse of activation down the chain of neurons 62 which parameterize the time access.
the instar avalanche could also be launched with some very simple estimations. However, these estimations become somewhat uncomputable when it is necessary to go to higher dimensional neural lattices with more than one pulse propagated therein.
each of the neurons in the neuron chain 62 is encoded with a compact representation of a pattern, and its neighbors encode the causal context in which that pattern occurred.
the coding is associatively accessible in that the spatial patterns are concentrated into the synapses of individual neurons in the neuron chain 62.
the network is comprised of a first layer 66 and a second layer 68.
the first layer is comprised of a plurality of neurons 70 and the second layer 68 is comprised of a plurality of neurons 72.
Each of the neurons 70 is labelled ai ⁇ a n with the a-_ , the a n and the a ⁇ neuron 70 being illustrated, the a ⁇ neuron representing an intermediate neuron.
the neurons 72 in layer 68 are labelled b ⁇ -b n w ith the b ⁇ , b n and the bj neuron 72 illustrated, the bj neuron 72 being an intermediate neuron.
Each of the neurons in the first layer 66 receives an input signal from an input vector 74.
Each of the neurons in the first layer 66 also receive an input signal from each of the neurons 72 in the second layer 68 and an associated weighting value with this input signal.
each of the neurons in the second layer 68 receives as an input, a signal from each of the neurons in the first layer 66 and associates a weighting value therewith.
Each of the neurons in the second layer 68 receives an input from each of the other neurons therein and associates an appropriate weighting factor therewith.
the learning objective for this type of network when utilized in prior systems is to build a compact (preferably a single neuron) code in the second layer 68 22 to represent a class or cluster of patterns that were presented to the first layer 66 in not-necessarily compact distributed form.
the recall objective of these networks is to reactivate a pattern of codes in the second layer 68 which identifies the class or set of classes of which the input pattern of the first layer-66 is a representative.
the output objective is itself a distributed pattern, but more often in the prior systems, it is to produce a "delta function". This delta function representation is a low entropy distribution of activations, i.e., one that is unlikely to appear by chance.
This low entropy distribution of activations amounts to an unequivocable declaration that the input pattern belongs to the class of patterns which the lone active neuron in the second layer 68 represents. Such a representation requires no further pattern processing to communicate its decision to the human user, although it is easy if desired to employ an outstar from the active neuron in the second layer 68 to another layer to generate a distributed picture for the human user. This is essentially what is accomplished in the Hecht-Nielsen counterpropagation network. This low entropy distribution of activations is essentially how prior systems operate.
the desired output is also a delta function.
this is accomplished by finding the neurons 72 in the second layer 68 with the strongest response to the input pattern and preventing the learning algorithm from applying to any other neuron 72, unless the strongest response is obtained from a neuron 72 which presents a "bad match" to the input pattern, in which case, the learning algorithm is allowed to apply only to some other single neuron 72 in the second layer 68.
the PA 12 of the present invention utilizes a learning algorithm that is supervised, but in a locally computable neural form.
the TFD is propagated as a wave across the IG 16.
the propagation of compact, coherent wave-particles over a discrete lattice was discovered by accident by Fermi, Pasta and Ulam in their study of the finite heat conductivity of solids.
the differential (in 24 time) difference (in space) equations which they were studying is now called the Fermi-Pasta-Ulam (FPU) equation.
FPU Fermi-Pasta-Ulam
the FPU equation is written as a continuum in the spatial coordinate, it is a form of the Korteweg-deVries (KdV) equation, which has been known for some time to model the shallow water solitary waves of Russell.
NLS Non-Linear Schroedinger
the NLS equation is a complex wave equation
h Planck's constant
i is the imaginary unit
m is a scaling constant identified with the mass of a particle
f is a real valued function chosen to offset the dispersion of the wave
U is a real scalar field which may be identified with the refractive index of the propagation medicum or with an externally applied force field.
the NLS equation will be solved by ⁇ oliton-like wave particles such as the "Gaussons" described in Birula [Annals of Physics, 100, pp. 62-93, 1976].
the potential field U(x,t) is established by the activation levels L(x,t) of the neurons of the IG 16, which are in turn determined by the signal vector n(t) which is received from the novum 14. It is easy to show, and is described hereinbelow, that after the PA has been entrained, the prediction errors carried by n(t) generate precisely the right potential field U (x,t) whose gradient vector
TFDs also act as markers when considered in conjunction 26 with the Kalman-Bucy filtering because they mark the location of a maximum likelihood state (or feature) estimate.
the neural lattice in the second layer 68 of the two layer network of Figure 5 has randomly initialized synaptic weights, i.e. uniformly in the interval [-1, +1] and that at the current time the threshold field exhibits a single TFD at some location x 1 in the lattice.
the current image will elicit a random response in the activation levels in the second layer 68, but the output signals will be nearly zero everywhere except in the neighborhood of x' because there the threshold is so low that almost anything (except a strong antimatch to the current pattern) will produce a strong output. Therefore, a signal Hebbian learning law (i.e., one in which the weight change is proportional to the product of the presynaptic signal times the output signal of the postsynaptic neuron) will capture the input pattern and store it at, and to a lesser extent near, x'. Therefore, the input pattern is stored in the nearest neighbor manner. The learning is shut down everywhere else because the baseline threshold levels of the threshold field 43 squelches random responses in the quiet range.
TFDs move like particles in a geodesic across the neural lattice of the IG 16, they can do much more than a simple parameterization of a time axis as in an ordinary avalanche. They serve as idealized internal models of the parametric trajectories of features in the observed scene.
TFD spatial delta function
a single TFD could only encode N levels of the parameter by its position in an N- neuron lattice, because a delta function disappears when its "peak" is between lattice points.
a distributed TFD which spans as many as six or seven lattice points at a time can represent a virtual continuum of positions between adjacent neurons.
the quantization of the interpolation should be on the order of 2 m times the word-length quantization of the activation of the m- neurons under a given TFD; i.e., if a TFD amplitude at each neuron is coded into an n-bit word and the TFD spans m-neurons at a time, then the interpolation ability of the peak of the TFD between neurons should be on the order of m times n-bits.
Hebbian learning law There are two relatively simple learning laws, the Hebbian learning law and the contra-Hebbian learning law* *
the Hebbian learning law is the archetype of almost all of the so-called unsupervised learning laws, yet it is almost never used in the original form because it fails to account for the temporal latencies which characterize 28 causal processes, which include the classical conditioning behavior of animals.
Some variants account for the direction of time by convolving one or both of the presynaptic or postsynaptic signals with the one- sided distribution, such as in Klopf's Drive
the contra-Hebbian learning law is a special case of the well-known delta rule.
the delta rule adjusts the synaptic weight in accordance to Widrow's stochastic gradient method to drive the actual output yj toward a "desired" output dj.
the formula for the delta rule is:
the fundamental objective of every filtering problem is to determine (i.e., to estimate), the conditional probability density P(x(t)
Y(t') ) for the state or "feature” or "parameter” of the observed system at time t given all the observations Y(t') ⁇ y(s)
the PA 12 is a two-layer architecture as described above, consisting of novum 14 and the IG 16, this architecture being somewhat similar to that illustrated in Figure 5 with the exception that the output of the second layer 68 corresponding to the IG 16 is also fed back to the first layer 66, which corresponds to the novum 14, as an additional input.
the novum 14 provides an approximation to the innovations process of the input time series.
the IG 16 stores the differential model of the observed system.
FIG. 6 there is illustrated a block diagram of the innovations approach to stochastic filtering, which has been taken from the paper by Kailath, T. "An Innovations Approach to Least-Squares Estimation, Part I: Linear Filtering in Additive White Noise", IEEE Trans. Automat. Contr.. vol AC-13, pp. 646- 655, Dec. 1968. Superimposed on that block diagram is a partition showing which functions are performed by the novum 14 and which are performed by the IG.
Previous to Applicant's present invention all known implementations of the Kalman filter were based on the well-known iterative formulation and refinements thereof. These previous systems are based on Gaussian statistics in either linear or linearized systems. The bulk of the computational burden is taken up by the matrix operations of the operation of the Kalman gain matrix (shown as the operator "K”) in Figure 6.
the novum 14 receives the observation of the plant (i.e., the input) and the IG 16 generates the state estimates of the plant.
the "algorithm" of the PA 12- is quite different from that shown in Figure 6. It is based on the more general multi-stage Bayesian estimator as described in Ho, Y.C. and Lee, R.C.K. , "A Bayesian Approach to Problems in Stochastic Estimation and Control", I.E.E.E. Transactions of Automation Control, Vol. AC-9, pp. 333-339, October 1964. In order to describe how the PA 12 operates, the procedure, as described in the Ho and Lee paper will be stepped through to show how the PA 12 accomplishes each step.
Step 1 Evaluate P(X] ⁇ +1
the new threshold field, T(x,t ]+1 ) represents the conditional likelihood function for the states (features) x given the prior likelihood function for those states.
Step 2 Evaluate P(Z] +1
z- ⁇ +i is the new observation vector, which in Figure 7 above is denoted by y(t) .
current activations L(x) are fed through the threshold field T(x,t j ⁇ +1 ) to produce the a-priori state estimates E(x
This IG output is then passed through the synapses of the novum 14, which implement the fl matrix. Since H implements the internal model of the observation matrix H, the result is the estimate of the observation as predicted by the IG 16. (Note that the observation itself is treated as a likelihood function over the receiving transducers, so this estimate is itself a likelihood function.) The novum 14 then subtracts this estimate from the current signal to produce the innovations process.
Step 3 Evaluate P(X] ⁇ + ⁇ , zj ⁇ +i
F(t) is the state transition operator
K(t) is the Kalman gain
n(t) is the innovations process.
Step 4 Evaluate P( ] ⁇ + ⁇
the novelty resulting from the new observation is passed through the updated IG synapses (along with any recurrent IG signals) to produce the new activation levels L(x) in the IG 16, and then L(x) is passed through the threshold field to obtain P(x
L(x) is passed through the threshold field to obtain P(x
Step 5 Select the state(s) corresponding to the maximum likelihood estimate(s) .
the new encoding will not effect the current signal from the IG 16 to the novum 14, nor will it deflect the physical trajectory of the threshold wave particle. But it will deflect the apparent trajectory of the threshold wave particle the next time it crosses its physical trajectory because the processing elements in its path now encode different features. This provides the improvement in the internal model of the dynamics of the observed system. It has no effect on the current estimation effort, but it will effect the convergence rate for the next observation of the same trajectory.
the observer model H is contained in the ⁇ G-to- novu synapses of the novum 14 and it is established with a very short time constant utilizing delta-rule learning.
the threshold level over the novum 14 is level and does not vary with time. That level defines the 34 maximum information level for the novum 14 activations, i.e., the maximum entropy level. That level is the "desired output" for each processing element (pixel or neuron) of the novum, which 14, for simplicity of computation V, has been chosen to equal zero.
the observation of the vector Y(t) is applied to the novum 14 through hard-wired, non- learning synapses, one component Yj (t) to each novum neuron 28.
the feedback signals P(x:T) from the IG 16 enter through learnable synapses on the input vector 22.
P(x[T) will be a traveling delta function, so that at any one time, only one of the IG input lines of each pixel will have a signal on it.
the delta learning algorithm will mold that synaptic weight into a mirror image of the signal component, i(t) , falling on the pixel at the same time.
the observation vector y(t) is supplied to the novum through hard-wired, nonlearning synapses, one component yj (t) to each novum neuron.
T) from the IG 16 enter through learnable synapses (i.e., input lines 36).
T) will be a traveling delta-function, so that at any one time only one of the IG input lines 36 on each neuron 36 will have a signal on it.
the delta-rule learning algorithm will mold that synaptic weight into a mirror-image of the signal component, y (t) , falling on the pixel at the same time (the mirror being at the threshold level) .
the synaptic weights will be a spatially recorded replica of the signal waveform, and only those synapses which were connected to IG neurons 36 activated by the traveling delta-function actually partake in the representation. Others are available for encoding observations of unrelated signal patterns.
the motion of the wa ⁇ e-particles which become associated with the dynamical model of the observed system may be achieved in basically two ways. This can be done by using an appropriate bell-shaped depression in an otherwise level threshold field and simply translating it in the desired direction by incrementing indices in the data array, or, if more than one such depression is to be moving simultaneously, by vectoring the data itself. This is appropriate for any implementation of the PA 12 on general-purpose computing equipment or special-purpose uniprocessor/vector processor equipment.
the nonlinear Schroedinger (NLS) equation is one route to the extension of the required dynamics to two and three dimensions. This equation describes the motion of photons and phonons in dispersive media, such as Langmuir waves in plasma.
the wave-particle solutions propagate in a medium that is characterized by a nonlinear refractive index which need not be spatially uniform and which, therefore, have the requisite properties for control and modulation of the trajectories of the wave-particles.
this refractive index can be tied directly to the response of the IG neurons 34 to the novum "error" signal to induce gradients in the refractive index field which will deflect the soliton trajectories toward smaller errors, as required by the Kalman-Bucy filter.
FIGs 8a and 8b there are illustrated schematic representations of the novum neuron 28 and the IG neuron 34, as described hereinabove with respect to Figure 2.
the novum neuron 28 in Figure 8a receives an input from other neurons in the novum lattice on the lines 32. Additionally, it receives an input from each of the points in the local plane on input lines 30. Weighting factors are associated with each of the input lines 32 and each of the input lines 30.
the external input vector Y(t) will be fanned out so that every novum neuron 28 receives every component of the vector Yj (t) .
novum neurons 28 are identified with integer indices (such as "i") and the IG neurons 34 will be identified with the vector indices (such as "x") corresponding to their coordinants in a geometric lattice.
the forward flow of signals from the novum 14 to the IG 16 implements an instar avalanche, that is, the novum 14 is a pixel array, while the threshold field'-of the IG 16 supports the propagation of TFDs on the two or the three dimensional IG lattice.
the threshold function for the IG neuron 34 at a lattice position x is given by:
⁇ IG (a(x,t);T(x,t)) [l+exp ⁇ 4m(T(x,t)-a(x,t) ) ⁇ 3" 1
the "current" position of T(x,t) is illustrated schematically in the interior of the IG neuron 34 by the coordinant axis 80.
the learning law of the IG 16 is the Hebbian law.
the output signals of the IG 16 represent the conditional probability density P(x
the feedback flow of signals from the IG 16 to the novum 14 implements an outstar avalanche except that the learning law in the novum 14 is the contra-Hebbian law. Moreover, the "time" domain is factored through the two or three dimensional IGs 16 instead of being a simple one dimensional domain. The result is that when a "pattern" is recalled, it will be the negative of the observed pattern so that if the recall is executed at the same time that the original pattern is replayed into the sensor array, the output of the novum 14 is zero from all pixels.
the threshold function of the novum 14 is given by:
a v " waveform 82 represents the output response of the novum 14 in neuron 28.
the novum 14 is comprised of an input plane 84 and an output plane 86.
the focal plane in the observation block 10 was considered to have a plurality of pixels with each pixel represented by Fft f Y) which represents a neuron y (reference numeral 83) in the novum input plane 84. Therefore, one input of this novum is the vector output 22 from the IG 16 represented as u(x) .
Each of the novum neurons y described above, has a plurality of weighting factors associated with each of the IG inputs.
the IG 16 is comprised of an activation plane 42 and a threshold plane 43.
the activation plane is referred to as the IG' 42.
the IG' 42 is comprised of an input plane 98 and an output plane 100, the output plane 100 comprising the output of the IG 16.
Each of the neurons 34 in the IG 16 is, as described above, arranged in a geometric lattice. A particular one of the neurons 34 utilized for this example is illustrated by a specific neuron 102 in the input plane 98.
the neuron 102 has associated therewith a template 104 wherein the weight values are stored. For each of the neurons 34 in the IG 16, there is one weight associated with each of the novum neurons 28. Therefore, the template 104 is illustrated with a single point 106 representing the weighting factor w IG (x,y) .
the dot product of the output vector for the novum n(y) and the associated weighting vector W jG (x,y) is taken to provide a template output 108.
the template/output is then input to a threshold block 110.
the threshold block 110 receives on the other input thereof the threshold function T(x) which is derived from the input vector 20 at the output of the novum 14 by way of the ave equation, which was described hereinabove. This yields the output u(x) for the output from the IG 16.
the network behavior of the PA 12 is rather more complicated than is indicated above, because the learning laws are inseparable from the dynamics of the architecture. It is easiest to explain the step function response of the network.
the novum 14 is illuminated with an image (applied to the hard wired synapse) , and that there is a single TFD moving along a geodesic in the IG.
each pixel of the novum receives a constant input signal y n which may be positive, negative, or zero (the latter case being uninteresting) .
That signal generates an activation of the same level which is passed through ⁇ - * - before being fanned out to the IG 16.
Almost all the IG neurons 34 have zero output due to the high threshold level and the synaptic weights of zero. But in the vicinity of the TFD, whose lattice barycenter is at X(t) , the threshold is low enough that the output signals P(X(t)+ ⁇ x
the neurons at and near X(0) absorb the input pattern (y n
the process is that of a feedback control ⁇ echanism for the internal model of the "plant” which consists of the soliton wave particles on the IG 16 lattice.
the input to this model is the observation, but only after it has been supplemented by the "regulator” in the novum 14.
the regulator output is constructed to stabilize -the plant, which in this case means that the TFD's are moving along their geodesies with minimum disturbance, i.e., with their own inertia.
the effect is that the output of the novum approximates the time derivative of the step function input and therefore, the patterns stored in the synapses of the IG trajectory X(t) record that time derivative.
the patterns stored in the synapses of the novum 14 record the negative of that time derivative.
IG neurons encode spatial patterns
novum neurons encode temporal signals.
the time derivative of a step function is also the innovations process of the step function. This does not hold true for more general signals.
the current internal representation of that context is the state of the threshold field of the IG 16, i.e., the positions and velocity vectors of the TFD markers. Those TFD's sensitize or condition the IG 16 for the detection of certain states/features in the input.
the output of the novum 14 is filtered through the synaptic templates of all the IG neurons 34 which respond with activation levels representing the a-priori (or "context free") estimate of the content of that data. These activation levels are filtered through the nonuniform IG threshold field 43 to produce the IG output distribution, which represents the conditional likelihood for the presence of states/ features in the input.
the IG output distribution is treated as a collection of scalar coefficients for the formation of a linear combination of the patterns that are stored in spatially distributed form in the novum synapses.
This construction produces the projection of the current observation into the pattern subspace spanned by the prior observations. (It is actually a "fuzzy" projection, since the TFD's are not delta functions over the IG lattice.)
This construction also constitutes a decoding of the abstract IG estimate and is easily seen to correspond to the method of "model based vision" in that the features that are detected by the IG 16 are used to reconstruct a model in the novum 14 for comparison against the actual observation. The correspondence even reflects hierarchical model based schemes if one allows that a network of PA bdules can achieve a nesting of more and more abstract feature sets as the distance of each module from the sensory array increases.
the reconstructed model is NOT an estimate of the current observation, but rather it is an estimate of the observation that will arrive after a time interval tioop hich is the time required for the signal to propagate forward from novum to IG and back to the novum again (because that is how the recording occurred during learning) .
the feature detection has been performed not as a one-shot pattern recognition operation on an isolated image (though it could cldrarly do this as a special case) , but rather as an integrated historical estimate with the temporal gain of the Kalman-Bucy filter.
the estimated observation is constructed, it is compared against the actual observation to produce an error pattern, which is used both to correct the ongoing estimate (through the Kalman gain operation of the variable refractive index field) and to improve the IG coding for future reduction of the error covariance (through the action of the Hebbian learning law) .
This error pattern is the only output of the novum , and since it consists of the residual after projection of the observation onto the historical subspace, it is rightly called the "innovations process" of the observed stochastic process.
it is only a partial or suboptimal innovations process because no single PA module has the capacity to store the entire- fully differential history of its input. This is an important technicality: A true innovations process is a Brownian motion, useless for control or error correction. But a suboptimal innovations process can be so used albeit in a computationally intractable form.
the PA 12 constructs the (estimated) probability density for the state of the system, it contains all the information necessary for achieving any desired control objective so long as the observability and controllability criteria are satisfied.
the output of the novum 14 is already adequate to control the evolution of the internal model of the plant, contained in the IG 16, so it only needs a gain transformation to allow it to control the plant itself.
the mechanism by which the PA can accomplish automatic target recognition and parameter estimation is described, and how to train and operate a system.
the training procedure and the result thereof is first described, which for this system is the equivalent of defining the feature set and building the feature detectors for a model based vision scheme.
the target recognition and tracking mechanism will be described, which detects the features in the input signal, uses them to build a representation of the estimated target and tracks the target while it moves.
Training the PA 12 requires first deciding on a set of basic features which are needed to distinguish targets of interest and selecting training data that is rich in those features and low in confusing or conflicting features.
the IG 16 subnetwork will be initialized with "grandmother cells", each of whose synaptic weights match one sample of one key feature of the targets. Some care will have to be given to the hierarchical primacy of these features, because the most primitive features belong in a PA module (which modules will be described hereinbelow) that is closest to the sensor array, while the most abstract features belong in a deeper PA module.
the PA 12 will be "imprinted" with training patterns which are rich in “grandmother” images.
imprinting occurs when at some time t' one of the key features first appears in the spatiotemporal input pattern. Prior to this time there are no TFD's moving in the IG lattice, because no IG neuron 34 has had a high enough activation level to interact with the threshold field and therefore the threshold field is uniformly flat. But at time t 1 one of the grandmother cells reacts strongly to the passing image that it is coded for and that reaction "plucks the threshold field" to initiate the first wave motion.
FIG 10 there is illustrated a top view of the IG 16 lattice.
two grandmother cells Gl and G2 located at x ⁇ and x 2 respectively in the IG 16 lattice.
G2 always follows Gl by a time J 15 interval of ⁇ t ⁇
F always follows G2 after a time St 2 .
the distance between Gl and G2 in the lattice is large enough that the time required for a threshold disturbance to travel between them is greater than 6 .
F is
the pattern at that time consists of F+R, where R is random with zero mean. (If any part of R consistently followed Gl and G2, it would have been included in F.) At that time also, the two threshold disturbances are concentrated in hyperspheres which we
IG neurons 34 which have a lowered threshold due to being in one of the hyperspheres (and being under a TFD) will weakly absorb the synaptic code for the feature F. The random part of the sample patterns will be cancelled by the learning law. But IG neurons 34 in both the hyperspheres will strongly absorb F because their thresholds will be lower (hence their outputs will be stronger) due to the superposition of pairs of TFD's. Thus, the strength of the code that is learned at any location is determined by the confluence of consistent events in the data.
the error signal from the novum 14 will do two things: (1) It will warp the refractive index field (RIF) to further deflect the TFD's in the direction of smaller error, and (2) it will add (via the Hebbian learning law) a correction to the synaptic patterns in the wakes of the TFD's so that a subsequent repetition of this experiment will require less of a correction — i.e., it improves the model.
RIF refractive index field
the TFD's should have proceeded on course without deflection. If the post-collision trajectories are uncoded by another training, then they will eventually receive duplicates of the coding in the geodesic trajectories. Otherwise, an inconsistency develops which can only be resolved by extending the IG model into higher dimensions. In practice, this cannot be done by physically implementing the IG on a 4- dimensional lattice; but it can be accomplished by networking a second PA module to the first.
Recognition occurs when an input pattern- drives one or more parametrized feature detectors over their thresholds.
the outputs of all feature units are sent back to the novum 14, where they are treated as scalar coefficients in the linear combination of one or more spatial patterns stored in the synapses of the novum 52 neurons 28.
this linear combination constitutes the prediction of the next observation, and since there is a small time delay in constructing that prediction, it is active at the time when that next input arrives. There is no problem getting the timing right, because if the delay is not the same as when the patterns were learned in the first place then the resulting error will correct the time base as part of the Kalman gain transformation.
model-based vision techniques The observation is processed for matches to a number of abstract features which are coded into the IG neurons 34, and these feature responses are used to regenerate a model of the observation. In this case, however, the regenerated model is not a model of what was seen, but what will be seen a short time step into the future. By the time the model is regenerated, the next observation is received and ready for comparison and processing of the error vector.
the feature detectors When the feature detectors are stimulated by an observation, they tug on the threshold field and initiate the motion of a TFD marker along a trajectory determined by the location of the feature IG neuron 34 and the velocity vector (if any) associated with that feature. (How the velocity vector is determined by the gradient of the "refractive index" field associated with the activation pattern generated by the observation is described hereinbelow.)
This marker moves under its own inertia to generate continuing predictions. That is, the IG neurons 34 whose threshold are lowered by the traveling marker generate an output signal whose intensity is determined by the combination of the synaptic template matching and the depth of the threshold; and this signal fans out to the novum 14 to contribute its decoded template to the current prediction. This prediction is subtracted from the actual observation and the residual error is transmitted from the novum 14 back to the IG 16.
the refractive index of the threshold medium of the IG 16 is tied directly to the activation levels of the IG 16, the activation pattern in which the TFD marker is moving will warp the medium in just the right direction to deflect the marker into compliance with the observations. This, at least qualitatively, is what is required by the Kalman-Bucy filter.
the principal advantage of the continuous estimator over stationary DSP methods and model based vision is that the latter are "single-shot" decision methods. That is, they must do the best they can with the signal-to- noise ratio that is available in a single frame (which may be the result of the integration of a number of scans) of data.
the continuous estimator makes decisions based on the information contained in all the relevant history of observations of the target, thus achieving the gain of massive integration while automatically compensating for (or ignoring) constituent motions in the target image.
a neural network architecture based on the Parametric Avalanche Kalman Filter (PAKF) is operable to observe a complex system and issue control signals to cause that system to track a desired reference trajectory.
the design employs a PA module to estimate the state of the "plant” and to function as the servocompensator.
This PA module has an adaptive feedback gain matrix to transform its state estimates into the required control signal.
the adaptive gain matrix monitors the effect of the control signal on the tracking error and adjusts to minimize it, thus allowing appropriate controls to develop even in the event that an actuator motor is cross-wired.
the objective is to design a neural network solution to the problem of asymptotic tracking and disturbance rejection. This problem is discussed in the Chen of the PA.
the asymptotic tracking problem is a generalization of the regulator problem.
a control input to the plant is sought which will stabilize the plant, which usually means to drive it to the zero state.
a control is sought which will drive the plant toward a desired trajectory called the reference trajectory, which need not be either zero or constant.
the stable state of the PA consists of a (possibly empty) set of TFDs whose trajectories are geodesies on the IG 16 lattice, i.e., a set of TFDs which are not being accelerated by any warping of the refractive index field due to prediction errors or any other induced accelerations.
the servocompensator receives the difference between the reference signal and the output of the plant, and that difference modulates the state of the servocompensator in the same way that the sensor input modulates the state of the IG 16 in the Parametric Avalanche.
the S/C state is supplied to a gain matrix which transforms it into a control supplement to the state feedback stabilization control (if there is any) .
the state feedback can be supplied by a state estimator, so long as the plant is observable and controllable. This is called the Separation Theorem, as it allows the state estimation problem to be separated from the control problem.
FIG. 12 there is illustrated a block diagram of a control module 114 which employs two PA Kalman Filters (PAKFs) 116 and 118 for the state estimation functions required in the tracker described above.
PAKFs PA Kalman Filters
Each PAKF 116 and 118 is followed by a gain matrix 120 and 122, respectively, to transform the state estimates into control signals.
This diagram is functionally the same as the tracking system described above and shown in Figure 11. However, it is a bit deceptive for two reasons. One is that the PAKF which is used for asymptotic state estimation does not employ the available control input to the plant 10 to improve its estimates, as it should be. The other is that the gain matrices 120 and 122 cannot be implemented as adaptive neural networks in the positions where they are shown.
PAKF 116 which performs state estimation for the feedback stabilization function, sees only the output of the plant 10.
figure 13 taken from Chapter 7 of the Chen reference, the design of a different asymptotic state estimator is shown which receives both the output of the plant and the control input to the plant.
an estimator 124 is illustrated in feedback with the plant 10.
the difference between the designs of Figure 12 and Figure 13 is that in the design of the Kalman filter, the plant 10 is assumed to be "driven" by noise. That is, all deviations of the plant trajectory about the geodesic are determined by the equation,
the PAKF simply associates incoming patterns with IG neurons 34 in the path of a TFD, so it will build a model of the control during training. Be since the control signal tends to be generated independently of the plant, any attempt to train the PAKF on observed trajectories that may be pushed one way at a certain point in one trial and another way at the same point in the next trial will encounter great difficulty in constructing a good model. If, however, that control signal could be made accessible to the PAKF through an appropriate mechanism, , then it could serve as an "organizer" of the novelty during training and as an accelerator of estimation convergence during recall.
gain matrices 120 and 122 With respect to the gain matrices 120 and 122, its basic function is to move the eigenvalues of the composite system into the left half of the complex plane, and as far left as possible without saturating the controller. What is important here is that a gain is acceptable if the composite system is asymptotically stable (in the sense described above) . One gain is better than another if it drives tne system toward the reference signal faster.
FIG. 14 there is illustrated the preferred PACM design, in which the gain matrix has disappeared because it is implemented adaptively in the novum 14.
Each of the novum neurons has an output n(t) which is input to the IG 16 and also to the plant 10 on a line 126.
the plant output, y(t) is input to an error block 128 that subtracts the value of y(t) from an external input r(t) to provide the input value e(t) to the novum 14.
each synapse is adjusted according to the product of its input times the output of the neuron 28.
the i-th component of that error happens to be available, since it is input to every element of the novum 14.
Our learning objective is to minimize the absolute value of this error. We therefore adapt the gain matrix with the following learning law:
⁇ Kij - ⁇ ⁇ j(t) (d/dt)(e-K ) sgnf ij).
the learning law needs to be modified slightly to prevent the control from saturating. Saturation occurs when n- ⁇ (t) approaches +1 or -1, which are the upper and lower asymptotes of the novum sigmoid function. Pushing the K j i further away from zero will have negligible effect on the control signal and may cause numeric overflow of the synaptic weights.
a solution is to shutdown the learning by linking the rate constant ⁇ to the magnitude of n ⁇ (t) .
a simple example will be stepped through in detail to illustrate the action of the PACM.
the observation is a measure of the elevation angle of the barrel of a rapid-fire gun mounted on a moving platform.
the reference signal is supplied to the operator and for this example it is assumed to be 60 initially zero (horizontal) . Since this is a one dimensional example, we suppose that the novum 14 contains a single neuron, although the IG 16 may contain several hundred in a one dimensional lattice.
the PA 12 has already been >- trained as described hereinabove to observe the measurement and to estimate the elevation angle through normal vehicular motions and during firing of the gun, but without any stabilization.
the neurons 34 of the IG 16 have come to be associated with a range of elevation angles. As described hereinabove, even though the IG neurons 34 are on a discrete lattice, the likelihood estimates can interpolate between them, so that the IG estimates are practically continuous.
the novum 14 output is then connected to the vertical actuator and a reference signal of zero degrees is supplied so that the input to the novum 14 is the actual elevation angle of the gun.
y(t) is the observed elevation angle (positive being above horizontal)
Y(t)) is the IG estimate of the tracking error, given the history of observations
n(t) is the output of the novum, which is also the control signal u(t) to the actuator
K(t) is the (scalar) value of the synaptic weight in the novum 14 which receives the input e(t) .
CASE 1 n(t) is connected to the actuator "properly", so that the control acceleration of the gun elevation is directly proportional to n(t) .
CASE 2 Same as Case 1 except the actuator is cross wired, so that the vertical acceleration of the gun elevation is inversely proportional to n(t) , i.e., the gun moves down when n(t) is positive.
the learning law will react to any large magnitude error as if it did not "trust" its gain value(s). That is because such errors are always increasing in magnitude until the control action takes effect, so during that time the matrix is being adapted in the wrong direction. But if the control action is correct, the error will begin decreasing and the gain matrix will return to its trustworthy state.
the learning rate constant ⁇ controls the time constants for adaptation, so it is necessary to adjust ⁇ properly to allow for the latency in the feedback loop.
the FPU equations are anisotropic, so that an initial disturbance results in a positive pulse moving to the left and a negative pulse moving to the right. Both of the nonperiodic boundary conditions caused some degree of reflection of the waves from the ends of the lattice. With the periodic boundary condition, we could run the simulation until the left and right waves collided, and we confirmed that they would emerge from the collision with their shapes intact.
Figures 15 and 16 show the result of such an experiment.
the boundary conditions are "WRAP", which allows the initial disturbance over neurons number 1-5 to propagate to the right and to the left from waveform 129 at time t ⁇ .
the leftward disturbance wraps around and re-enters the array from the right as waveform 131 slightly later in time.
the waveform 129 moves to the right to form waveform 133 at t, and waveform 131 moves to the left to form waveform 135 at t.
the disturbance moving to the right is positive and the disturbance moving to the left is negative. That is the opposite of what happens with the usual sign on the nonlinear term of the FPU equation, but the sign was - reversed since it is desired that positive waves move to the right.
Figure 15 the experiment proceeds up to, but not beyond the point of the collision of the right and left waves 133 and 135 at time t ⁇ .
Figure 16 shows the two waves 133 and 135 at the time of collision with solid curve 130 and after the collision by dotted curve 132, illustrating one of the key properties of solitons, and demonstrating the viability of one of the most important elements of the Parametric Avalanche design.
Figures 17 and 18 illustrate the response of the synaptic weights in the IG 16 and the novum 14, respectively, to the onset and the offset of a boxcar function which was input to pixel number 5 (only) of the novum.
Figure 17 shows that the onset was recorded most strongly at IG neuron number 25, which is where the moving soliton was shortly after the signal came on (at IG neuron number 20) .
the offset was recorded at IG neuron number 88 (the signal was turned off when the soliton was at number 80) .
the graph in Figure 12 shows the values of synaptic weight number 5 on each of the 100 IG neurons, and clearly illustrates the way in which the novelty in the temporal signal is distributed spatially over the neurons of the IG. Note that the synaptic ⁇ - weights at the equilibrium point just before the offset of the boxcar do not reach the zero level, for reasons that we discussed in Section 2.3.1.
the graph in Figure 18 illustrates the values of the synaptic weights on each neuron 28 (pixel) of the novum 14.
a dashed curve 134 shows all the learnable synapses on pixel number 5 of the novum 14.
the other pixels, which received no input, are also shown to illustrate that even though their synapses were receiving input from the IG, their weights remained at their initial values near zero (random within the interval [-.01,+.01]) . Note that these weights are the negative of those in Figure 17, and that they are partially concentrated on a single neuron, rather than being spatially distributed as in the IG.
Figure 17 is almost exactly the activation level of the IG at onset of the Mexican hat.
the vertical axis is rescaled and relabeled as the "activation" of the neuron whose number appears on the horizontal axis.
neuron number 88 responded with the largest positive output, since its template aligned with the negative of the Mexican hat function.
the PA has the advantage that the threshold field dynamics not only control the learning of 67 the patterns, but also the recall gain in the presence of a consistent and reinforcing history of observations.
FIG. 19 there is illustrated a block diagram of one of the neurons 34 in the IG 16 which, as described above, comprises a single processing element.
Each of these processing elements is arranged in an array of processing elements of, for example, an M x N array for a two-dimensional system or even a higher dimensional system.
Each of the processing elements in the array is represented by that illustrated in Figure 19.
the processing element in Fl ⁇ gure 19 receives on one set of inputs 140 the signal vector
inputs 142 receive adjacent threshold levels from selected nodes, which in the preferred embodiment, are neighboring nodes. However, it should be understood that these threshold levels can be received from selected other nodes or neurons in the IG lattice.
Each of the processing elements is comprised of an IG processor 144 and a threshold level 146.
the inputs 140 are input to the IG processor 144 and the inputs 142 are input to the threshold level 146.
a memory 148 is provided which is interfaced through a bi ⁇ directional bus 150 to the processing element to communicate with both the IG processor 144 and *the threshold level 146.
a block 152 represents the f portion of the processing element that computes the stefcivation levels. This resides in the IG plane 144.
In the threshold plane 146 there is a block 156 that is provided for updating the threshold values.
there is a clock 158 that operates the processing element of Figure 19. As described above, each of the processing elements is asynchronous and operates on its own clock, which is an important aspect of the Parametric Avalanche.
the output of the activation block 152 is input to a threshold function block 158 which determines the output on a line 160 as a function of the threshold generated by the threshold computation block 156. As described above, the threshold is low only in the vicinity of the TFD.
the output of block 158 comprises the output of the IG neuron or processing element of Figure 19 and this also is fed back to the input of the compute weight update block 154 to determine new weights.
the output of the activation block 152 is also input to the threshold level update block 156.
Each of the blocks 152, 154 and 156 interface with the memory 148 which is essentially a multiport memory. This is so because each of the processes operate independently; that is, the synaptic weights are fetched from memory 148 by the activation computation block 152 while they are being updated.
the activation computation block 152 when a signal is received on the lines 140 from the novum, the activation computation block 152 must fetch the weights from the memory 148 in order to compute the activation level. This is then input to the threshold block 158. At the same time, the threshold output levels from each of the interconnected (preferably adjacent) nodes is received to generate the threshold level at that processing element or node. This is utilized to set the threshold level input to the node, and, thus, determine the output level. As described above, if the threshold level is low, this will produce an output even if the activation level is very low. However, if the threshold level is high, but the activation level is very high, this may also produce an output. 69
the first is when the system is initialized and nothing is stored in the template such that the system must learn. As a soliton wave moves across the processing element and the threshold level goes down, the signal level will go up on the output due to the mismatch and the presence of a threshold level and the observed image will be stored in the template in memory 148. In the second situation, a TFD moves across a processing element, but the input signal mismatches with the memory, resulting in a high activation output from activation block 152. In this case, the mismatch either does not produce an output or it does produce an output.
the memory template will stay where it is, and if it does produce an output, the memory template will be transformed so that it looks like whatever signal is activated.
the third situation is when the soliton wave passes across the particular processing element and the threshold gets lowered and the incoming signal actually matches the template. Since it actually matches the template, the output will be high but since the template already looks like what was originally stored it will only be reinforced but not changed in form.
FIG 20 there is illustrated a block diagram of one of the neurons 28 in the novum 14, which, as described above, comprises a single processing element.
Each of these processing elements is arranged in an array of processing elements of, for example, an M x N array for a two dimensional system.
Each of the processing elements in the array is represented by that illustrated in Figure 20.
the processing element in Figure 20 receives on one set of inputs 170 the signal vector inputs from the output of the observation matrix 10.
the processing element of Figure 20 also receives on a second set of inputs 172 the outputs from the IG 16.
Each of the processing elements is comprised of a computational unit and a memory 174.
the memory 174 is interfaced with the computational unit through a bi ⁇ directional bus 176.
the computational unit computes the activation energy in a computational block 178.
the computational unit also computes the weight updates, as represented by compuational block 180.
the weight update computational block 180 provides the learning portion of the novum. Both the block 178 and the block 180 receive the inputs from both the signal vector inputs and IG outputs.
the output of the activation computation block 178 is input to a threshold function block 182 which was described above and comprises a bi-polar function.
the output of the threshold function block 182 is input to the weight update computation block 180 and also provides the novum output on the line 184.
a clock 186 is provided which operates the computational unit of the novum.
a cart 186 is provided with an upright member 188 disposed on the upper surface thereof and mounted on a pivot point 190 at the lower end thereof.
the upper end of the member 188 has a weight 192 disposed thereon.
the object of this problem is to maintain the member 188 in a vertical and upright direction.
the Parametric Avalanche in this example is comprised of 100 neurons in an IG 194 and a single novum neuron 196.
the novum neuron receives as inputs the outputs of the 100 neurons in the IG 194 and it also receives a single observation input, representing the angle from the vertical relative to the cart 186.
the angle theta is input to the negative input of summing block 198, the positive input of which is connected to a signal REFISG.
the output is input to a block 200 which receives on the other input thereof the adaptive gain input.
the output of block 200 provides the observation input to the novum neuron 196.
the control input to the cart is a horizontal acceleration and is supplied by the network.
the output of the novum neuron 196 is input to each of the 100 neurons in the IG 194 on an output line 204.
the output of the novum neuron 196 is input through a gain scaling block 206 in the cart 186 to provide a control input.
the threshold field is represent by a ⁇ U ⁇ oving TFD 202 which traverses the IG neurons from the left to the right.
This IG is a one dimensional IG. Since the novum output is a smooth approximation of the derivative of the input (when the input is entirely novel) , the novum serves as a "derivative controller".
Figure 22 there is illustrated the time evolution of the angle from the vertical, and the corresponding novum output.
Figure 22 is similar to Figure 21 except that random disturbances have been injected into the system, as will be described hereinbelow.
a "virgin" Parametric Avalanche generates a control signal which maintains an inverted pendulum in its upright position through the application of a horizontal acceleration to the pivot point of the pendulum.
the simulation consists of a loop on the time variable, in each cycle of which the error (difference) between the actual angle of the inverted pendulum (in radians away from the vertical) and the desired angle of zero radians is multiplied by a gain coefficient and then supplied as the input to the novum neuron 196 of the Parametric Avalanche model.
the Parametric Avalanche model is then called upon to advance its state forward one increment of time in accordance with its Quantum Neurodynamics and its learning laws, and to present the output of the novum neuron 196 as the control signal (the horizontal acceleration) for the motion of the pivot point 190.
the Novum output is restricted by the sigmoid threshold function to being within the range from -1 to +1, it is amplified by a constant positive factor before it reaches the pendulum model.
the update subroutine for the adaptive adjoint gain coefficient is called upon to adjust this gain for optimum effect of the control action.
the pendulum model is called upon to advance its state forward one increment of time by a simple double integration of the second order difference equations, thus producing the actual pendulum angle for use in the next cycle of the loop.
the Quantum Neurodynamics of the PA model is implemented by the "naive" method rather than by actual integration of the nonlinear Schroedinger equation.
a threshold depression (TFD) 202 is propagated along the one-dimensional IG lattice 194 by a global type of algorithm which is capable of interpolating the TFD 202 into one hundred equally spaced positions between each pair of the IG neurons.
the TFD 202 moves with a velocity that is specified by the operator at run time. There is no provision for modulation of this velocity by the "warp drive” mechanism (warping of the refractive index field) , because it is assumed that the synapses of each IG neuron are randomly initialized prior to the passage of the TFD 202. Of course, those synapses will become programmed as the
TFD 202 passes by in accordance with the learning law.
the output of the novum will simply be a noisy version of the input signal, the variance of the noise depending on the variance of the random initialization of the synaptic weights.
any velocity in excess of approximately 2 IG neurons per simulation second will produce this kind of output, which is useless for control of the pendulum.
the noise disappears and the phase angle of the novum output begins to lead the phase angle of the input, moving toward the time derivative of the input signal. This allows the novum output to function as a derivative controller of the pendulum.
the simulations shown in the graphs of Figures 22 and 23 were produced with a TFD velocity of 1 IG neuron per simulation second.
the input data which produced the graph of Figure 22 is contained in Table 1 and the output data which produced the graph in Figure 23 is contained in Table 2 in the list bearing the same name as the graph, but with a ".DAT" extension.
the main difference between the two is that in Figure 23, a random disturbance was applied to the velocity of the pivot point, as indicated by the nonzero value of the RANGE parameter.
a control system is designed by passing a state estimate through a gain factor to "represent" the estimate properly to the input of the plant.
we adopt an adjoint representation of the gain by placing it between the output of the plant and the input of the state estimator (the PA) .
the PA state estimator
the reason for doing this is that the information needed for adaptive gain adjustments is not compatible with neurocomputing methods when the gain is placed between the PA output and the plant input; but it is compatible with neurocomputing methods when it is placed in the observation path.
the UPDATE subroutine adjusts the gain so as to favor opposite signs of the input to the novum (the tracking error) and the output of the novum (the control) .
that output approximates the time derivative of the tracking error, such a gain will result in any tracking error being driven to zero by the control signal.
the gain happens to be negative, then the PA will be learning a reversed image of the observations, but that is what it takes to obtain one signal from the novum that means the'"same thing to both the plant and the PA's model of the plant in terms of controlling its trajectory.

Landscapes

Engineering & Computer Science (AREA)
Theoretical Computer Science (AREA)
Physics & Mathematics (AREA)
Data Mining & Analysis (AREA)
General Health & Medical Sciences (AREA)
Biomedical Technology (AREA)
Biophysics (AREA)
Computational Linguistics (AREA)
Life Sciences & Earth Sciences (AREA)
Evolutionary Computation (AREA)
Artificial Intelligence (AREA)
Molecular Biology (AREA)
Computing Systems (AREA)
General Engineering & Computer Science (AREA)
General Physics & Mathematics (AREA)
Mathematical Physics (AREA)
Software Systems (AREA)
Health & Medical Sciences (AREA)
Image Analysis (AREA)

EP90909520A 1989-06-16 1990-06-15 Fortlaufende bayesschätzung mit einer neuronalen netzwerkarchitektur Withdrawn EP0433414A1 (de)

Applications Claiming Priority (2)

Application Number	Priority Date	Filing Date	Title
US36746889A	1989-06-16	1989-06-16
US367468		1994-12-30

Publications (1)

Publication Number	Publication Date
EP0433414A1 true EP0433414A1 (de)	1991-06-26

Family

ID=23447304

Family Applications (1)

Application Number	Title	Priority Date	Filing Date
EP90909520A Withdrawn EP0433414A1 (de)	1989-06-16	1990-06-15	Fortlaufende bayesschätzung mit einer neuronalen netzwerkarchitektur

Country Status (4)

Country	Link
EP (1)	EP0433414A1 (de)
JP (1)	JPH04500738A (de)
AU (1)	AU5835990A (de)
WO (1)	WO1990016038A1 (de)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number	Priority date	Publication date	Assignee	Title
DE4100500A1 (de) *	1991-01-10	1992-07-16	Bodenseewerk Geraetetech	Signalverarbeitungsanordnung zur klassifizierung von objekten aufgrund der signale von sensoren
US6054710A (en) *	1997-12-18	2000-04-25	Cypress Semiconductor Corp.	Method and apparatus for obtaining two- or three-dimensional information from scanning electron microscopy
JP5541578B2 (ja)	2010-09-14	2014-07-09	株式会社リコー	光走査装置および画像形成装置
WO2019018533A1 (en) *	2017-07-18	2019-01-24	Neubay Inc	NEURO-BAYESIAN ARCHITECTURE FOR THE IMPLEMENTATION OF GENERAL ARTIFICIAL INTELLIGENCE
US11556794B2 (en) *	2017-08-31	2023-01-17	International Business Machines Corporation	Facilitating neural networks
US11556343B2 (en)	2017-09-22	2023-01-17	International Business Machines Corporation	Computational method for temporal pooling and correlation
US11138493B2 (en)	2017-12-22	2021-10-05	International Business Machines Corporation	Approaching homeostasis in a binary neural network
WO2019203945A1 (en) *	2018-04-17	2019-10-24	Hrl Laboratories, Llc	A neuronal network topology for computing conditional probabilities
CN111168680B (zh) *	2020-01-09	2022-11-15	中山大学	一种基于神经动力学方法的软体机器人控制方法

1990
- 1990-06-15 EP EP90909520A patent/EP0433414A1/de not_active Withdrawn
- 1990-06-15 JP JP2508939A patent/JPH04500738A/ja active Pending
- 1990-06-15 AU AU58359/90A patent/AU5835990A/en not_active Abandoned
- 1990-06-15 WO PCT/GB1990/000932 patent/WO1990016038A1/en not_active Application Discontinuation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO9016038A1 *

Also Published As

Publication number	Publication date
JPH04500738A (ja)	1992-02-06
AU5835990A (en)	1991-01-08
WO1990016038A1 (en)	1990-12-27

Legal Events

Date	Code	Title	Description
1991-05-09	PUAI	Public reference made under article 153(3) epc to a published international application that has entered the european phase	Free format text: ORIGINAL CODE: 0009012
1991-06-26	AK	Designated contracting states	Kind code of ref document: A1 Designated state(s): AT BE CH DE DK ES FR GB IT LI LU NL SE
1991-08-21	17P	Request for examination filed	Effective date: 19910627
1993-01-13	RAP1	Party data changed (applicant data changed or rights of an application transferred)	Owner name: MARTINGALE RESEARCH CORPN.
1994-08-03	17Q	First examination report despatched	Effective date: 19940620
1995-05-05	STAA	Information on the status of an ep patent application or granted ep patent	Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN
1995-06-28	18D	Application deemed to be withdrawn	Effective date: 19950103

Publication	Publication Date	Title
Becker	1991	Unsupervised learning procedures for neural networks
Barreto et al.	2004	Identification and control of dynamical systems using the self-organizing map
Ghosh et al.	2001	An overview of radial basis function networks
Lim et al.	1997	An incremental adaptive network for on-line supervised learning and probability estimation
Ebadzadeh et al.	2015	CFNN: Correlated fuzzy neural network
Barreto et al.	2003	A taxonomy for spatiotemporal connectionist networks revisited: The unsupervised case
Lehtokangas et al.	1995	Initializing weights of a multilayer perceptron network by using the orthogonal least squares algorithm
Steinberg et al.	1991	A neural network approach to source localization
EP0433414A1 (de)	1991-06-26	Fortlaufende bayesschätzung mit einer neuronalen netzwerkarchitektur
Amari	1991	Mathematical theory of neural learning
Hinton et al.	1999	Spiking boltzmann machines
Schaal et al.	1997	Receptive field weighted regression
Camargo	1990	Learning algorithms in neural networks
MacLennan	1997	Field computation in motor control
Dash et al.	2018	Gold price prediction using an evolutionary pi-sigma neural network
Smyth	1994	Probability density estimation and local basis function neural networks
Fujita	1993	Trial-and-error correlation learning
Ducke	2003	Archaeological predictive modelling in intelligent network structures
Sutton et al.	2001	A fuzzy autopilot design approach that utilizes non-linear consequent terms
Iegorova et al.	2020	Binary Classification of Terrains Using Energy Consumption of Hexapod Robots
Chang	1998	Learning Algorithms and Applications of Principal
CN117710789A (zh)	2024-03-15	一种基于选择性激活的脉冲神经网络连续学习目标识别***
Ungar et al.	1995	EMRBF: a statistical basis for using radial basis functions for process control
Pahlevi	2022	Deep Learning for Optical Tweezers DeepCalib Implementation for Brownian Motion with Delayed Feedback
Patan et al.	2008	Locally recurrent neural networks