WO2005048185A1 - Transductive neuro fuzzy inference method for personalised modelling - Google Patents

Transductive neuro fuzzy inference method for personalised modelling Download PDF

Info

Publication number
WO2005048185A1
WO2005048185A1 PCT/NZ2004/000290 NZ2004000290W WO2005048185A1 WO 2005048185 A1 WO2005048185 A1 WO 2005048185A1 NZ 2004000290 W NZ2004000290 W NZ 2004000290W WO 2005048185 A1 WO2005048185 A1 WO 2005048185A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
output
input
predicting
rationalised
Prior art date
Application number
PCT/NZ2004/000290
Other languages
French (fr)
Inventor
Nikola Kirilov Kasabov
Qun Song
Original Assignee
Auckland University Of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Auckland University Of Technology filed Critical Auckland University Of Technology
Publication of WO2005048185A1 publication Critical patent/WO2005048185A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/043Architecture, e.g. interconnection topology based on fuzzy logic, fuzzy membership or fuzzy inference, e.g. adaptive neuro-fuzzy inference systems [ANFIS]

Definitions

  • the invention relates to a Transductive Neuro Fuzzy Inference Method and Uses for Personalised Modelling.
  • connectionist and also fuzzy inference systems a global model is learned that consists of many local models (e.g., rules representing clusters of data) that collectively cover the whole space and are adjusted incrementally on new data.
  • the output for a new vector is calculated based on the activation of one or several neighbouring local models (rules).
  • the inductive learning and inference approach is useful when a global model ("the big picture") of the problem is needed even in its very approximate form.
  • some models e.g. ECOS
  • the inductive global learning process is less suitable where personalised modelling is required, for example, in clinical and medical applications of learning systems. This problem is particularly acute in determining individual outcomes, diagnoses and treatment regimes for medical decision support systems. In such applications, the focus is not on the global model, but on the individual patient. It is not so important what the global error of a global model over the whole problem space is, but rather - the accuracy of prediction for an individual patient.
  • Transductive inference systems and methods have been devised to address this problem by estimation of a function at a single point of the search space only.
  • the closest examples that form a data subset are derived from an existing data set or/and generated from an existing model.
  • a new model is dynamically created from this subset to approximate the function at the new input vector.
  • An example of these models is the k-nearest neighbour method where for every new input vector v, the closest A: vectors from a training (existing) data set are chosen and the predicted output for the new vector is calculated based on the outputs of these k examples.
  • Unfortunately, currently available transductive inference methods and systems have one or more of the following disadvantages:
  • the models do not estimate the importance factors for the input variables in every part of the problem space, where a new vector is located; it is known that for different groups of patients, for example old, versus young; male versus female, some input variables are more important than others, and if this is taken into account, a more accurate output value would be calculated for the new input vector.
  • the present invention provides a method for predicting an output from a test input comprising the step of receiving a set of input data having expected output data; applying a transformation to at least some of the input data to obtain a set of normalised data; applying a rationalising function to the set of normalised data to obtain a set of rationalised input data and rationalised expected output data; applying a clustering function to the set of rationalised data; applying a transformation to a set of rules based at least partly on the results of the clustering function; and evaluating the accuracy of the rationalised expected output data; and generating output data.
  • the present invention comprises a prediction system configured to predict an output from a test input, the system comprising a data transformation module configured to transform at least some of the input data to obtain a set of normalised data; a rationalising module configured to apply a rationalising function to the set of normalised data to obtain a set of rationalised input data and rationalised expected output data; a clustering module configured to apply a clustering function to the set of rationalised data; a set of rules maintained in computer memory; an optimiser module configured to apply a transformation to the rules based at least partly on the results of the clustering function; a decoder configured to transform a series of outputs; and an output layer configured to display a set out outputs.
  • the present invention also extends in a still further aspect to a neural network module for carrying out the steps in the first aspect.
  • the present invention also provides a method for predicting an output from a test input x comprising at least the following steps: a) provide a set D of known global inputs and expected outputs of the used variables; b) select relevant input variables and initialise importance factors for the input variables for the new input vector x - local importance factors; c) perform a transformation of the problem space into a reduced and normalised variable space based on weighted variable normalisation that reflects the local importance of the input variables for the area of the new input vector, thus producing a normalised data set D '; d) rationalise the said set D ' to produce a new rationalised local set D ' x of inputs and expected outputs that are closely related to the test input x in the variable importance space; e) cluster and partition the rationalised set D' x in the weighted variable normalisation problem space using a clustering algorithm; f) set the initial parameters of the or classification/prediction model based on fuzzy rules according to the results of the clustering and the partitioning in steps
  • the present invention also provides a system for predicting an output from a test input comprising at least the following: a) an input device for receiving a test input x; b) a storage and retrieval medium to provide a set D of known global and previously stored inputs and expected outputs; c) a variable selection and data transformation module that transforms data from the original space to a weighted variable normalisation space by performing a transformation of the problem space into a normalised variable space based on weighted variable normalisation that reflects the local importance of the input variables for the area of the new input vector, thus producing a normalised data set D'; d) a rationalising module that produces a new rationalised set D' x of inputs and expected outputs that are closely related to the test input x from set D' in the weighted variable normalisation space; e) a clustering module comprising a clustering algorithm for clustering and partitioning the rationalised set D ' x ; f) a fuzzy rule creation module that creates fuzzy rules and sets initial parameters
  • Figure 1 shows a schematic diagram of the main components of an embodiment of the invention.
  • Figure 2 shows case study data associated with the invention.
  • FIG. 1 shows a schematic diagram of preferred aspects of one form of the invention.
  • the system 100 includes an input layer to which known inputs and expected outputs are passed.
  • the set of known inputs and expected outputs are preferably stored in a database maintained in computer memory.
  • the outputs are membership classes.
  • the membership class may, for example, be a class of patient permitting a classification of a patient or a condition.
  • the output is one or more data values or vectors.
  • typical input values are clinical and/or gene patient-specific data and outputs are preferably selected from the group consisting of membership of a group of patients, a risk of an event, a clinical variable not easily directly measured e.g. glomerular filtration rate, a prognostic outcome, a diagnostic outcome, a suggested treatment or treatment regime.
  • typical input variables and their corresponding output variables would be: records of applicants for a bank loan and the decision (grant, or don't grant the loan); a set of economic variables and a predicted economic state; a set of financial indexes and a predicted value for an index; etc.
  • Transformation could be performed on the input data to obtain a set of normalised data, shown as normalised inputs 110.
  • the system could assign an importance factor, or set of importance factors to one or more of the inputs.
  • the importance factors are normalisation weights.
  • the variable importance space is weighted variable normalisation space. The inputs having normalisation weights exceeding a threshold importance factor are selected for subsequent transformation and rationalisation.
  • Normalised variable space is also known as variable importance space.
  • Variable normalisation weights are also known as local importance factors.
  • importance factors in local space may be initialised by assuming they are equal to importance factors determined using a global model, such as an inductive fuzzy neural network. In an alternate embodiment, the importance factors are all initialised to be equal to 1.
  • the model produced from the optimisation is a set of rules.
  • a rationalising function could then be applied to the normalised inputs 110 to obtain a set of rationalised data, shown as rationalised inputs 115.
  • the rationalisation process may be in the form of human selection of relevant data based on experience.
  • the rationalisation process may be a computational method as suggested in this invention.
  • a simple example of such a computational method is the k- nearest neighbour (k-NN) transductive decision method described in Mitchell, M.T. (1997). "Machine Learning,.” MacGraw-Hill. and Vapnik, V. (1998) Statistical Learning Theory, John Wiley & Sons, Inc.
  • the rationalised set of inputs and expected outputs that are closely related to the test input represent a sub-set of the original set data more closely related to the test input than the original set data according to a measured similarity. If desired, the rationalised set can be selected after the initial data set is transformed into the variable importance space through normalisation procedure and distance is measured between the new input vector and all data vectors. One embodiment selects criteria for data selection by selecting a minimum number of N closest vectors to the new vector but all 5 samples that are different from each other in less than 10% percent are also included.
  • the rationalised set can also be selected based on the distance between the new input vector and the data samples in the original input space, after which importance factors are calculated with the use of methods from the art, such as correlation analysis (for prediction models) and signal-to-noise ratio (for classification models).
  • the rationalised set then is l o transformed into a set in the new space - the variable importance space.
  • a clustering function could then be applied to create a set of clustered inputs 120.
  • the clustering and partitioning of the rationalised set may be accomplished by using any suitable clustering algorithm in the art, but in a preferred embodiment, clustering and partitioning is performed in the local weighted variable normalisation problem space based 15 on the importance factors.
  • the currently preferred algorithm is ECM described in Kasabov, N. and Q. Song (2002). "DENFIS. Dynamic, evolving neural-fuzzy inference systems and its application for time-series prediction.” IEEE Trans. On Fuzzy Systems 10(2): 144-154 , which is hereby incorporated by reference.
  • the system also maintains a set of rules, shown as fuzzy rules 105.
  • fuzzy rules 105 The process of creating 20 fuzzy rules may be undertaken separately from the clustering and partitioning or may be undertaken in the same process.
  • the currently preferred ECM algorithm provides the requisite process as part of the partitioning and clustering process.
  • the system includes an optimiser 130 that is configured to apply a transformation, for example an optimising transformation, to the rules based at least partly on the clustering 25 function described above.
  • the parameters of the fuzzy rules may be optimised by any objective evaluation method that determines the fitness of the data.
  • the currently preferred method is to determine an overall error for the fitness of the data.
  • Optimisation occurs by: (1) changing the weighted normalisation intervals (importance) for the input variables; and (2) changing the parameters of the fuzzy rules, using in both cases error minimising algorithms in the art.
  • the currently preferred algorithm for optimisation is a steepest descent algorithm. However, there are other well-established algorithms available in the art suitable for application in the practice of the present invention. Following optimisation, control could be passed back to obtain normalised inputs 110, rationalised inputs 115 and clustered inputs 120.
  • the output from the system and method is calculated using fuzzy decoding algorithms in the art specific for the fuzzy rules used.
  • a decoder 135 applies a fuzzy decoding algorithm. Outputs are then passed to an output layer 140.
  • the problem space transformation module based on weighted variable normalisation, the clustering module, the fuzzy rule creation module, the optimising module, and fuzzy decoder form part of a computer implemented neural network 145 comprising an input transformation layer comprising one or more input nodes arranged to receive and normalise input data; a rule base layer comprising one or more rule nodes; an output layer comprising one or more output nodes; and an adaptive component arranged to aggregate selected two or more rule nodes in the rule base layer based on the input data.
  • the system and the method are preferably dynamic multi-input - multi-output neural-fuzzy inference systems and methods respectively with a local generalization, in which a fuzzy inference engine is used, for example the Zadeh-Mamdani engine described in Zadeh, L.A. (1965). "Fuzzy Sets.” Information and Control 8. 338-353 or the Takagi-Sugeno engine described in Takagi, T. and M. Sugeno (1985). "Fuzzy Identification of systems and its applications to modeling and control.” IEEE Trans. On Systems. Man, and Cybernetics 15: 116-132.
  • the local generalization means that in a sub-space of the whole problem space (local area) a model is created that performs generalization in this area.
  • Gaussian fuzzy membership functions may be applied in each fuzzy rule for both the antecedent and the consequent parts or for the antecedent part only.
  • a BP (Back- Propagation) learning algorithm may be used for optimizing the parameters of the fuzzy membership functions.
  • other learning algorithms may be employed.
  • An additional learning function may be derived for use in the model.
  • the distance between vectors x and y is preferably measured in the weighted variable normalisation space as the normalized Euclidean distance defined as follows:
  • the variable importance factors normalisation weights
  • an importance weight for an input variable can be 0 or close to 0, which indicates that this variable is not selected in the local model to calculate the output value for the input variable.
  • a ECM Evolving Clustering Method
  • a ECM Evolving Clustering Method
  • DENFIS Dynamic, evolving neural-fuzzy inference systems and its application for time-series prediction.
  • IEEE Trans. On Fuzzy Systems 10(2): 144-154 and the cluster centres and cluster radii are respectively taken as initial values of the centres and the widths of the Gaussian membership functions.
  • the data in a cluster may be used for creating a linear output function.
  • TNFIP Transductive Neuro-Fuzzy Inference Method for Prediction System
  • initial variable normalisation weighting functions f l3 f 2 ,...,f p for the input variables x l3 x 2 ,...,x p to represent their importance for the new input vector x t _ .
  • Ni can be pre-defined based on experience, or - optimised through the application of an optimization procedure. Here we assume the former approach.
  • ECM or another clustering algorithm
  • Fy are fuzzy sets defined by the following Gaussian type membership function:
  • the steepest descent algorithm (BP) is used then to obtain the formulas for the optimization of the parameters n q ⁇ , ⁇ q ⁇ , my, ay and ⁇ y of Z ⁇ deh-M ⁇ md ⁇ ni type TNFI such that the error function E from (12) is minimized: (13)
  • xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2
  • x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x2 has an importance factor of 0.3)
  • THEN y has a membership degree of 0.9 to a Gaussian function with a center at 0.8 and a standard deviation of 0.18, with 15 vectors being in this cluster
  • xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (xl has an importance factor of 0.9) and x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x2 has an importance factor of 0.3)
  • the TNFIP modelling method is used here as part of a methodology for time series prediction for modelling and predicting the future values of time series.
  • the methodology is presented through a case study problem of building transductive models for the prediction of the Mackey-Glass (MG) time series data set. This has been used as a bench-mark problem in the areas of neural networks, fuzzy systems and hybrid systems.
  • This time series is created with the use of the MG time-delay differential equation defined below: dx(t) 0.2 x(t- ⁇ ) - 0.1 JC ( (24) dt l +x 10 (t- ⁇ )
  • the fourth-order Runge-Kutta method was used to find the numerical solution to the above MG equation.
  • the task is to predict the values x(t + 85) from input vectors [x(t — 18), x(t — 12), x(t - 6), x(t)] for any value of the time t.
  • connectionist models applied for inductive inference on the same task. These models are MLP and DENFIS.
  • Table 1 lists the prediction results including obtained by using the TNFIP method and two other popular methods - MLP (Multilayer perception) and DENFIS (Dynamic Neuro- Fuzzy Inference System) in terms of RMSE (root mean square error), MAE (mean absolute error) on the simulating data as well as the number Rn of rules or rule nodes or neurone used in each model.
  • MLP Multilayer perception
  • DENFIS Dynamic Neuro- Fuzzy Inference System
  • RMSE root mean square error
  • MAE mean absolute error
  • MLP Number of neurons in the hidden layer: 16; Learning algorithm: Levenberg-
  • DENFIS Dthr (distance-threshod): 0.15; MofN: 4; Learning epochs: 60;
  • TNFIP JV ⁇ : 32; Dthr: 0.20; Learning epochs for weight and parameter optimisation for each input vector: 60.
  • the TNFIP transductive reasoning system performs better than the other inductive reasoning models. This is a result of the fine tuning of each local model in TNFIP for each simulated example, derived according to the TNFIP learning procedure. The finely tuned local models achieve a better local generalisation.
  • a GA is run on a population of TNFI models for different values of weights, over several generations.
  • a fitness function the root mean square error RMSE of a trained model on the training data or on a validation data is used.
  • the GA runs over generations of populations and standard operations are applied such as binary encoding of the genes (weights); roulette wheel selection criterion; multi-point crossover operation for crossover.
  • the model with the least error is selected as the best one, and its chromosome - the vector of weights [q ls q 2 ,...,q p ] defines the optimum normalization range for the input variables.
  • TNFIP is applied on the Mackey-Glass (MG) time series prediction task.
  • the following GA parameter values are used: for each input variable, the values from 0.16 to 1 are mapped onto 4 bit string; the number of individuals in a population is 12; mutation rate is 0.001; termination criterion (the maximum epochs of GA operation) is 100 generations; the root-mean square error RMSE on the training data is used as a fitness function.
  • the resulted weight values, the training RMSE and testing RMSE are shown in Table 2.
  • TNFIP results with the same parameters, the same framing data and testing data, but without optimisation of the normalisation weights are also shown in Table 2.
  • Zadeh-Mamdani rules e.g.:
  • xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2
  • x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12
  • x3 has a membership degree of 0.68 to a Gaussian function with a center at 0.14 and a standard deviation of 0.02
  • x4 has a membership degree to a Gaussian function with a center at 0.87 and standard deviation of 0.2
  • THEN y has a membership degree of 0.78 to a Gaussian function with a center at 0.83 and a standard deviation of 0.18, with 10 vectors being in this cluster
  • the TNFIP is used to develop an application oriented methodology for medical decision support systems. It is presented here through a case example - personalised (individualised) modelling for the evaluation of a renal function of patients in a renal clinic. Real data is used and the developed TNFIP system is currently considered for use in a clinical environment.
  • the accurate evaluation of renal function is fundamental to sound nephrology practice.
  • the early detection of renal impairment will allow for the institution of appropriate diagnostic and therapeutic measures, and potentially maximise preservation of intact nephrons.
  • GFR Glomerular filtration rate
  • Screat is a protein which is expected to be filtered in the kidneys and the residual of it - released into the blood. The creatinine level in the serum is determined by the rate it is being removed in the kidney and is also a measure of the kidney function.
  • Surea is a substance produced in the liver as a means of disposing of ammonia from protein metabolism. It is filtered by the kidney and can be reabsorbed to the bloodstream.
  • Salb is the protein of the highest concentration in plasma. Decreased serum albumin may result from kidney disease, which allows albumin to escape into the urine. Decreased albumin may also be explained by malnutrition or liver disease .
  • the TNFIP method is applied for the prediction of the GFR of each new patient where a modified Takagi-Sugeno types of fuzzy rules are used whwre the output function is of the MDRD type but the coefficients will be calculated for every individual patient (personalised model) with the use of the TNFIP method.
  • results produced by the MDRD formula (a global regression model), the MLP (a globally trained connectionist model) and DENFIS (a global model that is a set of adaptive local models), all - inductive reasoning systems, along with the results produced by using the transductive WKNN method, are also listed in the table.
  • the leave-one-out training-simulating tests were performed for each model on the data set and Table 3 lists the results including RMSE (root mean square error), MAE (mean absolute error) and Rn (the number of rules or nodes, neurone) used in each model.
  • RMSE root mean square error
  • MAE mean absolute error
  • Rn the number of rules or nodes, neurone
  • MLP Number of neurons in the hidden layer: 10; Learning algorithm: Levenberg-
  • DENFIS Dt/zr (distance-threshod): 0.15; MofN: 6; Learning epochs: 60;
  • the TNFIP system gives the best accuracy of the GFR evaluation for each individual patient and overall - for the whole data set. There was no optimisation of the variable normalisation weights applied (the transformation functions were assumed constant).
  • Variant 2 Using weighted normalisation for the input variables
  • a personalised model for each patient is derived and the input variables are weighted for their importance for the prediction of the output for this patient. This is illustrated in table 6 for a randomly selected single patient (one sample from the GFR data).
  • Fuzzy rules are extracted from this personalized model (six rules) as shown in Table 7 that best describe the prediction rules for the area of the problem space where the new input vector is located. Table 7. The fuzzy rules extracted from the personalised model for the person's data from fig. 6.
  • TNFIC Transductive Neuro-Fuzzy Inference Method for Classification
  • the TNFIC classifies a data set into a number of classes in the n-dimensional input space.
  • the system is a multi-input multi-output type fuzzy inference system optimized by a steepest descent algorithm (BP).
  • the fuzzy rules that constitute the system can be of Zadeh- Mamdani type, of Takagi-Sugeno type or any non-linear function.
  • initial variable normalisation weighting functions ft, f 2 ,...,f p for the input variables x ⁇ ,x 2 ,...,x p to represent their importance for the new input vector JC,..
  • N q the number of bits in the framing data set in the input space to find N q training examples that are closest to x q .
  • the value for N q can be pre-defined based on experience, or - optimized through the application of an optimization procedure. Here we assume the former approach.
  • the /-th rule has the form of:
  • the steepest descent algorithm (BP) is used then to obtain the formulas for the optimization of the parameters « & , ⁇ s , ay, my and ⁇ y of the TNFIC such that the value of E from Eq. (29) is minimized:
  • Input variables: 7 1 , 2, ... , P;
  • Example 1 TNFIC for the Classification of Iris data set with Optimisation of the Variable Normalisation Weights
  • TNFIC classification results with the same parameters, the same training data and testing data, but without variable weight normalisation are also shown in Table 8. From the results, we can see that the weight of the first variable is much smaller than the weights of the other variables. The weights show the importance of the variables and the least important variables can be removed from the input for some particular new input vectors. Same experiment is repeated without the first input variable (least important) and the results have improved as shown in Table 8. If another variable is removed, and the total number of input variables is 2, the test error increases, so it can be assumed that for the particular ECMC model the optimum number of input variables is 3. For different new input vectors, the normalisation weights of the input variables will be different pointing to the different importance of these variables for the classification (or prediction) of every new input vector located in a particular part of the problem space.
  • Zadeh-Mamdani rules e.g.: IF x2 has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (x2 has an importance factor of 0.5) and x3 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x3 has an importance factor of 0.92) and x4 has a membership degree of 0.68 to a Gaussian function with a center at 0.14 and a standard deviation of 0.02 (x4 has an importance factor of 1) THEN y has a membership degree of 0.78 to belong to a class 2 defined by a Gaussian function with a center at 0.83 and a standard deviation of 0.18, with 10 vectors being in this cluster.
  • the problem used here is mortgage approval for applicants defined by 8 input variables - character (0- doubtful; 1 - good); total asset; equity; mortgage loan; budget surplus; gross income; debt servicing ratio; term of loan, and one output variable (decision ( 0- disapprove; 1 - approve).
  • TNFIC models are created in a leave-one-out mode for every single sample in the data set of 91 samples and results are presented in Table 9. The results are compared with the results obtained with the use of ECF and MLP as inductive methods.
  • a personalised decision support model is developed for every applicant that best makes the decision for them and the input variables are also weighted showing the importance of the variables for this applicant personalised model. This is illustrated in table box 10 where two of the rules that comprise the personalised decision model are shown:
  • Table 10 A personalised decision model for an applicant for a loan, the weighted input variables in this model through TNFI and two of the Zade-Mamdani fuzzy rules that comprise the model.
  • Input vector of a randomly selected person comprising the gene expression of the selected by M.Shipp 11 genes: [341 275 20 20 725 237 314 20 20 62.6 192]
  • Correctly predicted outcome by the personalised model TNFIC Class 2 (died in 5 years time)
  • Transductive reasoning is not practical in case of large data sets D (e.g. millions of data samples) and large number of variables (e.g. thousands).
  • a large data set D* given on a large number of variables V* is transformed into several clusters of data samples, each cluster defining their own list of variables, so that for every new vector xj only the data from the cluster that X ; belongs is used as data set D (see the general TNFI method) on a much smaller number of variables.
  • the method consists of the following steps:

Abstract

The invention provides a prediction system (100) configured to predict an output from a test input. The system includes a data transformation module configured to transform at least some of the input data to obtain a set of normalised data (110). A rationalising module is configured to apply a rationalising function to the set of normalised data to obtain a set of rationalised input data (115) and rationalised expected output data. A clustering module is configured to apply a clustering function to the set of rationalised data (115). A set of rules (125) is maintained in computer memory. An optimiser module (130) is configured to apply a transformation to the rules (125) based at least partly on the results of the clustering function. A decoder (135) is configured to transform a series of outputs and an output layer (140) is configured to display a set of outputs.

Description

TRANSDUCTΓVΈ NEURO FUZZY INFERENCE METHOD FOR PERSONALISED MODELLING
FIELD OF INVENTION
The invention relates to a Transductive Neuro Fuzzy Inference Method and Uses for Personalised Modelling.
BACKGROUND OF THE INVENTION
Most of the learning models and systems in artificial intelligence currently available are in most cases global models, covering the whole problem space. Such models include regression functions, the multilayer perception neural networks, the ANFIS neuro-fuzzy inference systems, and so on. These models are usually difficult to update on new data without using old data, previously used to derive the models; they do not take into account the partial information contained in a new vector when they are recalled on it; they do not take into account the importance of the different variables for a different part of the problem space. Overall, creating a global model (function) that would be valid for the whole problem space is a difficult task, and in most cases — it is not necessary to solve. These global models are usually derived through using inductive learning methods.
In some connectionist and also fuzzy inference systems, a global model is learned that consists of many local models (e.g., rules representing clusters of data) that collectively cover the whole space and are adjusted incrementally on new data. The output for a new vector is calculated based on the activation of one or several neighbouring local models (rules). Such systems are the evolving connectionist systems.
The inductive learning and inference approach is useful when a global model ("the big picture") of the problem is needed even in its very approximate form. In some models (e.g. ECOS) it is possible to apply incremental, on-line learning to adjust this model on new data and trace its evolution. Unfortunately, despite these advances, the inductive global learning process is less suitable where personalised modelling is required, for example, in clinical and medical applications of learning systems. This problem is particularly acute in determining individual outcomes, diagnoses and treatment regimes for medical decision support systems. In such applications, the focus is not on the global model, but on the individual patient. It is not so important what the global error of a global model over the whole problem space is, but rather - the accuracy of prediction for an individual patient.
Transductive inference systems and methods have been devised to address this problem by estimation of a function at a single point of the search space only. For a new input vector that needs to be processed for a prognostic task, the closest examples that form a data subset are derived from an existing data set or/and generated from an existing model. A new model is dynamically created from this subset to approximate the function at the new input vector. An example of these models is the k-nearest neighbour method where for every new input vector v, the closest A: vectors from a training (existing) data set are chosen and the predicted output for the new vector is calculated based on the outputs of these k examples. Unfortunately, currently available transductive inference methods and systems have one or more of the following disadvantages:
(1) The models do not estimate the importance factors for the input variables in every part of the problem space, where a new vector is located; it is known that for different groups of patients, for example old, versus young; male versus female, some input variables are more important than others, and if this is taken into account, a more accurate output value would be calculated for the new input vector.
(2) The models created are opaque, making it difficult to explain in terms of rules a predicted value (or a class) for an input vector, thus limiting the explanation power of the systems.
(3) The models fail to work well in a large dimensional space of many variables (for example as required in bioinformatics applications where thousands of genes, proteins and/or clinical variables are present). (4) The models assume in general that data is mapped in a linear function. Such models fail to sufficiently accurately predict non-linear data; and
(5) The models are inflexible making them unsuitable for applications where there are different numbers of variables or on data characterised by missing values. It is therefore an object of the present invention to provide a transductive inference method and system with variable importance evaluation, that overcome the above- mentioned difficulties or that at least provides the public with a useful choice.
SUMMARY OF THE INVENTION
In a first aspect the present invention provides a method for predicting an output from a test input comprising the step of receiving a set of input data having expected output data; applying a transformation to at least some of the input data to obtain a set of normalised data; applying a rationalising function to the set of normalised data to obtain a set of rationalised input data and rationalised expected output data; applying a clustering function to the set of rationalised data; applying a transformation to a set of rules based at least partly on the results of the clustering function; and evaluating the accuracy of the rationalised expected output data; and generating output data.
In broad terms in another aspect the present invention comprises a prediction system configured to predict an output from a test input, the system comprising a data transformation module configured to transform at least some of the input data to obtain a set of normalised data; a rationalising module configured to apply a rationalising function to the set of normalised data to obtain a set of rationalised input data and rationalised expected output data; a clustering module configured to apply a clustering function to the set of rationalised data; a set of rules maintained in computer memory; an optimiser module configured to apply a transformation to the rules based at least partly on the results of the clustering function; a decoder configured to transform a series of outputs; and an output layer configured to display a set out outputs. The present invention also extends in a still further aspect to a neural network module for carrying out the steps in the first aspect.
The present invention also provides a method for predicting an output from a test input x comprising at least the following steps: a) provide a set D of known global inputs and expected outputs of the used variables; b) select relevant input variables and initialise importance factors for the input variables for the new input vector x - local importance factors; c) perform a transformation of the problem space into a reduced and normalised variable space based on weighted variable normalisation that reflects the local importance of the input variables for the area of the new input vector, thus producing a normalised data set D '; d) rationalise the said set D ' to produce a new rationalised local set D 'x of inputs and expected outputs that are closely related to the test input x in the variable importance space; e) cluster and partition the rationalised set D'x in the weighted variable normalisation problem space using a clustering algorithm; f) set the initial parameters of the or classification/prediction model based on fuzzy rules according to the results of the clustering and the partitioning in steps (c) and (d); g) optimise the variable normalisation weights and the parameters of the fuzzy rules for the model based on the accuracy measured for the data set ; h) iterate the process from step c) above until a maximum accuracy model is produced; and i) calculate the output for the provided test input by applying fuzzy inference over the fuzzy rules.
The present invention also provides a system for predicting an output from a test input comprising at least the following: a) an input device for receiving a test input x; b) a storage and retrieval medium to provide a set D of known global and previously stored inputs and expected outputs; c) a variable selection and data transformation module that transforms data from the original space to a weighted variable normalisation space by performing a transformation of the problem space into a normalised variable space based on weighted variable normalisation that reflects the local importance of the input variables for the area of the new input vector, thus producing a normalised data set D'; d) a rationalising module that produces a new rationalised set D'x of inputs and expected outputs that are closely related to the test input x from set D' in the weighted variable normalisation space; e) a clustering module comprising a clustering algorithm for clustering and partitioning the rationalised set D 'x; f) a fuzzy rule creation module that creates fuzzy rules and sets initial parameters for the fuzzy rules according to the results of clustering and partitioning in step e); g) an optimising module that optimises the variable normalising weights and the parameters of the fuzzy rules for the model based on the accuracy measured for the data set and feeds back models into the data transformation module of step c) until the accuracy of the model is adequate; and h) a fuzzy decoder for calculating the output for the provided test input by applying fuzzy inference over the fuzzy rules for a weighted input vector based on the local importance factors; and i) an output device for outputting the output result from the fuzzy decoder and for analysing the fuzzy rules and clusters.
BRIEF DESCRIPTION OF THE DRAWING
Preferred forms of the method and system of the invention will now be described with reference to the accompanying figures in which:
Figure 1 shows a schematic diagram of the main components of an embodiment of the invention; and
Figure 2 shows case study data associated with the invention.
DETAILED DESCRIPTION OF THE INVENTION
Figure 1 shows a schematic diagram of preferred aspects of one form of the invention. The system 100 includes an input layer to which known inputs and expected outputs are passed.
In the above aspects, the set of known inputs and expected outputs are preferably stored in a database maintained in computer memory. In one embodiment, the outputs are membership classes. In a medical context, the membership class may, for example, be a class of patient permitting a classification of a patient or a condition.
In an alternative embodiment, the output is one or more data values or vectors. In medical applications, typical input values are clinical and/or gene patient-specific data and outputs are preferably selected from the group consisting of membership of a group of patients, a risk of an event, a clinical variable not easily directly measured e.g. glomerular filtration rate, a prognostic outcome, a diagnostic outcome, a suggested treatment or treatment regime. In business decision support applications, typical input variables and their corresponding output variables would be: records of applicants for a bank loan and the decision (grant, or don't grant the loan); a set of economic variables and a predicted economic state; a set of financial indexes and a predicted value for an index; etc.
Transformation could be performed on the input data to obtain a set of normalised data, shown as normalised inputs 110. The system could assign an importance factor, or set of importance factors to one or more of the inputs. In one embodiment, the importance factors are normalisation weights. In such a case, the variable importance space is weighted variable normalisation space. The inputs having normalisation weights exceeding a threshold importance factor are selected for subsequent transformation and rationalisation.
Normalised variable space is also known as variable importance space. Variable normalisation weights are also known as local importance factors.
In one embodiment, importance factors in local space may be initialised by assuming they are equal to importance factors determined using a global model, such as an inductive fuzzy neural network. In an alternate embodiment, the importance factors are all initialised to be equal to 1. The model produced from the optimisation is a set of rules.
A rationalising function could then be applied to the normalised inputs 110 to obtain a set of rationalised data, shown as rationalised inputs 115.
The rationalisation process may be in the form of human selection of relevant data based on experience. Alternatively, the rationalisation process may be a computational method as suggested in this invention. A simple example of such a computational method is the k- nearest neighbour (k-NN) transductive decision method described in Mitchell, M.T. (1997). "Machine Learning,." MacGraw-Hill. and Vapnik, V. (1998) Statistical Learning Theory, John Wiley & Sons, Inc.
The rationalised set of inputs and expected outputs that are closely related to the test input, represent a sub-set of the original set data more closely related to the test input than the original set data according to a measured similarity. If desired, the rationalised set can be selected after the initial data set is transformed into the variable importance space through normalisation procedure and distance is measured between the new input vector and all data vectors. One embodiment selects criteria for data selection by selecting a minimum number of N closest vectors to the new vector but all 5 samples that are different from each other in less than 10% percent are also included.
The rationalised set can also be selected based on the distance between the new input vector and the data samples in the original input space, after which importance factors are calculated with the use of methods from the art, such as correlation analysis (for prediction models) and signal-to-noise ratio (for classification models). The rationalised set then is l o transformed into a set in the new space - the variable importance space.
A clustering function could then be applied to create a set of clustered inputs 120. The clustering and partitioning of the rationalised set may be accomplished by using any suitable clustering algorithm in the art, but in a preferred embodiment, clustering and partitioning is performed in the local weighted variable normalisation problem space based 15 on the importance factors. The currently preferred algorithm is ECM described in Kasabov, N. and Q. Song (2002). "DENFIS. Dynamic, evolving neural-fuzzy inference systems and its application for time-series prediction." IEEE Trans. On Fuzzy Systems 10(2): 144-154 , which is hereby incorporated by reference.
The system also maintains a set of rules, shown as fuzzy rules 105. The process of creating 20 fuzzy rules may be undertaken separately from the clustering and partitioning or may be undertaken in the same process. The currently preferred ECM algorithm provides the requisite process as part of the partitioning and clustering process.
The system includes an optimiser 130 that is configured to apply a transformation, for example an optimising transformation, to the rules based at least partly on the clustering 25 function described above. The parameters of the fuzzy rules may be optimised by any objective evaluation method that determines the fitness of the data. The currently preferred method is to determine an overall error for the fitness of the data. Optimisation occurs by: (1) changing the weighted normalisation intervals (importance) for the input variables; and (2) changing the parameters of the fuzzy rules, using in both cases error minimising algorithms in the art. The currently preferred algorithm for optimisation is a steepest descent algorithm. However, there are other well-established algorithms available in the art suitable for application in the practice of the present invention. Following optimisation, control could be passed back to obtain normalised inputs 110, rationalised inputs 115 and clustered inputs 120.
The output from the system and method is calculated using fuzzy decoding algorithms in the art specific for the fuzzy rules used. A decoder 135 applies a fuzzy decoding algorithm. Outputs are then passed to an output layer 140.
For the new input vector, a set of fuzzy rules that represent the rationalised data set is presented along with their activation for and the new input vector thus providing explanation facilities and transparency of the solution.
Typically, the problem space transformation module based on weighted variable normalisation, the clustering module, the fuzzy rule creation module, the optimising module, and fuzzy decoder form part of a computer implemented neural network 145 comprising an input transformation layer comprising one or more input nodes arranged to receive and normalise input data; a rule base layer comprising one or more rule nodes; an output layer comprising one or more output nodes; and an adaptive component arranged to aggregate selected two or more rule nodes in the rule base layer based on the input data.
The system and the method are preferably dynamic multi-input - multi-output neural-fuzzy inference systems and methods respectively with a local generalization, in which a fuzzy inference engine is used, for example the Zadeh-Mamdani engine described in Zadeh, L.A. (1965). "Fuzzy Sets." Information and Control 8. 338-353 or the Takagi-Sugeno engine described in Takagi, T. and M. Sugeno (1985). "Fuzzy Identification of systems and its applications to modeling and control." IEEE Trans. On Systems. Man, and Cybernetics 15: 116-132. The local generalization means that in a sub-space of the whole problem space (local area) a model is created that performs generalization in this area. Gaussian fuzzy membership functions may be applied in each fuzzy rule for both the antecedent and the consequent parts or for the antecedent part only. A BP (Back- Propagation) learning algorithm may be used for optimizing the parameters of the fuzzy membership functions. However, other learning algorithms may be employed. An additional learning function may be derived for use in the model.
The distance between vectors x and y is preferably measured in the weighted variable normalisation space as the normalized Euclidean distance defined as follows:
Figure imgf000012_0001
where x,y G R q.
For example, the distance between two data samples x=(0.3, 0.7) and z=(0.2, 0.4) (all variables are in the range of [0,1]) in the original data space is 0.1581, due to the large difference in variable x2 values. If the variable importance factors (normalisation weights) are ql=0.9 and q2= 0.3 for the two variables xι and x2 respectively, then the transformed vectors are x'= (0.27, 0.21) and z'=(0.18, 0.12), so the distance between the two vectors in the variable importance space applying the same formula (4), is now 0.06, due to the fact that the difference between the values for x2 is not so important (importance 0.3, when compared to the importance of variable xl which is 0.9). As a partial case, an importance weight for an input variable can be 0 or close to 0, which indicates that this variable is not selected in the local model to calculate the output value for the input variable.
To partition the weighted variable normalisation space for creating fuzzy rules and obtaining initial values of fuzzy rules, a ECM (Evolving Clustering Method) may be applied, such as that described in Kasabov, N. and Q. Song (2002). "DENFIS. Dynamic, evolving neural-fuzzy inference systems and its application for time-series prediction." IEEE Trans. On Fuzzy Systems 10(2): 144-154 and the cluster centres and cluster radii are respectively taken as initial values of the centres and the widths of the Gaussian membership functions. The data in a cluster may be used for creating a linear output function.
The invention is described below by reference to generic methods and specific application methodologies systems.
1. Transductive Neuro-Fuzzy Inference Method for Prediction - TNFD?: Training and Simulating Methods
We assume that the TNFIP (Transductive Neuro-Fuzzy Inference Method for Prediction System) is given an input vector JC,. The following steps are implemented:
1. Define initial variable normalisation weighting functions fl3 f2,...,fp for the input variables xl3x2,...,xp to represent their importance for the new input vector xt_. In one implementation the initial values are calculated as: f1= f2= ... =fp i.e. all variables are of equal importance for the new input vector */. In another implementation, the functions fi, f2,...,fp are different linear weighted normalisation functions, so that fj(xj) = αj (XJ . Xjmin) /(xjmax - Xjmin) (j= 1 ,2, ... ,p), which is a linear normalisation of the variable j in the interval [0, qj. The more important a variable is, the larger its normalisation interval will be and the more it will influence the distance measure between data samples in the transformed space.
2. Transform the initial problem space {xl5x ,...,xp} into weighted variable normalisation space {f1;f2,...,fp} }where all data samples are transformed according to these functions (as a partial case a function fj is equivalent to its weight qj). The functions (the weights) are subject to optimisation over iterations.
3. Search in the framing data set in the transformed space to find Ni tiaining examples that are closest to JC,. The value for Ni can be pre-defined based on experience, or - optimised through the application of an optimization procedure. Here we assume the former approach.
4. Calculate the distances dj,j = 1, 2, ..., Ni, between each ofthese data samples and C/. 5. Calculate the weights Wj = 1 + (Min(d) - df,j = 1, 2, ..., N, Min(</) is the minimum value in the distance vector d = [d\, d_, ... , dw .
6. Use ECM (or another clustering algorithm) to cluster and partition the input sub-space that consists of Ni selected training examples. 7. Create fuzzy rules and set their initial parameter values according to the ECM clustering procedure results.
8. Optimize the parameters of the fuzzy rules following Eq. (5 - 24).
9. Apply the above points 2-6 for a certain number of iterations (training epochs) (the number can be either pre-defined or optimized) thus optimizing the parameters of the fuzzy rules in the local model Mi based on minimum least square error.
10. Modify the transformation functions f1; f2,...,fp to optimise them based on minimum least square error. Repeat points 2 to 10 until an optimum set of functions and optimum model parameters are obtained.
11. Calculate the output value _y, for the input vector JC, applying fuzzy inference over the set of fuzzy rules that constitute the local model Mi.
12. End of the procedure.
The objective function and the TΝFIP training-simulation procedure are described below:
Consider the system having P inputs (xι, x2, ..., p) and T outputs (y\, y%, ..., yx). Suppose that it has M fuzzy rules defined initially through the ECM clustering procedure, and the l- th rule has the form of: i?/ :
Figure imgf000014_0001
is Fn and x2 is F_ and ... xp is FR, then VΪ is Gn, yi is Gu, ..., yτ is Gπ 1 = 1, 2, ..., M; (Zadeh-Mamdani type) (5) or, Rι :
Figure imgf000014_0002
is Fn and x2 is Fa and ... xP is Fp, then vi is m, y2 is nn, ..., yτ is nπ I = 1 , 2, ... , M. (Takagi-Sugeno type) (6) Here, Fy are fuzzy sets defined by the following Gaussian type membership function:
Figure imgf000015_0001
and Giq are of a similar type as Fy and are defined as: ι2 -I GaussianMF = exp (for Zadeh-Mamdani type), 2δ2 (8)
or: Kqi = qio + bqii
Figure imgf000015_0002
+ bq xι + . . . + bqn> xp : (for Tokagi-Sugeno type) (9)
Using the Modified Centre Average defuzzification procedure the output values of the system are calculated as follows:
Mx = (for Zadeh-Mamdani type)
Figure imgf000015_0003
(10) or:
Uχi) = (for Takagi-Sugeno type)
Figure imgf000015_0004
(11) The TNFIP model n inimizes the following objective function (an error function):
Figure imgf000016_0001
The steepest descent algorithm (BP) is used then to obtain the formulas for the optimization of the parameters nqι, δqι, my, ay and σy of Zαdeh-Mαmdαni type TNFI such that the error function E from (12) is minimized:
Figure imgf000016_0002
(13)
Figure imgf000016_0003
(15)
Figure imgf000016_0004
16)
17) The steepest descent algorithm (BP) is also used to obtain the formulas for the optimization of the parameters nqι, my, ay and σy of the Takagi-Sugeno type TNFIP such that the error function E from (12) is minimized: bql0(£ + 1) = bql0 [k] - ηb∑{w,ΦΛ (Xl)[fιq (k) (*) - y ι=l (18)
H) = α„(k)-!7β ∑ Φn(x,)∑ [( (x ~y ( ) -fJk)( ι=l <jr=l (19)
Figure imgf000017_0001
(20) σj c + 1)
Figure imgf000017_0002
(21)
Figure imgf000017_0003
(22) nn(k + l) [(fιq (k)(Xl)-yιqXnql(k)_f W (Xι))J
Figure imgf000017_0004
(23) where: η„, ηδ , ηm, ηαand ησare learning rates for updating the parameters nqι, δqι, my, ay and σy respectively.
In the TNFI training-simulating algorithm, the following indexes are used: • Training data samples: = 1 , 2, ... , N;
• Input variables : j = 1 , 2, ... , P;
• Output variables: q = 1, 2, ... , T;
• Fuzzy rules: / = 1, 2, ..., M;
• Training epochs: A:= 1, 2,
Explanation rules can be extracted that apply to the new input vector x and explain the prognosis for this vector in the form of:
(1) Zadeh-Mamdani Rules, e.g.
IF xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (xl has an importance factor of 0.9) and x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x2 has an importance factor of 0.3) THEN y has a membership degree of 0.9 to a Gaussian function with a center at 0.8 and a standard deviation of 0.18, with 15 vectors being in this cluster,
(2) Takagi Sugeno rules, e.g.:
IF xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (xl has an importance factor of 0.9) and x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x2 has an importance factor of 0.3) THEN y is calculated as y= - 0.17 +0.73XJ + 0.58x2 (with 15 vectors being in this cluster). 2. TNFD? Methodology for Time Series Modelling and Prediction
The TNFIP modelling method is used here as part of a methodology for time series prediction for modelling and predicting the future values of time series. The methodology is presented through a case study problem of building transductive models for the prediction of the Mackey-Glass (MG) time series data set. This has been used as a bench-mark problem in the areas of neural networks, fuzzy systems and hybrid systems. This time series is created with the use of the MG time-delay differential equation defined below: dx(t) 0.2 x(t- τ) - 0.1JC( (24) dt l +x10(t-τ)
To obtain values at integer time points, the fourth-order Runge-Kutta method was used to find the numerical solution to the above MG equation. Here we assume that: the time step is 0.1; x(0) = 1.2; τ= 17; and x(t) = 0 for t < 0. The task is to predict the values x(t + 85) from input vectors [x(t — 18), x(t — 12), x(t - 6), x(t)] for any value of the time t. For the purpose of a comparative analysis, we also trained other connectionist models applied for inductive inference on the same task. These models are MLP and DENFIS.
Variant 1:
The following experiment was conducted: 200 data points, for t = 4001 to 4200, are extracted from the time series and used as training data; the following 200 data points, for t = 4201 to 4400, are used as simulating data, so that for each of the simulating data sample a local TNFI model is created and tested on this data. Figure 2 displays the target data 200 including training data 205 and simulating data 210. Table 1 lists the prediction results including obtained by using the TNFIP method and two other popular methods - MLP (Multilayer perception) and DENFIS (Dynamic Neuro- Fuzzy Inference System) in terms of RMSE (root mean square error), MAE (mean absolute error) on the simulating data as well as the number Rn of rules or rule nodes or neurone used in each model.
Table 1. Simulating results on MG data (no optimisation of the variable normalisation weights)
Figure imgf000020_0001
The following parameter values were used in the models:
MLP: Number of neurons in the hidden layer: 16; Learning algorithm: Levenberg-
Marguardt BP algorithm; Learning Epochs: 100.
DENFIS: Dthr (distance-threshod): 0.15; MofN: 4; Learning epochs: 60;
TNFIP: JV}: 32; Dthr: 0.20; Learning epochs for weight and parameter optimisation for each input vector: 60.
The TNFIP transductive reasoning system performs better than the other inductive reasoning models. This is a result of the fine tuning of each local model in TNFIP for each simulated example, derived according to the TNFIP learning procedure. The finely tuned local models achieve a better local generalisation.
In the example above, constant transformation functions were used, i.e. fι=xι, f2= x2,...,fp=
In the example below we apply optimisation of the weighted normalisation functions with the use of a genetic algorithm (GA) following the steps below. Variant 2:
1) A GA is run on a population of TNFI models for different values of weights, over several generations. As a fitness function, the root mean square error RMSE of a trained model on the training data or on a validation data is used. The GA runs over generations of populations and standard operations are applied such as binary encoding of the genes (weights); roulette wheel selection criterion; multi-point crossover operation for crossover.
2) The model with the least error is selected as the best one, and its chromosome - the vector of weights [qls q2,...,qp] defines the optimum normalization range for the input variables.
3) Variables with small weights are removed from the feature set and the steps from above are repeated again to find the optimum and the minimum set of variables for a particular problem and a particular TNFI model. The above method is illustrated as follows. TNFIP is applied on the Mackey-Glass (MG) time series prediction task. The following GA parameter values are used: for each input variable, the values from 0.16 to 1 are mapped onto 4 bit string; the number of individuals in a population is 12; mutation rate is 0.001; termination criterion (the maximum epochs of GA operation) is 100 generations; the root-mean square error RMSE on the training data is used as a fitness function. The resulted weight values, the training RMSE and testing RMSE are shown in Table 2. For a comparison, TNFIP results with the same parameters, the same framing data and testing data, but without optimisation of the normalisation weights are also shown in Table 2.
Table 2: Comparison between TNFIP without - and with optimisation of the variable normalisation weights
Figure imgf000022_0001
With the use of the method, better prediction results are obtained for a significantly less number of rule nodes (clusters) evolved. This is because of the better clustering achieved when different variables are normalized differently and the normalization reflects on their importance.
Two types of rules can be extracted for a particular new input vector:
(1) Zadeh-Mamdani rules, e.g.:
IF xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (xl has an importance factor of 0.4) and x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x2 has an importance factor of 0.8) and x3 has a membership degree of 0.68 to a Gaussian function with a center at 0.14 and a standard deviation of 0.02 (x3 has an importance factor of 0.28) and x4 has a membership degree to a Gaussian function with a center at 0.87 and standard deviation of 0.2 (x2 has an importance factor of 0.28) THEN y has a membership degree of 0.78 to a Gaussian function with a center at 0.83 and a standard deviation of 0.18, with 10 vectors being in this cluster,
(2) Takagi Sugeno rules, e.g. IF xl has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (xl has an importance factor of 0.4) and x2 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x2 has an importance factor of 0.8) and x3 has a membership degree of 0.68 to a Gaussian function with a center at 0.14 and a standard deviation of 0.02 (x3 has an importance factor of 0.28) and x4 has a membership degree to a Gaussian function with a center at 0.87 and standard deviation of 0.2 (x2 has an importance factor of 0.28) THEN y is calculated as y= - 0.25 +0.93xi + 0.5x2 (with 10 vectors being in this cluster).
3. TNFIP Methodology for Personalised Medical Decision Support and Prognosis
Here, the TNFIP is used to develop an application oriented methodology for medical decision support systems. It is presented here through a case example - personalised (individualised) modelling for the evaluation of a renal function of patients in a renal clinic. Real data is used and the developed TNFIP system is currently considered for use in a clinical environment.
The accurate evaluation of renal function is fundamental to sound nephrology practice. The early detection of renal impairment will allow for the institution of appropriate diagnostic and therapeutic measures, and potentially maximise preservation of intact nephrons.
Glomerular filtration rate (GFR) is traditionally considered the best overall index to determine renal function in healthy and in diseased people. Most clinicians rely upon the clearance of creatinine (CrCl) as a convenient and inexpensive surrogate for GFR. CrCl can be determined by either timed urine collection, or from serum creatinine using equations developed from regression analyses such as that by Cockcroft-Gault formula, but the accuracy of CrCl is limited by methodological imprecision and the systematic bias.
Recently, the Modification of Diet in Renal Disease (MDRD) study group developed a new formula to more accurately evaluate the GFR. The formula uses six input variables: age, sex, gender, Screat, Salb and Surea and is defined as follows: GFR = 170x Screaf0"9 x Age 0 76 x 0.762 (if female) x x 1.18(ifrace is black) x Sured017 *SaW 0.318 (25)
In the formula (25) Screat (Serum creatinine) is a protein which is expected to be filtered in the kidneys and the residual of it - released into the blood. The creatinine level in the serum is determined by the rate it is being removed in the kidney and is also a measure of the kidney function. Surea (Serum urea) is a substance produced in the liver as a means of disposing of ammonia from protein metabolism. It is filtered by the kidney and can be reabsorbed to the bloodstream. Salb (Serum albumin) is the protein of the highest concentration in plasma. Decreased serum albumin may result from kidney disease, which allows albumin to escape into the urine. Decreased albumin may also be explained by malnutrition or liver disease .
However, the formulae above that constitute global and fixed models can be misleading as to the presence and progression of renal disease. Here, the TNFIP method is applied for the prediction of the GFR of each new patient where a modified Takagi-Sugeno types of fuzzy rules are used whwre the output function is of the MDRD type but the coefficients will be calculated for every individual patient (personalised model) with the use of the TNFIP method.
Variant 1:
Using the TNFIP on a small GFR data set (93 samples) collected in a hospital in New Zealand, we obtain more accurate results than the MDRD formula. The testing was done with the use of leave-one-out cross validation method over the set of 93 samples. The results are listed in Table 3. Table 3. Comparison between the error of GFR evaluation with the use of the proposed TNFIP transductive reasoning method and the MDRD formula, the MLP, the DENFIS, and the transductive weighted k-NN method (WKNN) (preliminary results)
Figure imgf000024_0001
Figure imgf000025_0001
For comparison, the results produced by the MDRD formula (a global regression model), the MLP (a globally trained connectionist model) and DENFIS (a global model that is a set of adaptive local models), all - inductive reasoning systems, along with the results produced by using the transductive WKNN method, are also listed in the table. The leave-one-out training-simulating tests were performed for each model on the data set and Table 3 lists the results including RMSE (root mean square error), MAE (mean absolute error) and Rn (the number of rules or nodes, neurone) used in each model. In the different models the following parameter values were used:
MLP: Number of neurons in the hidden layer: 10; Learning algorithm: Levenberg-
Marguardt BP; Learning Epochs: 100.
DENFIS: Dt/zr (distance-threshod): 0.15; MofN: 6; Learning epochs: 60;
WKNN: N: 24;
TNFI: Nt: 24; Dthr: 0.20; Learning epochs: 60;
The TNFIP system gives the best accuracy of the GFR evaluation for each individual patient and overall - for the whole data set. There was no optimisation of the variable normalisation weights applied (the transformation functions were assumed constant).
Variant 2: Using weighted normalisation for the input variables
Leave one method is applied and the weighted normalisation of variables
Table 4. Comparison of results of the GFR individual prognosis whn TNFI method is used in its two variants: no weighting is applied; weighting of the input variables is applied through gradient descent optimisation algorithm. The TNFIP with optimisation is a superior method than the TNFIP without optimisation.
Figure imgf000026_0001
The average weighting (importance) factors for the variables across all samples after a leave-one-ut method is applied are shown in table 5.
Table 5. Average variable importance factors (variable normalisation weights) evaluated with the proposed method
Figure imgf000026_0002
Through using the TNFIP method a personalised model for each patient is derived and the input variables are weighted for their importance for the prediction of the output for this patient. This is illustrated in table 6 for a randomly selected single patient (one sample from the GFR data).
Table 6. The input data, the weighted variables and the predicted GFR value obtained with the use of a personalised TNFIP model for a single patient.
Figure imgf000026_0003
Fuzzy rules are extracted from this personalized model (six rules) as shown in Table 7 that best describe the prediction rules for the area of the problem space where the new input vector is located. Table 7. The fuzzy rules extracted from the personalised model for the person's data from fig. 6.
Figure imgf000027_0001
Figure imgf000028_0001
4. TNFIC: Transductive Neuro-Fuzzy Inference Method for Classification
The TNFIC classifies a data set into a number of classes in the n-dimensional input space. The system is a multi-input multi-output type fuzzy inference system optimized by a steepest descent algorithm (BP). The fuzzy rules that constitute the system can be of Zadeh- Mamdani type, of Takagi-Sugeno type or any non-linear function.
Suppose that TNFIC is given an input vector xq, the following steps are implemented: 1) Define initial variable normalisation weighting functions ft, f2,...,fp for the input variables xι,x2,...,xp to represent their importance for the new input vector JC,.. In one implementation the initial values are calculated as: fι=xl5 f2=x2,...,fp= xP; i.e. all variables are of equal importance for the new input vector xL In another implementation, the functions ft, f2,...,fp are linear weighted normalisation functions, so that fj(xj) = α (XJ . Xjmin) (xjma - Xjmin) (j=l,2,...,ρ), which is a linear normalisation of the variable Xj in the interval [0, q ]. The more important a variable is, the larger its normalisation interval will be and the more it will influence the distance measure between data samples in the transformed space. 2) Transform the initial problem space {xl5x2,...,xP} into weighted variable normalisation space {ft, f ,...,fP} }where all data samples are transformed according to these functions (as a partial case a function fj is a constant equivalent to its weight qj). The functions (the weights) are subject to optimisation over iterations. 3) Search in the framing data set in the input space to find Nq training examples that are closest to xq. The value for Nq can be pre-defined based on experience, or - optimized through the application of an optimization procedure. Here we assume the former approach. 4) Calculate the distances dt, i = 1, 2, ..., Nq, between each ofthese data samples and Xq. 5) Calculate the distance weights w,- = 1 - (dt - mia(d)), i = 1, 2, ..., Nq, min(< ) is the minimum value in the distance vector d = [d\, d2, ... , duq]. 6) Use ECM (other clustering algorithms can also be used) to cluster and partition the input sub-space that consists of Nq selected training samples. 7) Create fuzzy rules and set their initial parameter values according to the ECM clustering procedure results. 8) Optimise the parameters of the fuzzy rules in the local model Mq following Equations (26) - (35). 9) Apply the above points 2-8 for a certain number of iterations (training epochs) (the number can be either pre-defined or optimised) thus optimising the parameters of the fuzzy rules in the local model Mq based on minimum least square error. 10) Modify the transformation functions ft, f2,...,fpto optimise them based on minimum least square error. Repeat points 2 to 10 until an optimum set of functions and optimum model parameters are obtained. 11) Calculate the output value yq = y\, y2, ... , yτ] for the input vector xq applying fuzzy inference over the set of fuzzy rules that constitute the local model Mq. \fys = max(p?), the input vector xq belongs to the class s. 12) End of the procedure.
The parameter optimisation procedure is described below:
Consider the system having P inputs, T outputs and fuzzy rules of the Zadeh-Mamdani type defined initially through the ECM clustering procedure, the /-th rule has the form of:
Ri : If ! is Fn andx2 is Fα and ... x? is F , then^! is Gn, and ...,yτ is G^ Here, Fy are fuzzy sets defined by the following Gaussian type membership function:
Figure imgf000030_0001
and Gjs are of a similar type as Fy and are defined as: GaussianMF = exp (27) 2δ2
Using the Modified Centre Average defuzzification procedure the output value of the system can be calculated on input vector x{ =
Figure imgf000031_0001
x2, ... , xp] as follows:
Figure imgf000031_0002
Suppose the TNFIC is given a framing data pair [xt, the system minimizes the following objective function (a weighted error function):
E = w [/*(*<) - J (Wi are defined in step 3) (29)
The steepest descent algorithm (BP) is used then to obtain the formulas for the optimization of the parameters «&, διs, ay, my and σy of the TNFIC such that the value of E from Eq. (29) is minimized:
nb(k + l) (30)
Figure imgf000031_0003
Figure imgf000031_0004
Figure imgf000031_0005
(33)
Figure imgf000032_0001
(34)
here, Φs (x,) = (35)
Figure imgf000032_0002
where: η„ , ηδ , ηαm and ησare learning rates for updating the parameters «&, δ α/7, my and σ/ respectively.
In the TNFIC training-simulating algorithm, the following indexes are used:
• Training data samples: = 1, 2, ... , N; • Input variables: 7 = 1 , 2, ... , P;
• Output variables: s = 1 , 2, ... , T;
• Fuzzy rules: /= 1, 2, ..., M;
• Training epochs: k — 1, 2, ....
Example 1: TNFIC for the Classification of Iris data set with Optimisation of the Variable Normalisation Weights
In this section, the TNFIC with weighted variable normalisation using genetic algorithm
(GA) is applied on the Iris data for both classification and feature selection. The same as the experiments in the section 3, all experiments in this section are repeated 50 times with the same parameters and the results are averaged. 50% of the whole data set is randomly selected as training data and another 50% as testing data. The initial weight intervals for the four normalized input variables are [0, 1] and are encoded in a 6-bit binary string. The following GA parameters are used for the weight optimisation: number of individuals in a population 12; mutation rate 0.005; termination criterion (the maximum epochs of GA operation) 50; fitness function - the number of created rule nodes. The resulted weight values and the number of errors on the testing data are shown in Table 8. For comparison, TNFIC classification results with the same parameters, the same training data and testing data, but without variable weight normalisation are also shown in Table 8. From the results, we can see that the weight of the first variable is much smaller than the weights of the other variables. The weights show the importance of the variables and the least important variables can be removed from the input for some particular new input vectors. Same experiment is repeated without the first input variable (least important) and the results have improved as shown in Table 8. If another variable is removed, and the total number of input variables is 2, the test error increases, so it can be assumed that for the particular ECMC model the optimum number of input variables is 3. For different new input vectors, the normalisation weights of the input variables will be different pointing to the different importance of these variables for the classification (or prediction) of every new input vector located in a particular part of the problem space.
Table 8: Comparison between TNFIC without and with optimisation of the variable normalisation weights
Figure imgf000033_0001
Figure imgf000034_0001
Rules can be extracted for each new input vector x that explain the decision of this vector as illustrated below:
(1) Zadeh-Mamdani rules, e.g.: IF x2 has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (x2 has an importance factor of 0.5) and x3 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x3 has an importance factor of 0.92) and x4 has a membership degree of 0.68 to a Gaussian function with a center at 0.14 and a standard deviation of 0.02 (x4 has an importance factor of 1) THEN y has a membership degree of 0.78 to belong to a class 2 defined by a Gaussian function with a center at 0.83 and a standard deviation of 0.18, with 10 vectors being in this cluster. (2) Takagi Sugeno rules, e.g.: IF x2 has a membership degree of 0.68 to a Gaussian function with a center at 0.7 and a standard deviation of 0.2 (x2 has an importance factor of 0.5) and x3 has a membership degree to a Gaussian function with a center at 0.5 and standard deviation of 0.12 (x3 has an importance factor of 0.92) and x4 has a membership degree of 0.68 to a Gaussian function with a center at 0.14 and a standard deviation of 0.02 (x4 has an importance factor of 1) THEN y is calculated as the formula: y=- 0.3 + 0.15 x2 - 0.4x3 + 0.5 x4, with 10 vectors being in this cluster, and if the calculated output value is greater than 1 than new vector belongs to class 2. 5. TNFIC Methodology for Business Decision Making Systems TNFIC is used here to develop a novel methodology for business decision support systems. The methodology is presented through a case example.
The problem used here is mortgage approval for applicants defined by 8 input variables - character (0- doubtful; 1 - good); total asset; equity; mortgage loan; budget surplus; gross income; debt servicing ratio; term of loan, and one output variable (decision ( 0- disapprove; 1 - approve).
TNFIC models are created in a leave-one-out mode for every single sample in the data set of 91 samples and results are presented in Table 9. The results are compared with the results obtained with the use of ECF and MLP as inductive methods.
Table 9. Comparative analysis of TNFIC, ECF and MLP on the business decision support case study data of mortgage approval
TNFIC Number of errors class 1: 3; Number of errors for class 2: 1; Overall correct: 87
(95.6%)
ECF Number of errors class 1 : 4; Number of errors for class 2: 2; Overall correct: 85
(93.4%)
MLP Number of errors class 1: 3; Number of errors for class 2: 2; Overall correct: 86
(94.5%)
Through using TNFIC a personalised decision support model is developed for every applicant that best makes the decision for them and the input variables are also weighted showing the importance of the variables for this applicant personalised model. This is illustrated in table box 10 where two of the rules that comprise the personalised decision model are shown:
Table 10. A personalised decision model for an applicant for a loan, the weighted input variables in this model through TNFI and two of the Zade-Mamdani fuzzy rules that comprise the model. Input vector of applicant/sample 7: [0 0.07 0.108 0.075 0.236 0.109 0.16 0.32]
Output value: Class 0 (disapprove).
Weights for input variables: [0.99 0.97 0.98 0.99 1.0 0.98 0.97 0.99]
Number of selected training data: 24
Rule l: if jci is (Gassian MF - center: -0.17, STD: 0.30) (Importance 0.99), and x2 is (Gassian MF - center: -0.00, STD: 0.30) (Importance 0.97), and x3 is (Gassian MF - center: -0.12, STD: 0.30) (Importance 0.98), and x4 is (Gassian MF - center: -0.05, STD: 0.21) (Importance 0.99), and x5 is (Gassian MF - center: -0.18, STD: 0.36) (Importance 1.0), and x6 is (Gassian MF - center: -0.05, STD: 0.48) (Importance 0.98), and x7 is (Gassian MF - center: 0.22, STD: 0.13) (Importance 0.97), and x8 is (Gassian MF - center: 0.61, STD: 0.29) (Importance 0.99), then v is Class 0 Rule 2: if xι is (Gassian MF - center: 0.71, STD: 0.30) (Importance 0.99), and x2 is (Gassian MF - center: -0.34, STD: 0.30) (Importance 0.97), and x3 is (Gassian MF - center: -0.07, STD: 0.30) (Importance 0.98), and x4 is (Gassian MF - center: 0.03, STD: 0.20) (Importance 0.99), and 5 is (Gassian MF - center: 0.69, STD: 0.36) (Importance 1.0), and x6 is (Gassian MF - center: 0.05, STD: 0.48) (Importance 0.98), and x7 is (Gassian MF - center: -0.01, STD: 0.12) (Importance 0.97), and x8 is (Gassian MF - center: 0.69, STD: 0.28) (Importance 0.99), then y is Class 0
6. TNFIC Methodology for Personalised Modelling and Decision Making for Medical Decision Support Systems
A methodology for personalised prognostic and classification systems is presented here through a case study examples of using personal gene expression data. As an example here, we use public domain data on DLBCL Lymphoma cancer outcome prognosis based on gene expression data, published in Shipp et al, Shipp, M.A., K.N. Ross, et al. (2002).
"Diffuse large B-cell lymphoma outcome prediction by gene-expression profiling and supervised machine learning." Nature Medicine 8(1): 68-74. There are 7129 gene expression variables with 58 samples - 32 survivals (class 1) and 26 fatal outcomes (class
2), all traced in 5 years time. After some pre-processing, there were 11 genes selected with high prognostic power. The prognostic system presented in Shipp et al based on these 11 variables using inductive support vector machines and other techniques resulted in 84.5% accuracy measured through leave-one-out method. Here we suggest that using a personalised TNFIC modeling methodology on the 11 variables not only a better prediction on survival can be made, but a personal model can be evolved to explain the results and to be used for a personalized treatment and personalized drug design (see tables 11 ands 12).
Table 11. Experimental Results of TNFIC, ECF and SVM on DLBCL Lymphoma data (Leave-one-out validation)
Figure imgf000037_0001
Table 12 .A personalised TNFIC model for the survival prediction of a randomly selected person from the data set of Shipp et al.
Input vector of a randomly selected person comprising the gene expression of the selected by M.Shipp 11 genes: [341 275 20 20 725 237 314 20 20 62.6 192] Correctly predicted outcome by the personalised model TNFIC: Class 2 (died in 5 years time)
Weights for input variables: [0.97 0.99 1.0 1.0 0.99 0.98 0.99 0.99 1.0 0.99 0.99]
Number of selected training data samples for the personalised model: 56
7. A Method for Preliminary Variable and Data Set Selection
Transductive reasoning is not practical in case of large data sets D (e.g. millions of data samples) and large number of variables (e.g. thousands). Here we propose that a large data set D* given on a large number of variables V* is transformed into several clusters of data samples, each cluster defining their own list of variables, so that for every new vector xj only the data from the cluster that X; belongs is used as data set D (see the general TNFI method) on a much smaller number of variables. The method consists of the following steps:
1. Starting from a whole data set D* defined with a set V* of variables, cluster the data into m clusters C1,C2,...,Cm using ECM or other clustering methods. Each cluster Ci contains a subset Di of m; samples.
2. For each cluster Ci, define a set of variables Vi as a subset of V* (starting form 1 variable) that results in a TNFI model for this cluster with the highest accuracy.
3. The set of m clusters, m data sets and m set of variables will represent the initial data set D* in a more convenient format for a TNFI modelling on any new data vectors in following manner. For every new data vector x; for which a model is created through the TNFI methods, first the vector is mapped into the clusters and the cluster Ci the vector belongs to is used to create the data set D=Di with variables Vi as a starting point of the TNFI methods. The foregoing describes the invention including preferred forms thereof. Alterations and modifications as will be obvious to those skilled in the art are intended to be incorporated within the scope hereof, as defined by the accompanying claims.

Claims

CLAIMS:
1. A method for predicting an output from a test input comprising the steps of: receiving a set of input data having expected output data; applying a transformation to at least some of the input data to obtain a set of normalised data; applying a rationalising function to the set of normalised data to obtain a set of rationalised input data and rationalised expected output data; applying a clustering function to the set of rationalised data; applying a transformation to a set of rules based at least partly on the results of the clustering function; evaluating the accuracy of the rationalised expected output data; and generating output data.
2. A method for predicting an output from a test input as claimed in claim 2 further comprising the step of selecting a subset of the set of input data.
3. A method for predicting an output from a test input as claimed in claim 2 further comprising the steps of assigning an importance factor to one or more members of the set of input data and selecting for the subset those members having an importance factor above a threshold importance factor.
4. A method for predicting an output from a test input as claimed in claim 3 wherein the importance factors assigned to respective members of the set of input data are calculated using an inductive fuzzy neural network.
5.. A method for predicting an output from a test input as claimed in claim 3 wherein the same importance factor is assigned to members of the set of input data.
6. A method for predicting an output from a test input as claimed in claim 1 wherein the rationalising function comprises a transductive decision method.
7. A method for predicting an output from a test input as claimed in claim 3 wherein the clustering function is performed based at least partly on the importance factors assigned to one or more members of the set of input data.
8. A method for predicting an output from a test input as claimed in claim 3 further comprising the step of assigning a new importance factor to one or more members of the set of input data following the step of evaluating the accuracy of the rationalised expected output data.
9. A method for predicting an output from a test input as claimed in claim 3 further comprising the step of applying a further transformation to the set of rules following the step of evaluating the accuracy of the rationalised expected output data.
10. A method for predicting an output from a test input as claimed in claim 1 wherein the input data comprises clinical data.
11. A method for predicting an output from a test input as claimed in claim 1 wherein the input data comprises gene data.
12. A method for predicting an output from a test input as claimed in claim 1 wherein the input data comprises financial institution loan application data.
13. A method for predicting an output from a test input as claimed in claim 1 wherein the input data comprises economic data.
14. A prediction system configured to predict an output from a test input, the system comprising: a data transformation module configured to transform at least some of the input data to obtain a set of normalised data; a rationalising module configured to apply a rationalising function to the set of normalised data to obtain a set of rationalised input data and rationalised expected output data; a clustering module configured to apply a clustering function to the set of rationalised data; a set of rules maintained in computer memory; an optimiser module configured to apply a transformation to the rules based at least partly on the results of the clustering function; a decoder configured to transform a series of outputs; and an output layer configured to display a set of outputs.
PCT/NZ2004/000290 2003-11-17 2004-11-17 Transductive neuro fuzzy inference method for personalised modelling WO2005048185A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
NZ52957003 2003-11-17
NZ529570 2003-11-17

Publications (1)

Publication Number Publication Date
WO2005048185A1 true WO2005048185A1 (en) 2005-05-26

Family

ID=34588198

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/NZ2004/000290 WO2005048185A1 (en) 2003-11-17 2004-11-17 Transductive neuro fuzzy inference method for personalised modelling

Country Status (1)

Country Link
WO (1) WO2005048185A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106535A (en) * 2013-02-21 2013-05-15 电子科技大学 Method for solving collaborative filtering recommendation data sparsity based on neural network
EP2582341B1 (en) 2010-06-16 2016-04-20 Fred Bergman Healthcare Pty Ltd Method for analysing events from sensor data by optimization
WO2018137203A1 (en) * 2017-01-25 2018-08-02 深圳华大基因研究院 Method for determining population sample biological indicator set and predicting biological age and use thereof
CN110991478A (en) * 2019-10-29 2020-04-10 西安建筑科技大学 Method for establishing thermal comfort model and method and system for setting user preference temperature
CN111898628A (en) * 2020-06-01 2020-11-06 淮阴工学院 Novel T-S fuzzy model identification method
US11062792B2 (en) 2017-07-18 2021-07-13 Analytics For Life Inc. Discovering genomes to use in machine learning techniques
US11139048B2 (en) 2017-07-18 2021-10-05 Analytics For Life Inc. Discovering novel features to use in machine learning techniques, such as machine learning techniques for diagnosing medical conditions
US20230196095A1 (en) * 2021-04-20 2023-06-22 Shanghaitech University Pure integer quantization method for lightweight neural network (lnn)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001078003A1 (en) * 2000-04-10 2001-10-18 University Of Otago Adaptive learning system and method
WO2003040949A1 (en) * 2001-11-07 2003-05-15 Biowulf Technologies, Llc Pre-processed Feature Ranking for a support Vector Machine

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2001078003A1 (en) * 2000-04-10 2001-10-18 University Of Otago Adaptive learning system and method
WO2003040949A1 (en) * 2001-11-07 2003-05-15 Biowulf Technologies, Llc Pre-processed Feature Ranking for a support Vector Machine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KASABOV N. ET AL.: "Evolving connectionist systems", Retrieved from the Internet <URL:http://www.aut.ac.nz/reserach_showcase/research_activity_areas/kedri/books.shtml> *
KASABOV N. ET AL.: "Evolving fuzzy neural networks for supervised/unsupervised on-line, knowledge-based learning", IEEE TRANSACTIONS OF SYSTEMS, MAN AND CYBERNETICS, PART B - CYBERNETICS, vol. 3, no. 6, December 2001 (2001-12-01), Retrieved from the Internet <URL:http://www.aut.ac.nz/reserach_showcase/research_activity_areas/kedri/downloads/pdf/kas-smc-2001.pdf> *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2582341B1 (en) 2010-06-16 2016-04-20 Fred Bergman Healthcare Pty Ltd Method for analysing events from sensor data by optimization
CN103106535A (en) * 2013-02-21 2013-05-15 电子科技大学 Method for solving collaborative filtering recommendation data sparsity based on neural network
CN103106535B (en) * 2013-02-21 2015-05-13 电子科技大学 Method for solving collaborative filtering recommendation data sparsity based on neural network
WO2018137203A1 (en) * 2017-01-25 2018-08-02 深圳华大基因研究院 Method for determining population sample biological indicator set and predicting biological age and use thereof
US11062792B2 (en) 2017-07-18 2021-07-13 Analytics For Life Inc. Discovering genomes to use in machine learning techniques
US11139048B2 (en) 2017-07-18 2021-10-05 Analytics For Life Inc. Discovering novel features to use in machine learning techniques, such as machine learning techniques for diagnosing medical conditions
CN110991478A (en) * 2019-10-29 2020-04-10 西安建筑科技大学 Method for establishing thermal comfort model and method and system for setting user preference temperature
CN111898628A (en) * 2020-06-01 2020-11-06 淮阴工学院 Novel T-S fuzzy model identification method
CN111898628B (en) * 2020-06-01 2023-10-03 淮阴工学院 Novel T-S fuzzy model identification method
US20230196095A1 (en) * 2021-04-20 2023-06-22 Shanghaitech University Pure integer quantization method for lightweight neural network (lnn)
US11934954B2 (en) * 2021-04-20 2024-03-19 Shanghaitech University Pure integer quantization method for lightweight neural network (LNN)

Similar Documents

Publication Publication Date Title
Lin et al. Missing value imputation: a review and analysis of the literature (2006–2017)
El-Shafiey et al. A hybrid GA and PSO optimized approach for heart-disease prediction based on random forest
Papageorgiou et al. Application of evolutionary fuzzy cognitive maps for prediction of pulmonary infections
Sahebi et al. GeFeS: A generalized wrapper feature selection approach for optimizing classification performance
EP1534122B1 (en) Medical decision support systems utilizing gene expression and clinical information and method for use
Leung et al. Generating compact classifier systems using a simple artificial immune system
Vidhya et al. Deep learning based big medical data analytic model for diabetes complication prediction
Urso et al. Data mining: Classification and prediction
Wang et al. Patient admission prediction using a pruned fuzzy min–max neural network with rule extraction
Salerno et al. High-dimensional survival analysis: Methods and applications
Khan et al. Use of classification algorithms in health care
Dhar An adaptive intelligent diagnostic system to predict early stage of parkinson's disease using two-stage dimension reduction with genetically optimized lightgbm algorithm
Koloseni et al. Optimized distance metrics for differential evolution based nearest prototype classifier
Moturi et al. Grey wolf assisted dragonfly-based weighted rule generation for predicting heart disease and breast cancer
Fadhil et al. Multiple efficient data mining algorithms with genetic selection for prediction of SARS-CoV2
Liang et al. Evolving personalized modeling system for integrated feature, neighborhood and parameter optimization utilizing gravitational search algorithm
WO2005048185A1 (en) Transductive neuro fuzzy inference method for personalised modelling
Shinde et al. A genetic algorithm, information gain and artificial neural network based approach for hypertension diagnosis
Schütz et al. A comparative study of pattern recognition algorithms for predicting the inpatient mortality risk using routine laboratory measurements
Mehrankia et al. Prediction of heart attacks using biological signals based on recurrent GMDH neural network
Balasubramanian et al. Rough set theory-based feature selection and FGA-NN classifier for medical data classification
Di Nuovo et al. Psychology with soft computing: An integrated approach and its applications
Mohamed et al. prediction of cardiovascular disease using machine learning techniques
Srivastava et al. A taxonomy on machine learning based techniques to identify the heart disease
Bouslah et al. A new Parkinson detection system based on evolutionary fast learning networks and voice measurements

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase