CN108875960A - A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline - Google Patents
A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline Download PDFInfo
- Publication number
- CN108875960A CN108875960A CN201810592605.2A CN201810592605A CN108875960A CN 108875960 A CN108875960 A CN 108875960A CN 201810592605 A CN201810592605 A CN 201810592605A CN 108875960 A CN108875960 A CN 108875960A
- Authority
- CN
- China
- Prior art keywords
- function
- sample
- weight
- node
- moment
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/02—Computing arrangements based on specific mathematical models using fuzzy logic
- G06N7/023—Learning or tuning the parameters of a fuzzy system
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- Automation & Control Theory (AREA)
- Fuzzy Systems (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Algebra (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The present invention is suitable for techniques of teime series analysis field, provides the learning method of timing ambiguity Cognitive Map, including:It is divided after being pre-processed to sample data, obtains training sample set and forecast sample collection, initialize weight function using the Pearson correlation coefficient based on timing, the timing ambiguity Cognitive Map network that building connects entirely in training sample is concentrated;According to the number of iterations, the weight function after initialization is optimized using batch gradient descent method, the error of the weight function after measuring every suboptimization with loss function;It chooses so that the smallest weight function of loss function, construct optimum timing Fuzzy Cognitive Map network, optimum timing Fuzzy Cognitive Map is verified using forecast sample collection, the present invention learns tFCM model out using batch gradient descent algorithm, weight is initialized using based on the Pearson correlation coefficient of timing, it allows search range to fall in the section for having statistical significance, reduces a possibility that falling into locally optimal solution.
Description
Technical field
The invention belongs to techniques of teime series analysis field more particularly to a kind of timing ambiguity cognitions based on gradient decline
The learning method and system of figure.
Background technique
Fuzzy Cognitive Map (Fuzzy Cognitive Map, FCM) is a kind of for the oriented of knowledge representation and causal reasoning
Weighted graph has been widely used in the fields such as engineering management, medical decision support.However, the iteration reasoning of FCM does not have
Show the timing in causality.Timing ambiguity Cognitive Map (Temporal Fuzzy Cognitive Map, tFCM) gram
This defect of FCM is taken, the weight in each edge is designed to the function on Disgrete Time Domain by it, can sufficiently be shown each
The mutual causality of each factor influences in period.
Currently, the side right value of tFCM is manually specified, disadvantage has two:1) artificially estimation weight needs it to have accordingly
Data fields knowledge, FCM knowledge etc., the requirement to people is very high;2) tFCM is constantly dynamic for handling time series data, time series data
State variation, it needs model to have learning ability, the adaptability that side right value will will limit tFCM to time series data is manually specified.
There is the method for automatically calculating, learning side right value in FCM, for example the learning algorithm based on Hebbian, has been based on
The algorithm of evolution thought.But in existing learning method, there is a possibility that being easily trapped into locally optimal solution, learn simultaneously
The not high problem of accuracy.
Summary of the invention
Technical problem to be solved by the present invention lies in provide a kind of timing ambiguity Cognitive Map based on gradient decline
Learning method and system, it is intended to solve to there is a possibility that being easily trapped into locally optimal solution in existing learning method, learn simultaneously
The not high problem of accuracy.
The invention is realized in this way a kind of learning method of the timing ambiguity Cognitive Map based on gradient decline, including:
Step A obtains initial sample data, pre-processes to the initial sample data, obtains sample data;
Step B divides the sample data, obtains training sample set and forecast sample collection;
Step C initializes weight function using the Pearson correlation coefficient based on timing, in training sample concentration
Construct the timing ambiguity Cognitive Map network connected entirely;
Step D carries out the weight function after initialization using batch gradient descent method excellent according to preset the number of iterations
Change, the error of the weight function after measuring every suboptimization with loss function;
Step E chooses so that the smallest weight function of loss function, building optimum timing Fuzzy Cognitive Map network are described
Optimum timing Fuzzy Cognitive Map indicates the interacting in different time period of each node in the sample data;
Step F verifies the optimum timing Fuzzy Cognitive Map using the forecast sample collection.
Further, the step A includes:
Step A1 obtains initial sample data from preset data source;
Step A2 carries out the initial sample data to include feature extraction, normalized pretreatment, obtains the sample
Data, the sample data indicate each feature in the value at each moment.
Further, the step B includes:
Step B1 uses sliding window method to the sample data, generates the identical several sample groups of size;
Several sample groups are divided into training sample set and test sample collection by step B2.
Further, in the sample group, s moment each node and preceding s-1 moment all nodes have side connection,
The step C includes:
Step C1 obtains weight letter using the weight function of the Pearson correlation coefficient initialization each edge based on timing
Manifold is closed, and the Pearson correlation coefficient is:
X={ x1,x2,…,xnAnd Y={ x1,x2,…,xnIndicate variable, Corr (Xj,Xi, t') and indicate Xj(t) with by t' when
Between after Xi(t+t') degree of correlation between, i, j indicate feature, XjIndicate each feature of jth of data;
Step C2 is constructed in each sample group being integrated into the training set sample set according to the weight function and is connected entirely
The timing ambiguity Cognitive Map network connect.
Further, the step D includes:
Step D1, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics
Initial prediction is calculated in sigmod function;
Step D2 uses mean square deviation as loss function according to the initial prediction:
For the node C of sample liError be:Then for there is the sample l of n node,
Error is:Therefore, the error of all samples is:
Step D3 optimizes weight using batch gradient descent method:
By node CiLoss function about a weight function derivation, obtain:
Update is iterated to weight function, increment is:Wherein γ is setting
There is descending three phases variation in an iterative process in habit rate;
Step D4 carries out right-value optimization by cut operator, to reduce the dimension of parameter space, reduces over-fitting, obtains
Weight function after iteration:
In entire period interior nodes CjThe node C that needs are predictediInfluence fij(t') always it is less than preset threshold, then
Delete all moment node CjWith node CiBetween side, cut operator with learning rate γ variation, only carry out three times;
Step D5 carries out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose
The smallest best initial weights function of function.
Further, in step D1:
If at the t+1 moment, node CiIt is only influenced by last moment all nodes, then solves the state value of this concept,
State transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated;
If needing solution node C at the t+1 momentiState value, need consider include t moment before all concepts it is general to this
The influence of thought accumulates, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period (0, t);
If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is indicated
For:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period s.
The embodiment of the invention also provides a kind of learning systems of timing ambiguity Cognitive Map based on gradient decline, including:
Sample process unit pre-processes the initial sample data, obtains sample for obtaining initial sample data
Notebook data;
Sample division unit obtains training sample set and forecast sample collection for dividing to the sample data;
Network struction unit, for using the Pearson correlation coefficient based on timing to initialize weight function, in the instruction
Practice the timing ambiguity Cognitive Map network that interior building connects entirely in sample set;
Function optimization unit, for according to preset the number of iterations, using batch gradient descent method to the power after initialization
Value function optimizes, the error of the weight function after measuring every suboptimization with loss function;
Network determination unit, for choosing so that the smallest weight function of loss function, constructs optimum timing Fuzzy Cognitive
Figure network, the optimum timing Fuzzy Cognitive Map indicate the mutual shadow of each node in different time period in the sample data
It rings;
Network verification unit, for being verified using the forecast sample collection to the optimum timing Fuzzy Cognitive Map.
Further, the sample process unit is specifically used for:Initial sample data is obtained from preset data source, to institute
It states initial sample data and carries out including feature extraction, normalized pretreatment, obtain the sample data, the sample data table
Show each feature in the value at each moment;
The sample division unit is specifically used for:Sliding window method is used to the sample data, it is identical several to generate size
Several sample groups are divided into training sample set and test sample collection by sample group.
Further, the network struction unit is specifically used for:
Firstly, obtaining weight function using the weight function of the Pearson correlation coefficient initialization each edge based on timing
Set, the Pearson correlation coefficient are:
X={ x1,x2,…,xnAnd Y={ x1,x2,…,xnIndicate variable, Corr (Xj,Xi, t') and indicate Xj(t) with by t' when
Between after Xi(t+t') degree of correlation between, i, j indicate feature, XjIndicate each feature of jth of data;
Full connection is constructed finally, being integrated into each sample group in the training set sample set according to the weight function
Timing ambiguity Cognitive Map network;
The function optimization unit is specifically used for:
Firstly, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics
Initial prediction is calculated in sigmod function;
Then, according to the initial prediction, use mean square deviation as loss function:
For the node C of sample liError be:Then for there is the sample l of n node,
Error is:Therefore, the error of all samples is:
Then, weight is optimized using batch gradient descent method:
By node CiLoss function about a weight function derivation, obtain:
Update is iterated to weight function, increment is:Wherein γ is setting
There is descending three phases variation in an iterative process in habit rate;
Then, right-value optimization is carried out by cut operator, to reduce the dimension of parameter space, reduces over-fitting, changed
Weight function after generation:
In entire period interior nodes CjThe node C that needs are predictediInfluence fij(t') always it is less than preset threshold, then
Delete all moment node CjWith node CiBetween side, cut operator with learning rate γ variation, only carry out three times;
Finally, carrying out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose letter
The smallest best initial weights function of number.
Further, the step of initial prediction is calculated using sigmod function in the function optimization unit is also wrapped
It includes:
If at the t+1 moment, node CiIt is only influenced by last moment all nodes, then solves the state value of this concept,
State transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated;
If needing solution node C at the t+1 momentiState value, need consider include t moment before all concepts it is general to this
The influence of thought accumulates, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period (0, t);
If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is indicated
For:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period s.
Compared with prior art, the present invention beneficial effect is:The embodiment of the present invention is located in advance after obtaining sample data
Reason, divides the sample data, obtains training sample set and test sample collection, use the Pearson came phase relation based on timing
Number initialization weight functions, according to preset the number of iterations, using batch gradient descent method to the weight function after initialization into
Row optimization, the error of the weight function after measuring every suboptimization with loss function are chosen so that the smallest weight letter of loss function
Number, construct optimum timing Fuzzy Cognitive Map network, finally using the forecast sample collection to the optimum timing Fuzzy Cognitive Map into
Row verifying.The embodiment of the present invention learns tFCM model out using batch gradient descent algorithm, uses the Pearson came phase based on timing
Relationship number initializes weight, and search range is allowed to fall in section for having statistical significance, reduce fall into locally optimal solution can
It can property.
Detailed description of the invention
Fig. 1 is a kind of stream of the learning method of timing ambiguity Cognitive Map based on gradient decline provided in an embodiment of the present invention
Cheng Tu;
Fig. 2 is a kind of knot of the learning system of timing ambiguity Cognitive Map based on gradient decline provided in an embodiment of the present invention
Structure schematic diagram.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and
It is not used in the restriction present invention.
Fig. 1 shows a kind of study side of timing ambiguity Cognitive Map based on gradient decline provided in an embodiment of the present invention
Method, including:
S101 obtains initial sample data, pre-processes to the initial sample data, obtains sample data;
S102 divides the sample data, obtains training sample set and forecast sample collection;
S103 initializes weight function using the Pearson correlation coefficient based on timing, in training sample concentration
Construct the timing ambiguity Cognitive Map network connected entirely;
S104 carries out the weight function after initialization using batch gradient descent method excellent according to preset the number of iterations
Change, the error of the weight function after measuring every suboptimization with loss function;
S105 chooses so that the smallest weight function of loss function, constructs optimum timing Fuzzy Cognitive Map network, it is described most
Excellent timing ambiguity Cognitive Map indicates the interacting in different time period of each node in the sample data;
S106 verifies the optimum timing Fuzzy Cognitive Map using the forecast sample collection.
The embodiment of the present invention be suitble to small-scale time series data prediction and simple Causality Analysis, specifically, in step
In rapid S101, the initial sample data of acquisition carrys out the financial data information that Google ***, Baidu or other official websites are announced freely,
Details including annual income, gross profit, Innovation Input etc. and each moon, season, it is necessary first to which initial sample data is carried out
Feature extraction, the pretreatment such as normalization, finally obtained each sample data is a time series, each sample data generation
Table be each time point each feature value, wherein sample includes several features, such as:First sample be
Gross profit, total income, Innovation Input of 10-12 company etc., second sample are the gross profit, total income, research and development of 11-13 company
Investment etc..
In step s 102, sample data step S101 obtained generates sample group with sliding window method, sets each sample
Group size is s, and sample group is divided into training sample set and forecast sample collection.
In step s 103, present embodiment assumes that the node at preceding s-1 moment is for s moment node in a sample group
All there is causal influence, i.e. s moment each node and preceding s-1 moment all nodes have side connection, and when use is based on
The Pearson correlation coefficient of sequence initializes the weight function on each side, such as t' moment node CjTo s moment node CiWeight function
For fij(t'), thus constructing one has the timing ambiguity Cognitive Map tFCM network connected entirely.
Specifically, Pearson correlation coefficient (Pearson correlation coefficient) is a kind of research two
Degree of dependence, the method for linearly related degree between variable.Given variable X=x1, x2 ..., xnAnd Y=x1, x2 ...,
xn, Pearson correlation coefficient is defined with following formula:
Here variable has timing, therefore cannot directly use the Pearson correlation coefficient.It will in the present embodiment
It is transformed into following formula, with Corr (Xj,Xi, t') and indicate Xj(t) and by the X after the t' timei(t+t') between
Degree of correlation:
In step S104, for training set, the embodiment of the present invention is mainly by training the timing mould an of Weighted Coefficients
Paste Cognitive Map tFCM network, the tFCM network can not only predicted time sequence well, and can sufficiently reflect each time
Mutual causal influence in section between concept is assumed within each period in the present embodiment, is all to have mutual shadow between concept
Loud, specific steps include:
Step S401, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics
Initial prediction is calculated in sigmod function.Specifically, after the present embodiment obtains side right value function set by initialization, base
In the principle and temporal characteristics of FCM, an initial prediction is calculated using sigmod function.
The state vector of tFCM is constantly changing at any time, and state transition function is shown as the following formula:
If at the t+1 moment, node CiIt is only influenced by last moment all nodes, then solves the state value of this concept,
State transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated;
If needing solution node C at the t+1 momentiState value, need consider include t moment before all concepts it is general to this
The influence of thought accumulates, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period (0, t);
If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is indicated
For:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period s;
Step S402 uses mean square deviation as loss function according to the initial prediction.
Specifically, after calculating initial prediction, mean square deviation is used to measure error size as loss function, wherein:
For the node C of sample liError be:
Then for there is the sample l of n node, error is:
Then, the error of all samples is:
Step S403 optimizes weight using batch gradient descent method;
Specifically, by node CiLoss function about a weight function derivation, obtain:
Update is iterated to weight function, increment (i.e. gradient) is:Wherein
γ is the learning rate of setting, there is descending three phases variation in an iterative process;
S404 carries out right-value optimization by cut operator, to reduce the dimension of parameter space, reduces over-fitting, is changed
Weight function after generation;
Specifically, a threshold value is set, in entire period interior nodes CjThe node C that needs are predictediInfluence fij
(t') always it is less than this threshold value, then can deletes all moment node CjWith node CiBetween side.Cut operator is with study
The variation of rate γ only carries out three times, i.e. the method for three stages level beta pruning;
S405 carries out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose letter
The smallest best initial weights function of number;
Specifically, after completing step S401 to S404, that is, an iteration is completed, the weight function after obtaining iteration,
Next iteration is carried out, until the number of iterations reaches preset the number of iterations, finally obtaining one group can make loss function the smallest
Best initial weights function.
In step s105, it is chosen in step S104 and makes the smallest weight function of error, construct tFCM network,
The tFCM network can sufficiently show each factor influencing each other in different time period.
Come below by specific embodiment to each round iteration in step S104, the step of weight function adjusts is solved
It releases:
Given sample data<Xi,Yi>| i=1,2 ..., k }, time-domain T={ 0,1,2 ..., t }.Wherein, it is each for inputting
When inscribe state vector composition matrixOutput is each concept going through at the t+1 moment
History data Yi={ (y1,y2,…,ym)}.For example, research s=2, when input is 2010,2011,2012,2013,
When data in 2014, output is then 2012, data in 2014 in 2013.Because being divided into three groups of samples:Use 2010
The data in year and 2011 predict data in 2012, and number in 2013 is predicted using the data of 2011 and 2012
According to predicting data in 2014 using the data of 2012 and 2013.
In a certain sample<Xi,Yi>In, about CiError beWithIndicate CiAt the t+1 moment
Historical data;The error of sample l indicates with mean square error, i.e.,The overall error of all samples is defined asTraining error and test error are measured with E.ElTo fji(t') such as formula of the partial derivative at the t+1 moment (1)
It is shown:
Each time in iteration, shown in the increment of weight function such as formula (2):
Iteration is completed to update weight function:
fij(t')←fij(t')+Δfij(t'); (3)
The learning method for the timing ambiguity Cognitive Map side right value based on gradient decline that the embodiment of the present invention proposes, FCM
Graph model and timing weight function combine, and design is succinct, calculate simply, can easily apply to time series data, expand FCM's
Function, while but also with the interpretation of FCM, the good feature of precision can be realized automatic study, training tFCM model, in reality
The problems such as applying the tFCM building that can be used for this method in enterprise's financial data in example, effectively reducing over-fitting has obtained essence
High, the interpretable tFCM model of degree.
In the training process, the embodiment of the present invention declines binding hierarchy pruning method using gradient, subtracts to a certain extent
It is blocked in the probability of the point of local optimum less, and effectively reduces over-fitting.
Based on the weight function of gradient descent method study tFCM, solve the problems, such as following three:
(1) how to reduce the probability that local minimum points are fallen into study:
The embodiment of the present invention initializes weight function, all people using the Pearson correlation coefficient being unfolded based on the time
Work intelligent algorithm is all searching algorithm, and wherein gradient method is heuristic search algorithm, and effect is that one is found near initial point
A optimal solution.Therefore, it is influenced by initial value very big.One bad initial value can allow algorithm to fall into locally optimal solution.Tradition
Method be repeatedly to be trained using different random starting values, then look for optimal solution.The present embodiment uses related Pierre
Inferior related coefficient is come after determining initial value, therefore the section that search range can be allowed to have statistical significance at one falls into locally optimal solution
A possibility that can reduce.
(2) how allowable loss function:Using traditional mean square deviation formula, to do loss function.HaveWherein ylRefer to Ci(t+1) corresponding anticipated output (CiIn t+
The historical data at 1 moment),It is the error of concept i in sample l, ElIt is the mean value of the error of all concepts in sample l, E is institute
There is the overall error mean value of sample.
(3) when training stops:
Usually there are two types of ways for iteration ends:One is the threshold values that error is arranged in dependence, once training error is less than this
A threshold value, so that it may stop.Second is using fixed study the number of iterations.In the present embodiment, in uncertain error threshold
In the case that how value, which sets, is just able to satisfy accuracy, and reduction two class of over-fitting requires, using fixed the number of iterations.
Further, in the present embodiment, other than using timing Pearson correlation coefficient to initialize weight function, may be used also
To be randomized initial function, the number of iterations in addition to can with other than preset, can also use it is early stop method, limit the number of iterations.
The embodiment of the invention also provides a kind of study of timing ambiguity Cognitive Map based on gradient decline as shown in Figure 2
System, including:
Sample process unit 201 pre-processes the initial sample data, obtains for obtaining initial sample data
To sample data;
Sample division unit 202 obtains training sample set and forecast sample for dividing to the sample data
Collection;
Network struction unit 203, for using the Pearson correlation coefficient based on timing to initialize weight function, described
Training sample constructs the timing ambiguity Cognitive Map network connected entirely in concentrating;
Function optimization unit 204, for according to preset the number of iterations, using batch gradient descent method to initialization after
Weight function optimizes, the error of the weight function after measuring every suboptimization with loss function;
Network determination unit 205, for choosing so that the smallest weight function of loss function, building optimum timing is fuzzy to recognize
Know figure network, the optimum timing Fuzzy Cognitive Map indicates in different time period mutual of each node in the sample data
It influences;
Network verification unit 206, for being tested using the forecast sample collection the optimum timing Fuzzy Cognitive Map
Card.
Further, sample process unit 201 is specifically used for:Initial sample data is obtained from preset data source, to institute
It states initial sample data and carries out including feature extraction, normalized pretreatment, obtain the sample data, the sample data is each
Value of a feature at each moment;
Sample division unit 202 is specifically used for:Sliding window method is used to the sample data, if generating the identical dry sample of size
Several sample groups are divided into training sample set and test sample collection by this group.
Further, network struction unit 203 is specifically used for:
Firstly, obtaining weight function using the weight function of the Pearson correlation coefficient initialization each edge based on timing
Set, the Pearson correlation coefficient are:
X={ x1,x2,…,xnAnd Y={ x1,x2,…,xnIndicate variable, Corr (Xj,Xi, t') and indicate Xj(t) with by t' when
Between after Xi(t+t') degree of correlation between, i, j indicate feature, XjIndicate each feature of jth of data;
Full connection is constructed finally, being integrated into each sample group in the training set sample set according to the weight function
Timing ambiguity Cognitive Map network;
Function optimization unit 204 is specifically used for:
Firstly, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics
Initial prediction is calculated in sigmod function;
Then, according to the initial prediction, use mean square deviation as loss function:
For the node C of sample liError be:Then for there is the sample l of n node,
Error is:Therefore, the error of all samples is:
Then, weight is optimized using batch gradient descent method:
By node CiLoss function about a weight function derivation, obtain:
Update is iterated to weight function, increment is:Wherein γ is setting
There is descending three phases variation in an iterative process in habit rate;
Then, right-value optimization is carried out by cut operator, to reduce the dimension of parameter space, reduces over-fitting, changed
Weight function after generation:
In entire period interior nodes CjThe node C that needs are predictediInfluence fij(t') always it is less than preset threshold, then
Delete all moment node CjWith node CiBetween side, cut operator with learning rate γ variation, only carry out three times;
Finally, carrying out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose letter
The smallest best initial weights function of number.
Further, the step of initial prediction is calculated using sigmod function in function optimization unit 204 further include:
If at the t+1 moment, node CiIt is only influenced by last moment all nodes, then solves the state value of this concept,
State transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated;
If needing solution node C at the t+1 momentiState value, need consider include t moment before all concepts it is general to this
The influence of thought accumulates, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period (0, t);
If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is indicated
For:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period s.
The embodiment of the invention also provides a kind of terminal, including memory, processor and storage on a memory and are being located
The computer program that runs on reason device, which is characterized in that when processor executes computer program, realize it is as shown in Figure 1 based on
Each step in the learning method of the timing ambiguity Cognitive Map of gradient decline.
A kind of readable storage medium storing program for executing is also provided in the embodiment of the present invention, is stored thereon with computer program, which is characterized in that
When the computer program is executed by processor, the timing ambiguity Cognitive Map as shown in Figure 1 based on gradient decline is realized
Each step in learning method.
It, can also be in addition, each functional module in each embodiment of the present invention can integrate in a processing module
It is that modules physically exist alone, can also be integrated in two or more modules in a module.Above-mentioned integrated mould
Block both can take the form of hardware realization, can also be realized in the form of software function module.
If the integrated module is realized in the form of software function module and sells or use as independent product
When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially
The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words
It embodies, which is stored in a storage medium, including some instructions are used so that a computer
Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention
Portion or part steps.And storage medium above-mentioned includes:USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only
Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey
The medium of sequence code.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention
Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.
Claims (10)
1. a kind of learning method of the timing ambiguity Cognitive Map based on gradient decline, which is characterized in that including:
Step A obtains initial sample data, pre-processes to the initial sample data, obtains sample data;
Step B divides the sample data, obtains training sample set and forecast sample collection;
Step C initializes weight function using the Pearson correlation coefficient based on timing, the building in the training sample is concentrated
The timing ambiguity Cognitive Map network connected entirely;
Step D optimizes the weight function after initialization using batch gradient descent method according to preset the number of iterations,
The error of weight function after measuring every suboptimization with loss function;
Step E chooses so that the smallest weight function of loss function, building optimum timing Fuzzy Cognitive Map network are described optimal
Timing ambiguity Cognitive Map indicates the interacting in different time period of each node in the sample data;
Step F verifies the optimum timing Fuzzy Cognitive Map using the forecast sample collection.
2. learning method as described in claim 1, which is characterized in that the step A includes:
Step A1 obtains initial sample data from preset data source;
Step A2 carries out the initial sample data to include feature extraction, normalized pretreatment, obtains the sample number
According to the sample data indicates each feature in the value at each moment.
3. learning method as described in claim 1, which is characterized in that the step B includes:
Step B1 uses sliding window method to the sample data, generates the identical several sample groups of size;
Several sample groups are divided into training sample set and test sample collection by step B2.
4. learning method as described in claim 1, which is characterized in that in the sample group, s moment each node and preceding s-
1 moment all node has side connection, and the step C includes:
Step C1 obtains weight function collection using the weight function of the Pearson correlation coefficient initialization each edge based on timing
It closes, the Pearson correlation coefficient is:
X={ x1,x2,…,xnAnd Y={ x1,x2,…,xnIndicate variable, Corr (Xj,Xi, t') and indicate Xj(t) with by t' when
Between after Xi(t+t') degree of correlation between, i, j indicate feature, XjIndicate each feature of jth of data;
Step C2 is constructed in each sample group being integrated into the training set sample set according to the weight function and to be connected entirely
Timing ambiguity Cognitive Map network.
5. learning method as claimed in claim 4, which is characterized in that the step D includes:
Step D1, according to the principle that the weight function set utilizes sigmod based on Fuzzy Cognitive Map FCM and temporal characteristics
Initial prediction is calculated in function;
Step D2 uses mean square deviation as loss function according to the initial prediction:
For the node C of sample liError be:Then for having the sample l of n node, error
For:Therefore, the error of all samples is:
Step D3 optimizes weight using batch gradient descent method:
By node CiLoss function about a weight function derivation, obtain:
Update is iterated to weight function, increment is:Wherein γ is setting
There is descending three phases variation in an iterative process in habit rate;
Step D4 carries out right-value optimization by cut operator, to reduce the dimension of parameter space, reduces over-fitting, obtains iteration
Weight function afterwards:
In entire period interior nodes CjThe node C that needs are predictediInfluence fij(t') always it is less than preset threshold, then deletes
All moment node CjWith node CiBetween side, cut operator with learning rate γ variation, only carry out three times;
Step D5 carries out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and makes loss function
The smallest best initial weights function.
6. learning method as claimed in claim 5, which is characterized in that in step D1:
If at the t+1 moment, node CiIt is only influenced by last moment all nodes, then solves the state value of this concept, state turns
Exchange the letters number is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated;
If needing solution node C at the t+1 momentiState value, need to consider include t moment before all concepts to the shadow of the concept
Accumulation is rung, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period (0, t);
If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period s.
7. a kind of learning system of the timing ambiguity Cognitive Map based on gradient decline, which is characterized in that including:
Sample process unit pre-processes the initial sample data, obtains sample number for obtaining initial sample data
According to;
Sample division unit obtains training sample set and forecast sample collection for dividing to the sample data;
Network struction unit, for using the Pearson correlation coefficient based on timing to initialize weight function, in the trained sample
The timing ambiguity Cognitive Map network that building connects entirely in this concentration;
Function optimization unit, for according to preset the number of iterations, using batch gradient descent method to the weight letter after initialization
Number optimizes, the error of the weight function after measuring every suboptimization with loss function;
Network determination unit, for choosing so that the smallest weight function of loss function, constructs optimum timing Fuzzy Cognitive Map net
Network, the optimum timing Fuzzy Cognitive Map indicate the interacting in different time period of each node in the sample data;
Network verification unit, for being verified using the forecast sample collection to the optimum timing Fuzzy Cognitive Map.
8. learning system as claimed in claim 7, which is characterized in that the sample process unit is specifically used for:From preset
Data source obtains initial sample data, carries out including feature extraction, normalized pretreatment to the initial sample data, obtain
The sample data, the sample data indicate each feature in the value at each moment;
The sample division unit is specifically used for:Sliding window method is used to the sample data, generates the identical several samples of size
Several sample groups are divided into training sample set and test sample collection by group.
9. learning system as claimed in claim 7, which is characterized in that the network struction unit is specifically used for:
Firstly, weight function set is obtained using the weight function of the Pearson correlation coefficient initialization each edge based on timing,
The Pearson correlation coefficient is:
X={ x1,x2,…,xnAnd Y={ x1,x2,…,xnIndicate variable, Corr (Xj,Xi, t') and indicate Xj(t) with by t' when
Between after Xi(t+t') degree of correlation between, i, j indicate feature, XjIndicate each feature of jth of data;
Finally, constructed in each sample group being integrated into the training set sample set according to the weight function connect entirely when
Sequence Fuzzy Cognitive Map network;
The function optimization unit is specifically used for:
Firstly, according to the principle that the weight function set utilizes sigmod letter based on Fuzzy Cognitive Map FCM and temporal characteristics
Initial prediction is calculated in number;
Then, according to the initial prediction, use mean square deviation as loss function:
For the node C of sample liError be:Then for having the sample l of n node, error
For:Therefore, the error of all samples is:
Then, weight is optimized using batch gradient descent method:
By node CiLoss function about a weight function derivation, obtain:
Update is iterated to weight function, increment is:Wherein γ is setting
There is descending three phases variation in an iterative process in habit rate;
Then, right-value optimization is carried out by cut operator, to reduce the dimension of parameter space, over-fitting is reduced, after obtaining iteration
Weight function:
In entire period interior nodes CjThe node C that needs are predictediInfluence fij(t') always it is less than preset threshold, then deletes
All moment node CjWith node CiBetween side, cut operator with learning rate γ variation, only carry out three times;
Finally, carrying out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and make loss function most
Small best initial weights function.
10. learning system as claimed in claim 9, which is characterized in that the function optimization unit utilizes sigmod function meter
Calculation obtains the step of initial prediction and further includes:
If at the t+1 moment, node CiIt is only influenced by last moment all nodes, then solves the state value of this concept, state turns
Exchange the letters number is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated;
If needing solution node C at the t+1 momentiState value, need to consider include t moment before all concepts to the shadow of the concept
Accumulation is rung, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period (0, t);
If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is expressed as:
Wherein CiIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment
The predicted value that function obtains, Cj(t) value of t time intrinsic factor j, f are indicatedji(t) influence of the different moments j to i is indicated, k is indicated
Period s.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810592605.2A CN108875960A (en) | 2018-06-11 | 2018-06-11 | A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810592605.2A CN108875960A (en) | 2018-06-11 | 2018-06-11 | A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108875960A true CN108875960A (en) | 2018-11-23 |
Family
ID=64337748
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810592605.2A Pending CN108875960A (en) | 2018-06-11 | 2018-06-11 | A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108875960A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325340A (en) * | 2020-02-17 | 2020-06-23 | 南方科技大学 | Information network relation prediction method and system |
CN111401605A (en) * | 2020-02-17 | 2020-07-10 | 北京石油化工学院 | Interpretable prediction method for atmospheric pollution |
CN111401559A (en) * | 2020-02-17 | 2020-07-10 | 北京石油化工学院 | Fuzzy cognitive map formed by haze and multi-dimensional time sequence mining method thereof |
-
2018
- 2018-06-11 CN CN201810592605.2A patent/CN108875960A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111325340A (en) * | 2020-02-17 | 2020-06-23 | 南方科技大学 | Information network relation prediction method and system |
CN111401605A (en) * | 2020-02-17 | 2020-07-10 | 北京石油化工学院 | Interpretable prediction method for atmospheric pollution |
CN111401559A (en) * | 2020-02-17 | 2020-07-10 | 北京石油化工学院 | Fuzzy cognitive map formed by haze and multi-dimensional time sequence mining method thereof |
CN111401605B (en) * | 2020-02-17 | 2023-05-02 | 北京石油化工学院 | Interpreted prediction method for atmospheric pollution |
CN111401559B (en) * | 2020-02-17 | 2023-05-05 | 北京石油化工学院 | Fuzzy cognitive map formed by haze and multidimensional time sequence mining method thereof |
CN111325340B (en) * | 2020-02-17 | 2023-06-02 | 南方科技大学 | Information network relation prediction method and system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Majhi et al. | Improved prediction of daily pan evaporation using Deep-LSTM model | |
CN111563706A (en) | Multivariable logistics freight volume prediction method based on LSTM network | |
CN108022001A (en) | Short term probability density Forecasting Methodology based on PCA and quantile estimate forest | |
CN106338708B (en) | Electric energy metering error analysis method combining deep learning and recurrent neural network | |
Ding et al. | Point and interval forecasting for wind speed based on linear component extraction | |
CN107622329A (en) | The Methods of electric load forecasting of Memory Neural Networks in short-term is grown based on Multiple Time Scales | |
CN110969290A (en) | Runoff probability prediction method and system based on deep learning | |
CN107423817A (en) | The method and apparatus that a kind of deep learning is realized | |
CN103544544B (en) | A kind of energy resource consumption Forecasting Methodology and device | |
BRPI0708330A2 (en) | training a classification function using propagated document relevance | |
CN108875960A (en) | A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline | |
CN110428015A (en) | A kind of training method and relevant device of model | |
CN110781595B (en) | Method, device, terminal and medium for predicting energy use efficiency (PUE) | |
CN107798426A (en) | Wind power interval Forecasting Methodology based on Atomic Decomposition and interactive fuzzy satisfying method | |
CN114912673A (en) | Water level prediction method based on whale optimization algorithm and long-term and short-term memory network | |
CN105893362A (en) | A method for acquiring knowledge point semantic vectors and a method and a system for determining correlative knowledge points | |
CN108879732A (en) | Transient stability evaluation in power system method and device | |
CN109359665A (en) | A kind of family's electric load recognition methods and device based on support vector machines | |
Fayaz et al. | An adaptive gradient boosting model for the prediction of rainfall using ID3 as a base estimator | |
Wang et al. | Regional wind power forecasting model with NWP grid data optimized | |
CN111612648B (en) | Training method and device for photovoltaic power generation prediction model and computer equipment | |
Mukilan et al. | Prediction of rooftop photovoltaic solar potential using machine learning | |
CN117252288A (en) | Regional resource active support capacity prediction method and system | |
CN113610665B (en) | Wind power generation power prediction method based on multi-delay output echo state network | |
CN114971062A (en) | Photovoltaic power prediction method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181123 |
|
RJ01 | Rejection of invention patent application after publication |