CN108875960A

CN108875960A - A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline

Info

Publication number: CN108875960A
Application number: CN201810592605.2A
Authority: CN
Inventors: 冯禹洪; 李宸; 华静静; 周才清; 钟皓明; 苗春燕
Original assignee: Shenzhen University
Current assignee: Shenzhen University
Priority date: 2018-06-11
Filing date: 2018-06-11
Publication date: 2018-11-23

Abstract

The present invention is suitable for techniques of teime series analysis field, provides the learning method of timing ambiguity Cognitive Map, including：It is divided after being pre-processed to sample data, obtains training sample set and forecast sample collection, initialize weight function using the Pearson correlation coefficient based on timing, the timing ambiguity Cognitive Map network that building connects entirely in training sample is concentrated；According to the number of iterations, the weight function after initialization is optimized using batch gradient descent method, the error of the weight function after measuring every suboptimization with loss function；It chooses so that the smallest weight function of loss function, construct optimum timing Fuzzy Cognitive Map network, optimum timing Fuzzy Cognitive Map is verified using forecast sample collection, the present invention learns tFCM model out using batch gradient descent algorithm, weight is initialized using based on the Pearson correlation coefficient of timing, it allows search range to fall in the section for having statistical significance, reduces a possibility that falling into locally optimal solution.

Description

A kind of learning method and system of the timing ambiguity Cognitive Map based on gradient decline

Technical field

The invention belongs to techniques of teime series analysis field more particularly to a kind of timing ambiguity cognitions based on gradient decline The learning method and system of figure.

Background technique

Fuzzy Cognitive Map (Fuzzy Cognitive Map, FCM) is a kind of for the oriented of knowledge representation and causal reasoning Weighted graph has been widely used in the fields such as engineering management, medical decision support.However, the iteration reasoning of FCM does not have Show the timing in causality.Timing ambiguity Cognitive Map (Temporal Fuzzy Cognitive Map, tFCM) gram This defect of FCM is taken, the weight in each edge is designed to the function on Disgrete Time Domain by it, can sufficiently be shown each The mutual causality of each factor influences in period.

Currently, the side right value of tFCM is manually specified, disadvantage has two：1) artificially estimation weight needs it to have accordingly Data fields knowledge, FCM knowledge etc., the requirement to people is very high；2) tFCM is constantly dynamic for handling time series data, time series data State variation, it needs model to have learning ability, the adaptability that side right value will will limit tFCM to time series data is manually specified.

There is the method for automatically calculating, learning side right value in FCM, for example the learning algorithm based on Hebbian, has been based on The algorithm of evolution thought.But in existing learning method, there is a possibility that being easily trapped into locally optimal solution, learn simultaneously The not high problem of accuracy.

Summary of the invention

Technical problem to be solved by the present invention lies in provide a kind of timing ambiguity Cognitive Map based on gradient decline Learning method and system, it is intended to solve to there is a possibility that being easily trapped into locally optimal solution in existing learning method, learn simultaneously The not high problem of accuracy.

The invention is realized in this way a kind of learning method of the timing ambiguity Cognitive Map based on gradient decline, including：

Step A obtains initial sample data, pre-processes to the initial sample data, obtains sample data；

Step B divides the sample data, obtains training sample set and forecast sample collection；

Step C initializes weight function using the Pearson correlation coefficient based on timing, in training sample concentration Construct the timing ambiguity Cognitive Map network connected entirely；

Step D carries out the weight function after initialization using batch gradient descent method excellent according to preset the number of iterations Change, the error of the weight function after measuring every suboptimization with loss function；

Step E chooses so that the smallest weight function of loss function, building optimum timing Fuzzy Cognitive Map network are described Optimum timing Fuzzy Cognitive Map indicates the interacting in different time period of each node in the sample data；

Step F verifies the optimum timing Fuzzy Cognitive Map using the forecast sample collection.

Further, the step A includes：

Step A1 obtains initial sample data from preset data source；

Step A2 carries out the initial sample data to include feature extraction, normalized pretreatment, obtains the sample Data, the sample data indicate each feature in the value at each moment.

Further, the step B includes：

Step B1 uses sliding window method to the sample data, generates the identical several sample groups of size；

Several sample groups are divided into training sample set and test sample collection by step B2.

Further, in the sample group, s moment each node and preceding s-1 moment all nodes have side connection, The step C includes：

Step C1 obtains weight letter using the weight function of the Pearson correlation coefficient initialization each edge based on timing Manifold is closed, and the Pearson correlation coefficient is：

X={ x₁,x₂,…,x_nAnd Y={ x₁,x₂,…,x_nIndicate variable, Corr (X_j,X_i, t') and indicate X_j(t) with by t' when Between after X_i(t+t') degree of correlation between, i, j indicate feature, X_jIndicate each feature of jth of data；

Step C2 is constructed in each sample group being integrated into the training set sample set according to the weight function and is connected entirely The timing ambiguity Cognitive Map network connect.

Further, the step D includes：

Step D1, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics Initial prediction is calculated in sigmod function；

Step D2 uses mean square deviation as loss function according to the initial prediction：

For the node C of sample l_iError be：Then for there is the sample l of n node, Error is：Therefore, the error of all samples is：

Step D3 optimizes weight using batch gradient descent method：

By node C_iLoss function about a weight function derivation, obtain：

Update is iterated to weight function, increment is：Wherein γ is setting There is descending three phases variation in an iterative process in habit rate；

Step D4 carries out right-value optimization by cut operator, to reduce the dimension of parameter space, reduces over-fitting, obtains Weight function after iteration：

In entire period interior nodes C_jThe node C that needs are predicted_iInfluence f_ij(t') always it is less than preset threshold, then Delete all moment node C_jWith node C_iBetween side, cut operator with learning rate γ variation, only carry out three times；

Step D5 carries out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose The smallest best initial weights function of function.

Further, in step D1：

If at the t+1 moment, node C_iIt is only influenced by last moment all nodes, then solves the state value of this concept, State transition function is expressed as：

Wherein C_iIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment The predicted value that function obtains, C_j(t) value of t time intrinsic factor j, f are indicated_ji(t) influence of the different moments j to i is indicated；

If needing solution node C at the t+1 moment_iState value, need consider include t moment before all concepts it is general to this The influence of thought accumulates, then state transition function is expressed as：

Wherein C_iIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment The predicted value that function obtains, C_j(t) value of t time intrinsic factor j, f are indicated_ji(t) influence of the different moments j to i is indicated, k is indicated Period (0, t)；

If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is indicated For：

Wherein C_iIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment The predicted value that function obtains, C_j(t) value of t time intrinsic factor j, f are indicated_ji(t) influence of the different moments j to i is indicated, k is indicated Period s.

The embodiment of the invention also provides a kind of learning systems of timing ambiguity Cognitive Map based on gradient decline, including：

Sample process unit pre-processes the initial sample data, obtains sample for obtaining initial sample data Notebook data；

Sample division unit obtains training sample set and forecast sample collection for dividing to the sample data；

Network struction unit, for using the Pearson correlation coefficient based on timing to initialize weight function, in the instruction Practice the timing ambiguity Cognitive Map network that interior building connects entirely in sample set；

Function optimization unit, for according to preset the number of iterations, using batch gradient descent method to the power after initialization Value function optimizes, the error of the weight function after measuring every suboptimization with loss function；

Network determination unit, for choosing so that the smallest weight function of loss function, constructs optimum timing Fuzzy Cognitive Figure network, the optimum timing Fuzzy Cognitive Map indicate the mutual shadow of each node in different time period in the sample data It rings；

Network verification unit, for being verified using the forecast sample collection to the optimum timing Fuzzy Cognitive Map.

Further, the sample process unit is specifically used for：Initial sample data is obtained from preset data source, to institute It states initial sample data and carries out including feature extraction, normalized pretreatment, obtain the sample data, the sample data table Show each feature in the value at each moment；

The sample division unit is specifically used for：Sliding window method is used to the sample data, it is identical several to generate size Several sample groups are divided into training sample set and test sample collection by sample group.

Further, the network struction unit is specifically used for：

Firstly, obtaining weight function using the weight function of the Pearson correlation coefficient initialization each edge based on timing Set, the Pearson correlation coefficient are：

Full connection is constructed finally, being integrated into each sample group in the training set sample set according to the weight function Timing ambiguity Cognitive Map network；

The function optimization unit is specifically used for：

Firstly, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics Initial prediction is calculated in sigmod function；

Then, according to the initial prediction, use mean square deviation as loss function：

Then, weight is optimized using batch gradient descent method：

By node C_iLoss function about a weight function derivation, obtain：

Then, right-value optimization is carried out by cut operator, to reduce the dimension of parameter space, reduces over-fitting, changed Weight function after generation：

Finally, carrying out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose letter The smallest best initial weights function of number.

Further, the step of initial prediction is calculated using sigmod function in the function optimization unit is also wrapped It includes：

Compared with prior art, the present invention beneficial effect is：The embodiment of the present invention is located in advance after obtaining sample data Reason, divides the sample data, obtains training sample set and test sample collection, use the Pearson came phase relation based on timing Number initialization weight functions, according to preset the number of iterations, using batch gradient descent method to the weight function after initialization into Row optimization, the error of the weight function after measuring every suboptimization with loss function are chosen so that the smallest weight letter of loss function Number, construct optimum timing Fuzzy Cognitive Map network, finally using the forecast sample collection to the optimum timing Fuzzy Cognitive Map into Row verifying.The embodiment of the present invention learns tFCM model out using batch gradient descent algorithm, uses the Pearson came phase based on timing Relationship number initializes weight, and search range is allowed to fall in section for having statistical significance, reduce fall into locally optimal solution can It can property.

Detailed description of the invention

Fig. 1 is a kind of stream of the learning method of timing ambiguity Cognitive Map based on gradient decline provided in an embodiment of the present invention Cheng Tu；

Fig. 2 is a kind of knot of the learning system of timing ambiguity Cognitive Map based on gradient decline provided in an embodiment of the present invention Structure schematic diagram.

Specific embodiment

In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.

Fig. 1 shows a kind of study side of timing ambiguity Cognitive Map based on gradient decline provided in an embodiment of the present invention Method, including：

S101 obtains initial sample data, pre-processes to the initial sample data, obtains sample data；

S102 divides the sample data, obtains training sample set and forecast sample collection；

S103 initializes weight function using the Pearson correlation coefficient based on timing, in training sample concentration Construct the timing ambiguity Cognitive Map network connected entirely；

S104 carries out the weight function after initialization using batch gradient descent method excellent according to preset the number of iterations Change, the error of the weight function after measuring every suboptimization with loss function；

S105 chooses so that the smallest weight function of loss function, constructs optimum timing Fuzzy Cognitive Map network, it is described most Excellent timing ambiguity Cognitive Map indicates the interacting in different time period of each node in the sample data；

S106 verifies the optimum timing Fuzzy Cognitive Map using the forecast sample collection.

The embodiment of the present invention be suitble to small-scale time series data prediction and simple Causality Analysis, specifically, in step In rapid S101, the initial sample data of acquisition carrys out the financial data information that Google ***, Baidu or other official websites are announced freely, Details including annual income, gross profit, Innovation Input etc. and each moon, season, it is necessary first to which initial sample data is carried out Feature extraction, the pretreatment such as normalization, finally obtained each sample data is a time series, each sample data generation Table be each time point each feature value, wherein sample includes several features, such as：First sample be Gross profit, total income, Innovation Input of 10-12 company etc., second sample are the gross profit, total income, research and development of 11-13 company Investment etc..

In step s 102, sample data step S101 obtained generates sample group with sliding window method, sets each sample Group size is s, and sample group is divided into training sample set and forecast sample collection.

In step s 103, present embodiment assumes that the node at preceding s-1 moment is for s moment node in a sample group All there is causal influence, i.e. s moment each node and preceding s-1 moment all nodes have side connection, and when use is based on The Pearson correlation coefficient of sequence initializes the weight function on each side, such as t' moment node C_jTo s moment node C_iWeight function For f_ij(t'), thus constructing one has the timing ambiguity Cognitive Map tFCM network connected entirely.

Specifically, Pearson correlation coefficient (Pearson correlation coefficient) is a kind of research two Degree of dependence, the method for linearly related degree between variable.Given variable X=x1, x2 ..., x_nAnd Y=x1, x2 ..., x_n, Pearson correlation coefficient is defined with following formula：

Here variable has timing, therefore cannot directly use the Pearson correlation coefficient.It will in the present embodiment It is transformed into following formula, with Corr (X_j,X_i, t') and indicate X_j(t) and by the X after the t' time_i(t+t') between Degree of correlation：

In step S104, for training set, the embodiment of the present invention is mainly by training the timing mould an of Weighted Coefficients Paste Cognitive Map tFCM network, the tFCM network can not only predicted time sequence well, and can sufficiently reflect each time Mutual causal influence in section between concept is assumed within each period in the present embodiment, is all to have mutual shadow between concept Loud, specific steps include：

Step S401, according to the principle that the weight function set is utilized based on Fuzzy Cognitive Map FCM and temporal characteristics Initial prediction is calculated in sigmod function.Specifically, after the present embodiment obtains side right value function set by initialization, base In the principle and temporal characteristics of FCM, an initial prediction is calculated using sigmod function.

The state vector of tFCM is constantly changing at any time, and state transition function is shown as the following formula：

Wherein C_iIt (t+1) is influence by factor j all in the preceding t time to i, by being multiplied by different weight of each moment The predicted value that function obtains, C_j(t) value of t time intrinsic factor j, f are indicated_ji(t) influence of the different moments j to i is indicated, k is indicated Period s；

Step S402 uses mean square deviation as loss function according to the initial prediction.

Specifically, after calculating initial prediction, mean square deviation is used to measure error size as loss function, wherein：

For the node C of sample l_iError be：

Then for there is the sample l of n node, error is：

Then, the error of all samples is：

Step S403 optimizes weight using batch gradient descent method；

Specifically, by node C_iLoss function about a weight function derivation, obtain：

Update is iterated to weight function, increment (i.e. gradient) is：Wherein γ is the learning rate of setting, there is descending three phases variation in an iterative process；

S404 carries out right-value optimization by cut operator, to reduce the dimension of parameter space, reduces over-fitting, is changed Weight function after generation；

Specifically, a threshold value is set, in entire period interior nodes C_jThe node C that needs are predicted_iInfluence f_ij (t') always it is less than this threshold value, then can deletes all moment node C_jWith node C_iBetween side.Cut operator is with study The variation of rate γ only carries out three times, i.e. the method for three stages level beta pruning；

S405 carries out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and to lose letter The smallest best initial weights function of number；

Specifically, after completing step S401 to S404, that is, an iteration is completed, the weight function after obtaining iteration, Next iteration is carried out, until the number of iterations reaches preset the number of iterations, finally obtaining one group can make loss function the smallest Best initial weights function.

In step s105, it is chosen in step S104 and makes the smallest weight function of error, construct tFCM network, The tFCM network can sufficiently show each factor influencing each other in different time period.

Come below by specific embodiment to each round iteration in step S104, the step of weight function adjusts is solved It releases：

Given sample data<X_i,Y_i>| i=1,2 ..., k }, time-domain T={ 0,1,2 ..., t }.Wherein, it is each for inputting When inscribe state vector composition matrixOutput is each concept going through at the t+1 moment History data Y_i={ (y₁,y₂,…,y_m)}.For example, research s=2, when input is 2010,2011,2012,2013, When data in 2014, output is then 2012, data in 2014 in 2013.Because being divided into three groups of samples：Use 2010 The data in year and 2011 predict data in 2012, and number in 2013 is predicted using the data of 2011 and 2012 According to predicting data in 2014 using the data of 2012 and 2013.

In a certain sample<X_i,Y_i>In, about C_iError beWithIndicate C_iAt the t+1 moment Historical data；The error of sample l indicates with mean square error, i.e.,The overall error of all samples is defined asTraining error and test error are measured with E.E^lTo f_ji(t') such as formula of the partial derivative at the t+1 moment (1) It is shown：

Each time in iteration, shown in the increment of weight function such as formula (2)：

Iteration is completed to update weight function：

f_ij(t')←f_ij(t')+Δf_ij(t')； (3)

The learning method for the timing ambiguity Cognitive Map side right value based on gradient decline that the embodiment of the present invention proposes, FCM Graph model and timing weight function combine, and design is succinct, calculate simply, can easily apply to time series data, expand FCM's Function, while but also with the interpretation of FCM, the good feature of precision can be realized automatic study, training tFCM model, in reality The problems such as applying the tFCM building that can be used for this method in enterprise's financial data in example, effectively reducing over-fitting has obtained essence High, the interpretable tFCM model of degree.

In the training process, the embodiment of the present invention declines binding hierarchy pruning method using gradient, subtracts to a certain extent It is blocked in the probability of the point of local optimum less, and effectively reduces over-fitting.

Based on the weight function of gradient descent method study tFCM, solve the problems, such as following three：

(1) how to reduce the probability that local minimum points are fallen into study：

The embodiment of the present invention initializes weight function, all people using the Pearson correlation coefficient being unfolded based on the time Work intelligent algorithm is all searching algorithm, and wherein gradient method is heuristic search algorithm, and effect is that one is found near initial point A optimal solution.Therefore, it is influenced by initial value very big.One bad initial value can allow algorithm to fall into locally optimal solution.Tradition Method be repeatedly to be trained using different random starting values, then look for optimal solution.The present embodiment uses related Pierre Inferior related coefficient is come after determining initial value, therefore the section that search range can be allowed to have statistical significance at one falls into locally optimal solution A possibility that can reduce.

(2) how allowable loss function：Using traditional mean square deviation formula, to do loss function.HaveWherein y_lRefer to C_i(t+1) corresponding anticipated output (C_iIn t+ The historical data at 1 moment),It is the error of concept i in sample l, E^lIt is the mean value of the error of all concepts in sample l, E is institute There is the overall error mean value of sample.

(3) when training stops：

Usually there are two types of ways for iteration ends：One is the threshold values that error is arranged in dependence, once training error is less than this A threshold value, so that it may stop.Second is using fixed study the number of iterations.In the present embodiment, in uncertain error threshold In the case that how value, which sets, is just able to satisfy accuracy, and reduction two class of over-fitting requires, using fixed the number of iterations.

Further, in the present embodiment, other than using timing Pearson correlation coefficient to initialize weight function, may be used also To be randomized initial function, the number of iterations in addition to can with other than preset, can also use it is early stop method, limit the number of iterations.

The embodiment of the invention also provides a kind of study of timing ambiguity Cognitive Map based on gradient decline as shown in Figure 2 System, including：

Sample process unit 201 pre-processes the initial sample data, obtains for obtaining initial sample data To sample data；

Sample division unit 202 obtains training sample set and forecast sample for dividing to the sample data Collection；

Network struction unit 203, for using the Pearson correlation coefficient based on timing to initialize weight function, described Training sample constructs the timing ambiguity Cognitive Map network connected entirely in concentrating；

Function optimization unit 204, for according to preset the number of iterations, using batch gradient descent method to initialization after Weight function optimizes, the error of the weight function after measuring every suboptimization with loss function；

Network determination unit 205, for choosing so that the smallest weight function of loss function, building optimum timing is fuzzy to recognize Know figure network, the optimum timing Fuzzy Cognitive Map indicates in different time period mutual of each node in the sample data It influences；

Network verification unit 206, for being tested using the forecast sample collection the optimum timing Fuzzy Cognitive Map Card.

Further, sample process unit 201 is specifically used for：Initial sample data is obtained from preset data source, to institute It states initial sample data and carries out including feature extraction, normalized pretreatment, obtain the sample data, the sample data is each Value of a feature at each moment；

Sample division unit 202 is specifically used for：Sliding window method is used to the sample data, if generating the identical dry sample of size Several sample groups are divided into training sample set and test sample collection by this group.

Further, network struction unit 203 is specifically used for：

Function optimization unit 204 is specifically used for：

Then, weight is optimized using batch gradient descent method：

By node C_iLoss function about a weight function derivation, obtain：

Further, the step of initial prediction is calculated using sigmod function in function optimization unit 204 further include：

The embodiment of the invention also provides a kind of terminal, including memory, processor and storage on a memory and are being located The computer program that runs on reason device, which is characterized in that when processor executes computer program, realize it is as shown in Figure 1 based on Each step in the learning method of the timing ambiguity Cognitive Map of gradient decline.

A kind of readable storage medium storing program for executing is also provided in the embodiment of the present invention, is stored thereon with computer program, which is characterized in that When the computer program is executed by processor, the timing ambiguity Cognitive Map as shown in Figure 1 based on gradient decline is realized Each step in learning method.

It, can also be in addition, each functional module in each embodiment of the present invention can integrate in a processing module It is that modules physically exist alone, can also be integrated in two or more modules in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.

If the integrated module is realized in the form of software function module and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server or the network equipment etc.) executes the complete of each embodiment the method for the present invention Portion or part steps.And storage medium above-mentioned includes：USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk etc. are various can store journey The medium of sequence code.

The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Made any modifications, equivalent replacements, and improvements etc., should all be included in the protection scope of the present invention within mind and principle.

Claims

1. a kind of learning method of the timing ambiguity Cognitive Map based on gradient decline, which is characterized in that including：

Step C initializes weight function using the Pearson correlation coefficient based on timing, the building in the training sample is concentrated The timing ambiguity Cognitive Map network connected entirely；

Step D optimizes the weight function after initialization using batch gradient descent method according to preset the number of iterations, The error of weight function after measuring every suboptimization with loss function；

Step E chooses so that the smallest weight function of loss function, building optimum timing Fuzzy Cognitive Map network are described optimal Timing ambiguity Cognitive Map indicates the interacting in different time period of each node in the sample data；

2. learning method as described in claim 1, which is characterized in that the step A includes：

Step A1 obtains initial sample data from preset data source；

Step A2 carries out the initial sample data to include feature extraction, normalized pretreatment, obtains the sample number According to the sample data indicates each feature in the value at each moment.

3. learning method as described in claim 1, which is characterized in that the step B includes：

4. learning method as described in claim 1, which is characterized in that in the sample group, s moment each node and preceding s- 1 moment all node has side connection, and the step C includes：

Step C1 obtains weight function collection using the weight function of the Pearson correlation coefficient initialization each edge based on timing It closes, the Pearson correlation coefficient is：

Step C2 is constructed in each sample group being integrated into the training set sample set according to the weight function and to be connected entirely Timing ambiguity Cognitive Map network.

5. learning method as claimed in claim 4, which is characterized in that the step D includes：

Step D1, according to the principle that the weight function set utilizes sigmod based on Fuzzy Cognitive Map FCM and temporal characteristics Initial prediction is calculated in function；

For the node C of sample l_iError be：Then for having the sample l of n node, error For：Therefore, the error of all samples is：

Step D3 optimizes weight using batch gradient descent method：

By node C_iLoss function about a weight function derivation, obtain：

Step D4 carries out right-value optimization by cut operator, to reduce the dimension of parameter space, reduces over-fitting, obtains iteration Weight function afterwards：

In entire period interior nodes C_jThe node C that needs are predicted_iInfluence f_ij(t') always it is less than preset threshold, then deletes All moment node C_jWith node C_iBetween side, cut operator with learning rate γ variation, only carry out three times；

Step D5 carries out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and makes loss function The smallest best initial weights function.

6. learning method as claimed in claim 5, which is characterized in that in step D1：

If at the t+1 moment, node C_iIt is only influenced by last moment all nodes, then solves the state value of this concept, state turns Exchange the letters number is expressed as：

If needing solution node C at the t+1 moment_iState value, need to consider include t moment before all concepts to the shadow of the concept Accumulation is rung, then state transition function is expressed as：

If only considering a certain given period s, work as t'>When s, f (t+1-t')=0, then state transition function is expressed as：

7. a kind of learning system of the timing ambiguity Cognitive Map based on gradient decline, which is characterized in that including：

Sample process unit pre-processes the initial sample data, obtains sample number for obtaining initial sample data According to；

Network struction unit, for using the Pearson correlation coefficient based on timing to initialize weight function, in the trained sample The timing ambiguity Cognitive Map network that building connects entirely in this concentration；

Function optimization unit, for according to preset the number of iterations, using batch gradient descent method to the weight letter after initialization Number optimizes, the error of the weight function after measuring every suboptimization with loss function；

Network determination unit, for choosing so that the smallest weight function of loss function, constructs optimum timing Fuzzy Cognitive Map net Network, the optimum timing Fuzzy Cognitive Map indicate the interacting in different time period of each node in the sample data；

8. learning system as claimed in claim 7, which is characterized in that the sample process unit is specifically used for：From preset Data source obtains initial sample data, carries out including feature extraction, normalized pretreatment to the initial sample data, obtain The sample data, the sample data indicate each feature in the value at each moment；

The sample division unit is specifically used for：Sliding window method is used to the sample data, generates the identical several samples of size Several sample groups are divided into training sample set and test sample collection by group.

9. learning system as claimed in claim 7, which is characterized in that the network struction unit is specifically used for：

Firstly, weight function set is obtained using the weight function of the Pearson correlation coefficient initialization each edge based on timing, The Pearson correlation coefficient is：

Finally, constructed in each sample group being integrated into the training set sample set according to the weight function connect entirely when Sequence Fuzzy Cognitive Map network；

The function optimization unit is specifically used for：

Firstly, according to the principle that the weight function set utilizes sigmod letter based on Fuzzy Cognitive Map FCM and temporal characteristics Initial prediction is calculated in number；

Then, weight is optimized using batch gradient descent method：

By node C_iLoss function about a weight function derivation, obtain：

Then, right-value optimization is carried out by cut operator, to reduce the dimension of parameter space, over-fitting is reduced, after obtaining iteration Weight function：

Finally, carrying out next iteration, until the number of iterations reaches preset the number of iterations, obtains one group and make loss function most Small best initial weights function.

10. learning system as claimed in claim 9, which is characterized in that the function optimization unit utilizes sigmod function meter Calculation obtains the step of initial prediction and further includes：