CN109241291A - Knowledge mapping optimal path inquiry system and method based on deeply study - Google Patents

Knowledge mapping optimal path inquiry system and method based on deeply study Download PDF

Info

Publication number
CN109241291A
CN109241291A CN201810791353.6A CN201810791353A CN109241291A CN 109241291 A CN109241291 A CN 109241291A CN 201810791353 A CN201810791353 A CN 201810791353A CN 109241291 A CN109241291 A CN 109241291A
Authority
CN
China
Prior art keywords
layer
entity
network
value
optimal path
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810791353.6A
Other languages
Chinese (zh)
Other versions
CN109241291B (en
Inventor
黄震华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Original Assignee
South China Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University filed Critical South China Normal University
Priority to CN201810791353.6A priority Critical patent/CN109241291B/en
Publication of CN109241291A publication Critical patent/CN109241291A/en
Application granted granted Critical
Publication of CN109241291B publication Critical patent/CN109241291B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention proposes a kind of knowledge mapping optimal path inquiry methods based on deeply study, including two modules, respectively module one and module two, the module one is knowledge mapping optimal path model off-line training module, module two is knowledge mapping optimal path model application on site module, the knowledge mapping optimal path model off-line training module is equipped with deeply and learns component, make the training study of deeply to current entity, obtain next entity, current entity repetition training study is made with next entity again, obtain optimal path model, the optimal path model that module one obtains is input to by starting entity and target entity again, finally obtain optimal path, invention increases the generalization abilities of model, improve accuracy in computation, logical construction of the invention is clear, calculation is flexible, especially intensified learning with Deep learning can improve operation efficiency with distributed computing.

Description

Knowledge mapping optimal path inquiry system and method based on deeply study
Technical field
The present invention relates to computer fields, and in particular to it is a kind of based on deeply study knowledge mapping optimal path look into Ask system and method.
Background technique
Knowledge mapping (Knowledge Graph) is intended to describe and portray various entities present in real world (Entity) relationship (Relation) and between entity, usually come tissue and is indicated with digraph, and the node in figure indicates Entity, and side is then made of relationship, relationship is used to connect two entities, and whether portray between them has described in the relationship Relevance;If illustrating relevant property between them there are a line between two entities, otherwise indicating no relevance.In reality In the application of border, to the additional numerical value added between one 0~1 of each entity relationship (i.e. each edge of figure) in knowledge mapping, instead The correlation degree between entity is reflected;According to different application demands, the numerical value can indicate confidence level, tightness, distance or Cost etc., therefore this knowledge mapping is referred to as probabilistic knowledge map.
Optimal path inquiry retrieves knowledge mapping field the relationship between two entities between probabilistic knowledge map entity It is extremely important, is Knowledge Extraction, entity retrieval, relationship between the knowledge mapping network optimization and knowledge mapping entity One of core technologies of applications such as analysis.Data query and retrieval type for this complexity, need a kind of effective number It precisely could be effectively calculated required for user according to organizational form and efficient inquiry processing method as a result, therefore, improving Search efficiency and reduce processing cost be highly desirable be also it is extremely challenging.The topological structure of probabilistic knowledge map is to add Weigh digraph.
Currently, the figure optimal path inquiry method of mainstream has dijkstra's algorithm, Floyd algorithm and Bellman-Ford Algorithm etc..However, with the arrival of big data era, the search efficiency of these methods can no longer meet people it is acceptable when Between the memory space that can accommodate of range and machine, they for solve data volume greatly optimal road inquired it is incompetent For power.
And it has now been found that, large-scale data network this for probabilistic knowledge map, if it is desired to query time is reduced, Often using the strategy traded space for time, the higher query result of enquiry frequency is stored, the side Landmaeks-BFS Method sorts according to enquiry frequency of the user to probabilistic knowledge map entity, by the optimal path beta pruning between common entity, real Optimal path between body is stored in set, and this method reduces search space, but has ignored point of node in a network Property is dissipated, inquiry accuracy rate is not high.In addition, there are also acceleration technique is used on inquiry data prediction, such as based on two-way Parallel query method, the querying method based on goal directed and the querying method based on layering of search.These technologies are being looked into It askes and meets requirement in efficiency, however, since some intermediate points have been given up in beta pruning, so declined in query accuracy, If beta pruning is improper to may cause inquiry less than shortest path, if beta pruning is very few between two o'clock, being easy to degenerate is width First search, time efficiency is low and poor expandability.It is difficult to the shortest path in accurately inquiry probabilistic knowledge map Need to reach over time and space a balance, it is difficult to should guarantee that query time meets the requirement of user, also to guarantee to look into Ask quality.
Summary of the invention
The present invention in order to overcome at least one of the drawbacks of the prior art described above (deficiency), it is high, general to provide a kind of accuracy Optimal path inquiry method between the probabilistic knowledge map entity that change ability is strong, speed is fast and is easy to extend.
In order to solve the above technical problems, technical scheme is as follows:
A kind of knowledge mapping optimal path inquiry system based on deeply study, including two modules, respectively mould Block one and module two, the module one be knowledge mapping optimal path model off-line training module, module two be knowledge mapping most Shortest path model application on site module, the knowledge mapping optimal path model off-line training module are equipped with deeply study portion Part, the training for making deeply to current entity learn, and obtain next entity, then make current entity repetition training with next entity Study obtains optimal path model, then will originate entity and be input to the optimal path model that module one obtains with target entity, most Obtain optimal path eventually, by being used cooperatively between two modules, reach that accuracy is high, generalization ability is strong, speed is fast and It is easy to the purpose extended.
Further, the deeply study component is made of encoder, network components and logistic regression component, the net Network component includes transition components and training assembly, and the transition components include CNN neural network and FC neural network, the training Component includes intensified learning Policy strategy network and intensified learning value value network.
Further, the intensified learning Policy network is using the five layers of neural network connected entirely composition, intensified learning The preceding four node layers number of Policy neural network reduces step by step, and layer 5 has k neuron, intensified learning Policy nerve net The first layer and the second layer and the second layer and third layer of network, which are all made of dropout technology, prevents over-fitting, and activation primitive uses Tanh function, enhances the generalization ability of model using batch standardized technique between third layer and the 4th layer, activation primitive uses Sigmod function, the 4th layer is connected using complete come the probability for k relationship for obtaining being predicted, as next between layer 5 The action selection of a entity;
And the intensified learning value value network is using the five layers of neural network connected entirely composition, intensified learning The first layer of value value neural network to the 4th layer using the full Connection Neural Network successively decreased step by step, layer 5 only one Neuron is adopted between intensified learning value value neural network first layer and the second layer and between the second layer and third layer Over-fitting is prevented with dropout technology, and the activation primitive of first layer and the second layer is all made of tanh function, and third layer activates letter Number uses sigmod function, enhances the generalization ability of model, activation between third layer and the 4th layer using batch standardized technique Function is all made of relu function, and the 4th layer is connected between layer 5 using complete, and output result is working as Value neural network forecast Preceding state adds up bring income to dbjective state.
And a kind of knowledge mapping optimal path inquiry method based on deeply study proposed by the present invention, this method tool Body the following steps are included:
S1. the entity relationship in probabilistic knowledge map is arranged from big to small by user's visitation frequency in the unit time first Sequence chooses n relationship, generates required set of data samples;
S2. set of data samples is input in deeply study component and is trained study;
S3. it is carried out respectively the stage 1 in deeply study component, the training of the three phases in stage 2 and stage 3 is learned It practises;
Stage 1: entity is converted by initial term vector using encoder, then passes through 1-10 layers of CNN convolutional neural networks Encoded initial term vector is further processed and is converted into the term vector that deeply study component needs;
Stage 2: based on the relationship to be passed through of intensified learning Policy neural network forecast current entity next time;
Stage 3: value calculation is carried out to selected strategy based on intensified learning value network;
S4. after step S3 training study, the optimal path model of inquiry is obtained;
S5. input starting entity and target entity, successively pass through and are converted into term vector, it is defeated then to merge the two term vectors Enter the optimal path model to the inquiry of step S4, until finding target entity, finally obtaining a starting point is that starting is real Body, terminal are the optimal query path of target entity.
Further, n relationship is chosen in the step S1, n is not less than the 1/10 of probabilistic knowledge map entity relationship sum, γ=n/2 relationship is randomly selected in this n relationship, by this corresponding γ relationship and each relationship in probabilistic knowledge map Set of data samples needed for the two entities composition model training connected.
Further, the stage 1 of the step S3 is by the entity e of input1And e2Two are converted by encoder and network components A term vector Gθ(e1) and Gθ(e2), θ is set of network parameters to be optimized, two term vector G that the stage 1 is obtainedθ(e1) with Gθ(e2) similarity calculation is carried out, their COS distance is found out, is shown below:
Dθ(e1,e2)=| | Gθ(e1)-Gθ(e2)||cos,
In the training process, the two received data samples are represented by { (F, e1,e2), F is each data sample Label be shown below to construct trained loss function:
Wherein n is the sum of training sample.
Further, the loss function L (θ) needs to minimize, and loss function L (θ) can be refined are as follows:
LsIndicate the loss function between identical entity, and LuIt indicates the loss function between different entities, needs to make LuTo the greatest extent May be small, and make LsIt is as big as possible.
Further, the stage 2 and stage 3 of the step S3 carries out in the training component in deeply study component, The training component includes tactful network and value network, and the stage 2 does Strategies Training, and the stage 3 does value training, and Optimize the parameter sets of the two networks, the i.e. parameter θ of Policy strategy networkpWith the parameter θ of Value value networkv, two In a training, be equipped with four-tuple<state, return, movement, model>, wherein state with the entity in probabilistic knowledge map come It indicates.
Further, what the deeply by tactful network and value network based on target drives learnt obtains strategy Function and cost function: it for strategic function, is fitted by the neural network of nonlinear function estimation, obtaining strategic function is f (et,g|θp), for cost function, present node is equally fitted to target section by the neural network of nonlinear function estimation The income of point, obtaining cost function is h (et,g|θv)。
Further, the return that cost function is obtained is multiplied to indicate plan with the estimation of strategy given by strategic function The slightly loss function of network, is shown below:
Lf=log f (et,g|θp)×((rt+γh(et+1,g|θv)-h(et,g|θv)),
Wherein, γ ∈ (0,1) indicates discount factor, and according to LfTo parameter θpDerivation, and updated in such a way that gradient rises The parameter θ of Policy strategy networkp, obtain following formula:
Indicate derivative operation,Indicate strategic function f (et,g|θp) entropy item, β ∈ (0,1) be learn Habit rate;
If current strategies are positive with income product brought by the strategy is chosen, then positive update Policy strategy network Parameter θpValue so that a possibility that predicting the state next time increase;It is reversed to update Policy strategy if product is negative The parameter θ of networkpValue so that predict that the shape probability of state is as small as possible next time, until current network prediction strategy not Until fluctuating again.
Further, the obtained cost function h (et,g|θv) and current entity actual gain rt+γh(et+1,g|θv) The absolute value for making difference between the two calculates, and obtains the loss function of value network, is shown below:
Lh=| (rt+γ×h(et+1,g|θv))-h(et,g|θv) |,
Wherein, γ ∈ (0,1) indicates discount factor, and according to LhTo parameter θvDerivation, and updated in a manner of gradient decline The parameter θ of Value value networkv, obtain following formula:
Derivative operation is indicated, if the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between accidentally Difference is greater than the threshold value l that user gives, then updating the parameter θ of Value value networkv, so that the income error of prediction is as far as possible It is small, until the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between the threshold that is given in user of error Until no longer being fluctuated in the range of [- l, the l] of value.
Compared with prior art, the beneficial effect of technical solution of the present invention is:
(1) the invention proposes probabilistic knowledge maps, and the randomization carrying out 0~1 to entity relationship is handled, so that knowledge Optimal path inquiry on map more meets actual application demand.
(2) since the present invention is trained by the way of intensified learning, on the one hand reduce existing deep learning method In cause finally to calculate the poor problem of effect due to the irrationality of label design, secondly this mode is by saving each time Current entity reduces search space to the shortest path between a certain entity in iterative process, so that the adaptability of model is more By force, accuracy is higher.
(3) the present invention is based on deep learning technologies, and by two structures, identical, weight is shared and the convolution of pre-training is refreshing Starting term vector and target term vector are merged through network, avoided since the change needs of target entity restart to instruct Practice, increases the generalization ability of model, improve accuracy in computation.
(4) logical construction of each inside modules of the present invention is clear, calculation is flexible, has good loose coupling, Network structure can be flexibly set, the needs of calculating are met, while not being limited by specific developing instrument and programming software, and And can Quick Extended into distributed and parallelization exploitation environment, especially intensified learning and deep learning can be in a distributed manner It calculates, improves operation efficiency.
Detailed description of the invention
Fig. 1 is a kind of technological frame figure of knowledge mapping optimal path inquiry method based on deeply study.
Fig. 2 is that deeply learns component logic structure chart.
Specific embodiment
The attached figures are only used for illustrative purposes and cannot be understood as limitating the patent;
To those skilled in the art, it is to be understood that certain known features and its explanation, which may be omitted, in attached drawing 's.
The following further describes the technical solution of the present invention with reference to the accompanying drawings and examples.Embodiment 1
The invention proposes a kind of knowledge mapping optimal path inquiry systems based on deeply study, as shown in Figure 1, Including two modules, respectively module one and module two, module one is knowledge mapping optimal path model off-line training module, mould Block two is knowledge mapping optimal path model application on site module, and the knowledge mapping optimal path model off-line training module is set There is deeply to learn component, the training for making deeply to current entity learns, and data are carried out dress by module one and change instruction Practice, so that it may obtain the current entity next entity optimal to target entity, then next entity repetition training is learnt, so After obtain a trained optimal path model, then in module two by target entity and starting entity by conversion input To module one generate optimal path model in, realization strengthen again, can finally obtain optimal query path, by two modules it Between be used cooperatively, achieve the purpose that accuracy is high, generalization ability is strong, speed is fast and be easy to extension.
And module one constructs the set of data samples of optimal path model off-line training first, constructs as follows: first to probability Entity relationship in knowledge mapping is sorted from large to small by user's visitation frequency in the nearest m unit time, and then chooses first n Then relationship, n randomly select γ=n/2 not less than the 1/8 of probabilistic knowledge map entity relationship sum in this n relationship Relationship, thus two entity composition models that this corresponding γ relationship and each relationship in probabilistic knowledge map are connected The required set of data samples of training.
On this basis, each data sample constructed is input to deeply as shown in Figure 2 by module one It practises in component and is trained study, search for and obtain the relationship of next maximum probability associated by current entity, obtain and complete Merge the return value of the corresponding next entity of selected relationship later to update deeply study parameters of operating part.It changes in module one For this process, and it is continuously updated deeply study parameters of operating part, until current entity is target entity or iteration time Until number has been more than the greatest iteration threshold value that user gives, a candidate road from starting entity to target entity has been obtained at this time Diameter.Then, module one calculates the Total Return in current candidate path and compares with the fullpath Total Return inquired before, if worked as The income in preceding path be higher than before query path, then obtaining optimal path model, instead as the optimal path of inquiry The above process is executed again, until deeply study parameters of operating part convergence.
The deeply study component of module one is as shown in Fig. 2, by word2vec (word insertion) encoder, CNN (Convolutional Neural Network: convolutional neural networks) neural network, FC (Full Connect is connected entirely) mind Through network, intensified learning Policy (strategy) network, intensified learning value (Value) network and logistic regression component composition. The training process of deeply study component is broadly divided into 3 stages, wherein the stage 1 uses word2vec encoder by entity Be converted into initial term vector, then by multi-layer C NN convolutional neural networks to encoded initial term vector further progress at Reason is converted into the term vector that deeply study component needs;Stage 2 is based on intensified learning Policy (strategy) neural network forecast and works as The relationship to be passed through next time of preceding entity;Stage 3 is based on intensified learning value (Value) network and is worth to selected strategy It calculates.
In the stage 1, the present invention inputs c entity first, real by this c respectively by word2vec word embedded coding device Body converts corresponding c term vector, and the dimension of this c term vector is identical, then, arbitrarily selects from c entity term vector at random 2 term vectors are selected, the two term vectors are input in multi-layer C NN convolutional neural networks, multi-layer C NN convolutional neural networks are total Have 8 layers of structure: first layer carries out process of convolution to 2 entity term vectors of input respectively, the second layer to the convolution of first layer into Row maximum pondization operation, third layer and the 4th layer continue to second layer pond layer obtained data progress process of convolution, then, After the maximum pond layer of layer 5, it is sequentially ingressed into layer 6 and layer 7 and carries out process of convolution, finally by the 8th The average pond layer of layer obtains two final term vectors.Especially, right after the second layer and layer 5 complete maximum pondization operation It exports result and carries out batch standardization.To which the 8th layer of obtained term vector is the output in stage 1.Multi-layer C NN convolution mind Task through network training is to calculate the distance of the 8th layer of two obtained term vector, and the term vector distance for allowing positive sample to obtain is to the greatest extent May be small, and the term vector that negative sample obtains is apart from as big as possible.In addition, two complete phases of multilayer convolutional neural networks structure Together, network weight is shared.
Mainly intensified learning Policy (strategy) network is trained in the stage 2.The present invention is first with current entity Term vector and target entity term vector as input and by the obtained output vector of full articulamentum as Policy net The input term vector of network.Policy network is using the five layers of neural network connected entirely composition, preceding four layers of neural network node number Reduce step by step, layer 5 has k neuron.It is all made of between first layer and the second layer and between the second layer and third layer Dropout technology prevents over-fitting, and activation primitive uses tanh function.Using batch standardized technique between third layer and the 4th layer Enhance the generalization ability of model, meanwhile, activation primitive uses sigmod function.4th layer between layer 5 using full connection The probability of k relationship being predicted is obtained, action selection as next entity.The output of Policy network is probability Maximum relationship, and it as the obtained behavior of Policy network (Action).The selection mode of k relationship is as follows: first First select k1A highest relationship of confidence level, then randomly chooses k-k from remaining relationship1It is a, and by them according to confidence level It sorts from large to small, to obtain the maximum relationship of k confidence level of Policy network output.The training mission of Policy network It is to select best strategy as far as possible, so that next entity bring Income Maximum that selected relationship reaches
And the stage 3 is mainly trained intensified learning Value (value) network.The input of Value network and Policy The input of network is identical, i.e., using the term vector of the term vector of current entity and target entity as input and pass through full articulamentum institute Obtained output vector.Value network is used to the 4th layer and is passed step by step using the five layers of neural network connected entirely composition, first layer The full Connection Neural Network subtracted, only one neuron of layer 5.Between first layer and the second layer and the second layer and third layer Between be all made of dropout technology and prevent over-fitting, the activation primitive of first layer and the second layer is all made of tanh function, and third Layer activation primitive uses sigmod function.Enhance the extensive energy of model between third layer and the 4th layer using batch standardized technique Power, activation primitive are all made of relu function.4th layer is connected between layer 5 using complete, and output result is Value network The current state of prediction adds up bring income to dbjective state.The training mission of Value network is to make to predict under current state Income, as far as possible with the error of the sum of the income predicted under the confidence level and NextState of relationship given by Policy network It is small.
Module two in probabilistic knowledge map starting entity and target entity be input, successively by word2vec word it is embedding Enter encoder and 8 layers of CNN convolutional neural networks are converted into one-dimensional term vector respectively, then, merges the two one-dimensional term vectors simultaneously Input as intensified learning Policy strategy network and Value value network.Policy strategy network and Value value network It overlaps each other, and from starting entity, the current entity next entity optimal to target entity is provided every time, until finding Until target entity.Finally obtaining a starting point is starting entity, and terminal is the optimal query path of target entity.
A kind of knowledge mapping optimal path inquiry method based on deeply study that the present invention also proposes, specifically includes Following steps:
S1. first to the entity relationship in probabilistic knowledge map by user's visitation frequency in the nearest m unit time from big To small sequence, and then preceding n relationship is chosen, then n is closed not less than the 1/8 of probabilistic knowledge map entity relationship sum at this n γ=n/2 relationship is randomly selected in system, thus by this corresponding γ relationship in probabilistic knowledge map and each relationship institute Set of data samples needed for two entities composition model training of connection.
S2. then using the word2vec word embedded coding device of *** company respectively by the current entity of input and target Entity is converted into the one-dimensional term vector that two length are 512.
S3. then, carried out respectively the stage 1 in deeply study component, the instruction of the three phases in stage 2 and stage 3 Practice study.
Stage 1: the CNN convolutional neural networks that two structures of construction are identical and weight is shared, construction process are as follows:
The first layer of CNN convolutional neural networks includes 512 neurons, and using 22 × 1 convolution kernels, sliding step is solid It is set to 2, this layer mainly carries out convolution to the one-dimensional term vector (length is equal to 512) that front word2vec word embedded coding device obtains Processing obtains the one-dimensional vector that 2 length are 256.Then, the second layer of CNN convolutional neural networks is directed to the 2 of first layer output A one-dimensional term vector is 2 × 1 using 2 convolution kernel sizes, and the convolution kernel that sliding step is 1 carries out maximum pondization operation, thus Obtain the one-dimensional vector that 2 length are 256.Then on this basis, batch standard operation is executed to this 2 one-dimensional vectors.Then, Export to the second layer using 44 × 1 convolution kernels 2 of the third layer of CNN convolutional neural networks are one-dimensional after criticizing standard Vector carries out process of convolution, and sliding step is fixed as 4, to obtain the one-dimensional vector that 8 length are 64.Then, CNN convolution mind The 4th layer through network using 14 × 1 convolution kernel, sliding step 1, to 8 one-dimensional vectors of third layer output again into Row process of convolution is similarly obtained the one-dimensional vector that 8 length are 64.Then, the layer 5 of CNN convolutional neural networks is to the 4th layer 8 one-dimensional vectors carry out maximum pondization operation again, convolution kernel size is equal to 2 × 1, and convolution kernel number is equal to 4, sliding step It is 2, thus, obtain the one-dimensional vector that 32 length are 32.On this basis, crowd standard behaviour is executed to this 32 one-dimensional vectors Make.Then, the layer 6 of network layer 5 export using 24 × 1 convolution kernels 32 after criticizing standard it is one-dimensional to Amount carries out process of convolution, and sliding step is fixed as 2, thus, obtain the one-dimensional vector that 64 length are 16.Then, network 64 one-dimensional vectors progress process of convolution that layer 7 exports layer 6 using 44 × 1 convolution kernels, sliding step 4, To obtain the one-dimensional vector that 40 length are 512.Finally, the 8th layer of network is operated using average pondization, and finally obtain 256 length are the one-dimensional vector of 4 dimensions, and then, this 256 one-dimensional vectors are connected by connecting entirely with 512 neurons, from And obtain the one-dimensional vector that length is 512.
After the CNN convolutional neural networks construction that two structures are identical and weight is shared finishes, the present invention is logical The entity and relationship crossed in probabilistic knowledge map are trained them and parameter optimization, process are as follows:
The input of the two CNN convolutional neural networks is two entity e respectively1And e2, and it is 512 that output, which is two length, One-dimensional vector Gθ(e1) and Gθ(e2), wherein θ is set of network parameters to be optimized.Then, to the two one-dimensional vectors into Row similarity calculation finds out their COS distance: Dθ(e1,e2)=| | Gθ(e1)-Gθ(e2)||cosIf e1And e2This two A physical differences are larger, then Dθ(e1,e2) larger, and if e1With it is same or similar, then Dθ(e1,e2) smaller.
Therefore, in the training process, the two CNN convolutional neural networks received data samples be represented by (F, e1,e2), wherein F is the label of each data sample, if e1And e2Indicate identical entity, then F=1, anyway F=0.From And obtain the loss function of construction training are as follows:
Wherein n is the sum of training sample.
On this basis, L is usedsIndicate the loss function between identical entity, and LuIndicate the loss letter between different entities Number.In order to achieve the purpose that minimize loss function L (θ), need to make LuIt is as small as possible, and make LsIt is as big as possible.To training Loss function L (θ) can be refined are as follows:
In the training process, this by minimize loss function L (θ), may finally allow identical physical distance as far as possible Small, different physical distances is as big as possible, increases the discrimination of sample.In addition, in the training process, choosing 1,000,000 samples This entity therefrom randomly selects 250,000 pairs of identical entities to as positive sample, and randomly selects 250,000 pairs of different entities To as negative sample, it is input to after mixing and goes to train in network.
After calculating by the two CNN convolutional neural networks, length corresponding to current entity and target entity is obtained For 512 one-dimensional vector.Then, the two one-dimensional vectors are subjected to full attended operation again, i.e., one that two length are 512 Dimensional vector is directly connected to obtain the one-dimensional vector that length is 1024, is then linked into the full articulamentum of 512 neurons, Finally obtain the one-dimensional vector that a length is 512.We indicate fused current entity and target entity with it;
Stage 2 and stage 3 are mainly the Policy strategy network and Value value network in training deeply study component Network, and optimize the parameter sets of the two networks, the i.e. parameter θ of Policy strategy networkpWith the parameter θ of Value value networkv。 Next optimal policy and dynamic undated parameter θ are searched in continuous repetitive exercise above-mentioned two stagepAnd θv, until getting Until global optimum's strategy.Each round iteration can find a target entity, and undated parameter θ in fintie number of stepspAnd θv。 Especially, maximum number of iterations c is arranged in module onemaxIf current iteration number is more than to stop iteration.
For this purpose, the present invention, which is primarily based on probabilistic knowledge map, defines required four-tuple in the two network training process <state is returned, movement, and model>, wherein state is indicated with the entity in probabilistic knowledge map, such as current entity et, mesh Mark entity g and starting entity s;Current entity etTo next entity et+1Return rtIt indicates, rtEqual to etWith et+1Between relationship Confidence level;Movement indicated with m, be intelligent body action selection, correspond to probabilistic knowledge map in current entity with it is next Relationship between entity;Finally, model indicates the depth based on target drives in Policy strategy network or Value value network The strategic function or cost function of intensified learning: the neural network estimated for strategic function, the present invention by nonlinear function It is fitted, i.e., strategic function is f (et,g|θp), and for cost function, the nerve net of the same nonlinear function estimation of the present invention Network is fitted present node to the income of destination node, i.e., cost function is h (et,g|θv)
Stage 2: first to the parameter sets θ of Policy strategy networkpCarry out random initializtion.Then, Policy strategy Network receives current entity and the corresponding one-dimensional vector of target entity as input.The first layer of Policy strategy network has 256 A neuron is connect entirely with one-dimensional vector corresponding to current entity and target entity (length 512);The second layer has 64 A neuron;Third layer has 32 neurons;4th layer has 16 neurons;Layer 5 has 10 neurons, represents output The value of 10 entities and the probability for selecting this 10 entities, this 10 entities be by current entity into next layer entity The higher entity of preceding 7 confidence levels is collectively constituted with 3 entities of random selection in remaining entity, if next layer entity number Less than 10,0 filling of so much remaining solid element.First layer, the second layer and third layer are all made of tanh activation letter Number, and the 4th layer uses sigmod activation primitive with layer 5.Meanwhile using dropout technology and implementing to criticize between layers Standardization improves precision of prediction.Finally, 10 neuron outputs of layer 5 is 10 selected by Policy strategy network Then the probability of a relationship obtains selection of the relationship as behavior of maximum probability by softmax function.
Strategy given by the return obtained in the training process in stage 2 based on cost function and current strategies function The loss function that estimation is multiplied to indicate Policy strategy network is to be shown below:
Lf=log f (et,g|θp)×((rt+γh(et+1,g|θv)-h(et,g|θv)),
Wherein, γ ∈ (0,1) indicates discount factor.Then, according to LfTo parameter θpDerivation, and in such a way that gradient rises Undated parameter θp, it can obtain:
Wherein,Indicate derivative operation,Indicate strategic function f (et,g|θp) entropy item, β ∈ (0, It 1) is learning rate, the purpose that the entropy item is added is to obtain time dominant strategy too early in order to avoid Policy strategy network, and fall into office Portion is optimal.If current strategies are positive with income product brought by the strategy is chosen, forward direction updates θpValue, so that next A possibility that secondary prediction state, increases;If product is negative, θ is reversely updatedpValue, so that predicting the shape probability of state next time It is as small as possible, until the strategy of current network prediction no longer fluctuates;
Stage 3: first to the parameter sets θ of Value value networkvCarry out random initializtion.Then, tactful with Policy Network is the same, and Value value network receives current entity and the corresponding one-dimensional vector of target entity as input.Value network First layer have 256 neurons, connected entirely with one-dimensional vector corresponding to current entity and target entity (length 512) It connects;The second layer has 128 neurons;Third layer has 64 neurons;4th layer has 32 neurons;Layer 5 has a nerve Value in the state that member representative is current.It is all used between first layer and the second layer and between the second layer and third layer Dropout technology prevents over-fitting.First layer and the second layer are all made of tanh activation primitive, and third layer is all made of with the 4th layer Sigmod activation primitive.Implement batch standardization between third layer and the 4th layer to enhance the generalization ability of model.4th layer The value of prediction is finally obtained using full Connection Neural Network between layer 5.
In the training process in stage 3, current entity actual gain r is calculatedt+γh(et+1,g|θv) and predicted income h (et,g|θv) between difference absolute value, and the loss function as Value value network is shown below:
Lh=| (rt+γ×h(et+1,g|θv))-h(et,g|θv) |,
Wherein, γ ∈ (0,1) indicates discount factor.Then, according to LhTo parameter θvDerivation, and in a manner of gradient decline Undated parameter θv, it can obtain::
Wherein,Indicate derivative operation.If the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g| θv) between error be greater than the given threshold value l of user, then updating θv, so that the income error of prediction is as small as possible, until prediction Income h (et,g|θv) with calculate income rt+γh(et+1,g|θv) between error [- l, the l] of threshold value that gives in user Until no longer being fluctuated in range;
S4. in an iterative process, and it is continuously updated deeply study parameters of operating part, until current entity is that target is real Until body or the number of iterations have been more than the greatest iteration threshold value that user gives, obtained at this time from starting entity to target entity A path candidate.Then, mould calculate current candidate path Total Return and with the fullpath Total Return pair inquired before Than if the query path before the income of current path is higher than is held repeatedly as the optimal path model of inquiry The row above process, until deeply study parameters of operating part convergence.
S5. the entity in two probabilistic knowledge maps, i.e. starting entity s and target entity g are inputted, by trained Word2vec word embedded coding device converts them to the one-dimensional vector that length is 512 respectively.Then, the two vectors are merged The one-dimensional vector for being 1024 at length, and using it as the input of trained multi-layer C NN convolutional neural networks, it has respectively obtained The one-dimensional vector that length corresponding to beginning entity and target entity is 512.Then on this basis, then by the two one-dimensional vectors Generate the vector that new length is 1024 by full articulamentum, and as trained intensified learning Policy strategy network and The input of Value value network.Policy strategy network and Value value network overlap each other, and from starting entity, often It is secondary to provide the current entity next entity optimal to target entity, until finding target entity.To finally obtain one Starting point is starting entity s, and terminal is the optimal query path Path (s, g) of target entity g.
Finally, it is stated that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although referring to compared with Good embodiment describes the invention in detail, those skilled in the art should understand that, it can be to skill of the invention Art scheme is modified or replaced equivalently, and without departing from the objective and range of technical solution of the present invention, should all be covered at this In the scope of the claims of invention.

Claims (10)

1. a kind of knowledge mapping optimal path inquiry system based on deeply study, which is characterized in that including two modules, Respectively module one and module two, the module one are knowledge mapping optimal path model off-line training module, and module two is to know Know map optimal path model application on site module, it is strong that the knowledge mapping optimal path model off-line training module is equipped with depth Chemistry practises component, and the training for making deeply to current entity learns, and obtains next entity, then make current entity with next entity Repetition training study obtains optimal path model, then is input to the optimal road that module one obtains by starting entity and target entity Diameter model, finally obtains optimal path.
2. the knowledge mapping optimal path inquiry system according to claim 1 based on deeply study, which is characterized in that The deeply study component is made of encoder, network components and logistic regression component, and the network components include conversion Component and training assembly, the transition components include CNN neural network and FC neural network, and the training assembly includes extensive chemical Practise Policy strategy network and intensified learning value value network.
3. the knowledge mapping optimal path inquiry system according to claim 2 based on deeply study, which is characterized in that The neural network that the intensified learning Policy network is connected entirely using five layers forms, before intensified learning Policy neural network Four node layer numbers reduce step by step, and layer 5 has k neuron, the first layer and the second layer of intensified learning Policy neural network And the second layer and third layer are all made of dropout technology prevents over-fitting, activation primitive uses tanh function, third layer and the Enhance the generalization ability of model between four layers using batch standardized technique, activation primitive uses sigmod function, the 4th layer with Action selection between layer 5 using full connection come the probability for k relationship for obtaining being predicted, as next entity;
The intensified learning value value network is using the five layers of neural network connected entirely composition, intensified learning value value mind First layer through network to the 4th layer using the full Connection Neural Network successively decreased step by step, strengthen by only one neuron of layer 5 Dropout skill is all made of between study value value neural network first layer and the second layer and between the second layer and third layer Art prevents over-fitting, and the activation primitive of first layer and the second layer is all made of tanh function, and third layer activation primitive uses Sigmod function, enhances the generalization ability of model using batch standardized technique between third layer and the 4th layer, activation primitive is equal Using relu function, the 4th layer is connected between layer 5 using complete, and output result is the current state of Value neural network forecast Add up bring income to dbjective state.
4. a kind of knowledge mapping optimal path inquiry method based on deeply study, which comprises the following steps:
S1. the entity relationship in probabilistic knowledge map is sorted from large to small first by user's visitation frequency in the unit time, is selected N relationship is taken, required set of data samples is generated;
S2. set of data samples is input in deeply study component and is trained study;
S3. it is carried out respectively the stage 1 in deeply study component, the training study of the three phases in stage 2 and stage 3;
Stage 1: entity is converted by initial term vector using encoder, then by 1-10 layers of CNN convolutional neural networks to The initial term vector of coding, which is further processed, is converted into the term vector that deeply study component needs;
Stage 2: based on the relationship to be passed through of intensified learning Policy neural network forecast current entity next time;
Stage 3: value calculation is carried out to selected strategy based on intensified learning value network;
S4. after step S3 training study, the optimal path model of inquiry is obtained;
S5. then input starting entity and target entity merge the two term vectors and are input to successively by being converted into term vector The optimal path model of the inquiry of step S4, until finding target entity, finally obtaining a starting point is starting entity, eventually Point is the optimal query path of target entity.
5. the knowledge mapping optimal path inquiry method according to claim 4 based on deeply study, which is characterized in that Choose n relationship in the step S1,1/10 of n not less than probabilistic knowledge map entity relationship sum, in this n relationship at random Choose γ=n/2 relationship, two realities that this corresponding γ relationship and each relationship in probabilistic knowledge map are connected The required set of data samples of body composition model training.
6. the knowledge mapping optimal path inquiry method according to claim 4 based on deeply study, which is characterized in that The stage 1 of the step S3 is by the entity e of input1And e2Two term vector G are converted by encoder and network componentsθ(e1) With Gθ(e2), θ is set of network parameters to be optimized, two term vector G that the stage 1 is obtainedθ(e1) and Gθ(e2) carry out it is similar Degree calculates, and finds out their COS distance, is shown below:
Dθ(e1,e2)=| | Gθ(e1)-Gθ(e2)||cos,
In the training process, the two received data samples are represented by { (F, e1,e2), F is the mark of each data sample Label, to construct trained loss function, are shown below:
Wherein n is the sum of training sample.
The stage 2 and stage 3 of the step S3 carries out in the training component in deeply study component, and the stage 2 does Strategies Training, the stage 3 do value training, optimize the parameter sets of the two networks, i.e. Policy plan in the training process The slightly parameter θ of networkpWith the parameter θ of Value value networkv, and it is equipped with four-tuple<state, it returns, movement, model>, wherein State is indicated with the entity in probabilistic knowledge map.
7. the knowledge mapping optimal path inquiry method according to claim 6 based on deeply study, which is characterized in that The loss function L (θ) needs to minimize, and loss function L (θ) can be refined are as follows:
LsIndicate the loss function between identical entity, and LuIt indicates the loss function between different entities, needs to make LuAs far as possible It is small, and make LsIt is as big as possible.
8. the knowledge mapping optimal path inquiry method according to claim 6 based on deeply study, which is characterized in that The deeply study by tactful network and value network based on target drives obtains strategic function and cost function: It for strategic function, is fitted by the neural network of nonlinear function estimation, obtaining strategic function is f (et,g|θp), for valence Value function is equally fitted present node to the income of destination node by the neural network of nonlinear function estimation, must be worth Function is h (et,g|θv)。
9. the knowledge mapping optimal path inquiry method according to claim 8 based on deeply study, which is characterized in that The return that cost function is obtained is multiplied to indicate the loss letter of tactful network with the estimation of strategy given by strategic function Number, is shown below:
Lf=logf (et,g|θp)×((rt+γh(et+1,g|θv)-h(et,g|θv)),
Wherein, γ ∈ (0,1) indicates discount factor, and according to LfTo parameter θpDerivation, and updated in such a way that gradient rises The parameter θ of Policy strategy networkp, obtain following formula:
Indicate derivative operation,Indicate strategic function f (et,g|θp) entropy item, β ∈ (0,1) be study Rate;
If current strategies are positive with income product brought by the strategy is chosen, then the positive ginseng for updating Policy strategy network Number θpValue so that a possibility that predicting the state next time increase;It is reversed to update Policy strategy network if product is negative Parameter θpValue so that predict that the shape probability of state is as small as possible next time, until the strategy no longer wave of current network prediction Until dynamic.
10. the knowledge mapping optimal path inquiry method according to claim 8 based on deeply study, feature exist In the obtained cost function h (et,g|θv) and current entity actual gain rt+γh(et+1,g|θv) between the two make it is poor The absolute value of value calculates, and obtains the loss function of value network, is shown below:
Lh=| (rt+γ×h(et+1,g|θv))-h(et,g|θv) |,
Wherein, γ ∈ (0,1) indicates discount factor, and according to LhTo parameter θvDerivation, and updated in a manner of gradient decline The parameter θ of Value value networkv, obtain following formula:
Derivative operation is indicated, if the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between error it is big In the threshold value l that user gives, then updating the parameter θ of Value value networkv, so that the income error of prediction is as small as possible, directly To the income h (e of predictiont,g|θv) with calculate income rt+γh(et+1,g|θv) between the threshold value that gives in user of error [- L, l] in the range of no longer fluctuate until.
CN201810791353.6A 2018-07-18 2018-07-18 Knowledge graph optimal path query system and method based on deep reinforcement learning Active CN109241291B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810791353.6A CN109241291B (en) 2018-07-18 2018-07-18 Knowledge graph optimal path query system and method based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810791353.6A CN109241291B (en) 2018-07-18 2018-07-18 Knowledge graph optimal path query system and method based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN109241291A true CN109241291A (en) 2019-01-18
CN109241291B CN109241291B (en) 2022-02-15

Family

ID=65072112

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810791353.6A Active CN109241291B (en) 2018-07-18 2018-07-18 Knowledge graph optimal path query system and method based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN109241291B (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109818786A (en) * 2019-01-20 2019-05-28 北京工业大学 A kind of cloud data center applies the more optimal choosing methods in combination of resources path of appreciable distribution
CN109829579A (en) * 2019-01-22 2019-05-31 平安科技(深圳)有限公司 Minimal path calculation method, device, computer equipment and storage medium
CN109947098A (en) * 2019-03-06 2019-06-28 天津理工大学 A kind of distance priority optimal route selection method based on machine learning strategy
CN110288878A (en) * 2019-07-01 2019-09-27 科大讯飞股份有限公司 Adaptive learning method and device
CN110347857A (en) * 2019-06-06 2019-10-18 武汉理工大学 The semanteme marking method of remote sensing image based on intensified learning
CN110391843A (en) * 2019-06-19 2019-10-29 北京邮电大学 Transmission quality prediction, routing resource and the system of multi-area optical network
CN110825890A (en) * 2020-01-13 2020-02-21 成都四方伟业软件股份有限公司 Method and device for extracting knowledge graph entity relationship of pre-training model
CN110825821A (en) * 2019-09-30 2020-02-21 深圳云天励飞技术有限公司 Personnel relationship query method and device, electronic equipment and storage medium
CN110956254A (en) * 2019-11-12 2020-04-03 浙江工业大学 Case reasoning method based on dynamic knowledge representation learning
CN110990548A (en) * 2019-11-29 2020-04-10 支付宝(杭州)信息技术有限公司 Updating method and device of reinforcement learning model
CN111382359A (en) * 2020-03-09 2020-07-07 北京京东振世信息技术有限公司 Service strategy recommendation method and device based on reinforcement learning and electronic equipment
CN111401557A (en) * 2020-06-03 2020-07-10 超参数科技(深圳)有限公司 Agent decision making method, AI model training method, server and medium
CN111563209A (en) * 2019-01-29 2020-08-21 株式会社理光 Intention identification method and device and computer readable storage medium
CN111581343A (en) * 2020-04-24 2020-08-25 北京航空航天大学 Reinforced learning knowledge graph reasoning method and device based on graph convolution neural network
CN111597209A (en) * 2020-04-30 2020-08-28 清华大学 Database materialized view construction system, method and system creation method
CN111611339A (en) * 2019-02-22 2020-09-01 北京搜狗科技发展有限公司 Recommendation method and device for inputting related users
CN112801731A (en) * 2021-01-06 2021-05-14 广东工业大学 Federal reinforcement learning method for order taking auxiliary decision
CN112966591A (en) * 2021-03-03 2021-06-15 河北工业职业技术学院 Knowledge map deep reinforcement learning migration system for mechanical arm grabbing task
CN113255347A (en) * 2020-02-10 2021-08-13 阿里巴巴集团控股有限公司 Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment
CN114248265A (en) * 2020-09-25 2022-03-29 广州中国科学院先进技术研究所 Multi-task intelligent robot learning method and device based on meta-simulation learning
CN114626530A (en) * 2022-03-14 2022-06-14 电子科技大学 Reinforced learning knowledge graph reasoning method based on bilateral path quality assessment
CN115099401A (en) * 2022-05-13 2022-09-23 清华大学 Learning method, device and equipment of continuous learning framework based on world modeling
CN115936091A (en) * 2022-11-24 2023-04-07 北京百度网讯科技有限公司 Deep learning model training method and device, electronic equipment and storage medium
CN117009548A (en) * 2023-08-02 2023-11-07 广东立升科技有限公司 Knowledge graph supervision system based on secret equipment maintenance

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106598856A (en) * 2016-12-14 2017-04-26 广东威创视讯科技股份有限公司 Path detection method and path detection device
US20170124497A1 (en) * 2015-10-28 2017-05-04 Fractal Industries, Inc. System for automated capture and analysis of business information for reliable business venture outcome prediction
CN106776729A (en) * 2016-11-18 2017-05-31 同济大学 A kind of extensive knowledge mapping path query fallout predictor building method
CN106934012A (en) * 2017-03-10 2017-07-07 上海数眼科技发展有限公司 A kind of question answering in natural language method and system of knowledge based collection of illustrative plates
CN107577805A (en) * 2017-09-26 2018-01-12 华南理工大学 A kind of business service system towards the analysis of daily record big data
CN107944025A (en) * 2017-12-12 2018-04-20 北京百度网讯科技有限公司 Information-pushing method and device
CN108073711A (en) * 2017-12-21 2018-05-25 北京大学深圳研究生院 A kind of Relation extraction method and system of knowledge based collection of illustrative plates

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170124497A1 (en) * 2015-10-28 2017-05-04 Fractal Industries, Inc. System for automated capture and analysis of business information for reliable business venture outcome prediction
CN106776729A (en) * 2016-11-18 2017-05-31 同济大学 A kind of extensive knowledge mapping path query fallout predictor building method
CN106598856A (en) * 2016-12-14 2017-04-26 广东威创视讯科技股份有限公司 Path detection method and path detection device
CN106934012A (en) * 2017-03-10 2017-07-07 上海数眼科技发展有限公司 A kind of question answering in natural language method and system of knowledge based collection of illustrative plates
CN107577805A (en) * 2017-09-26 2018-01-12 华南理工大学 A kind of business service system towards the analysis of daily record big data
CN107944025A (en) * 2017-12-12 2018-04-20 北京百度网讯科技有限公司 Information-pushing method and device
CN108073711A (en) * 2017-12-21 2018-05-25 北京大学深圳研究生院 A kind of Relation extraction method and system of knowledge based collection of illustrative plates

Cited By (39)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109818786B (en) * 2019-01-20 2021-11-26 北京工业大学 Method for optimally selecting distributed multi-resource combined path capable of sensing application of cloud data center
CN109818786A (en) * 2019-01-20 2019-05-28 北京工业大学 A kind of cloud data center applies the more optimal choosing methods in combination of resources path of appreciable distribution
CN109829579A (en) * 2019-01-22 2019-05-31 平安科技(深圳)有限公司 Minimal path calculation method, device, computer equipment and storage medium
CN111563209A (en) * 2019-01-29 2020-08-21 株式会社理光 Intention identification method and device and computer readable storage medium
CN111611339A (en) * 2019-02-22 2020-09-01 北京搜狗科技发展有限公司 Recommendation method and device for inputting related users
CN109947098A (en) * 2019-03-06 2019-06-28 天津理工大学 A kind of distance priority optimal route selection method based on machine learning strategy
CN110347857A (en) * 2019-06-06 2019-10-18 武汉理工大学 The semanteme marking method of remote sensing image based on intensified learning
CN110391843A (en) * 2019-06-19 2019-10-29 北京邮电大学 Transmission quality prediction, routing resource and the system of multi-area optical network
CN110391843B (en) * 2019-06-19 2021-01-05 北京邮电大学 Transmission quality prediction and path selection method and system for multi-domain optical network
CN110288878A (en) * 2019-07-01 2019-09-27 科大讯飞股份有限公司 Adaptive learning method and device
CN110288878B (en) * 2019-07-01 2021-10-08 科大讯飞股份有限公司 Self-adaptive learning method and device
CN110825821A (en) * 2019-09-30 2020-02-21 深圳云天励飞技术有限公司 Personnel relationship query method and device, electronic equipment and storage medium
CN110825821B (en) * 2019-09-30 2022-11-22 深圳云天励飞技术有限公司 Personnel relationship query method and device, electronic equipment and storage medium
CN110956254A (en) * 2019-11-12 2020-04-03 浙江工业大学 Case reasoning method based on dynamic knowledge representation learning
CN110990548A (en) * 2019-11-29 2020-04-10 支付宝(杭州)信息技术有限公司 Updating method and device of reinforcement learning model
CN110990548B (en) * 2019-11-29 2023-04-25 支付宝(杭州)信息技术有限公司 Method and device for updating reinforcement learning model
CN110825890A (en) * 2020-01-13 2020-02-21 成都四方伟业软件股份有限公司 Method and device for extracting knowledge graph entity relationship of pre-training model
CN113255347B (en) * 2020-02-10 2022-11-15 阿里巴巴集团控股有限公司 Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment
CN113255347A (en) * 2020-02-10 2021-08-13 阿里巴巴集团控股有限公司 Method and equipment for realizing data fusion and method for realizing identification of unmanned equipment
CN111382359B (en) * 2020-03-09 2024-01-12 北京京东振世信息技术有限公司 Service policy recommendation method and device based on reinforcement learning, and electronic equipment
CN111382359A (en) * 2020-03-09 2020-07-07 北京京东振世信息技术有限公司 Service strategy recommendation method and device based on reinforcement learning and electronic equipment
CN111581343A (en) * 2020-04-24 2020-08-25 北京航空航天大学 Reinforced learning knowledge graph reasoning method and device based on graph convolution neural network
CN111581343B (en) * 2020-04-24 2022-08-30 北京航空航天大学 Reinforced learning knowledge graph reasoning method and device based on graph convolution neural network
CN111597209A (en) * 2020-04-30 2020-08-28 清华大学 Database materialized view construction system, method and system creation method
CN111597209B (en) * 2020-04-30 2023-11-14 清华大学 Database materialized view construction system, method and system creation method
CN111401557B (en) * 2020-06-03 2020-09-18 超参数科技(深圳)有限公司 Agent decision making method, AI model training method, server and medium
CN111401557A (en) * 2020-06-03 2020-07-10 超参数科技(深圳)有限公司 Agent decision making method, AI model training method, server and medium
CN114248265A (en) * 2020-09-25 2022-03-29 广州中国科学院先进技术研究所 Multi-task intelligent robot learning method and device based on meta-simulation learning
CN114248265B (en) * 2020-09-25 2023-07-07 广州中国科学院先进技术研究所 Method and device for learning multi-task intelligent robot based on meta-simulation learning
CN112801731A (en) * 2021-01-06 2021-05-14 广东工业大学 Federal reinforcement learning method for order taking auxiliary decision
CN112966591B (en) * 2021-03-03 2023-01-20 河北工业职业技术学院 Knowledge map deep reinforcement learning migration system for mechanical arm grabbing task
CN112966591A (en) * 2021-03-03 2021-06-15 河北工业职业技术学院 Knowledge map deep reinforcement learning migration system for mechanical arm grabbing task
CN114626530A (en) * 2022-03-14 2022-06-14 电子科技大学 Reinforced learning knowledge graph reasoning method based on bilateral path quality assessment
CN115099401A (en) * 2022-05-13 2022-09-23 清华大学 Learning method, device and equipment of continuous learning framework based on world modeling
CN115099401B (en) * 2022-05-13 2024-04-26 清华大学 Learning method, device and equipment of continuous learning framework based on world modeling
CN115936091A (en) * 2022-11-24 2023-04-07 北京百度网讯科技有限公司 Deep learning model training method and device, electronic equipment and storage medium
CN115936091B (en) * 2022-11-24 2024-03-08 北京百度网讯科技有限公司 Training method and device for deep learning model, electronic equipment and storage medium
CN117009548A (en) * 2023-08-02 2023-11-07 广东立升科技有限公司 Knowledge graph supervision system based on secret equipment maintenance
CN117009548B (en) * 2023-08-02 2023-12-26 广东立升科技有限公司 Knowledge graph supervision system based on secret equipment maintenance

Also Published As

Publication number Publication date
CN109241291B (en) 2022-02-15

Similar Documents

Publication Publication Date Title
CN109241291A (en) Knowledge mapping optimal path inquiry system and method based on deeply study
Han et al. A survey on metaheuristic optimization for random single-hidden layer feedforward neural network
Leng et al. Design for self-organizing fuzzy neural networks based on genetic algorithms
Nagib et al. Path planning for a mobile robot using genetic algorithms
CN108537366B (en) Reservoir scheduling method based on optimal convolution bidimensionalization
CN113239897B (en) Human body action evaluation method based on space-time characteristic combination regression
Chouikhi et al. Single-and multi-objective particle swarm optimization of reservoir structure in echo state network
Zhang et al. Evolving neural network classifiers and feature subset using artificial fish swarm
CN104732067A (en) Industrial process modeling forecasting method oriented at flow object
WO2022147583A2 (en) System and method for optimal placement of interacting objects on continuous (or discretized or mixed) domains
Raiaan et al. A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks
Zuo et al. Domain selection of transfer learning in fuzzy prediction models
Fofanah et al. Experimental Exploration of Evolutionary Algorithms and their Applications in Complex Problems: Genetic Algorithm and Particle Swarm Optimization Algorithm
CN116611504A (en) Neural architecture searching method based on evolution
CN115620046A (en) Multi-target neural architecture searching method based on semi-supervised performance predictor
Parsa et al. Multi-objective hyperparameter optimization for spiking neural network neuroevolution
Kavipriya et al. Adaptive weight deep convolutional neural network (AWDCNN) classifier for predicting student’s performance in job placement process
Park et al. DAG-GCN: Directed Acyclic Causal Graph Discovery from Real World Data using Graph Convolutional Networks
Guzman et al. Adaptive model predictive control by learning classifiers
de Oliveira et al. An evolutionary extreme learning machine based on fuzzy fish swarms
Phatai et al. Cultural algorithm initializes weights of neural network model for annual electricity consumption prediction
Zhang et al. Bandit neural architecture search based on performance evaluation for operation selection
Ikushima et al. Differential evolution neural network optimization with individual dependent mechanism
Srinivasan et al. Electricity price forecasting using evolved neural networks
Chen et al. Deep Recurrent Policy Networks for Planning Under Partial Observability

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant