CN106503035A - A kind of data processing method of knowledge mapping and device - Google Patents

A kind of data processing method of knowledge mapping and device Download PDF

Info

Publication number
CN106503035A
CN106503035A CN201610825067.8A CN201610825067A CN106503035A CN 106503035 A CN106503035 A CN 106503035A CN 201610825067 A CN201610825067 A CN 201610825067A CN 106503035 A CN106503035 A CN 106503035A
Authority
CN
China
Prior art keywords
entities
eigenvector
initial solid
value
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610825067.8A
Other languages
Chinese (zh)
Inventor
袁丽
甘信军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Group Co Ltd
Original Assignee
Hisense Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Group Co Ltd filed Critical Hisense Group Co Ltd
Priority to CN201610825067.8A priority Critical patent/CN106503035A/en
Publication of CN106503035A publication Critical patent/CN106503035A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A kind of data processing method of knowledge mapping and device is embodiments provided, methods described includes:Currently processed target entity is selected from other entities described;The first eigenvector of the initial solid is obtained, and, obtain the second feature vector of the target entity;According to the first eigenvector and second feature vector, the corresponding characteristic value of the target entity is calculated;For the eigenvalue of maximum, classification information and the relation information for updating the initial solid and other entities using its corresponding first eigenvector and second feature vector.In the embodiment of the present invention, the vectorization that is converted into of entity in knowledge mapping and relation information is represented, it is easy to classification information and the relation information prediction of entity, and further carry out more intelligent semantic analysis and process, semantic analysis is automated with processing, the structure of rule is not based on, manual maintenance cost is reduced, applicable scope is more extensive.

Description

A kind of data processing method of knowledge mapping and device
Technical field
The present invention relates to the technical field of data processing, more particularly to a kind of data processing method of knowledge mapping and Plant the data processing equipment of knowledge mapping.
Background technology
Knowledge mapping is also referred to as mapping knowledge domains, is referred to as knowledge domain visualization in books and information group or ken reflects Map is penetrated, is a series of a variety of figures of explicit knowledge's development process and structural relation, known with visualization technique description Know resource and its carrier, excavate, analyze, build, draw and explicit knowledge and connecting each other between them.
Specifically, knowledge mapping is by learning applied mathematics, graphics, Information Visualization Technology, information science etc. The theory of section is combined with the method such as method and meterological citation analysis, Co-occurrence Analysis, and is visually opened up using visual collection of illustrative plates The core texture of dendrography section, developing history, Disciplinary Frontiers and overall Knowledge framework reach the modern reason of Multidisciplinary Integration purpose By.Its complicated ken is shown by data mining, information processing, knowledge measure and graphic plotting.
With the development that knowledge mapping is studied, knowledge mapping can be good at assisting natural Language Processing and semantic analysis. But as accumulation of knowledge, the data volume of knowledge mapping increase, structure becomes increasingly complex, and will carry out accurate semantic analysis needs Carry out query logic constantly to add with rule and build.The message part disappearance when knowledge mapping builds, knowledge information be not full-time, It is extremely difficult loaded down with trivial details that completion knowledge mapping is carried out using rule.
Content of the invention
In view of the above problems, it is proposed that the embodiment of the present invention overcomes the problems referred to above or at least in part to provide one kind The data processing method dress of a kind of data processing method of the knowledge mapping for solving the above problems and accordingly a kind of knowledge mapping Put.
In order to solve the above problems, the embodiment of the invention discloses a kind of data processing method of knowledge mapping, described know Knowing collection of illustrative plates includes that initial solid and other entities, the initial solid and other entities have classification information and relation information, institute The method of stating includes:
Currently processed target entity is selected from other entities described;
The first eigenvector of the initial solid is obtained, and, obtain the second feature vector of the target entity;
According to the first eigenvector and second feature vector, the corresponding characteristic value of the target entity is calculated;
Determine the eigenvalue of maximum in the characteristic value;
For the eigenvalue of maximum, updated using its corresponding first eigenvector and second feature vector described initial The classification information and relation information of entity and other entities.
Preferably, the initial solid and other entities are respectively provided with corresponding term vector data, described from described other The step of selecting currently processed target entity in entity includes:
Using the term vector data of the initial solid, and, the term vector data of other entities calculate transfer general Rate value;
Judge the transition probability value whether more than the first predetermined threshold value;
When the transition probability value is more than the first predetermined threshold value, determine that the transition probability value other entities corresponding are Target entity.
Preferably, described vectorial according to the first eigenvector and the second feature, calculate the target entity pair The step of characteristic value that answers, includes:
According to the first eigenvector and second feature vector, the corresponding conditional probability value of the target entity is calculated;
Tire out and take advantage of the conditional probability value, obtain to tire out and takes advantage of conditional probability value;
Tired conditional probability value is taken advantage of to carry out operation of taking the logarithm, acquisition logarithm conditional probability value for described;
Add up the logarithm conditional probability value, obtains characteristic value.
Preferably, described for the eigenvalue of maximum, using its corresponding first eigenvector and second feature vector, The step of classification information and relation information for updating the initial solid and other entities, includes:
For the eigenvalue of maximum, its corresponding first eigenvector and second feature vector is obtained;
According to the first eigenvector and second feature vector, for the initial solid and other entities mark classification Information.
According to the first eigenvector and second feature vector, add relation letter for initial solid and other entities Breath.
Preferably, described according to the first eigenvector and second feature vector, for the initial solid and other The step of entity mark classification information, includes:
Using the first eigenvector and second feature vector of the initial solid and other entities, training first is classified Device;
Using first grader, the classification information of the initial solid and other entities is calculated;
The initial solid and other entities are marked using the classification information.
Preferably, described vectorial according to the first eigenvector and second feature, for initial solid and other entities The step of adding relation information includes:
Using the first eigenvector and second feature vector of the initial solid and other entities, training second is classified Device;
Using second grader, the relation information of the initial solid and other entities is calculated;
The relation information is added to the initial solid and other entities.
Preferably, it is characterised in that methods described also includes:
For off-peak characteristic value, its corresponding first eigenvector and second feature vector is updated.
Preferably, described determine in the characteristic value eigenvalue of maximum the step of include:
Record described vectorial according to the first eigenvector and second feature, the corresponding feature of the calculating target entity The execution number of times of the step of value;
Judge the execution number of times whether more than the second predetermined threshold value;
When the number of times is more than the second predetermined threshold value, the eigenvalue of maximum in the characteristic value is selected.
The embodiment of the invention also discloses a kind of data processing equipment of knowledge mapping, the knowledge mapping includes initial reality There is classification information and relation information, described device to include for body and other entities, the initial solid and other entities:
Target entity chosen module, for selecting currently processed target entity from other entities described;
First and second characteristic vector acquisition module, for obtaining the first eigenvector of the initial solid, and, obtain Take the second feature vector of the target entity;
Characteristic value calculating module, for according to the first eigenvector and second feature vector, calculating the mesh The corresponding characteristic value of mark entity;
Eigenvalue of maximum determining module, for determining the eigenvalue of maximum in the characteristic value;
Classification information and relation information update module, for being directed to the eigenvalue of maximum, using its corresponding first spy Levy classification information and relation information that vector sum second feature vector updates the initial solid and other entities.
Preferably, the target entity chosen module includes:
Transition probability value calculating sub module, for the term vector data using the initial solid, and, other realities described The term vector data of body, calculate transition probability value;
Whether the first predetermined threshold value judging submodule, for judging the transition probability value more than the first predetermined threshold value;
Target entity determination sub-module, for when the transition probability value is more than the first predetermined threshold value, determining described turning It is target entity to move probable value other entities corresponding.
Preferably, the characteristic value calculating module includes:
Conditional probability value calculating sub module, for according to the first eigenvector and second feature vector, calculating described The corresponding conditional probability value of target entity;
Tire out and take advantage of conditional probability value to obtain submodule, the conditional probability value is taken advantage of for tired, obtain to tire out and takes advantage of conditional probability value;
Logarithm conditional probability value obtains submodule, for for described tired take advantage of conditional probability value to carry out operation of taking the logarithm, obtain Obtain logarithm conditional probability value;
Characteristic value obtains submodule, for the logarithm conditional probability value that adds up, obtains characteristic value.
Preferably, first and second characteristic vector update module described includes:
First and second characteristic vector pickup submodule, for for the eigenvalue of maximum, extracting corresponding first Characteristic vector and second feature vector;
Classification information marks submodule, for according to the first eigenvector and second feature vector, for described first Beginning entity and other entities mark classification information.
Relation information adds submodule, for according to the first eigenvector and second feature vector, for initial reality Body and other entities add relation information.
Preferably, the classification information mark submodule includes:
First classifier training unit, for the first eigenvector and second using the initial solid and other entities Characteristic vector, trains the first grader;
Classification information computing unit, for using first grader, the calculating initial solid and other entities Classification information;
Classification information marks unit, for marking the initial solid and other entities using the classification information.
Preferably, the relation information adds submodule and includes:
Second classifier training unit, for the first eigenvector and second using the initial solid and other entities Characteristic vector, trains the second grader;
Relation information computing unit, for using second grader, the calculating initial solid and other entities Relation information;
Relation information adding device, for being added to the initial solid and other entities by the relation information.
Preferably, described device also includes:
First and second vectorial update module, for for off-peak characteristic value, update its corresponding fisrt feature to Amount and second feature vector.
Preferably, the eigenvalue of maximum determining module includes:
Number of times record sub module is executed, for recording vectorial, the meter according to the first eigenvector and second feature The execution number of times of the step of calculating the target entity corresponding characteristic value;
Whether the second predetermined threshold value judging submodule, for judging the execution number of times more than the second predetermined threshold value;
Eigenvalue of maximum chooses submodule, for when the number of times is more than the second predetermined threshold value, selecting the feature Eigenvalue of maximum in value.
The embodiment of the present invention includes advantages below:
In the embodiment of the present invention, currently processed target entity is selected from other entities the plurality of;Using described One characteristic vector and second feature vector, select the characteristic value of maximum, special using its corresponding first eigenvector and second Levy classification information and relation information that vector updates the initial solid and other entities multiple.In the embodiment of the present invention, will know The vectorization that is converted into for knowing entity (point) and relation information (side) in collection of illustrative plates is represented, by complicated network graphic knot in knowledge mapping Structure is mapped as the characteristic vector of low-dimensional and represents, is easy to classification information and the relation information prediction of entity, and further enters The more intelligent semantic analysis of row and process, semantic analysis is automated with processing, the structure of rule is not based on, and reduces artificial dimension Shield cost, applicable scope are more extensive.
Further, the embodiment of the present invention is vectorial according to the first eigenvector and second feature, for initial solid And other entities multiple add relation information and classification information, the entity and relation letter in knowledge mapping is represented using vectorization Breath, represents according to the vectorization of entity, can close with the classification information of auto-complete entity and to adding between two entities automatically It is information, greatly reduces the workload and maintenance cost of ground maintenance knowledge collection of illustrative plates.
Description of the drawings
The step of Fig. 1 is a kind of data processing method embodiment one of knowledge mapping of embodiment of the present invention flow chart;
Fig. 2 is a kind of schematic diagram of knowledge mapping of prior art;
The step of Fig. 3 is a kind of data processing method embodiment two of knowledge mapping of embodiment of the present invention flow chart;
Fig. 4 A are a kind of the first schematic diagrames of the target entity set of knowledge mapping of the embodiment of the present invention;
Fig. 4 B are a kind of the second schematic diagrames of the target entity set of knowledge mapping of the embodiment of the present invention;
Fig. 5 is a kind of structured flowchart of the data processing equipment embodiment of knowledge mapping of the embodiment of the present invention.
Specific embodiment
Understandable for enabling the above-mentioned purpose of the embodiment of the present invention, feature and advantage to become apparent from, below in conjunction with the accompanying drawings and The present invention is further detailed explanation for specific embodiment.
One of core idea of the present invention is that the entity in the knowledge mapping that will be given is mapped as the vectorization table of low-dimensional Show, then between entity or entity, add the classification information or relation letter of disappearance by the vector in original knowledge mapping The entity of knowledge mapping is represented with vectorization, complicated network graphic structure in knowledge mapping is reflected by breath, the embodiment of the present invention Penetrate the vectorization for low-dimensional to represent, and no-go gage is then represented, is easy to the classification information of entity and the prediction of relation information.
With reference to Fig. 1, show the embodiment of the present invention a kind of data processing method embodiment one of knowledge mapping the step of Flow chart, the knowledge mapping can include that initial solid and other entities, the initial solid and other entities have classification Information and relation information, specifically may include steps of:
Step 101, selectes currently processed target entity from other entities described;
Knowledge mapping is made up of entity (point) and relation information (side), and each point has corresponding property value, property value Classification information can be included, be connected by side between 2 points, the process for building knowledge mapping is exactly constantly will by redaction rule Point or side are added to knowledge mapping, knowledge mapping is constantly expanded.
For given knowledge mapping, by an entity of knowledge mapping, initial solid is chosen to be, is considered as starting point, knowledge Collection of illustrative plates can include multiple entities, in addition to initial solid, also multiple other entities being associated with initial solid.Knowledge mapping It is intended to describe various entities present in real world.Wherein, ID of each entity with a globally unique determination (identifier, identifier) is representing.And pass through characteristic vector presentation-entity in embodiments of the present invention.Real in knowledge mapping Body is connected by relation information, and each entity is respectively provided with corresponding classification information, and classification information is used for the inherence for portraying entity Characteristic, and relation information is used for connecting two entities, portrays the association between them.Knowledge mapping be also regarded as one huge Big figure, the node presentation-entity in figure, and the side in figure is then made up of relation information.
A kind of example of knowledge mapping in prior art with reference to shown in Fig. 2, knowledge mapping is with entity " Deng Chao " and " grandson Centered on pari ", other entities are the films and television programs relevant with " Deng Chao ", " grandson pari ".Assume to select Deng Chao for initial solid, table F (n) is shown as, the entity that other are connected with " Deng Chao " is other entities.For example, the classification information of entity " Deng Chao " is performer, real Relation information between body " Deng Chao " and entity " grandson pari " is man and wife.
As a kind of example of concrete application of the present invention, can pass through to calculate transition probability value, from other entities multiple Currently processed target entity is selected, by calculating transition probability value, currently processed target entity is selected.Transition probability value is Can be with the size of the weight of the contact between presentation-entity.Transition probability value can also be the value of other expression weight sizes, example Such as, transition probability matrix, adjacency matrix, certainly, above it is merely meant that the size of the weight of relation information between entity is shown Example, any transition probability value can be can serve as with the value of the size of the weight of the relation information between presentation-entity, the present invention Embodiment does not make specific restriction to this.
Step 102, obtains the first eigenvector of the initial solid, and, obtain the target entity second is special Levy vector;
In the embodiment of the present invention, in given knowledge mapping, all of entity gives an initial vector, this initially to Amount can be the numerical value of multidimensional, and numerical value is random imparting, it is impossible to the contact and architectural feature between presentation-entity, but by this After updating initial vector after the method for inventive embodiments, can be very good to represent the contact between different entities and architectural feature, Initial vector can include the first eigenvector of initial solid, or the second feature vector of target entity.The of initial solid One characteristic vector is the vector of dimension more than being manually set, and the second feature of target entity is vectorial to be equally manually set The vector of dimension more than one.
The application embodiment of the present invention, the initial solid in knowledge mapping can be defined as the fisrt feature of dimension more than Vector, dimension can be 100 dimensions, or 50 dimensions, and for example, the vectorization of the initial solid " Deng Chao " of definition represents that f (n) is First eigenvector, i.e. f (Deng Chao)=[0.543,0.381,0.328 ... 0.182], wherein, dimension is 100 dimension (fisrt feature Number in vector is 100), and assume target entity for " grandson pari ", then the second feature vector representation of target entity " grandson pari " For f (grandson pari)=[0.337,0.169,0.401 ... 0.403], wherein, dimension is 100 dimensions.
Certainly, the setting for dimension can be determined according to actual conditions by those skilled in the art that the present invention is to this It is not restricted.
Step 103, according to the first eigenvector and second feature vector, calculates the target entity corresponding Characteristic value;
In a kind of preferred embodiment of concrete application of the present invention, can according to the fisrt feature to and second feature to Amount, calculates the conditional probability value of the second feature vector of target entity under conditions of the first eigenvector with initial solid, Because one in knowledge mapping initial solid be generally connected with multiple target entities, therefore multiple different target entities can be obtained Conditional probability value, the method for design conditions probable value can be represented using softmax functional expressions, tired take advantage of the conditional probability Value, obtain multiple conditional probability values takes advantage of value, to the plurality of conditional probability value take advantage of value take the logarithm then add up obtain Characteristic value.
Step 104, determines the eigenvalue of maximum in the characteristic value;
The embodiment of the present invention is applied to, described vectorial according to the first eigenvector and second feature, calculating institute is recorded The execution number of times of the step of stating target entity corresponding characteristic value, it can be understood as obtain the number of characteristic value, judge described in hold Whether places number is more than the second predetermined threshold value, when the number of times is more than the second predetermined threshold value, selects in the characteristic value Eigenvalue of maximum.Wherein, the second predetermined threshold value be set to those skilled in the art according to actual conditions depending on, the present invention implement Example is not restricted to this.
Step 105, for the eigenvalue of maximum, is updated using its corresponding first eigenvector and second feature vector The classification information and relation information of the initial solid and other entities multiple.
In embodiments of the present invention, it is calculated after characteristic value via step 103, constantly adjustment fisrt feature can be passed through Vector and the every one-dimensional value of second feature vector, so that obtain different characteristic values.For each characteristic value, maximum is selected The corresponding first eigenvector of characteristic value and second feature vector, used as initial solid and the vector of other entities (target entity) Change and represent, because after all entity vectorizations are represented in knowledge mapping, if represent can be fine for the vectorization of all entities Expression relation information and architectural feature of the entity in knowledge mapping, at this moment, other adjacent entities around entity " Deng Chao " The conditional probability value that set occurs will be maximum.
Because characteristic value is obtained through certain operations by conditional probability, thus, definable to all entity n in collection of illustrative plates most The characteristic value of bigization, characteristic value can include target function value.When characteristic value be not maximum when, represent now corresponding first Characteristic vector and second feature vector are not optimal solutions.
At this time, it may be necessary to continue to adjust (increase is reduced) first eigenvector and the one or more dimensions in second feature vector Number, obtain different characteristic values, choose the maximum corresponding first eigenvector value of characteristic value and multiple second feature vector Value represents as the vectorization of entity, otherwise, updates in first eigenvector and second feature vector, and return according to first Characteristic vector and second feature vector, the step of calculate target entity corresponding characteristic value.
It should be noted that an iterations can be arranged in the embodiment of the present invention, select in iterations most Big characteristic value, it is ensured that the performance of hardware is without prejudice, the setting of iterations can be by those skilled in the art according to actual Situation determining, the invention is not limited in this regard.
Further, the eigenvalue of maximum in iterations is selected, corresponding first spy of maximum characteristic value is extracted Vector and second feature vector is levied, initial solid and in knowledge mapping is updated by first eigenvector and second feature vector more The classification information of other entities individual and relation information, using first eigenvector and second feature vector and initial solid and multiple The classification information of other entities and relation information, training the first grader and the second grader, update unknown classification information and Relation information.
It should be noted that in the embodiment of the present invention, two can be represented by the product between two characteristic vectors Relation information between individual entity, can also pass through mean value or the standardization of L1 norms or L2 norms rule between two characteristic vectors Generalized represents that the relation information between two entities, the embodiment of the present invention do not make any restriction to this.
In the embodiment of the present invention, currently processed target entity is selected from other entities multiple;Using fisrt feature to Amount and second feature vector, calculate the corresponding characteristic value of target entity;If characteristic value is not maximum, adjust its corresponding first After characteristic vector and second feature vector, return according to first eigenvector and second feature vector, calculate target entity corresponding Characteristic value the step of;The characteristic value of maximum is selected, is updated using its corresponding first eigenvector and second feature vector The classification information and relation information of initial solid and other entities multiple.In the embodiment of the present invention, by entity in knowledge mapping The vectorization that is converted into of (point) and relation information (side) is represented, is low-dimensional by complicated network graphic structure mapping in knowledge mapping Characteristic vector represent, be easy to classification information and the relation information prediction of entity, and further carry out more intelligent language Justice analysis and process, by semantic analysis and process automation, are not based on the structure of rule, reduce manual maintenance cost, applicable Scope more extensive.
With reference to Fig. 3, show the embodiment of the present invention a kind of data processing method embodiment two of knowledge mapping the step of Flow chart, knowledge mapping include that initial solid and other entities multiple, initial solid and other entities multiple have classification information And relation information, initial solid and other entities are respectively provided with corresponding term vector data, the substantially side of embodiment of the method two The extension of method embodiment one, specifically may include steps of:
Step 201, using the term vector data of the initial solid, and, the term vector data of other entities, meter Calculate transition probability value;
In the embodiment of the present invention, the term vector data of initial solid and the term vector data of other entities multiple are obtained, is pressed Multiple transition probability values are calculated according to specific formulation, term vector data can be obtained using language model training.Common method has N-gram models, maximum entropy Markov model etc., the embodiment of the present invention do not make any restriction to this.
Whether step 202, judge the transition probability value more than the first predetermined threshold value;
Wherein, the first predetermined threshold value can be the artificial any numerical value for arranging, and for example, the first predetermined threshold value could be arranged to 0, when transition probability value is more than 0, the operation of execution step 203.
Step 203, when the transition probability value be more than the first predetermined threshold value when, determine the transition probability value corresponding its Its entity is target entity;
Specifically, when there is transition probability value to be more than the first predetermined threshold value, transition probability value other realities corresponding are determined Body is target entity, so, just can determine the target entity for having particular association with initial solid.
Step 204, obtains the first eigenvector of the initial solid, and, obtain the target entity second is special Levy vector;
For reality, obtain default definition the first eigenvector of initial solid and the second feature of target entity to Amount, first eigenvector f (Deng Chao)=[0.543,0.381,0.328 ... 0.182], and assume that target entity is " grandson pari ", then Target entity " grandson pari " is expressed as second feature vector f (grandson pari)=[0.337,0.169,0.401 ... 0.403].
Step 205, described vectorial according to the first eigenvector and second feature, calculate the target entity corresponding Characteristic value;
In a kind of preferred embodiment of the embodiment of the present invention, described according to the first eigenvector and second feature to Amount, includes following sub-step the step of calculate the target entity corresponding characteristic value:
Sub-step S2051, according to the first eigenvector and second feature vector, calculates the target entity pair The conditional probability value that answers;
Sub-step S2052, tires out and takes advantage of the conditional probability value, obtains to tire out and takes advantage of conditional probability value;
Sub-step S2053, tired takes advantage of conditional probability value to carry out operation of taking the logarithm, acquisition logarithm conditional probability value for described;
Sub-step S2054, add up the logarithm conditional probability value, obtains characteristic value.
In concrete application, according to fisrt feature to and second feature vector, calculate with the fisrt feature of initial solid to Under conditions of amount target entity second feature vector conditional probability value because one in knowledge mapping initial solid usual It is connected with multiple target entities, therefore the conditional probability value of multiple different target entities can be obtained.
Further, tire out and take advantage of conditional probability value, obtain a conditional probability value takes advantage of value, and the value of taking advantage of of conditional probability value is taken Logarithm is then cumulative, can obtain characteristic value, and wherein, characteristic value can include target function value.
Step 206, determines the eigenvalue of maximum in the characteristic value;
In a kind of preferred embodiment of the embodiment of the present invention, the sub-step for determining the eigenvalue of maximum in the characteristic value Suddenly include:
Sub-step S2061, records described vectorial according to the first eigenvector and second feature, the calculating target reality The execution number of times of the step of body corresponding characteristic value;
Whether sub-step S2062, judge the execution number of times more than the second predetermined threshold value;
Sub-step S2063, when the number of times is more than the second predetermined threshold value, selects the maximum feature in the characteristic value Value.
Specifically, the execution number of times of recording step 204, when number of times is executed more than Second Threshold, it is possible to obtain multiple Characteristic value, selects eigenvalue of maximum from multiple characteristic values, carries out the operation of next step.
Step 207, for the eigenvalue of maximum, obtains its corresponding first eigenvector and second feature vector;
Step 208, according to the first eigenvector and second feature vector, for the initial solid and other entities Mark classification information;
In the embodiment of the present invention, according to first eigenvector and second feature vector, for initial solid and other entities The step of mark classification information, includes:
Sub-step S2081, using the first eigenvector and second feature vector of the initial solid and other entities, instruction Practice the first grader;
Sub-step S2081, using first grader, calculates the classification information of the initial solid and other entities;
Sub-step S2083, marks the initial solid and other entities using the classification information.
Wherein, the first grader, can be decision tree, logistic regression, naive Bayesian, neutral net scheduling algorithm etc., make With the first classifier training initial solid and the known classification information in other entities, can calculate initial solid and other The classification information of entity.
Step 209, according to the first eigenvector and second feature vector, adds for initial solid and other entities Relation information.
In the embodiment of the present invention, according to first eigenvector and second feature vector, for initial solid and other entities The step of adding relation information includes:
Sub-step S2091, using the first eigenvector and second feature vector of the initial solid and other entities, instruction Practice the second grader;
Sub-step S2092, using second grader, calculates the relation information of the initial solid and other entities;
The relation information is added to the initial solid and other entities by sub-step S2093.
Wherein, the second grader, can be decision tree, logistic regression, naive Bayesian, neutral net scheduling algorithm etc., this Inventive embodiments are not intended to be limited in any.Known relation letter in using the second classifier training initial solid and other entities Breath, can calculate the relation information of initial solid and other entities.Initial reality relation information being added in knowledge mapping Body and other entities.
In a kind of preferred embodiment of the embodiment of the present invention, methods described also includes the steps:
Step S11, record described return described according to the first eigenvector and second feature vector, calculate the mesh The execution number of times of the step of mark entity corresponding characteristic value;
Whether step S12, judge the execution number of times more than the second predetermined threshold value;
Step S13, when the number of times is more than the second predetermined threshold value, returns using its corresponding first eigenvector and the The step of classification information and relation information of the two characteristic vectors renewal initial solid and other entities.
Wherein, the second predetermined threshold value can be the artificial iterations for arranging, and for example, execute calculating target entity corresponding The number of times of characteristic value is 1,000,000 times, then stop computing afterwards at 1,000,000 times, choose maximum characteristic value, extract corresponding first special Vectorization during vector and second feature vector are levied as knowledge mapping is represented.
In the embodiment of the present invention, using the term vector data of initial solid, and, the term vector data of other entities, meter Calculate transition probability value;When transition probability value is more than the first predetermined threshold value, determine that transition probability value other entities corresponding are target Entity;For eigenvalue of maximum, corresponding first eigenvector and second feature vector is extracted;According to first eigenvector and Second feature vector, for initial solid and other entities mark classification information, further, according to first eigenvector and the Two characteristic vectors, are added relation information for initial solid and other entities, are represented the entity in knowledge mapping using vectorization And relation information, represent according to the vectorization of entity, with the classification information of auto-complete entity and can give between two entities certainly The dynamic workload and maintenance cost that adds relation information, greatly reduce ground maintenance knowledge collection of illustrative plates.
For making those skilled in the art be better understood from the embodiment of the present invention, carry out below by way of a specific example Explanation.
First, the building process of object function during the vectorization of knowledge mapping is represented
Knowledge mapping is carried out vectorization expression, the embodiment of the present invention proposes the maximized target of structural environment probable value Function, by taking concrete collection of illustrative plates as an example.
With reference to Fig. 2, a kind of knowledge mapping of the prior art is shown, a part for video display knowledge mapping as shown in Figure 2, It can be seen that the relation and structure between Deng Chao, grandson pari and films and television programs, when the entity in figure is quantified expression, is still to imply and knows Know the information and feature of collection of illustrative plates.In scheming as a example by " Deng Chao " this entity, when in collection of illustrative plates, all of entity is all quantified expression, That is entity " Deng Chao " f (Deng Chao) carries out vectorization expression, and entity " grandson pari " carries out vectorization expression etc. with f (grandson pari), if The vectorization of each entity represents the architectural feature and relation information that can embody each entity well in collection of illustrative plates, then logical The vectorization for crossing each entity represents that the conditional probability value that the target entity calculated around Deng Chao occurs should reach maximum, i.e., Mermaid, Sun Li etc. its neighbouring target entity set N (Deng Chao), after in collection of illustrative plates, all entity vectorizations are represented, if all The vectorization of entity represents and can be good at expressing architectural feature of the entity in collection of illustrative plates, then entity " Deng Chao " surrounding objects entity The conditional probability value P (N (Deng Chao) | f (Deng Chao)) that set occurs will be maximum.Thus, definable is to all entity n in collection of illustrative plates Maximization target function value (characteristic value) maxfn∈Vlog P(N(n)|f(n)).And the target entity around entity " Deng Chao " Between separate, i.e., mermaid and Chinese partner etc. are separate, so all targets around entity " Deng Chao " The conditional probability value that entity occurs is separate, then have
The conditional probability value that certain target entity of entity " Deng Chao " occurs can be taken advantage of with their respective characteristic vector entities Softmax function representations.For example,
General given knowledge mapping G=(V, E), all of entity in V representative graphs, all of relation letter in E representative graphs Breath.The skip-gram models of similar term vector, the probability of occurrence of a word are related to the word of its context.In figure is calculated During the character representation of entity, object function is maximized according to the conditional probability value definition that the target entity set of an entity occurs (function of characteristic value):
Wherein f (n) is that the vectorization of entity n is represented, its dimension is d, maximizes characteristic value (target by the training of model Functional value) adjusting the parameter of f (n).So, vectorization represents that model just has | V a | × d parameter to need to estimate.N (n) is The target entity set of entity n.P (N (n) | f (n)) it is the target of entity n when all entities are quantified expression in collection of illustrative plates The conditional probability value that entity sets occurs.If the vectorization of all entities is represented to can be good at expressing each entity and is existed in collection of illustrative plates Relation and architectural feature in collection of illustrative plates, then in V, the conditional probability value of the target entity set of all entities is up to maximum, that is, go up State characteristic value (target function value) and reach maximization.
2nd, the vectorization of the entity of knowledge mapping is represented
Assume the given knowledge mapping such as Fig. 2, the process for carrying out knowledge mapping vectorization expression is as follows:
1. initialization first eigenvector and second feature are vectorial
Characteristic vector parameter f (n) of all entities in random initializtion collection of illustrative plates, vectorization representation dimension d are set to 100, N N () target entity set sizes are k, train iterations to be defined as iterations, all entities in random initializtion collection of illustrative plates 100 dimensional vectors represent.
In collection of illustrative plates the vectorization of all entities represent can random initializtion be 100 dimensions vector:
First eigenvector f (Deng Chao)=[0.543,0.381,0.328 ... 0.182]
Second feature vector f (grandson pari)=[0.337,0.169,0.401 ... 0.403]
Second feature vector ...
Second feature vector ...
2. the acquisition of the target entity set in knowledge mapping in all entities
Modal graph search algorithm is BFS (BFS) and depth-first search (DFS).But entered with BFS The collection of row target entity is easily caused repeated sampling, and in figure, major part is not traversed.And the collection that carries out target entity with DFS is easy Cause to sample the physical distance source entity for obtaining too far, and lose representativeness.The embodiment of the present invention using a kind of comprehensive BFS and The method of sampling of DFS.
In reference picture 4A and Fig. 4 B, the size for defining target entity set is k, and N (n) sizes i.e. are k.Close in collection of illustrative plates It is that the transition probability value of information is sampled entity v_(i-1)Transition probability P (v to next entity v_ii|vi-1), it is defined as two The normalization term vector similarity of individual word, the acquisition of term vector model are obtained by building the big language material training of knowledge mapping, Transition probability value is bigger, and the degree of correlation for representing the two entities is higher, and the representativeness of the target entity is stronger, i.e., to the target reality The probability of body transfer is bigger.The embodiment of the present invention represents transition probability using cosine similarity, and the W in formula is for normalizing Transition probability value:
For example, in figure to be obtained between Deng Chao and grandson pari relation information transition probability, the term vector of Deng Chao is c (Deng Chao) =[0.500,0.249,0.069 ... 0.325], term vector c (grandson pari)=[0.196,0.121,0.207 ... 0.843] of grandson pari, Substitute into above formula
Transition probability between passing Deng Chao the and Mi months, as two inter-entity do not have relation information, then
P (the Deng Chao │ Mi months pass)=0
In the same manner, the transition probability of all relation informations in collection of illustrative plates can be calculated.
Target entity set N (n) sampling flow process:
(1) from initial solid n, the transition probability value according to every relation information of entity output carries out multinomial at random Profile samples obtain next entity, between two such entity transition probability sample more greatly the entity possibility bigger.
(2) from current entity, the sampling for carrying out a new round obtains next entity, note that and is transferred in N (n) The entity of the sampling for having existed, the entity are not counted, and continue repeated sampling process from the entity.
(3) repeat this process until sampling k entity, that is, obtain target entity set N (n).
In reference picture 4A and Fig. 4 B, two kinds of different target entity set in the embodiment of the present invention are shown, wherein, close Be information transition probability value be carried out according to the cosine similarity of two words of term vector model initialized, as shown in Figure 4 A, From entity, " Deng Chao "s, sample possible traversal order for Deng Chao → grandson pari → Zhen biography → grandson pari using random multinomial distribution → scoundrel angel → Deng Chao → mermaid, then N (Deng Chao)={ grandson pari, Zhen Chuan, scoundrel angel, mermaid }.Shown in Fig. 4 B, Same from Deng Chao in next iteration, formed a partnership for Deng Chao → China using the possible traversal order of random multinomial distribution sampling People → Deng Chao → mermaid → Deng Chao → grandson pari → Mi months pass, then N (Deng Chao)={ Chinese partner, mermaid, Sun Li, Mi month Pass }.Carry out iterations iteration sampling always, obtain the sampled result of iterations difference N (Deng Chao);In the same manner, The sampled result of iterations difference N (n) of other all entities in collection of illustrative plates can be obtained.
3. the vectorization of adjustment entity represents that f (n) parameters maximize characteristic value (target function value)
Target entity set N (n) that each entity is obtained by 2 samplings, calculates the bar of the target entity set of correspondent entity Part probable value P (N (n) | f (n)), and then characteristic value (target function value) is obtained, using stochastic gradient descent (Stochastic Gradient Descent, SGD) algorithm carries out the vectorization of all entities in iterations collection of illustrative plates and represents the iteration of parameter Adjustment and optimize, maximize characteristic value (target function value), make f (n) can architectural features of the presentation-entity n in collection of illustrative plates with adjacent Relationship characteristic between the entity of domain.
By taking entity " Deng Chao " as an example, it is assumed that in an iteration, the target entity collection obtained using 2 samplings is combined into N (Deng Super)={ grandson pari, Zhen Chuan, scoundrel angel, mermaid }, calculate the conditional probability value of each entity in target entity set such as Under:
The conditional probability value of computational entity " Deng Chao " target entity set is as follows:
In the same manner, the conditional probability value P (N (n) | f (n)) of the target entity set of other entities n in collection of illustrative plates can be calculated.
And then obtain object function (function of characteristic value) ∑n∈Vlog P(N(n)|f(n)).
The vectorization that all entity n in adjustment collection of illustrative plates are continued to optimize using stochastic gradient descent method SGD represents that f (n) makes spy Value indicative (target function value) is maximized.
Carry out iterations iteration optimization and obtain final knowledge mapping vectorization representing model, extract maximum mesh The corresponding first eigenvector of offer of tender numerical value and the vectorial vectorization as entity (initial solid or other entities) of second feature Represent, the completion of classification information or adding for relation information that entity is carried out using the first eigenvector and second feature vector Plus.
Further, complete to optimize knowledge mapping vectorization and represent that the vectorization of model each entity is represented, obtain each After the characteristic vector of entity is represented, the relation information between two entities (u, v) can turn to e (u, v)=f (u) f (v) with vector And/or e (u, v)=(f (u)-f (v))/2 and/or e (u, v)=| f (u)-f (v) | and/or e (u, v)=| f (u)-f (v) | ^2.
It should be noted that for embodiment of the method, in order to be briefly described, therefore which to be all expressed as a series of action group Close, but those skilled in the art should know, the embodiment of the present invention is not limited by described sequence of movement, because according to According to the embodiment of the present invention, some steps can be carried out using other orders or simultaneously.Secondly, those skilled in the art also should Know, embodiment described in this description belongs to preferred embodiment, the involved action not necessarily present invention is implemented Example is necessary.
With reference to Fig. 5, a kind of structural frames of the data processing equipment embodiment of knowledge mapping of the embodiment of the present invention are shown Figure, the knowledge mapping include that initial solid and other entities multiple, the initial solid and other entities multiple have classification Information and relation information, specifically can include such as lower module:
Target entity chosen module 301, for selecting currently processed target entity from other entities described;
First and second characteristic vector acquisition module 302, for obtaining the first eigenvector of the initial solid, with And, obtain the second feature vector of the target entity;
Characteristic value calculating module 303, for according to the first eigenvector and second feature vector, calculating the target The corresponding characteristic value of entity;
Eigenvalue of maximum determining module 304, for determining the eigenvalue of maximum in the characteristic value;
Classification information and relation information update module 305, for be directed to the eigenvalue of maximum, using its corresponding first Characteristic vector and second feature vector update classification information and the relation information of the initial solid and other entities.
Preferably, the target entity chosen module includes:
Transition probability value calculating sub module, for the term vector data using the initial solid, and, other realities described The term vector data of body, calculate transition probability value;
Whether the first predetermined threshold value judging submodule, for judging the transition probability value more than the first predetermined threshold value;
Target entity determination sub-module, for when the transition probability value is more than the first predetermined threshold value, determining described turning It is target entity to move probable value other entities corresponding.
Preferably, it is characterised in that the characteristic value calculating module includes:
Conditional probability value calculating sub module, for according to the first eigenvector and second feature vector, calculating described The corresponding conditional probability value of target entity;
Tire out and take advantage of conditional probability value to obtain submodule, the conditional probability value is taken advantage of for tired, obtain to tire out and takes advantage of conditional probability value;
Logarithm conditional probability value obtains submodule, for for described tired take advantage of conditional probability value to carry out operation of taking the logarithm, obtain Obtain logarithm conditional probability value;
Characteristic value obtains submodule, for the logarithm conditional probability value that adds up, obtains characteristic value.
Preferably, first and second characteristic vector update module described includes:
First and second characteristic vector pickup submodule, for for the eigenvalue of maximum, extracting corresponding first Characteristic vector and second feature vector;
Classification information marks submodule, for according to the first eigenvector and second feature vector, for described first Beginning entity and other entities mark classification information.
Relation information adds submodule, for according to the first eigenvector and second feature vector, for initial reality Body and other entities add relation information.
Preferably, the classification information mark submodule includes:
First classifier training unit, for the first eigenvector and second using the initial solid and other entities Characteristic vector, trains the first grader;
Classification information computing unit, for using first grader, the calculating initial solid and other entities Classification information;
Classification information marks unit, for marking the initial solid and other entities using the classification information.
Preferably, the relation information adds submodule and includes:
Second classifier training unit, for the first eigenvector and second using the initial solid and other entities Characteristic vector, trains the second grader;
Relation information computing unit, for using second grader, the calculating initial solid and other entities Relation information;
Relation information adding device, for being added to the initial solid and other entities by the relation information.
In a kind of preferred embodiment of the embodiment of the present invention, described device also includes:
First and second vectorial update module, for for off-peak characteristic value, update its corresponding fisrt feature to Amount and second feature vector.
Preferably, the eigenvalue of maximum determining module includes:
Number of times record sub module is executed, for recording vectorial, the meter according to the first eigenvector and second feature The execution number of times of the step of calculating the target entity corresponding characteristic value;
Whether the second predetermined threshold value judging submodule, for judging the execution number of times more than the second predetermined threshold value;
Eigenvalue of maximum chooses submodule, for when the number of times is more than the second predetermined threshold value, selecting the feature Eigenvalue of maximum in value.
For device embodiment, due to itself and embodiment of the method basic simlarity, so description is fairly simple, related Part is illustrated referring to the part of embodiment of the method.
Each embodiment in this specification is described by the way of going forward one by one, what each embodiment was stressed be with The difference of other embodiment, between each embodiment identical similar part mutually referring to.
Those skilled in the art are it should be appreciated that the embodiment of the embodiment of the present invention can be provided as method, device or calculate Machine program product.Therefore, the embodiment of the present invention can adopt complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.And, the embodiment of the present invention can adopt one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disc store, CD-ROM, optical memory etc.) of program code The form of the computer program of enforcement.
The embodiment of the present invention is with reference to method according to embodiments of the present invention, terminal device (system) and computer program The flow chart and/or block diagram of product is describing.It should be understood that can be by computer program instructions flowchart and/or block diagram In each flow process and/or square frame and flow chart and/or the flow process in block diagram and/or square frame combination.These can be provided Computer program instructions are set to all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is producing a machine so that held by the processor of computer or other programmable data processing terminal equipments Capable instruction is produced for realization in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames The device of the function of specifying.
These computer program instructions may be alternatively stored in and can guide computer or other programmable data processing terminal equipments In the computer-readable memory for working in a specific way so that the instruction being stored in the computer-readable memory produces bag The manufacture of command device is included, the command device is realized in one side of one flow process of flow chart or multiple flow processs and/or block diagram The function of specifying in frame or multiple square frames.
These computer program instructions can be also loaded in computer or other programmable data processing terminal equipments so that On computer or other programmable terminal equipments execute series of operation steps to produce computer implemented process, so as to The instruction executed on computer or other programmable terminal equipments is provided for realization in one flow process of flow chart or multiple flow processs And/or specify in one square frame of block diagram or multiple square frames function the step of.
Although having been described for the preferred embodiment of the embodiment of the present invention, those skilled in the art once know base This creative concept, then can make other change and modification to these embodiments.So, claims are intended to be construed to Including preferred embodiment and fall into the had altered of range of embodiment of the invention and change.
Finally, in addition it is also necessary to explanation, herein, such as first and second or the like relational terms be used merely to by One entity or operation are made a distinction with another entity or operation, and are not necessarily required or implied these entities or operation Between exist any this actual relation or order.And, term " including ", "comprising" or its any other variant meaning Covering including for nonexcludability, so that a series of process, method, article or terminal device including key elements is not only wrapped Those key elements, but also other key elements including being not expressly set out are included, or is also included for this process, method, article Or the key element that terminal device is intrinsic.In the absence of more restrictions, by wanting that sentence "including a ..." is limited Element, it is not excluded that also there is other identical element in process, method, article or the terminal device for including the key element.
Above to a kind of method provided by the present invention and a kind of device, it is described in detail, tool used herein Body example is set forth to principle of the invention and embodiment, and the explanation of above example is only intended to help and understands this Bright method and its core concept;Simultaneously for one of ordinary skill in the art, according to the thought of the present invention, concrete real Apply and will change in mode and range of application, in sum, this specification content should not be construed as the limit to the present invention System.

Claims (16)

1. a kind of data processing method of knowledge mapping, it is characterised in that the knowledge mapping includes initial solid and other realities There is classification information and relation information, methods described to include for body, the initial solid and other entities:
Currently processed target entity is selected from other entities described;
The first eigenvector of the initial solid is obtained, and, obtain the second feature vector of the target entity;
According to the first eigenvector and second feature vector, the corresponding characteristic value of the target entity is calculated;
Determine the eigenvalue of maximum in the characteristic value;
For the eigenvalue of maximum, the initial solid is updated using its corresponding first eigenvector and second feature vector And classification information and the relation information of other entities.
2. method according to claim 1, it is characterised in that the initial solid and other entities are respectively provided with corresponding Term vector data, described include the step of select currently processed target entity from other entities described:
Using the term vector data of the initial solid, and, the term vector data of other entities calculate transition probability Value;
Judge the transition probability value whether more than the first predetermined threshold value;
When the transition probability value is more than the first predetermined threshold value, determine that the transition probability value other entities corresponding are target Entity.
3. method according to claim 1 and 2, it is characterised in that described according to the first eigenvector and described Two characteristic vectors, include the step of calculate the target entity corresponding characteristic value:
According to the first eigenvector and second feature vector, the corresponding conditional probability value of the target entity is calculated;
Tire out and take advantage of the conditional probability value, obtain to tire out and takes advantage of conditional probability value;
Tired conditional probability value is taken advantage of to carry out operation of taking the logarithm, acquisition logarithm conditional probability value for described;
Add up the logarithm conditional probability value, obtains characteristic value.
4. method according to claim 1, it is characterised in that described for the eigenvalue of maximum, corresponding using which First eigenvector and second feature vector, update the classification information and the step of relation information of the initial solid and other entities Suddenly include:
For the eigenvalue of maximum, its corresponding first eigenvector and second feature vector is obtained;
According to the first eigenvector and second feature vector, for the initial solid and other entities mark classification letter Breath.
According to the first eigenvector and second feature vector, add relation information for initial solid and other entities.
5. method according to claim 4, it is characterised in that described according to the first eigenvector and second feature to Amount, includes the step of marking classification information for the initial solid and other entities:
Using the first eigenvector and second feature vector of the initial solid and other entities, the first grader is trained;
Using first grader, the classification information of the initial solid and other entities is calculated;
The initial solid and other entities are marked using the classification information.
6. method according to claim 4, it is characterised in that described according to the first eigenvector and second feature to Amount, includes the step of adding relation information for initial solid and other entities:
Using the first eigenvector and second feature vector of the initial solid and other entities, the second grader is trained;
Using second grader, the relation information of the initial solid and other entities is calculated;
The relation information is added to the initial solid and other entities.
7. method according to claim 1, it is characterised in that methods described also includes:
For off-peak characteristic value, its corresponding first eigenvector and second feature vector is updated.
8. method according to claim 1, it is characterised in that the step of the eigenvalue of maximum in the determination characteristic value Suddenly include:
Record described according to the first eigenvector and second feature vector, calculate the corresponding characteristic value of the target entity The execution number of times of step;
Judge the execution number of times whether more than the second predetermined threshold value;
When the number of times is more than the second predetermined threshold value, the eigenvalue of maximum in the characteristic value is selected.
9. a kind of data processing equipment of knowledge mapping, it is characterised in that the knowledge mapping includes initial solid and other realities There is classification information and relation information, described device to include for body, the initial solid and other entities:
Target entity chosen module, for selecting currently processed target entity from other entities described;
First and second characteristic vector acquisition module, for obtaining the first eigenvector of the initial solid, and, obtain institute State the second feature vector of target entity;
Characteristic value calculating module, for according to the first eigenvector and second feature vector, calculating the target reality The corresponding characteristic value of body;
Eigenvalue of maximum determining module, for determining the eigenvalue of maximum in the characteristic value;
Classification information and relation information update module, for be directed to the eigenvalue of maximum, using its corresponding fisrt feature to Amount and second feature vector update classification information and the relation information of the initial solid and other entities.
10. device according to claim 9, it is characterised in that the target entity chosen module includes:
Transition probability value calculating sub module, for the term vector data using the initial solid, and, other entities Term vector data, calculate transition probability value;
Whether the first predetermined threshold value judging submodule, for judging the transition probability value more than the first predetermined threshold value;
Target entity determination sub-module, for when the transition probability value is more than the first predetermined threshold value, determining that the transfer is general It is target entity that rate is worth other entities corresponding.
11. devices according to claim 9 or 10, it is characterised in that the characteristic value calculating module includes:
Conditional probability value calculating sub module, for according to the first eigenvector and second feature vector, calculating the target The corresponding conditional probability value of entity;
Tire out and take advantage of conditional probability value to obtain submodule, the conditional probability value is taken advantage of for tired, obtain to tire out and takes advantage of conditional probability value;
Logarithm conditional probability value obtains submodule, for for described tired take advantage of conditional probability value to carry out operation of taking the logarithm, acquisition is right Said conditions probable value;
Characteristic value obtains submodule, for the logarithm conditional probability value that adds up, obtains characteristic value.
12. devices according to claim 9, it is characterised in that first and second characteristic vector update module described includes:
First and second characteristic vector pickup submodule, for for the eigenvalue of maximum, extracting corresponding fisrt feature Vector sum second feature vector;
Classification information marks submodule, for according to the first eigenvector and second feature vector, for the initial reality Body and other entities mark classification information.
Relation information adds submodule, for according to the first eigenvector and second feature vector, for initial solid and Other entities add relation information.
13. devices according to claim 12, it is characterised in that the classification information mark submodule includes:
First classifier training unit, for first eigenvector and second feature using the initial solid and other entities Vector, trains the first grader;
Classification information computing unit, for using first grader, calculating the classification of the initial solid and other entities Information;
Classification information marks unit, for marking the initial solid and other entities using the classification information.
14. devices according to claim 12, it is characterised in that the relation information adds submodule to be included:
Second classifier training unit, for first eigenvector and second feature using the initial solid and other entities Vector, trains the second grader;
Relation information computing unit, for using second grader, calculating the relation of the initial solid and other entities Information;
Relation information adding device, for being added to the initial solid and other entities by the relation information.
15. devices according to claim 9, it is characterised in that described device also includes:
First and second vectorial update module, for for off-peak characteristic value, update its corresponding first eigenvector and Second feature vector.
16. methods according to claim 9, it is characterised in that the eigenvalue of maximum determining module includes:
Number of times record sub module is executed, described vectorial according to the first eigenvector and second feature for recording, calculate institute The execution number of times of the step of stating target entity corresponding characteristic value;
Whether the second predetermined threshold value judging submodule, for judging the execution number of times more than the second predetermined threshold value;
Eigenvalue of maximum chooses submodule, for when the number of times is more than the second predetermined threshold value, selecting in the characteristic value Eigenvalue of maximum.
CN201610825067.8A 2016-09-14 2016-09-14 A kind of data processing method of knowledge mapping and device Pending CN106503035A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610825067.8A CN106503035A (en) 2016-09-14 2016-09-14 A kind of data processing method of knowledge mapping and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610825067.8A CN106503035A (en) 2016-09-14 2016-09-14 A kind of data processing method of knowledge mapping and device

Publications (1)

Publication Number Publication Date
CN106503035A true CN106503035A (en) 2017-03-15

Family

ID=58291416

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610825067.8A Pending CN106503035A (en) 2016-09-14 2016-09-14 A kind of data processing method of knowledge mapping and device

Country Status (1)

Country Link
CN (1) CN106503035A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107193959A (en) * 2017-05-24 2017-09-22 南京大学 A kind of business entity's sorting technique towards plain text
CN107229878A (en) * 2017-06-28 2017-10-03 海南大学 A kind of resource security protection method based on data collection of illustrative plates, Information Atlas and knowledge mapping for putting into the security definable determined
CN107272521A (en) * 2017-08-05 2017-10-20 曲阜师范大学 A kind of Intelligent hardware control method of knowledge mapping driving
CN107665230A (en) * 2017-06-21 2018-02-06 海信集团有限公司 Training method and device for the users' behavior model of Intelligent housing
CN107943874A (en) * 2017-11-13 2018-04-20 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN108197108A (en) * 2017-12-29 2018-06-22 智搜天机(北京)信息技术有限公司 The method and its system that terminal contact based on AI intelligently extends
CN108461151A (en) * 2017-12-15 2018-08-28 北京大学深圳研究生院 A kind of the logic Enhancement Method and device of knowledge mapping
CN108563780A (en) * 2018-04-25 2018-09-21 北京比特智学科技有限公司 Course content recommends method and apparatus
CN108694178A (en) * 2017-04-06 2018-10-23 北京国双科技有限公司 A kind of method and device for recommending judicial cognizance
CN108959328A (en) * 2017-05-27 2018-12-07 株式会社理光 Processing method, device and the electronic equipment of knowledge mapping
CN109033160A (en) * 2018-06-15 2018-12-18 东南大学 A kind of knowledge mapping dynamic updating method
CN109582802A (en) * 2018-11-30 2019-04-05 国信优易数据有限公司 A kind of entity embedding grammar, device, medium and equipment
CN109634939A (en) * 2018-12-28 2019-04-16 中国农业银行股份有限公司 A kind of the determination method, apparatus and electronic equipment of missing values
CN109828965A (en) * 2019-01-09 2019-05-31 北京小乘网络科技有限公司 A kind of method and electronic equipment of data processing
CN110704743A (en) * 2019-09-30 2020-01-17 北京科技大学 Semantic search method and device based on knowledge graph
CN111078844A (en) * 2018-10-18 2020-04-28 上海交通大学 Task-based dialog system and method for software crowdsourcing
CN112351441A (en) * 2019-08-06 2021-02-09 ***通信集团广东有限公司 Data processing method and device and electronic equipment
WO2021032002A1 (en) * 2019-08-20 2021-02-25 星环信息科技(上海)股份有限公司 Big data processing method based on heterogeneous distributed knowledge graph, device, and medium
CN113222771A (en) * 2020-07-10 2021-08-06 杭州海康威视数字技术股份有限公司 Method and device for determining target group based on knowledge graph and electronic equipment
CN113779266A (en) * 2018-12-17 2021-12-10 北京百度网讯科技有限公司 Knowledge graph-based information processing method and device

Cited By (32)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108694178B (en) * 2017-04-06 2020-11-27 北京国双科技有限公司 Method and device for recommending judicial knowledge
CN108694178A (en) * 2017-04-06 2018-10-23 北京国双科技有限公司 A kind of method and device for recommending judicial cognizance
CN107193959A (en) * 2017-05-24 2017-09-22 南京大学 A kind of business entity's sorting technique towards plain text
CN107193959B (en) * 2017-05-24 2020-11-27 南京大学 Pure text-oriented enterprise entity classification method
CN108959328B (en) * 2017-05-27 2021-12-21 株式会社理光 Knowledge graph processing method and device and electronic equipment
CN108959328A (en) * 2017-05-27 2018-12-07 株式会社理光 Processing method, device and the electronic equipment of knowledge mapping
CN107665230A (en) * 2017-06-21 2018-02-06 海信集团有限公司 Training method and device for the users' behavior model of Intelligent housing
CN107229878B (en) * 2017-06-28 2019-09-24 海南大学 A kind of resource security protection method based on data map, Information Atlas and knowledge mapping that the safety that investment determines is definable
CN107229878A (en) * 2017-06-28 2017-10-03 海南大学 A kind of resource security protection method based on data collection of illustrative plates, Information Atlas and knowledge mapping for putting into the security definable determined
CN107272521A (en) * 2017-08-05 2017-10-20 曲阜师范大学 A kind of Intelligent hardware control method of knowledge mapping driving
CN107943874A (en) * 2017-11-13 2018-04-20 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN107943874B (en) * 2017-11-13 2019-08-23 平安科技(深圳)有限公司 Knowledge mapping processing method, device, computer equipment and storage medium
CN108461151A (en) * 2017-12-15 2018-08-28 北京大学深圳研究生院 A kind of the logic Enhancement Method and device of knowledge mapping
CN108461151B (en) * 2017-12-15 2021-06-15 北京大学深圳研究生院 Logic enhancement method and device of knowledge graph
CN108197108A (en) * 2017-12-29 2018-06-22 智搜天机(北京)信息技术有限公司 The method and its system that terminal contact based on AI intelligently extends
CN108563780A (en) * 2018-04-25 2018-09-21 北京比特智学科技有限公司 Course content recommends method and apparatus
CN109033160A (en) * 2018-06-15 2018-12-18 东南大学 A kind of knowledge mapping dynamic updating method
CN111078844A (en) * 2018-10-18 2020-04-28 上海交通大学 Task-based dialog system and method for software crowdsourcing
CN111078844B (en) * 2018-10-18 2023-03-14 上海交通大学 Task-based dialog system and method for software crowdsourcing
CN109582802B (en) * 2018-11-30 2020-11-03 国信优易数据股份有限公司 Entity embedding method, device, medium and equipment
CN109582802A (en) * 2018-11-30 2019-04-05 国信优易数据有限公司 A kind of entity embedding grammar, device, medium and equipment
CN113779266B (en) * 2018-12-17 2023-10-13 北京百度网讯科技有限公司 Knowledge graph-based information processing method and device
CN113779266A (en) * 2018-12-17 2021-12-10 北京百度网讯科技有限公司 Knowledge graph-based information processing method and device
CN109634939A (en) * 2018-12-28 2019-04-16 中国农业银行股份有限公司 A kind of the determination method, apparatus and electronic equipment of missing values
CN109828965A (en) * 2019-01-09 2019-05-31 北京小乘网络科技有限公司 A kind of method and electronic equipment of data processing
CN112351441A (en) * 2019-08-06 2021-02-09 ***通信集团广东有限公司 Data processing method and device and electronic equipment
CN112351441B (en) * 2019-08-06 2023-08-15 ***通信集团广东有限公司 Data processing method and device and electronic equipment
WO2021032002A1 (en) * 2019-08-20 2021-02-25 星环信息科技(上海)股份有限公司 Big data processing method based on heterogeneous distributed knowledge graph, device, and medium
CN110704743A (en) * 2019-09-30 2020-01-17 北京科技大学 Semantic search method and device based on knowledge graph
CN110704743B (en) * 2019-09-30 2022-02-18 北京科技大学 Semantic search method and device based on knowledge graph
CN113222771A (en) * 2020-07-10 2021-08-06 杭州海康威视数字技术股份有限公司 Method and device for determining target group based on knowledge graph and electronic equipment
CN113222771B (en) * 2020-07-10 2023-10-20 杭州海康威视数字技术股份有限公司 Method and device for determining target group based on knowledge graph and electronic equipment

Similar Documents

Publication Publication Date Title
CN106503035A (en) A kind of data processing method of knowledge mapping and device
CN107273490B (en) Combined wrong question recommendation method based on knowledge graph
CN111161535A (en) Attention mechanism-based graph neural network traffic flow prediction method and system
CN109508360A (en) A kind of polynary flow data space-time autocorrelation analysis method of geography based on cellular automata
CN101477529B (en) Three-dimensional object retrieval method and apparatus
CN105046714A (en) Unsupervised image segmentation method based on super pixels and target discovering mechanism
CN110310012B (en) Data analysis method, device, equipment and computer readable storage medium
CN103116893A (en) Digital image labeling method based on multi-exampling multi-marking learning
CN115761900B (en) Internet of things cloud platform for practical training base management
CN104376051A (en) Random structure conformal Hash information retrieval method
CN114283285A (en) Cross consistency self-training remote sensing image semantic segmentation network training method and device
CN115587597A (en) Sentiment analysis method and device of aspect words based on clause-level relational graph
CN110414575A (en) A kind of semi-supervised multiple labeling learning distance metric method merging Local Metric
CN112347252B (en) Interpretability analysis method based on CNN text classification model
CN108416395A (en) A kind of Interactive Decision-Making tree constructing method based on attribute loop
CN109636194B (en) Multi-source cooperative detection method and system for major change of power transmission and transformation project
CN111950646A (en) Hierarchical knowledge model construction method and target identification method for electromagnetic image
CN116738214A (en) Data dimension reduction preprocessing method based on high-order tensor
CN113420680B (en) Remote sensing image area attention and text generation method based on GRU attention
CN114254199A (en) Course recommendation method based on bipartite graph projection and node2vec
CN109241990A (en) A kind of threedimensional model mask method propagated based on multi-tag
CN117237984B (en) MT leg identification method, system, medium and equipment based on label consistency
CN117312395B (en) Query system optimization method, device and equipment based on big data big model
CN117541799B (en) Large-scale point cloud semantic segmentation method based on online random forest model multiplexing
CN117556064B (en) Information classification storage method and system based on big data analysis

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315