CN113673249B

CN113673249B - Entity identification method, device, equipment and storage medium

Info

Publication number: CN113673249B
Application number: CN202110984162.3A
Authority: CN
Inventors: 匡俊; 曹雪智; 陈凤娇; 郭林森; 徐灏; 谢睿; 张富峥; 王仲远
Original assignee: Beijing Sankuai Online Technology Co Ltd
Current assignee: Beijing Sankuai Online Technology Co Ltd
Priority date: 2021-08-25
Filing date: 2021-08-25
Publication date: 2022-08-16
Anticipated expiration: 2041-08-25
Also published as: CN113673249A

Abstract

The application discloses an entity identification method, an entity identification device, entity identification equipment and a storage medium, and belongs to the technical field of artificial intelligence. The method comprises the following steps: performing first feature mapping on an input text to obtain an initial feature vector of the input text; acquiring a first class of feature vectors and a second class of feature vectors based on a knowledge graph; the first-class feature vector is the feature representation of word level knowledge in a knowledge graph, and the same words belonging to different types in the knowledge graph correspond to different feature representations; the second type of feature vector is the feature representation of character level knowledge in the knowledge graph, and the same character belonging to different types in the knowledge graph corresponds to different feature representations; and performing entity recognition on the input text based on the initial feature vector, the first class feature vector and the second class feature vector of the input text. The method and the device can avoid word segmentation errors and entity type confusion, improve the accuracy of entity identification and achieve a good entity identification effect.

Description

Entity identification method, device, equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for entity identification.

Background

Natural Language Processing (NLP) is an important direction in the field of artificial intelligence technology. The NLP technology relates to entity recognition, wherein the entity recognition refers to recognition of entities with specific meanings, such as trade names, brand names, attribute names and the like, from unstructured texts; and the entity recognition result can be applied to downstream tasks, such as knowledge graph construction, user intention recognition and the like.

The entity recognition effect is influenced by factors such as word segmentation errors and entity type confusion. The word segmentation error means that the input text cannot be accurately segmented, for example, the word segmentation error is that the word segmentation error is segmented into 'Nanjing/Xichangjiang/Jiangtang'; entity type confusion refers to the inability to correctly identify the type of a word in an input text when it has multiple meanings; for example, "strawberry" may be used as a trade name or an attribute name to represent a flavor. The entity type for "strawberry" is confused in that it cannot correctly recognize whether it is used as a trade name or an attribute name in the input text.

Based on the above description, it can be known that reducing word segmentation errors and avoiding entity type confusion are the key points for ensuring the entity identification effect, and therefore, how to improve the accuracy of entity identification becomes a research hotspot of those skilled in the art.

Disclosure of Invention

The embodiment of the application provides an entity identification method, device, equipment and storage medium, which can avoid word segmentation errors and entity type confusion, improve the accuracy of entity identification and have a good entity identification effect. The technical scheme is as follows:

in one aspect, an entity identification method is provided, and the method includes:

performing first feature mapping on an input text to obtain an initial feature vector of the input text;

acquiring a first class of feature vectors and a second class of feature vectors based on a knowledge graph; the first-class feature vectors are feature representations of word-level knowledge in the knowledge graph, and the same words belonging to different types in the knowledge graph correspond to different feature representations; the second type of feature vector is the feature representation of character-level knowledge in the knowledge graph, and the same character belonging to different types in the knowledge graph corresponds to different feature representations; a term includes one or more characters;

and performing entity recognition on the input text based on the initial feature vector of the input text, the first class of feature vector and the second class of feature vector.

In another aspect, an entity identifying apparatus is provided, the apparatus including:

the processing module is configured to perform first feature mapping on an input text to obtain an initial feature vector of the input text;

an acquisition module configured to acquire a first class of feature vectors and a second class of feature vectors based on a knowledge graph; the first-class feature vectors are feature representations of word-level knowledge in the knowledge graph, and the same words belonging to different types in the knowledge graph correspond to different feature representations; the second type of feature vector is the feature representation of the character-level knowledge in the knowledge graph, and the same character belonging to different types in the knowledge graph corresponds to different feature representations; a term includes one or more characters;

an identification module configured to perform entity identification on the input text based on the initial feature vector of the input text, the first class of feature vectors, and the second class of feature vectors.

In some embodiments, the obtaining module includes:

the first processing unit is configured to convert the knowledge graph into a first node sequence in a random walk mode; the first node sequence is used for indicating word-word migration paths, and each node in the first node sequence is used for indicating a word and a type corresponding to the word;

the second processing unit is configured to perform character level splitting on each word in the first node sequence to obtain a second node sequence; the second node sequence is used for indicating character-character walking paths, and each node in the second node sequence is used for indicating a character and a type corresponding to the character;

a third processing unit configured to generate the first class feature vector based on the first node sequence; a fourth processing unit configured to generate the second class feature vector based on the second node sequence.

In some embodiments, the third processing unit is configured to:

generating a third node sequence based on different types corresponding to each word in the first node sequence; the third node sequence is used for indicating a wandering path between word-types;

and performing second feature mapping on the third node sequence to obtain the first class of feature vectors.

In some embodiments, the fourth processing unit is configured to:

generating a fourth node sequence based on different types corresponding to each character in the second node sequence; the fourth node sequence is used for indicating a wandering path between character-types;

and performing second feature mapping on the fourth node sequence to obtain the second class of feature vectors.

In some embodiments, the identification module comprises:

the fusion unit is configured to fuse an initial feature vector of a character, feature representations of different types corresponding to the character, and word feature representations corresponding to the character to obtain a final feature vector of the character for any character in the input text;

a recognition unit configured to input final feature vectors of all characters in the input text into an entity recognition model; performing entity recognition on the input text based on the entity recognition model;

the word characteristic representation corresponding to the characters refers to characteristic representations of different types corresponding to target words, and the target words refer to words matched with the characters in the knowledge graph.

In some embodiments, the target term includes at least one of:

words with the beginning positions matched with the characters in the knowledge graph;

words and phrases in the knowledge graph, the middle positions of which are matched with the characters;

words with end positions matched with the characters in the knowledge graph;

and the individual words in the knowledge graph are matched with the characters.

In some embodiments, the fusion unit is configured to:

fusing the initial feature vector of the character and the feature representations of different types corresponding to the character to obtain a first intermediate vector;

fusing the initial feature vector of the character and the word feature representation corresponding to the character; generating a second intermediate vector based on the fused feature vector;

and performing feature splicing on the initial feature vector of the character, the first intermediate vector and the second intermediate vector to obtain a final feature vector of the character.

In some embodiments, the entity recognition model includes a Bi-directional Long-Short Term Memory (Bi-LSTM) and Conditional Random Field (CRF) layer; the identification unit is configured to:

based on the bidirectional LSTM, encoding the final feature vector of each character in the input text to obtain an implicit vector of each character in the input text;

fusing an implicit vector of the character and different types of feature representations corresponding to the character to obtain a third intermediate vector for any character in the input text;

performing feature splicing on the implicit vector of the character and the third intermediate vector to obtain an output vector of the character;

decoding the output vector of each character in the input text based on the CRF layer to obtain an entity recognition result of the input text; the entity recognition result comprises a word segmentation result and a type corresponding to each entity in the word segmentation result.

In some embodiments, the fusion unit is configured to:

determining at least one probability value based on the initial feature vector of the character and the feature representations of the character corresponding to different types; acquiring the first intermediate vector according to the at least one probability value;

wherein the at least one probability value is used to indicate a probability that the character belongs to different types, and the probability value corresponding to the type that is more consistent with the current context is larger.

In another aspect, a computer device is provided, the device comprising a processor and a memory, the memory having stored therein at least one program code, the at least one program code being loaded and executed by the processor to implement the entity identification method described above.

In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, the at least one program code being loaded and executed by a processor to implement the entity identification method described above.

In another aspect, a computer program product or a computer program is provided, the computer program product or the computer program comprising computer program code, the computer program code being stored in a computer-readable storage medium, the computer program code being read by a processor of a computer device from the computer-readable storage medium, the computer program code being executed by the processor such that the computer device performs the entity identification method described above.

In the embodiment of the application, in the process of entity identification, the characteristic representation of character-level knowledge and the characteristic representation of word-level knowledge in a knowledge graph are introduced, and the two types of characteristic representations are related to the entity type; that is, the same words belonging to different types in the knowledge graph correspond to different feature representations, and the same characters belonging to different types correspond to different feature representations; that is, in the process of entity recognition, while word-level knowledge is introduced based on a knowledge graph, character-level knowledge is also introduced, and entity types are further considered.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic diagram of an implementation environment related to an entity identification method provided in an embodiment of the present application;

fig. 2 is a flowchart of an entity identification method according to an embodiment of the present application;

fig. 3 is a flowchart of another entity identification method provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of a learning knowledge vector representation provided by an embodiment of the present application;

FIG. 5 is a schematic diagram of an entity recognition model provided by an embodiment of the present application;

fig. 6 is a schematic structural diagram of an entity identification apparatus according to an embodiment of the present application;

FIG. 7 is a schematic structural diagram of a computer device according to an embodiment of the present disclosure;

fig. 8 is a schematic structural diagram of another computer device provided in the embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The terms "first," "second," and the like, in this application, are used for distinguishing between similar items and items that have substantially the same function or similar functionality, and it should be understood that "first," "second," and "nth" do not have any logical or temporal dependency, nor do they define a quantity or order of execution. It will be further understood that, although the following description uses the terms first, second, etc. to describe various elements, these elements should not be limited by these terms.

These terms are only used to distinguish one element from another element. For example, a first element can be termed a second element, and, similarly, a second element can also be termed a first element, without departing from the scope of various examples. The first element and the second element may both be elements, and in some cases, may be separate and distinct elements.

For example, at least one element may be an integer number of elements equal to or greater than one, such as one element, two elements, three elements, and the like. The plurality of elements means two or more, and for example, the plurality of elements may be two elements, three elements, or any integer of two or more.

Some key terms or abbreviations referred to in the embodiments of the present application are described below.

Knowledge graph: the knowledge is structurally characterized in a graph form, so that artificial intelligence can process, understand and even reason the human knowledge, and higher cognitive intelligence is achieved. The knowledge graph represents knowledge in a graph structure form, an Entity (Entity) or a Concept (Concept) serves as a node, and an edge is used for indicating a relationship between the node and the node.

Entity: from the perspective of data processing, objects that exist in the real world in a guest view can be referred to as entities, which are any distinguishable and identifiable objects in the real world in an alternative expression manner. For example, an entity may refer to a person, such as a teacher, a student, etc., and an entity may also refer to an object, such as a book, a warehouse, etc. Furthermore, an entity may refer to abstract things, such as shows, soccer games, etc., in addition to an objective object that can be reached. For example, in the O2O scenario, the title of a product is generally a text customized by a merchant, and the information of the name, the attribute, and the specification of the product are mixed together and contain a lot of noise, which is not good for matching the user's intention. And the entity identification can extract the information from the commodity title, so that the user intention can be more accurately understood.

An implementation environment of the entity identification method provided by the embodiment of the present application is described below.

The entity identification method provided by the embodiment of the application is applied to computer equipment. The computer device may be embodied as a terminal or as a server. In another expression, the entity identification method may be executed by the terminal alone, the server alone, or the terminal and the server in combination, and the present application is not limited herein.

Exemplarily, fig. 1 is a schematic diagram of an implementation environment related to an entity identification method provided in an embodiment of the present application. The implementation environment includes: a terminal 101 and a server 102. Wherein the terminal 101 provides a user input interface and the user inputs text through the user input interface provided by the terminal 101. The server 102 is responsible for receiving the text sent by the terminal 101 and performing entity identification on the text.

In some embodiments, the terminal 101 may be, but is not limited to, a smartphone, a tablet, a laptop, a desktop computer, a smart speaker, a smart watch, and the like. The terminal 101 and the server 102 are directly or indirectly connected by wired or wireless communication, and the present application is not limited thereto.

In some embodiments, the terminal 101 generally refers to one of a plurality of terminals, and the embodiment of the present application is illustrated by the terminal 101. Those skilled in the art will appreciate that the number of terminals 101 can be greater. For example, the number of the terminals 101 is dozens or hundreds, or more, and the implementation environment of the entity identification method includes other terminals. The number and the type of the terminals are not limited in the embodiments of the present application.

In some embodiments, the server 102 may be a stand-alone physical server or a server cluster composed of a plurality of physical servers.

In some embodiments, the application scenarios of the entity identification method include, but are not limited to: user intention identification, knowledge graph construction, content recommendation and the like in the intelligent question and answer process, and the application is not limited herein.

Fig. 2 is a flowchart of an entity identification method according to an embodiment of the present application, where an execution subject of the entity identification method is a computer device. Referring to fig. 2, the entity identification method includes the steps of:

201. the computer equipment carries out first feature mapping on the input text to obtain an initial feature vector of the input text.

In the embodiment of the application, the input text is the text to be subjected to entity recognition, the text can be input by a user, and the user intention can be better understood by performing entity recognition on the text.

In some embodiments, the first feature mapping of the input text refers to feature mapping of the input text through a Language Model (LM), which may be, for example, a neural network Language Model, and the application is not limited herein.

It should be noted that the feature vectors obtained by the language model are collectively referred to as language model vectors herein.

202. The computer equipment acquires a first class of feature vectors and a second class of feature vectors based on the knowledge graph; the first-class feature vector is the feature representation of the word level knowledge in the knowledge map, and the same words belonging to different types in the knowledge map correspond to different feature representations; the second type of feature vector is the feature representation of the character-level knowledge in the knowledge-graph, and the same character belonging to different types in the knowledge-graph corresponds to different feature representations.

The embodiment of the application provides an entity identification method based on knowledge graph enhancement, which learns the characteristic representation of knowledge based on relationship information (also called topological structure information) and entity type information in a knowledge graph and introduces the characteristic representation into an entity identification process so as to solve the problems of word segmentation errors, entity type confusion and the like, thereby ensuring the entity identification effect. Namely, the problems of word segmentation errors, entity type confusion and the like caused by knowledge loss are relieved by utilizing the relation information and the entity type information in the knowledge graph.

In some embodiments, the present application embodiments utilize graph embedding methods to learn feature representations of knowledge from knowledge graph spectra. Illustratively, since chinese generally employs character-level input, embodiments of the present application learn feature representations of character-level knowledge and word-level knowledge simultaneously from a knowledge graph. Wherein the above feature representation of word-level knowledge is also referred to herein as a first class feature vector or knowledge vector representation of a word; the above feature representation of character-level knowledge is also referred to herein as a second class of feature vectors or knowledge vector representation of characters.

The first point to be noted is that the entity types include, but are not limited to: the item "PROD", the brand "BR", the attribute "ATTR or AT", the character "PER", the location "LOC", and the like. A term includes one or more characters; for example, the word "strawberry" includes the characters "grass" and "berry".

The second point to be noted is that each node in the knowledge-graph is used to indicate an entity and a type corresponding to the entity. And under the condition that a certain entity corresponds to N types, the knowledge graph comprises N nodes corresponding to the entity.

The third point to be explained is that, because entity type information is introduced, the same words belonging to different types in the knowledge graph correspond to different feature representations, and the same characters belonging to different types in the knowledge graph correspond to different feature representations. For example, the word "strawberry" includes various types, both as a trade name and as an attribute name, representing a flavor. Then, the same word "strawberry" corresponds to two different feature representations as a product name and an attribute name, respectively, that is, "strawberry" as a product name corresponds to one feature representation, and "strawberry" as a product name corresponds to another feature representation.

A fourth point to be noted is that the step 202 may be executed before the step 201, and the present application is not limited herein.

203. The computer device performs entity recognition on the input text based on the initial feature vector, the first class of feature vector and the second class of feature vector of the input text.

In the embodiment of the application, after the initial feature vector, the first class feature vector and the second class feature vector of the input text are obtained, entity recognition is completed based on an entity recognition model.

In some embodiments, the entity recognition model includes bidirectional LSTM and CRF layers. Wherein the CRF layer is used to constrain the reasonableness of the type labels of the bi-directional LSTM prediction. The bi-directional LSTM may predict the probability that each character belongs to a different type of label, and then use the type label with the highest probability as the final prediction result for the character. Therefore, the relevance between the type labels is ignored in the prediction process, and a CRF layer is added after the output layer of the bidirectional LSTM, so that the entity recognition model can consider the relevance between the type labels.

In the embodiment of the application, in the process of entity identification, the characteristic representation of character-level knowledge and the characteristic representation of word-level knowledge in a knowledge graph are introduced, and the two types of characteristic representations are related to the entity type; that is, the same words belonging to different types in the knowledge graph correspond to different feature representations, and the same characters belonging to different types correspond to different feature representations; in the process of entity recognition, word level knowledge is introduced based on a knowledge graph, meanwhile, character level knowledge is introduced, entity types are further considered, the entity recognition mode can effectively avoid word segmentation errors and entity type confusion, the accuracy of entity recognition is improved, and the entity recognition effect is good.

Fig. 3 is a flowchart of an entity identification method according to an embodiment of the present application, where an execution subject of the entity identification method is a computer device. Referring to fig. 3, the entity identification method includes the steps of:

301. the computer equipment converts the knowledge graph into a first node sequence in a random walk mode; the first node sequence is used for indicating a word-word wandering path, and each node in the first node sequence is used for indicating a word and a type corresponding to the word.

The random walk is also called random walk, and means that a future development step and direction cannot be predicted based on past performance. The ideal mathematical state of brownian motion is close to brownian motion.

In the embodiment of the present application, the basic idea of random walk is: the knowledge-graph is traversed starting from a node of the knowledge-graph. At any node, the traverser walks to a neighbor node of the node according to the probability 1-a, randomly jumps to any node in the knowledge graph according to the probability a, namely the probability a of jumping occurrence, obtains a probability distribution after each walk, and the probability distribution describes the probability that each node in the knowledge graph is visited. This probability distribution is used as input for the next walk and the process is iterated. This probability distribution tends to converge when certain preconditions are met. After convergence, a smooth probability distribution can be obtained.

The present embodiment can obtain a word-word inter-word migration path such as that shown in fig. 4 by performing random walks on the knowledge graph spectrum, and this path is also referred to herein as a first node sequence. Referring to fig. 4, each node in the first sequence of nodes is used to indicate a term and the type to which the term corresponds. For example, the word "yogurt" is a type of commercial product.

302. The computer equipment splits each word in the first node sequence at a character level to obtain a second node sequence; the second node sequence is used for indicating character-character walking paths, and each node in the second node sequence is used for indicating a character and a type corresponding to the character.

In addition, in order to learn the feature representation of the character-level knowledge and the feature representation of the word-level knowledge synchronously, the embodiment of the application performs character-level splitting on each word in the first node sequence, that is, splits the characters of each word, and further obtains a walking path between characters and characters.

With continued reference to fig. 4, the splitting at the character level will split the path of "dili-yogurt-strawberry-ice cream" into the path of "dili-sour-milk-strawberry-ice-cream-lin". As shown in FIG. 4, each node in the second sequence of nodes is used to indicate a character and the type to which the character corresponds. The types of characters such as "acid" and "milk" are commercial products.

303. The computer equipment generates the first type of feature vector based on the first node sequence; and generating the second class of feature vectors based on the second node sequence.

In the embodiment of the present application, in order to introduce entity type information in an entity identification process, two paths, namely a "word/type" path and a "character/type" path shown in fig. 4, are additionally constructed on the basis of the word-word walking path and the character-character walking path. Wherein, the path of the word/type is obtained according to the wandering path between the words and the characters, and the path of the character/type is obtained according to the wandering path between the characters and the characters.

In some embodiments, the computer device generates a "term/type" path, also referred to herein as a third sequence of nodes, based on the different type to which each term in the first sequence of nodes corresponds; wherein the third node sequence is used to indicate a walking path between word-types; illustratively, referring to fig. 4, the word "strawberry" corresponds to three types, respectively brands, goods and attributes, i.e., the word "strawberry" -includes three wandering paths in total between the types.

In addition, the computer device generates a "character/type" path, also referred to herein as a fourth sequence of nodes, based on the different type to which each character in the second sequence of nodes corresponds; wherein the fourth node sequence is used to indicate a wandering path between character-types. Illustratively, referring to fig. 4, the character "berry" corresponds to three types, namely brands, goods and attributes, i.e., the character "berry" -includes three kinds of wandering paths in total between the types.

Correspondingly, performing second feature mapping on the third node sequence to obtain the first class of feature vectors; and performing second feature mapping on the fourth node sequence to obtain the second class of feature vectors. In some embodiments, the second feature mapping is a word vector (word2vec) method. That is, in the embodiment of the present application, the word/type path and the character/type path are constructed by using a word2vec method to obtain the first-class feature vector and the second-class feature vector.

The above-mentioned step 301-. In learning knowledge vectorAfter presentation, knowledge-enhanced entity recognition may be performed. As shown in fig. 5, the entity recognition model based on knowledge-graph enhancement can be divided into three major parts: input layer of fused knowledge vector representation (corresponding to right side of FIG. 5)

And coding), the coding layer of the fused knowledge vector representation (corresponding to Bi-LSTM on the right side of fig. 5) and the CRF layer (corresponding to decoding on the right side of fig. 5). This is explained in detail by the following step 304-305.

304. The computer equipment carries out first feature mapping on the input text to obtain an initial feature vector of the input text.

Step 304 may refer to step 201, which is not described herein again.

305. The computer device performs entity recognition on the input text based on the initial feature vector, the first class of feature vector and the second class of feature vector of the input text.

In the embodiment of the present application, entity recognition is performed on the input text based on the initial feature vector of the input text, the first class feature vector and the second class feature vector, including but not limited to the following steps.

3051. And fusing the initial feature vector of the character, the feature representation of different types corresponding to the character and the word feature representation corresponding to the character to obtain the final feature vector of the character for any character in the input text.

The word characteristic representation corresponding to the characters refers to characteristic representation of different types corresponding to target words, and the target words refer to words matched with the characters in the knowledge graph. In some embodiments, the target term includes at least one of: words with the beginning positions matched with the characters in the knowledge graph; the middle position in the knowledge graph is matched with the words of the character; words with the ending positions matched with the characters in the knowledge graph; and the individual words matched with the characters in the knowledge map.

This step corresponds to the input layer represented by the fused knowledge vector.

Taking any one of the characters as c _i For example, the character c _i The ith character in the input text is referred, i is a positive integer, and the character c _i Is denoted herein as the initial feature vector of

The character c _i Is denoted herein as the final feature vector of

Wherein T ∈ T _kg And T denotes entity type T, T _kg Refers to the set of all entity types; w is formed as W _i ，W _i Referring to the above target word, i.e. the word c associated with the character in the knowledge graph _i Matching words;

refers to the character c _i Corresponding word feature representation;

refers to the character c _i Corresponding to different types of feature representations. In addition, f ^I Is a function that fuses the different vectors.

In some embodiments, step 3051 can be further subdivided into several steps of fusing the initial feature vector of the character, the feature representations corresponding to different types of the character, and the word feature representation corresponding to the character, including:

3051-1, fusing the character c _i And the character c _i And obtaining a first intermediate vector corresponding to different types of feature representations.

Wherein this step can be expressed as the following formula:

f ^A is a function that fuses language model training and type-sensitive knowledge vector representation;

representing a first intermediate vector.

3051-2, fusing the character c _i And the character c _i Corresponding word feature representation; and generating a second intermediate vector based on the fused feature vector.

Wherein this step can be expressed as the following formula:

refers to the above-mentioned fused feature vector;

refers to a second intermediate vector; b is _i M _i E _i S _i Refers to the aforementioned target words, i.e. representing the beginning position, middle position, end position or single word and character c in the knowledge graph _i A set of matched words; v () refers to a vectorized representation of the corresponding set of words, m e { B _i ，M _i ，E _i ，S _i }。

In some embodiments, the character c is fused _i And the character c _i Corresponding to different types of feature representations, obtaining a first intermediate vector, and further refining into:based on the character c _i And the character c _i Determining at least one probability value corresponding to different types of feature representations; acquiring a first intermediate vector according to at least one probability value; wherein at least one probability value is used to indicate the character c _i The probability values belonging to different types are larger, and the probability value corresponding to the type which is more consistent with the current context is larger.

Put another way, function f ^A Selecting the correct type of knowledge in different contexts based on an attention mechanism; wherein,

refer to language model vectors; e _i，t Is a character c _i Corresponding to different types of feature representations

Or character c _i Corresponding word feature representation

W ^CA Denotes E _i，t Mapping to a weight matrix of an implicit space, wherein the weight matrix is a parameter obtained through training;

refer to inferring the character c from context _i A probability of type t, wherein a type that is more consistent with the current context corresponds to a higher probability.

3051-3, performing feature splicing on the initial feature vector of the character, the first intermediate vector and the second intermediate vector to obtain a final feature vector of the character.

Wherein this step can be expressed as the following formula:

wherein [; a character join (also called a character splice) operation.

3052. Inputting the final characteristic vectors of all characters in the input text into an entity recognition model; and performing entity recognition on the input text based on the entity recognition model.

In some embodiments, step 3052 can be further subdivided into the following steps of performing entity recognition on the input text based on the entity recognition model, including:

3052-1, based on the two-way LSTM included in the entity recognition model, encoding the final feature vector of each character in the input text to obtain an implicit vector of each character in the input text.

This step corresponds to the coding layer represented by the fused knowledge vector.

Obtaining input vectors incorporating knowledge vector representation

Then, the embodiment of the present application continues to encode the implicit vector by using the bidirectional LSTM, so as to obtain the implicit vector based on the context.

Wherein this step can be expressed as the following formula:

refers to the final feature vector of the character c1 in the input text,

finger transfusionCharacter c in text _i+1 Is determined by the final feature vector of (a),

to character c in input text _n N is the number of characters included in the input text;

for the character c in the input text ₁ The implicit vector of (a) is calculated,

for the character c in the input text _i The implicit vector of (a) is calculated,

for the character c in the input text _i+1 The implicit vector of (a) is calculated,

for the character c in the input text _n The implicit vector of (3).

3052-2, for any character in the input text, fusing the implicit vector of the character and the feature representations of different types corresponding to the character to obtain a third intermediate vector.

In the embodiment of the application, in order to further improve the influence of the knowledge graph on the entity recognition result, after the implicit vector is obtained, the implicit vector and the knowledge vector are fused, and considering that the entity type needs to be judged in the entity recognition, that is, the knowledge vector representation is type-sensitive, when the knowledge vector representation and the implicit vector are fused, the knowledge vector representation corresponding to the currently most suitable type is selected based on an attention mechanism, similarly to the input layer of the fusion knowledge vector representation.

Wherein this step can be expressed as the following formula:

wherein,

a third intermediate vector is referred to.

3052-3, performing feature splicing on the implicit vector of the character and the third intermediate vector to obtain an output vector of the character.

Finally, will

And implicit vectors

Combining to obtain the character c _i Final output vector

This step can be expressed as the following equation:

3052-4, decoding the output vector of each character in the input text based on the CRF layer to obtain an entity recognition result of the input text.

Proceed to enter character c in text _i For example, the final prediction result can be expressed as the following formula:

wherein, y _i Character c representing a prediction _i Type tag of y _i-1 Character c representing a prediction _i-1 Is a probability transition matrix, theta,

is a linear mapping matrix.

The first point to be explained is that the entity recognition result of the input text includes a word segmentation result and a type corresponding to each entity in the word segmentation result. For example, for the input text "pasta," its corresponding entity recognition result would label the input text as "meaning: B-AT is large: I-AT benefit: E-AT meat: B-PROD sauce: I-PROD surface: E-PROD ", where" AT "represents an attribute," PROD "represents a good, and" B "," I ", and" E "represent the beginning, middle, and end positions of a word, respectively. For example, "meat: B-PROD "indicates that the character is the character at the beginning of a certain trade name, and" catsup: I-PROD "indicates that the character is a character at the middle position of the trade name.

The second point to be noted is that table 1 below shows experimental results obtained by performing experiments on data sets using different entity identification methods. Wherein KANER represents the entity identification method provided by the embodiment of the application; r represents relation information introduced into the knowledge-graph, T represents entity type information introduced into the knowledge-graph, and A represents selection of an appropriate entity type using an attention mechanism. The following experimental results show that the entity identification method provided by the embodiment of the application can obviously improve the entity identification effect, and the relationship information, the entity type information and the attention mechanism in the knowledge graph can play positive roles in the entity identification task.

TABLE 1

In the embodiment of the application, in the process of entity identification, the characteristic representation of character-level knowledge and the characteristic representation of word-level knowledge in a knowledge graph are introduced, and the two types of characteristic representations are related to the entity type; that is, the same words belonging to different types in the knowledge graph correspond to different feature representations, and the same characters belonging to different types correspond to different feature representations; in the entity recognition process, word-level knowledge is introduced based on topological information in the knowledge graph, meanwhile, character-level knowledge is further introduced, entity types are further considered based on entity type knowledge in the knowledge graph, word segmentation errors and entity type confusion can be effectively avoided through the entity recognition mode, entity recognition accuracy is improved, and an entity recognition effect is good.

Fig. 6 is a schematic structural diagram of an entity identification apparatus according to an embodiment of the present application. Referring to fig. 6, the apparatus includes:

a processing module 601, configured to perform first feature mapping on an input text to obtain an initial feature vector of the input text;

an obtaining module 602 configured to obtain a first class of feature vectors and a second class of feature vectors based on a knowledge graph; the first-class feature vectors are feature representations of word-level knowledge in the knowledge graph, and the same words belonging to different types in the knowledge graph correspond to different feature representations; the second type of feature vector is the feature representation of the character-level knowledge in the knowledge graph, and the same character belonging to different types in the knowledge graph corresponds to different feature representations; a term includes one or more characters;

an identification module 603 configured to perform entity identification on the input text based on the initial feature vector of the input text, the first class feature vector, and the second class feature vector.

In the entity recognition device provided by the embodiment of the application, in the process of entity recognition, the characteristic representation of character-level knowledge and the characteristic representation of word-level knowledge in a knowledge graph are introduced, and the two types of characteristic representations are related to the entity type; that is, the same words belonging to different types in the knowledge graph correspond to different feature representations, and the same characters belonging to different types correspond to different feature representations; in the entity recognition process, word-level knowledge is introduced based on the knowledge graph, meanwhile, character-level knowledge is introduced, and entity types are further considered.

In some embodiments, the obtaining module includes:

In some embodiments, the third processing unit is configured to:

In some embodiments, the fourth processing unit is configured to:

In some embodiments, the identification module comprises:

In some embodiments, the target term includes at least one of:

words with end positions matched with the characters in the knowledge graph;

In some embodiments, the fusion unit is configured to:

In some embodiments, the identification unit is configured to:

In some embodiments, the fusion unit is configured to:

All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present disclosure, and are not described in detail herein.

It should be noted that: in the entity identification device provided in the foregoing embodiment, when entity identification is performed, only the division of each functional module is illustrated, and in practical applications, the function distribution may be completed by different functional modules as needed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above. In addition, the embodiment of the entity identification apparatus and the embodiment of the entity identification method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

Fig. 7 shows a block diagram of a computer device 700 provided in an exemplary embodiment of the present application. The computer device 700 may be embodied as a terminal. Generally, the computer device 700 includes: a processor 701 and a memory 702.

The processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit) which is responsible for rendering and drawing the content required to be displayed by the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in the memory 702 is used to store at least one program code for execution by the processor 701 to implement the entity identification methods provided by the method embodiments herein.

In some embodiments, the computer device 700 may also optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by buses or signal lines. Various peripheral devices may be connected to peripheral interface 703 via a bus, signal line, or circuit board. Specifically, the peripheral device includes: at least one of a radio frequency circuit 704, a display screen 705, a camera assembly 706, an audio circuit 707, a positioning component 708, and a power source 709.

The peripheral interface 703 may be used to connect at least one peripheral related to I/O (Input/Output) to the processor 701 and the memory 702. In some embodiments, processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.

The Radio Frequency circuit 704 is used for receiving and transmitting RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuitry 704 communicates with communication networks and other communication devices via electromagnetic signals. The rf circuit 704 converts an electrical signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 704 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: the world wide web, metropolitan area networks, intranets, generations of mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the radio frequency circuit 704 may also include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 705 is a touch display screen, the display screen 705 also has the ability to capture touch signals on or over the surface of the display screen 705. The touch signal may be input to the processor 701 as a control signal for processing. At this point, the display 705 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, the display 705 may be one, disposed on a front panel of the computer device 700; in other embodiments, the display 705 can be at least two, respectively disposed on different surfaces of the computer device 700 or in a folded design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or on a folded surface of the computer device 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The Display 705 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), or the like.

The camera assembly 706 is used to capture images or video. Optionally, camera assembly 706 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing or inputting the electric signals to the radio frequency circuit 704 to realize voice communication. For stereo sound acquisition or noise reduction purposes, the microphones may be multiple and located at different locations on the computer device 700. The microphone may also be an array microphone or an omni-directional acquisition microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 707 may also include a headphone jack.

The Location component 708 is used to locate the current geographic Location of the computer device 700 for navigation or LBS (Location Based Service). The Positioning component 708 can be a Positioning component based on the Global Positioning System (GPS) in the united states, the beidou System in china, or the galileo System in russia.

The power supply 709 is used to supply power to the various components of the computer device 700. The power source 709 may be alternating current, direct current, disposable batteries, or rechargeable batteries. When the power source 709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, the computer device 700 also includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, fingerprint sensor 714, optical sensor 715, and proximity sensor 716.

The acceleration sensor 711 may detect the magnitude of acceleration in three coordinate axes of a coordinate system established with the computer apparatus 700. For example, the acceleration sensor 711 may be used to detect components of the gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display the user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 711. The acceleration sensor 711 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 712 may detect a body direction and a rotation angle of the computer device 700, and the gyro sensor 712 may cooperate with the acceleration sensor 711 to acquire a 3D motion of the user with respect to the computer device 700. From the data collected by the gyro sensor 712, the processor 701 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensors 713 may be disposed on a side bezel of computer device 700 and/or underneath display screen 705. When the pressure sensor 713 is disposed on a side frame of the computer device 700, a user's holding signal to the computer device 700 may be detected, and the processor 701 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at a lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 714 is used for collecting a fingerprint of a user, and the processor 701 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 714, or the fingerprint sensor 714 identifies the identity of the user according to the collected fingerprint. When the user identity is identified as a trusted identity, the processor 701 authorizes the user to perform relevant sensitive operations, including unlocking a screen, viewing encrypted information, downloading software, paying, changing settings, and the like. The fingerprint sensor 714 may be disposed on the front, back, or side of the computer device 700. When a physical key or vendor Logo is provided on the computer device 700, the fingerprint sensor 714 may be integrated with the physical key or vendor Logo.

The optical sensor 715 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 715. Specifically, when the ambient light intensity is high, the display brightness of the display screen 705 is increased; when the ambient light intensity is low, the display brightness of the display screen 705 is adjusted down. In another embodiment, processor 701 may also dynamically adjust the shooting parameters of camera assembly 706 based on the ambient light intensity collected by optical sensor 715.

A proximity sensor 716, also known as a distance sensor, is typically disposed on a front panel of the computer device 700. The proximity sensor 716 is used to capture the distance between the user and the front of the computer device 700. In one embodiment, the processor 701 controls the display screen 705 to switch from the bright screen state to the dark screen state when the proximity sensor 716 detects that the distance between the user and the front face of the computer device 700 is gradually decreased; when the proximity sensor 716 detects that the distance between the user and the front of the computer device 700 is gradually increased, the processor 701 controls the display screen 705 to switch from the breath-screen state to the bright-screen state.

Those skilled in the art will appreciate that the configuration illustrated in FIG. 7 is not intended to be limiting of the computer device 700 and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components may be employed.

Fig. 8 is a schematic structural diagram of a computer device 800 according to an embodiment of the present application. The computer 800 may be embodied as a server. The computer apparatus 800 may have a relatively large difference due to different configurations or performances, and may include one or more processors (CPUs) 801 and one or more memories 802, wherein at least one program code is stored in the memory 802, and is loaded and executed by the processors 801 to implement the entity identification method provided by the above-described method embodiments. Certainly, the computer device 800 may further have a wired or wireless network interface, a keyboard, an input/output interface, and other components to facilitate input and output, and the computer device 800 may further include other components for implementing the device functions, which are not described herein again.

In some embodiments, there is also provided a computer readable storage medium, such as a memory, comprising program code executable by a processor in a computer device to perform the entity identification method in the above embodiments. For example, the computer-readable storage medium may be a Read-Only Memory (ROM), a Random Access Memory (RAM), a Compact Disc Read-Only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.

In some embodiments, there is also provided a computer program product or a computer program comprising computer program code stored in a computer readable storage medium, the computer program code being read by a processor of a computer device from the computer readable storage medium, the computer program code being executed by the processor to cause the computer device to perform the entity identification method described above.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. An entity identification method, characterized in that the method comprises:

converting the knowledge graph into a first node sequence in a random walk mode; the first node sequence is used for indicating word-word migration paths, and each node in the first node sequence is used for indicating a word and a type corresponding to the word;

performing character level splitting on each word in the first node sequence to obtain a second node sequence; the second node sequence is used for indicating character-character walking paths, and each node in the second node sequence is used for indicating a character and a type corresponding to the character;

generating a first class of feature vectors based on the first node sequence;

generating a second class of feature vectors based on the second node sequence;

the first class of feature vectors are feature representations of word-level knowledge in the knowledge graph, the first class of feature vectors are related to entity types, and the same words belonging to different types in the knowledge graph correspond to different feature representations; the second type of feature vector is the feature representation of character-level knowledge in the knowledge graph, the second type of feature vector is also related to the entity type, and the same character belonging to different types in the knowledge graph corresponds to different feature representations; a term includes one or more characters;

2. The entity identification method according to claim 1, wherein the generating the first class of feature vectors based on the first node sequence comprises:

3. The entity identification method according to claim 1, wherein the generating the second class of feature vectors based on the second node sequence comprises:

4. The entity recognition method of claim 1, wherein the entity recognition of the input text based on the initial feature vector of the input text, the first class feature vector and the second class feature vector comprises:

for any character in the input text, fusing an initial feature vector of the character, feature representations of different types corresponding to the character and word feature representations corresponding to the character to obtain a final feature vector of the character;

inputting the final feature vectors of all characters in the input text into an entity recognition model; performing entity recognition on the input text based on the entity recognition model;

5. The entity identification method of claim 4, wherein the target term comprises at least one of:

words with end positions matched with the characters in the knowledge graph;

6. The entity identification method according to claim 4, wherein the fusing the initial feature vector of the character, the feature representations corresponding to different types of the character, and the word feature representation corresponding to the character to obtain the final feature vector of the character comprises:

7. The entity recognition method of claim 4, wherein the entity recognition model comprises a two-way long-short memory network (LSTM) and a Conditional Random Field (CRF) layer; the entity recognition of the input text based on the entity recognition model comprises:

based on the LSTM, coding the final feature vector of each character in the input text to obtain an implicit vector of each character in the input text;

8. The entity recognition method according to claim 6, wherein the fusing the initial feature vector of the character and the different types of feature representations corresponding to the character to obtain a first intermediate vector comprises:

wherein the at least one probability value is used to indicate a probability that the character belongs to a different type, and the more suitable the type of the current context, the higher the corresponding probability value.

9. An entity identification apparatus, the apparatus comprising:

the acquisition module is configured to convert the knowledge graph into a first node sequence in a random walk mode; the first node sequence is used for indicating word-word migration paths, and each node in the first node sequence is used for indicating a word and a type corresponding to the word; performing character level splitting on each word in the first node sequence to obtain a second node sequence; the second node sequence is used for indicating character-character walking paths, and each node in the second node sequence is used for indicating a character and a type corresponding to the character; generating a first class of feature vectors based on the first node sequence; generating a second class of feature vectors based on the second node sequence; the first class of feature vectors are feature representations of word-level knowledge in the knowledge graph, the first class of feature vectors are related to entity types, and the same words belonging to different types in the knowledge graph correspond to different feature representations; the second type of feature vector is the feature representation of character-level knowledge in the knowledge graph, the second type of feature vector is also related to the entity type, and the same character belonging to different types in the knowledge graph corresponds to different feature representations; a term includes one or more characters;

10. A computer device, characterized in that the device comprises a processor and a memory, in which at least one program code is stored, which is loaded and executed by the processor to implement the entity identification method according to any of claims 1 to 8.

11. A computer-readable storage medium, having stored therein at least one program code, which is loaded and executed by a processor, to implement the entity identification method according to any one of claims 1 to 8.