CN112528662A

CN112528662A - Entity category identification method, device, equipment and storage medium based on meta-learning

Info

Publication number: CN112528662A
Application number: CN202011472865.XA
Authority: CN
Inventors: 刘玉; 徐国强
Original assignee: OneConnect Financial Technology Co Ltd Shanghai
Current assignee: OneConnect Smart Technology Co Ltd; OneConnect Financial Technology Co Ltd Shanghai
Priority date: 2020-12-15
Filing date: 2020-12-15
Publication date: 2021-03-19
Also published as: WO2022127124A1

Abstract

The present application relates to the field of artificial intelligence, and in particular, to a method, an apparatus, a device, and a storage medium for entity category identification based on meta-learning. The method comprises the following steps: acquiring a newly added entity type, and inquiring a reference sample corresponding to the newly added entity type; acquiring data to be identified; and inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model so as to recognize a new entity class corresponding to the reference sample in each data to be recognized, wherein the entity class recognition model is obtained by training based on a meta-learning mode. By adopting the method, the entity category identification accuracy can be ensured. In addition, the application also relates to a block chain technology, and the newly added entity type can be stored in the block chain node.

Description

Entity category identification method, device, equipment and storage medium based on meta-learning

Technical Field

The present application relates to the field of artificial intelligence technologies, and in particular, to a method, an apparatus, a device, and a storage medium for entity category identification based on meta-learning.

Background

At present, the research on named entity recognition in the field of artificial intelligence is more, but data sets related to named entity recognition are few, and particularly, Chinese named entity recognition data sets are rare. In addition, although a set of relatively mature named entity recognition models exist in the market, the models can only distinguish three relatively common entity categories, namely, names, organizations and addresses. These models cannot handle when new entity classes appear.

The traditional open-source Chinese named entity identification data set mainly comprises MSRA (Chinese Mobile architecture), civil daily report, microblog, CLUENER (customer premise equipment) and BOSON (business oriented network) data sets, the size of the five data sets is not very large, the number of entity categories is not large enough, and the total number of the five data sets is not more than 30 entity categories. However, the real-world entity class tree is much larger than 30, and the conventional idea is how many entity classes label how much data, which is not practical. It is often the case that when a new entity class appears, there are often only 10 to 100 samples of the new class, and it is not practical to retrain the model with these samples, since the model must be affected by class imbalance and overfitting.

Therefore, there is a need for a method that can accurately identify the corresponding entity class in the data when a new entity class appears.

Disclosure of Invention

In view of the above, it is necessary to provide a meta-learning based entity class identification method, apparatus, device and storage medium capable of accurately identifying entity classes in view of the above technical problems.

A meta-learning based entity class identification method, the method comprising:

acquiring a newly added entity type, and inquiring a reference sample corresponding to the newly added entity type;

acquiring data to be identified;

and inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model so as to recognize a new entity class corresponding to the reference sample in each data to be recognized, wherein the entity class recognition model is obtained by training based on a meta-learning mode.

In one embodiment, the inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model to recognize a new entity class corresponding to the reference sample in each data to be recognized includes:

serializing words in the reference sample and the data to be recognized, and performing high-order feature representation on the serialized words;

carrying out average pooling operation on the words after the high-order feature representation to obtain vector representations of the reference sample and the data to be identified;

processing the vectorization representation of the data to be identified through the vectorization representation of the reference sample to obtain high-level features of the data to be identified;

and processing the high-level features to obtain the newly added entity category corresponding to the reference sample in the data to be identified.

In one embodiment, the training mode of the entity class identification model includes:

obtaining sample data, and constructing a multi-element training sample according to the sample data;

and training according to the meta-training sample to obtain an entity class identification model.

In one embodiment, the obtaining sample data and constructing a multi-element training sample according to the sample data includes:

obtaining sample data, grouping the sample data according to entity types, and randomly extracting at least one group from the groups;

determining a first quantity of sample data in the extracted at least one group as a support sample, and determining a second quantity of sample data as a query sample;

obtaining a component training sample according to the supporting sample and the query sample;

repeating the randomly extracting at least one packet from the packets to obtain a multi-element training sample.

In one embodiment, the obtaining sample data and grouping the sample data according to entity categories includes:

obtaining sample data grouped according to an initial entity type, and grouping the sample data in the initial entity type according to a target entity type;

carrying out standardization processing on sample data grouped according to the target entity type;

and merging the standardized target entity categories corresponding to the initial entity categories to obtain groups corresponding to the target entity categories.

In one embodiment, the training according to the meta-training sample to obtain the entity class recognition model includes:

serializing words in the support sample and the query sample, performing high-order feature representation on the serialized words, and performing average pooling operation on the words after the high-order feature representation to obtain vector representations of the support sample and the query sample;

processing the vectorization representation of the query sample through the vectorization representation of the support sample according to an entity category identification model to obtain the high-level features of the query sample;

processing the high-level characteristics of the query sample to obtain a newly added entity category corresponding to the support sample in the query sample;

inputting the newly added entity type corresponding to the support sample in the obtained query sample and the real entity type of the query sample into a random field layer to obtain a loss function through calculation;

and training the entity class recognition model through the loss function.

In one embodiment, the serializing the words in the support sample and the query sample, and performing high-order feature representation on the serialized words includes:

processing the vectorized representation of the query sample by the vectorized representation of the support sample according to the following formula to obtain the high-level features of the query sample:

wherein the content of the first and second substances,

is q_jHigh-level features of the query sample obtained after modeling of the support sample; the atten function is used for calculating the contribution degree of each support sample to the identification of the named entity of the query sample;

the method is characterized in that two vectors are spliced into a new vector, T is a real number and is used for controlling the sharpness degree of distribution obtained by an atten function; k represents the serial number of the supporting samples, and the value of k is related to the number of the supporting samples.

An entity class identification apparatus based on meta-learning, the apparatus comprising:

the new entity type acquisition module is used for acquiring a new entity type and inquiring a reference sample corresponding to the new entity type;

the data to be identified acquisition module is used for acquiring data to be identified;

and the entity identification module is used for inputting the reference sample and the data to be identified into a pre-generated entity class identification model so as to identify a newly added entity class corresponding to the reference sample in each data to be identified, wherein the entity class identification model is obtained by training based on a meta-learning mode.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the method in any of the above embodiments when the processor executes the computer program.

A computer storage medium having stored thereon a computer program which, when executed by a processor, carries out the steps of the method in any of the above embodiments.

According to the entity category identification method, the entity category identification device, the entity category identification equipment and the storage medium based on meta-learning, the reference sample is determined according to the newly added entity category, the reference sample and the data to be identified are input into the entity category identification model which is generated in advance, so that the newly added entity category corresponding to the reference sample in each data to be identified is identified, manual interference is not needed, special knowledge in the field of artificial intelligence is not needed, the labor cost is greatly reduced, when the newly added entity category exists, the model does not need to be retrained, and only a few reference samples are needed to identify the data to be identified so as to determine whether the entity category exists.

Drawings

FIG. 1 is a flow diagram illustrating a method for entity class identification based on meta-learning in one embodiment;

FIG. 2 is a flow chart illustrating a meta-learning based entity class identification method according to another embodiment;

FIG. 3 is a block diagram of an entity class identification apparatus based on meta-learning in one embodiment;

FIG. 4 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

In an embodiment, as shown in fig. 1, an entity category identification method based on meta learning is provided, and this embodiment is illustrated by applying this method to a terminal, it is to be understood that this method may also be applied to a server, and may also be applied to a system including a terminal and a server, and is implemented by interaction between the terminal and the server. In this embodiment, the method includes the steps of:

s102: and acquiring the new entity type, and inquiring a reference sample corresponding to the new entity type.

Specifically, the new entity category may be a name of a new entity, and the new entity category may be at least one. The reference samples are samples belonging to the newly added entity category, wherein the number of the reference samples may be 10 or more, but the number is not too large. The server may establish a correspondence between the newly added entity class and the corresponding reference sample, for example, in a grouping manner. In addition, it should be noted that the same reference sample may belong to a plurality of additional entity categories, that is, one reference sample may be labeled with a plurality of entity categories.

S104: and acquiring data to be identified.

Specifically, the data to be identified is data that needs to be subjected to entity classification processing, and may be newly added data or past data.

S106: and inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model so as to recognize a new entity class corresponding to the reference sample in each data to be recognized, wherein the entity class recognition model is obtained by training based on a meta-learning mode.

Specifically, the entity class identification model is obtained by training based on a meta-learning mode, wherein a plurality of meta-training tasks are constructed according to sample data, and then the entity class identification model is obtained by training through the constructed meta-training tasks. The meta-training task is an entity class identification model which can identify the entity class of the data to be identified by training a few newly added entity class samples obtained by giving a small number of support samples and a large number of query samples.

The server inputs the reference sample and the data to be identified into a pre-generated entity type identification model, so that the entity type identification model identifies a newly added entity type corresponding to the reference sample in each data to be identified.

The process of identifying the newly added entity type corresponding to the reference sample in each piece of data to be identified by the entity type identification model may include a process of processing the reference sample and the piece of data to be identified, a step of calculating a high-level feature representation of the piece of data to be identified by using the processed reference sample, and a step of processing according to the high-level feature representation to determine the newly added entity type corresponding to the reference sample in each piece of data to be identified.

The process of processing the reference sample and the data to be identified may include: and finally, processing the words after the high-order representation through average pooling operation to obtain vector representations corresponding to the reference sample and the data to be recognized.

The step of the server calculating the high-level feature representation of the data to be identified by means of the processed reference samples may be performed according to the following formula:

wherein the content of the first and second substances,

is q_jThe high-level features of the data to be identified, which are obtained after modeling by the reference sample, model the relationship between the reference sample and the data to be identified to some extent. The atten function is used for calculating the contribution degree of each reference sample to the named entity identification in the data to be identified.

Representing the concatenation of two vectors into a new longer vector, T is a real number that controls how sharp the atten-derived distribution is. k represents the index of the reference sample, which is related to the number of samples of the reference sample.

Finally, the step of processing according to the high-level feature representation to determine the newly added entity category of the corresponding reference sample in each data to be identified includes: inputting high-level features into a predetermined full-connected layer, mapping the feature vector dimension of each word to a preset dimension, for example, 3 dimensions, which respectively represent that label of the word is O, B, I, i.e. does not belong to the category, belongs to the category and is located at the beginning of the sentence, belongs to the category and is located in the middle of the sentence.

It should be emphasized that, in order to further ensure the privacy and security of the entity types of the newly added entity and the data to be identified, the entity types of the newly added entity and the data to be identified may also be stored in a node of a block chain.

According to the entity category identification method based on meta-learning, the reference sample is determined according to the newly added entity category, the reference sample and the data to be identified are input into the entity category identification model generated in advance, the newly added entity category corresponding to the reference sample in each data to be identified is identified, manual interference is not needed, knowledge in the special artificial intelligence field is not needed, labor cost is greatly reduced, when the newly added entity category exists, the model does not need to be retrained, and only a few reference samples are needed to identify the data to be identified so as to determine whether the entity category exists.

In one embodiment, inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model to recognize a new entity class corresponding to the reference sample in each data to be recognized includes: serializing words in the reference sample and the data to be recognized, and performing high-order feature representation on the serialized words; carrying out average pooling operation on the words after the high-order feature representation to obtain vector representations of the reference sample and the data to be identified; processing the vectorization representation of the data to be identified by referring to the vectorization representation of the sample to obtain the high-level features of the data to be identified; and processing the high-level features to obtain the newly added entity category of the corresponding reference sample in the data to be identified.

Specifically, assume that the word sequences of the reference samples are respectively

Then the input of the reference sample is [ CLS ]],

[SEP]The word sequence of the data to be recognized is

The input of the data to be recognized is [ CLS ]],

[SEP]。

After the reference sample and the data to be recognized are input into the BERT, the high-order feature representation of each word of the reference sample and the data to be recognized can be obtained by the following equation:

wherein

And

i and j words, s, of reference sample and data to be recognized, respectively_iAnd q is_jRespectively, the high-order feature representations of the two words after BERT.

After obtaining the high-order feature representations of the words, an average pooling operation is used to obtain a uniform vector representation representing the entire reference sample and the data to be recognized:

s_rep＝MEAN_POOLING_i(s_i)

q_rep＝MEAN_POOLING_j(q_j)

s thus obtained_repThen generationTable characterization of the entire reference sample, q_repIt represents a characteristic representation of the entire data to be identified.

After obtaining the feature representations of the entire reference sample and the data to be identified, a higher-order feature representation of the data to be identified may be obtained from the reference sample:

wherein

Representing the concatenation of two vectors into a new longer vector, T is a real number that controls the sharpness of the distribution obtained by the atten function. k represents the index of the reference sample, which is related to the number of samples of the reference sample.

The server obtains the final feature representation of each word in the query sample, and then maps the feature vector dimension of each word to 3 dimensions through a full connection layer, wherein the three dimensions respectively represent that the label of the word is O, B and I, namely the label does not belong to the category, belongs to the category and is positioned at the beginning of a sentence, belongs to the category and is positioned in the middle of the sentence, namely the new entity category corresponding to the reference sample in the data to be recognized.

In one embodiment, the training mode of the entity class recognition model includes: acquiring sample data, and constructing a multi-element training sample according to the sample data; and training according to the meta-training sample to obtain an entity class identification model.

Specifically, the sample data may be a sample that is set in advance and has been classified. The meta training samples are obtained by processing sample data, wherein each meta training sample may include a plurality of support samples and a plurality of query samples, wherein the support samples may include a plurality of groups of sample data, i.e., sample data belonging to different categories, and the corresponding query sample is also the query sample in the corresponding group. The number of groups of the meta-training samples can be set according to needs, for example, ten thousand, then the meta-training samples are trained to obtain a target classification model, for example, the meta-training samples are trained sequentially until the accuracy of the entity class identification model reaches an expectation, wherein the calculation of the accuracy of the entity class identification model can be processed according to the meta-training samples, for example, the support samples and the query samples in the meta-training samples are input into the entity class identification model to determine the entity class corresponding to the query sample, and if the accuracy is compared with the real entity class of the query sample and the expectation is reached, the model training is completed.

In one embodiment, obtaining sample data and constructing a multi-element training sample according to the sample data includes: obtaining sample data, grouping the sample data according to entity types, and randomly extracting at least one group from the groups; determining a first quantity of sample data in the extracted at least one group as a support sample, and determining a second quantity of sample data as a query sample; obtaining a component training sample according to the supporting sample and the query sample; at least one packet is repeatedly and randomly extracted from the packets to obtain a multi-element training sample.

Specifically, the server first obtains original sample data, then processes the original sample data to obtain a data set corresponding to each category, and then starts to construct a training set. In order to train the meta-learning model, a series of meta-training samples are first constructed, and the construction rule is as follows:

randomly extracting a number of groups, e.g. 3 classes, from the processed entity classes, e.g. 12 entity classesIs represented by₁,l₂,…,l₃From₁,l₂,…,l₃Of these 3 classes, a first number, e.g., 10 samples, are randomly drawn for each class as support samples, and a second number, e.g., 100 samples, are randomly drawn for each class as query samples, so that a total of 30 support samples, 3000 query samples, are obtained.

The server integrates the data thus constructed into a meta-training task (meta-training task) whose purpose is to train the model so that it can classify the query sample given the supporting sample. To train the model, the server may construct 10000 such meta-training tasks.

In one embodiment, obtaining sample data, and grouping the sample data according to entity categories includes: obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category; carrying out standardization processing on sample data grouped according to the target entity type; and merging the standardized target entity categories corresponding to the initial entity categories to obtain groups corresponding to the target entity categories.

Specifically, the initial entity category is open source MSRA, civil newspaper, microblog, clener, BOSON data sets collected from the internet, and since the labeling formats of the data sets are not uniform, the data are preprocessed to be unified into data in a BIO labeling format.

Wherein the target entity is labeled in the initial category entity, such as PER name, LOC location, ORG organization, TIM time, COM company name, ADD specific address, GAME GAME name, GOV government department, SCENCE attraction, BOOK BOOK, MOVIE MOVIE, and PRODUCT PRODUCT. The server counts entity types marked by each data set in the MSRA, the civil daily report, the microblog, the CLUENER and the BOSON data sets, wherein the MSRA marks three entity types including PER, LOC and ORG, and if L (MSRA) is the entity type set marked by the MSRA, L (MSRA) ═ PER, LOC and ORG. Similarly, the server may obtain L (people's daily report) { PER, LOC, ORG, TIM }, L { microblog } { PER, ORG, LOC }, L (timer) { PER, LOC, ORG, COM, ADD, GAME, GOV, SCENCE, BOOK, MOVIE }, L (guest) { PER, LOC, ORG, COM, TIM, PRODUCT }

According to the data set labeling entity class set obtained in the last step, each data set is divided into new entity classes according to a single entity class, for example, for an MSRA data set, l (MSRA) ═ { PER, LOC, ORG }, the PER class is considered first, all PER positive samples in the MSRA data set are retained, all other positive samples such as LOC and ORG are labeled as negative samples, the originally existing negative samples in the MSRA remain unchanged, only the PER class positive samples are contained in the newly obtained data set, all other classes positive samples become negative samples, and finally sentences all of which are negative samples are removed, the data set is recorded as MSRA-PER, and similarly, the server can obtain MSRA-ORG, MSRA-LOC data sets. The server may also go to CLUENER-PER, CLUENER-ADD. Specifically, i call AB, live on CD, and work on EF, for example. Wherein AB is PER entity, CD is LOC entity, EF is ORG entity, these all count positive samples, and I call, live, these all count negative samples in office.

Then, the server can obtain five data sets related to the PERs, namely MSRA-PER, people's daily report-PER, CLUENER-PER, microblog-PER and BOSON-PER, and through the analysis, the five data sets only contain the entity of the PER category, and the entities of other categories are negative samples, so that the five data sets related to the PER are mixed to form a new data set which is recorded as a ZH-PER data set. Similarly, the server can go to ZH-LOC, ZH-ORG, ZH-TIM, ZH-ADD, ZH-COM, ZH-BOOK, etc. for a total of 12 data sets.

In one embodiment, referring to fig. 2, fig. 2 is a flowchart of a training process of the entity class identification model in an embodiment, where the training according to the meta-training sample to obtain the entity class identification model includes: serializing words in the support sample and the query sample, performing high-order feature representation on the serialized words, and performing average pooling operation on the words after the high-order feature representation to obtain vector representations of the support sample and the query sample; processing the vectorization representation of the query sample through the vectorization representation of the support sample according to the entity type identification model to obtain the high-level characteristics of the query sample; processing the high-level characteristics of the query sample to obtain a newly added entity type of a corresponding support sample in the query sample; inputting the newly added entity category of the corresponding support sample in the obtained query sample and the real entity category of the query sample into a random field layer to calculate and obtain a loss function; and training the entity class recognition model through a loss function.

Specifically, after the meta-training task is built, the server begins building the model. The text adopts a Chinese pre-training language model BERT to code the characteristic representation of a sentence, and the main structure of the model is as follows:

let the word sequences of the support (support) samples be respectively

The input to support the sample is [ CLS ]],

[SEP]The word sequence of the query sample is

The input of the query sample is [ CLS ]],

[SEP]。

After entering the support and query samples into BERT, the server obtains a high-order feature representation for each word of these samples by:

wherein

And

the ith and jth words, s, of the support and query samples, respectively_iAnd q is_jRespectively, the high-order feature representations of the two words after BERT.

After obtaining the high-order feature representations of the words, the server uses an averaging pooling operation to obtain a uniform vector representation representing the entire sample:

s_rep＝MEAN_POOLING_i(s_i)

q_rep＝MEAN_POOLING_j(q_j)

s thus obtained_repRepresenting the feature representation of the entire support sample, q_repA characterization representation of the entire query sample is represented.

After obtaining the feature representation of the entire sample, the server obtains a higher-order feature representation of the query sample from the support sample:

wherein

Is q_jThe higher-level feature representation obtained after modeling the support sample models to some extent the relationship between the support sample and the query sample. The atten function is used for calculating the contribution degree of each support sample to the identification of the named entity of the query sample.

Representing the concatenation of two vectors into a new longer vector, T is a real number that controls how sharp the atten-derived distribution is. k represents the serial number of the support samples, since 10 support samples were chosen for each category, since k takes 10 at maximum.

Thus, the server obtains the final feature representation of each word in the query sample, and then maps the feature vector dimension of each word to 3 dimensions through a full connection layer, where the three dimensions represent the words, label, to be O, B, I, i.e. not belonging to the category, belonging to the category and located at the beginning of the sentence, and belonging to the category and located in the middle of the sentence. After each word is mapped to 3 dimensions through the fully connected layer, a conditional random field CRF layer is then used to calculate the final penalty. The model is trained by a loss function.

In one embodiment, serializing the words in the support sample and the query sample, and performing high-order feature representation on the serialized words includes: processing the vectorized representation of the query sample by supporting the vectorized representation of the sample according to the following formula to obtain the high-level features of the query sample:

wherein the content of the first and second substances,

is q_jHigh-level features of a query sample obtained after modeling of the support sample; the atten function is used for calculating the contribution degree of each support sample to the identification of the named entity of the query sample;

represents twoSplicing the vectors into a new vector, wherein T is a real number and is used for controlling the sharpness degree of the distribution obtained by the atten function; k represents the serial number of the supporting samples, and the value of k is related to the number of the supporting samples.

It should be understood that although the various steps in the flowcharts of fig. 1 and 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 1 and 2 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performing the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least some of the sub-steps or stages of other steps.

In one embodiment, as shown in fig. 3, there is provided a meta-learning based entity class identification apparatus, including: an entity category acquisition module 100, a data to be identified acquisition module 200 and an entity identification module 300 are added, wherein:

a new entity type obtaining module 100, configured to obtain a new entity type, and query a reference sample corresponding to the new entity type;

a to-be-identified data acquisition module 200, configured to acquire to-be-identified data;

the entity identification module 300 is configured to input the reference sample and the data to be identified into a pre-generated entity class identification model, so as to identify a new entity class corresponding to the reference sample in each data to be identified, where the entity class identification model is obtained by training based on meta-learning.

In one embodiment, the entity identification module 300 may include:

the conversion unit is used for serializing the words in the reference sample and the data to be identified and performing high-order feature representation on the serialized words;

the first vector quantization unit is used for carrying out average pooling operation on the words after the high-order characteristic representation to obtain vector representations of a reference sample and data to be identified;

the first high-level feature representation unit is used for processing the vectorization representation of the data to be identified by referring to the vectorization representation of the sample to obtain the high-level features of the data to be identified;

and the identification unit is used for processing the high-level characteristics to obtain the newly added entity type of the corresponding reference sample in the data to be identified.

In one embodiment, the entity class identification device based on meta-learning includes:

the sample acquisition module is used for acquiring sample data and constructing a multi-element training sample according to the sample data;

and the training is fast, and the entity class recognition model is obtained by training according to the meta-training sample.

In one embodiment, the sample acquiring module may include:

the grouping unit is used for acquiring sample data, grouping the sample data according to the entity type and randomly extracting at least one group from the groups;

the extraction unit is used for determining that the first quantity of sample data in the extracted at least one group is a support sample and the second quantity of sample data is a query sample;

the combination unit is used for obtaining a group of element training samples according to the support samples and the query samples;

and the circulating unit is used for repeatedly and randomly extracting at least one packet from the packets to obtain the multi-element training sample.

In one embodiment, the grouping unit may include:

the grouping subunit is used for acquiring sample data grouped according to the initial entity category and grouping the sample data in the initial entity category according to the target entity category;

the standardization subunit is used for carrying out standardization processing on the sample data grouped according to the target entity type;

and the merging subunit is used for merging the standardized target entity categories corresponding to the initial entity categories to obtain the groups corresponding to the target entity categories.

In one embodiment, the training module may include:

the second vector quantization unit is used for serializing the words in the support sample and the query sample, performing high-order feature representation on the serialized words, and performing average pooling operation on the words after the high-order feature representation to obtain vector representations of the support sample and the query sample;

the second high-level feature representation unit is used for processing the vectorization representation of the query sample through the vectorization representation of the support sample according to the entity type identification model to obtain the high-level features of the query sample;

the category identification unit is used for processing the high-level characteristics of the query sample to obtain the category of a newly added entity corresponding to the support sample in the query sample;

the loss function generating unit is used for inputting the newly added entity type of the corresponding support sample in the obtained query sample and the real entity type of the query sample into the random field layer to calculate and obtain a loss function;

and the training unit is used for training the entity class recognition model through a loss function.

In one embodiment, the second quantization unit is further configured to process the vectorized representation of the query sample by supporting the vectorized representation of the sample according to the following formula to obtain the high-level features of the query sample:

wherein the content of the first and second substances,

For the specific definition of the entity class identification device based on meta-learning, reference may be made to the above definition of the entity class identification method based on meta-learning, which is not described herein again. The modules in the entity category identification device based on meta-learning can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 4. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a meta-learning based entity class identification method.

Those skilled in the art will appreciate that the architecture shown in fig. 4 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, there is provided a computer device comprising a memory storing a computer program and a processor implementing the following steps when the processor executes the computer program: acquiring a new entity type, and inquiring a reference sample corresponding to the new entity type; acquiring data to be identified; and inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model so as to recognize a new entity class corresponding to the reference sample in each data to be recognized, wherein the entity class recognition model is obtained by training based on a meta-learning mode.

In one embodiment, the inputting of the reference sample and the data to be recognized into the entity class recognition model generated in advance to recognize the new entity class corresponding to the reference sample in each data to be recognized, which is realized when the processor executes the computer program, includes: serializing words in the reference sample and the data to be recognized, and performing high-order feature representation on the serialized words; carrying out average pooling operation on the words after the high-order feature representation to obtain vector representations of the reference sample and the data to be identified; processing the vectorization representation of the data to be identified by referring to the vectorization representation of the sample to obtain the high-level features of the data to be identified; and processing the high-level features to obtain the newly added entity category of the corresponding reference sample in the data to be identified.

In one embodiment, the manner in which the entity class identification model is trained when the computer program is executed by the processor comprises: acquiring sample data, and constructing a multi-element training sample according to the sample data; and training according to the meta-training sample to obtain an entity class identification model.

In one embodiment, obtaining sample data and constructing a multi-component training sample from the sample data, as implemented by a processor executing a computer program, comprises: obtaining sample data, grouping the sample data according to entity types, and randomly extracting at least one group from the groups; determining a first quantity of sample data in the extracted at least one group as a support sample, and determining a second quantity of sample data as a query sample; obtaining a component training sample according to the supporting sample and the query sample; at least one packet is repeatedly and randomly extracted from the packets to obtain a multi-element training sample.

In one embodiment, obtaining sample data implemented by a processor executing a computer program, grouping the sample data by entity class, comprises: obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category; carrying out standardization processing on sample data grouped according to the target entity type; and merging the standardized target entity categories corresponding to the initial entity categories to obtain groups corresponding to the target entity categories.

In one embodiment, the training according to the meta-training samples to obtain the entity class recognition model when the processor executes the computer program includes: serializing words in the support sample and the query sample, performing high-order feature representation on the serialized words, and performing average pooling operation on the words after the high-order feature representation to obtain vector representations of the support sample and the query sample; processing the vectorization representation of the query sample through the vectorization representation of the support sample according to the entity type identification model to obtain the high-level characteristics of the query sample; processing the high-level characteristics of the query sample to obtain a newly added entity type of a corresponding support sample in the query sample; inputting the newly added entity category of the corresponding support sample in the obtained query sample and the real entity category of the query sample into a random field layer to calculate and obtain a loss function; and training the entity class recognition model through a loss function.

In one embodiment, the serialization of words in the support sample and the query sample and the high-order feature representation of the serialized words, which are implemented by the processor when the computer program is executed, comprises: processing the vectorized representation of the query sample by supporting the vectorized representation of the sample according to the following formula to obtain the high-level features of the query sample:

wherein the content of the first and second substances,

In one embodiment, a computer storage medium is provided, having a computer program stored thereon, the computer program, when executed by a processor, implementing the steps of: acquiring a new entity type, and inquiring a reference sample corresponding to the new entity type; acquiring data to be identified; and inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model so as to recognize a new entity class corresponding to the reference sample in each data to be recognized, wherein the entity class recognition model is obtained by training based on a meta-learning mode.

In one embodiment, the inputting of the reference sample and the data to be recognized into the entity class recognition model generated in advance to recognize the new entity class corresponding to the reference sample in each data to be recognized, which is realized when the computer program is executed by the processor, includes: serializing words in the reference sample and the data to be recognized, and performing high-order feature representation on the serialized words; carrying out average pooling operation on the words after the high-order feature representation to obtain vector representations of the reference sample and the data to be identified; processing the vectorization representation of the data to be identified by referring to the vectorization representation of the sample to obtain the high-level features of the data to be identified; and processing the high-level features to obtain the newly added entity category of the corresponding reference sample in the data to be identified.

In one embodiment, the manner in which the entity class identification model is trained when the computer program is executed by the processor includes: acquiring sample data, and constructing a multi-element training sample according to the sample data; and training according to the meta-training sample to obtain an entity class identification model.

In one embodiment, obtaining sample data and constructing a multi-component training sample from the sample data, when the computer program is executed by a processor, comprises: obtaining sample data, grouping the sample data according to entity types, and randomly extracting at least one group from the groups; determining a first quantity of sample data in the extracted at least one group as a support sample, and determining a second quantity of sample data as a query sample; obtaining a component training sample according to the supporting sample and the query sample; at least one packet is repeatedly and randomly extracted from the packets to obtain a multi-element training sample.

In one embodiment, obtaining sample data, the grouping of sample data by entity class performed when the computer program is executed by the processor, comprises: obtaining sample data grouped according to the initial entity category, and grouping the sample data in the initial entity category according to the target entity category; carrying out standardization processing on sample data grouped according to the target entity type; and merging the standardized target entity categories corresponding to the initial entity categories to obtain groups corresponding to the target entity categories.

In one embodiment, training from meta-training samples to obtain an entity class recognition model, when implemented by a processor, comprises: serializing words in the support sample and the query sample, performing high-order feature representation on the serialized words, and performing average pooling operation on the words after the high-order feature representation to obtain vector representations of the support sample and the query sample; processing the vectorization representation of the query sample through the vectorization representation of the support sample according to the entity type identification model to obtain the high-level characteristics of the query sample; processing the high-level characteristics of the query sample to obtain a newly added entity type of a corresponding support sample in the query sample; inputting the newly added entity category of the corresponding support sample in the obtained query sample and the real entity category of the query sample into a random field layer to calculate and obtain a loss function; and training the entity class recognition model through a loss function.

In one embodiment, the serialization of words in the support and query samples and the high-order feature representation of the serialized words, implemented when the computer program is executed by a processor, includes: processing the vectorized representation of the query sample by supporting the vectorized representation of the sample according to the following formula to obtain the high-level features of the query sample:

wherein the content of the first and second substances,

the method is characterized in that two vectors are spliced into a new vector, T is a real number and is used for controlling the sharpness degree of distribution obtained by an atten function; k represents the serial number of the supporting sample, and the value of k and the supporting sampleIs correlated with the number of samples.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A meta-learning based entity class identification method, the method comprising:

acquiring data to be identified;

2. The method of claim 1, wherein the inputting the reference sample and the data to be recognized into a pre-generated entity class recognition model to recognize a new entity class corresponding to the reference sample in each data to be recognized comprises:

3. The method according to claim 1 or 2, wherein the training of the entity class recognition model comprises:

4. The method of claim 3, wherein obtaining sample data and constructing multi-element training samples from the sample data comprises:

5. The method of claim 4, wherein the obtaining sample data, the grouping the sample data by entity class, comprises:

6. The method of claim 5, wherein the training according to the meta-training samples to obtain an entity class recognition model comprises:

and training the entity class recognition model through the loss function.

7. The method of claim 6, wherein the serializing words in the support samples and the query samples and performing high-order feature representation on the serialized words comprises:

wherein the content of the first and second substances,

8. An entity class identification device based on meta-learning, the device comprising:

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.

10. A computer storage medium on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.