CN113268452A

CN113268452A - Entity extraction method, device, equipment and storage medium

Info

Publication number: CN113268452A
Application number: CN202110569742.6A
Authority: CN
Inventors: 罗永贵; 刘霄晨; 肖劲; 尹芳; 张晓璐; 马晶
Original assignee: Lianren Healthcare Big Data Technology Co Ltd
Current assignee: Lianren Healthcare Big Data Technology Co Ltd
Priority date: 2021-05-25
Filing date: 2021-05-25
Publication date: 2021-08-17
Anticipated expiration: 2041-05-25
Also published as: CN113268452B

Abstract

The embodiment of the invention discloses an entity extraction method, an entity extraction device, entity extraction equipment and a storage medium. Wherein, the method comprises the following steps: acquiring an unlabeled data set and a labeled data set corresponding to the unlabeled data set, determining new words in the unlabeled data set, and forming a new word data set; converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained; enhancing the feature information output by the feature extraction module based on the new word data set, and inputting the enhanced feature information to a prediction module to obtain a prediction entity; and generating a loss function based on the predicted entity and the labeled data set, and carrying out iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model. According to the technical scheme, when the entity is extracted, the entity boundary information can be effectively learned by means of the new word data set, so that the accuracy of entity extraction is improved.

Description

Entity extraction method, device, equipment and storage medium

Technical Field

The embodiment of the invention relates to the technical field of data processing, in particular to a method, a device, equipment and a storage medium for entity extraction.

Background

In the writing process of the electronic medical record, a large number of medical professional terms exist, and meanwhile, because each doctor has a personalized writing habit, the same medical term can be often expressed differently, so that a large number of out-of-vocabulary words (OOV) exist in the electronic medical record, and great difficulty and challenge exist in the process of extracting entities of the electronic medical record.

The current common mode is a model training method based on single characters or words, and the model is trained by utilizing mass data so as to solve the problem of difficulty in identifying the out-of-set words by improving the generalization capability of the model. However, in the prior art, the extraction of the electronic medical record entity is faced with a vocabulary with infinite permutation and combination, and meanwhile, the accuracy of the identification and extraction of the electronic medical record entity is reduced due to the ambiguity of Chinese word segmentation.

Disclosure of Invention

The embodiment of the invention provides a method, a device, equipment and a storage medium for entity extraction, so as to improve the accuracy of entity extraction.

In a first aspect, an embodiment of the present invention provides a method for training an entity extraction model, including:

acquiring an unlabeled data set and a labeled data set corresponding to the unlabeled data set, and determining new words in the unlabeled data set to form a new word data set;

converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained, wherein the entity extraction model comprises a feature extraction module and a prediction module;

based on the new word data set, enhancing the feature information output by the feature extraction module, and inputting the enhanced feature information to the prediction module to obtain a prediction entity;

and generating a loss function based on the predicted entity and the labeled data set, and carrying out iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model.

In a second aspect, an embodiment of the present invention further provides an entity extraction method, including:

acquiring data to be processed, and converting the data to be processed into a preset format vector;

and inputting the preset format vector to a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed, wherein the entity extraction model is obtained by training based on the entity extraction model training method provided by any embodiment of the invention.

In a third aspect, an embodiment of the present invention further provides a training apparatus for an entity extraction model, including:

the new word determining module is used for acquiring an unlabeled data set and a labeled data set corresponding to the unlabeled data set, determining new words in the unlabeled data set and forming a new word data set;

the vector input module is used for converting each unmarked data in the unmarked data set into a preset format vector and inputting the preset format vector into an entity extraction model to be trained, wherein the entity extraction model comprises a feature extraction module and a prediction module;

the information enhancement module is used for enhancing the feature information output by the feature extraction module based on the new word data set, and inputting the enhanced feature information to the prediction module to obtain a prediction entity;

and the model generation module generates a loss function based on the predicted entity and the labeled data set, and performs iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model.

In a fourth aspect, an embodiment of the present invention further provides an entity extraction apparatus, including:

the data conversion module is used for acquiring data to be processed and converting the data to be processed into a preset format vector;

and the entity identification module is used for inputting the preset format vector to a pre-trained entity extraction model and identifying a target entity corresponding to the data to be processed, wherein the entity extraction model is obtained by training based on the entity extraction model training method provided by any embodiment of the invention.

In a fifth aspect, an embodiment of the present invention further provides an electronic device, where the electronic device includes:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method of training an entity extraction model, and/or a method of entity extraction, as provided by any of the embodiments of the invention.

In a sixth aspect, embodiments of the present invention further provide a storage medium containing computer-executable instructions, which when executed by a computer processor, are configured to perform the method for training an entity extraction model and/or the method for entity extraction described in any of the embodiments of the present invention.

According to the technical scheme of the embodiment of the invention, new words in an unlabeled data set are determined by acquiring the unlabeled data set and a labeled data set corresponding to the unlabeled data set, so as to form a new word data set; converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained, wherein the entity extraction model comprises a feature extraction module and a prediction module; based on the new word data set, enhancing the feature information output by the feature extraction module, and inputting the enhanced feature information to the prediction module to obtain a prediction entity; and generating a loss function based on the predicted entity and the labeled data set, and carrying out iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model. According to the technical scheme, when the entity is extracted, the entity boundary information can be effectively learned by means of the new word data set, so that the accuracy of entity extraction is improved.

Drawings

Fig. 1 is a flowchart of a training method for an entity extraction model according to an embodiment of the present invention.

Fig. 2 is a flowchart of an entity extraction method according to a second embodiment of the present invention.

Fig. 3 is a schematic diagram of an entity extraction structure based on new words according to a second embodiment of the present invention.

Fig. 4 is a schematic structural diagram of a training apparatus for an entity extraction model according to a third embodiment of the present invention.

Fig. 5 is a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention. It should be further noted that, for the convenience of description, only some of the structures related to the present invention are shown in the drawings, not all of the structures.

Example one

Fig. 1 is a flowchart of a method for training an entity extraction model according to an embodiment of the present invention, where this embodiment is applicable to a case where an entity extraction model is trained according to a data set, and the method may be executed by an apparatus for training an entity extraction model according to an embodiment of the present invention, where the apparatus may be implemented by software and/or hardware, and the apparatus may be configured on an electronic computing device, and specifically includes the following steps:

s110, obtaining an unlabeled data set and a labeled data set corresponding to the unlabeled data set, determining a new word in the unlabeled data set, and forming a new word data set.

In this embodiment, the labeled data set is a data set in which the learning data is processed by the user with the aid of the labeling tool, and the corresponding unlabeled data set is a data set in which the learning data is not processed by the user with the aid of the labeling tool. The type, content, etc. of the unlabeled data set is not specifically limited herein. Optionally, the unlabeled data set may be an electronic medical record unlabeled data set, and the labeled data set may be a data set formed by selecting a small number of samples from a large number of electronic medical record unlabeled data sets and labeling the samples. And determining new words from the data set through a new word discovery algorithm to form a new word data set, wherein the new words refer to words which are not registered in the dictionary, namely words which are not in the dictionary, and the new words can include, but are not limited to, abbreviations, proper nouns, derivatives, compound words and the like. The new word discovery algorithm mainly comprises a statistical-based method, a rule-based method and a method combining the statistical-based method and the rule-based method.

Optionally, the new word discovery algorithm may be a rule-based method, specifically, a rule base, a professional lexicon or a pattern base is established by analyzing word formation characteristics and appearance characteristics of the vocabulary, and the new vocabulary is discovered by a rule matching method. The new word discovery algorithm can be a statistical-based method, and the specifically adopted statistical model can be N-_gram (N-gram), the value of N can vary depending on the recognition sample and the requirements. For example, when the value of N is 2, the statistical model is bigram, and the statistical model only considers the grammar and data information obtained by two adjacent words. The new word discovery algorithm can also be a fusion method based on statistics and rules to realize more accurate new word discovery.

And S120, converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained, wherein the entity extraction model comprises a feature extraction module and a prediction module.

In this embodiment, each unmarked data in the unmarked data set is converted into a preset format vector, so that a vector adapted to the input requirement of the entity extraction model can be understood, and the entity extraction model to be trained can extract effective features, wherein the preset format can be determined according to the input requirement of the entity extraction model. The method for converting each unlabeled data in the unlabeled data set into a preset format vector may include, but is not limited to, a word to vector (char2vec) model, a word to vector (word2vec) model, and other vector conversion models.

The entity extraction model can be obtained by training the entity extraction model in advance through a large number of unlabeled data sets and labeled data sets corresponding to the unlabeled data sets. The trained entity extraction model comprises a feature extraction module and a prediction module, the feature extraction module learns the context relationship between each entity word in the unlabeled data and other entity words in the unlabeled data, and the prediction module predicts the entity type, wherein the entity type can include but is not limited to a name of a person, a place name or a mechanism; and training model parameters in the entity extraction model, and continuously adjusting the parameters of the entity extraction model to gradually reduce and stabilize the deviation between the output result of the model and the labeled data set so as to generate the entity extraction model.

The model parameters of the entity extraction model may adopt a random initialization principle, or may also adopt a fixed value initialization principle according to experience, which is not specifically limited in this embodiment. By carrying out initialization assignment on the weight and the offset value of each node of the model, the convergence speed and the performance of the model can be improved.

On the basis of the embodiment, the entity extraction model comprises a first extraction model based on a word vector and/or a second extraction model based on a word vector; the converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained includes: converting each unlabeled data in the unlabeled data set into a word vector, and inputting the word vector to a first extraction model to be trained; and/or converting each unlabeled data in the unlabeled data set into a word vector, and inputting the word vector to a second extraction model to be trained.

The word vector is vectorized representation of a word, and a method for converting each unmarked data in the unmarked data set into the word vector may include a word embedding model (char2vec), and the like; the word vector is a vectorized representation of a word, and a method for converting each unlabeled data in the unlabeled dataset into the word vector may include a word embedding model (word2vec), and the like. Wherein, the first extraction model based on the word vector and the second extraction model based on the word vector can be Named Entity Recognition (NER) models. The NER model may include, but is not limited to, LSTM-CRF, BERT-BilSTM-CRF, IDCNN/BilSTM-CRF, etc. deep learning models, which are not limited in this embodiment.

Exemplarily, an unlabeled data set can be trained through a word embedding model char2vec to obtain a word vector, and then the pre-trained word vector is input into an LSTM-CRF model; the unlabeled data set can be trained by a word embedding model word2vec to obtain a word vector, and then the pre-trained word vector is input into an LSTM-CRF model.

S130, based on the new word data set, the feature information output by the feature extraction module is enhanced, and the enhanced feature information is input to the prediction module to obtain a prediction entity.

In this embodiment, the feature information output by the feature extraction module is enhanced based on the new word data set, the feature information is further modified, and the potential multiple word information in the new word data set is used as the feature, so that the data in the feature information is more complete, thereby reducing the recognition error caused by ambiguity and improving the accuracy of the prediction module for predicting the entity.

Optionally, the feature information includes an emission matrix and a transition probability matrix, and correspondingly, the enhancing processing of the feature information output by the feature extraction module based on the new word data set includes: and determining an enhancement coefficient based on the number of new words in the new word data set, and enhancing the transition probability matrix based on the enhancement coefficient.

The system comprises a feature extraction module, a transmission matrix and a conversion probability matrix, wherein the transmission matrix and the conversion probability matrix are obtained by being output by the feature extraction module, and the transmission matrix describes the score of a certain entity class at the current position; the transition probability matrix describes the score from the current location entity category to the next location entity category. From the emission matrix and the transition probability matrix, a path score can be calculated, which can be understood as the probability of the entity type of the word in the current unlabeled dataset. The method comprises the steps of determining an enhancement coefficient based on the number of new words in a new word data set, then carrying out parameter adjustment on a transition probability matrix based on the enhancement coefficient, and combining potential various word information in the new word data set as characteristic information, so that recognition errors caused by ambiguity are reduced, and the accuracy of a prediction module for predicting an entity is improved.

It is emphasized that the transition probability matrix and the emission matrix are output from the feature extraction module and used as input of the prediction module, and the transition probability matrix and the emission matrix can be initialized randomly, so that parameters of the transition probability matrix and the emission matrix are updated along with model training.

For example, the path score may be represented by S, the transmission probability corresponding to the transmission matrix may be represented by E, and the transition probability corresponding to the transition probability matrix may be represented by T. The transmission probability may be referred to as a transmission score and the transition probability may also be referred to as a transition score. The method can be specifically realized by the following formula:

S＝E+T

the transition probability T of the path score is multiplied by an enhancement factor (1+ γ exp (N/10000)), where γ is a hyper parameter and is in the range of 0 < γ < 1. N is the number of new words in the new word dataset.

S140, generating a loss function based on the predicted entity and the labeled data set, and carrying out iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model.

In this embodiment, the loss function may be a log-likelihood loss function. Specifically, the loss function is generated by calculating an emission score and a conversion score corresponding to the emission matrix and the conversion probability matrix in the feature information, then performing normalization processing on the emission score and the conversion score to obtain a maximum likelihood probability, and then converting the maximum likelihood probability into a logarithmic form. Iterative parameter adjustment is carried out on the entity extraction model through the loss function, so that the difference between the predicted entity and the labeled data set is reduced and tends to be stable, and the target entity extraction model is obtained. The normalization processing method is not limited to this embodiment, and optionally, the normalization processing method may be a softmax function.

And (3) iteratively executing the training process of the model until the training times and the training precision are met or the convergence state is reached, determining that the entity extraction model is trained completely, and obtaining the target entity extraction model.

Optionally, the prediction entity is a prediction entity output by the first extraction model, or a prediction entity output by the second extraction model, or a prediction entity obtained by fusing a prediction entity output by the first extraction model and a prediction entity output by the second extraction model.

The prediction entity obtained by the entity extraction model can be a prediction entity output by a first extraction model based on the word vector, the prediction entity obtained by the entity extraction model can also be a prediction entity output by a second extraction model based on the word vector, and the prediction entity obtained by the entity extraction model can also be a prediction entity obtained by fusing the prediction entity of the first extraction model based on the word vector and the prediction entity output by the second extraction model based on the word vector. It should be noted that the predicted entity is determined by ranking the predicted entities based on the probability of the entity type obtained by the entity extraction model. And fusing the predicted entity output by the first extraction model based on the word vector and the predicted entity output by the second extraction model based on the word vector, wherein the fusing is the fusing of entity type probability. For example, the entity type probability of the predicted entity of the first extraction model based on the word vector is 0.8, the entity type probability of the predicted entity output by the second extraction model based on the word vector is 0.6, and the two are subjected to average value calculation to obtain the entity type probability of the fused predicted entity of 0.7, so that the equilibrium point of the entity type probability of the current predicted entity is obtained, and the reliability of the predicted entity is ensured. In the embodiment, the prediction entity of the first extraction model based on the word vector and the prediction entity output by the second extraction model based on the word vector are fused, so that the entity extraction model can utilize the information of the word vector and also fuse the context related information of the word vector, and the identification accuracy of the entity extraction model is improved.

The embodiment of the invention provides a training method of an entity extraction model, which comprises the steps of determining new words in an unlabeled data set by acquiring the unlabeled data set and a labeled data set corresponding to the unlabeled data set to form a new word data set; converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained, wherein the entity extraction model comprises a feature extraction module and a prediction module; based on the new word data set, enhancing the feature information output by the feature extraction module, and inputting the enhanced feature information to the prediction module to obtain a prediction entity; and generating a loss function based on the predicted entity and the labeled data set, and carrying out iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model. According to the technical scheme, when the entity is extracted, the entity boundary information can be effectively learned by means of the new word data set, so that the accuracy of entity extraction is improved.

Example two

Fig. 2 is a flowchart of an entity extraction method according to a second embodiment of the present invention, where this embodiment is applicable to a case of performing entity extraction by using an entity extraction model, and the method may be executed by an entity extraction apparatus according to the second embodiment of the present invention, where the apparatus may be implemented by software and/or hardware, and the apparatus may be configured on an electronic computing device, and specifically includes the following steps:

s210, obtaining data to be processed, and converting the data to be processed into a preset format vector.

S220, inputting the preset format vector to a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed.

In this embodiment, the data to be processed is an unlabeled data set, the data of the unlabeled data set is converted into a preset format vector by obtaining the unlabeled data set, where the preset format vector may include but is not limited to a word vector and a word vector, and the preset format vector is input to a pre-trained entity extraction model to obtain a target entity corresponding to the unlabeled data set.

In an optional implementation manner of the embodiment of the present invention, after acquiring the data to be processed, the method further includes: determining new words in the data to be processed; the inputting the preset format vector into a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed includes: and inputting the preset format vector to a feature extraction module of the entity extraction model to obtain feature information, enhancing the feature information based on the new words, and inputting the enhanced feature information to a prediction module of the entity extraction model to obtain a target entity.

The entity extraction model may use a Long Short-Term Memory network (LSTM) as a feature extraction module, or may use a deformed network based on the LSTM as a feature extraction module, which is not limited in this embodiment. The prediction module may adopt a Conditional Random Field (CRF), and may select a prediction result of the CRF through a viterbi algorithm to obtain an optimal target entity.

In an optional implementation manner of the embodiment of the present invention, the entity extraction model includes a first extraction model based on a word vector and/or a second extraction model based on a word vector; the inputting the preset format vector into a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed includes: inputting the word vector converted from the data to be processed into the first extraction model to obtain a first entity; and/or inputting the word vector converted from the data to be processed into the second extraction model to obtain a second entity; and determining the first entity or the second entity as a target entity, or fusing the first entity and the second entity to obtain the target entity.

For example, as shown in fig. 3, the data to be processed may be represented by S1, and a new word in S1 is found by a new word finding algorithm to form a new word set, which may be represented by S3. Converting the data to be processed S1 into a word vector (char Embedding) by a char2vec method; chinese word segmentation is carried out on the data to be processed S1 based on the dictionary, the text in the data set to be processed is segmented into words, and the words are converted into word vectors (word Embedding) by a word2vec method. The first extraction model can be represented by M1, the second extraction model can be represented by M2, the word vectors are input into the model M1, the transition probability matrix in M1 is enhanced through S3, and the enhanced transition probability matrix is input into a CRF layer of M2 for Viterbi decoding to obtain a first entity; and inputting the word vector into a model M2, enhancing the conversion probability matrix in M2 through S3, and inputting the enhanced conversion probability matrix into a CRF layer of M2 for Viterbi decoding to obtain a second entity. And fusing the first entity and the second entity obtained by the model M1 and the model M2 to obtain a target entity.

The embodiment of the invention provides an entity extraction method, which comprises the steps of converting data to be processed into a preset format vector by acquiring the data to be processed; and inputting the preset format vector to a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed. According to the technical scheme, when the entity is extracted, the entity boundary information can be effectively learned by means of the new word data set, so that the accuracy of entity extraction is improved.

EXAMPLE III

Fig. 4 is a schematic structural diagram of a training apparatus for an entity extraction model according to a third embodiment of the present invention, where the training apparatus for an entity extraction model provided in this embodiment may be implemented by software and/or hardware, and may be configured in a terminal and/or a server to implement the training method for an entity extraction model in the third embodiment of the present invention. The device may specifically comprise: a new word determination module 310, a vector input module 320, an information enhancement module 330, and a model generation module 340.

The new word determining module 310 is configured to obtain an unlabeled data set and a labeled data set corresponding to the unlabeled data set, determine a new word in the unlabeled data set, and form a new word data set; the vector input module 320 is configured to convert each unlabeled data in the unlabeled data set into a preset format vector, and input the preset format vector to an entity extraction model to be trained, where the entity extraction model includes a feature extraction module and a prediction module; the information enhancement module 330 is configured to perform enhancement processing on the feature information output by the feature extraction module based on the new word data set, and input the enhanced feature information to the prediction module to obtain a predicted entity; the model generating module 340 generates a loss function based on the predicted entity and the labeled data set, and performs iterative parameter adjustment on the entity extraction model to obtain a target entity extraction model.

On the basis of any optional technical scheme in the embodiment of the present invention, optionally, the feature information includes an emission matrix and a transition probability matrix; the information enhancement module 330 may be configured to:

and determining an enhancement coefficient based on the number of new words in the new word data set, and enhancing the transition probability matrix based on the enhancement coefficient.

On the basis of any optional technical solution in the embodiment of the present invention, optionally, the entity extraction model includes a first extraction model based on a word vector and/or a second extraction model based on a word vector;

the vector input module 320 may include:

the word vector conversion unit is used for converting each unmarked data in the unmarked data set into a word vector and inputting the word vector to a first extraction model to be trained; and/or the presence of a gas in the gas,

and the word vector conversion unit is used for converting each unlabeled data in the unlabeled data set into a word vector and inputting the word vector into a second extraction model to be trained.

On the basis of any optional technical solution in the embodiment of the present invention, optionally, the prediction entity is a prediction entity output by a first extraction model, or a prediction entity output by a second extraction model, or a prediction entity obtained by fusing a prediction entity output by a first extraction model and a prediction entity output by a second extraction model.

The embodiment further provides an entity extraction apparatus, which may include:

and the entity identification module is used for inputting the preset format vector to a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed, wherein the entity extraction model is obtained by training based on the entity extraction model training method according to any one of claims 1 to 4.

On the basis of any optional technical solution in the embodiment of the present invention, optionally, after the data to be processed is obtained, the data conversion module may further include:

the new word determining unit is used for determining new words in the data to be processed;

the entity identification module may be to:

and inputting the preset format vector to a feature extraction module of the entity extraction model to obtain feature information, enhancing the feature information based on the new words, and inputting the enhanced feature information to a prediction module of the entity extraction model to obtain a target entity.

the entity identification module is specifically operable to:

inputting the word vector converted from the data to be processed into the first extraction model to obtain a first entity; and/or inputting the word vector converted from the data to be processed into the second extraction model to obtain a second entity;

and determining the first entity or the second entity as a target entity, or fusing the first entity and the second entity to obtain the target entity.

The training device for the entity extraction model provided by the embodiment of the invention can execute the training method for the entity extraction model provided by any embodiment of the invention, and has corresponding functional modules and beneficial effects of the execution method.

Example four

Fig. 5 is a schematic structural diagram of an electronic device according to a fourth embodiment of the present invention. FIG. 5 illustrates a block diagram of an exemplary electronic device 12 suitable for use in implementing embodiments of the present invention. The electronic device 12 shown in fig. 5 is only an example and should not bring any limitation to the function and the scope of use of the embodiment of the present invention.

As shown in FIG. 5, electronic device 12 is embodied in the form of a general purpose computing device. The components of electronic device 12 may include, but are not limited to: one or more processors or processing units 16, a system memory 28, and a bus 18 that couples various system components including the system memory 28 and the processing unit 16.

Bus 18 represents one or more of any of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, and a processor or local bus using any of a variety of bus architectures. By way of example, such architectures include, but are not limited to, Industry Standard Architecture (ISA) bus, micro-channel architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.

Electronic device 12 typically includes a variety of computer system readable media. Such media may be any available media that is accessible by electronic device 12 and includes both volatile and nonvolatile media, removable and non-removable media.

The system memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM)30 and/or cache memory 32. The electronic device 12 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 5, and commonly referred to as a "hard drive"). Although not shown in FIG. 5, a magnetic disk drive for reading from and writing to a removable, nonvolatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, nonvolatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In these cases, each drive may be connected to bus 18 by one or more data media interfaces. System memory 28 may include at least one program product having a set (e.g., at least one) of program modules that are configured to carry out the functions of embodiments of the invention.

A program/utility 40 having a set (at least one) of program modules 42 may be stored, for example, in system memory 28, such program modules 42 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each of which examples or some combination thereof may comprise an implementation of a network environment. Program modules 42 generally carry out the functions and/or methodologies of the described embodiments of the invention.

Electronic device 12 may also communicate with one or more external devices 14 (e.g., keyboard, pointing device, display 24, etc.), with one or more devices that enable a user to interact with electronic device 12, and/or with any devices (e.g., network card, modem, etc.) that enable electronic device 12 to communicate with one or more other computing devices. Such communication may be through an input/output (I/O) interface 22. Also, the electronic device 12 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet) via the network adapter 20. As shown in FIG. 5, the network adapter 20 communicates with the other modules of the electronic device 12 via the bus 18. It should be appreciated that although not shown in FIG. 5, other hardware and/or software modules may be used in conjunction with electronic device 12, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

The processing unit 16 executes programs stored in the system memory 28 to perform various functional applications and data processing, such as a training method for an entity extraction model and/or an entity extraction method provided by the present embodiment.

EXAMPLE five

Embodiments of the present invention also provide a storage medium containing computer-executable instructions, which when executed by a computer processor, perform a method for training an entity extraction model and/or a method for entity extraction.

Computer storage media for embodiments of the invention may employ any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

A computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider).

It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims

1. A method for training an entity extraction model is characterized by comprising the following steps:

2. The method of claim 1, wherein the characteristic information comprises a transmit matrix and a transition probability matrix;

the enhancing processing of the feature information output by the feature extraction module based on the new word data set comprises:

3. The method according to any one of claims 1-2, wherein the entity extraction model comprises a first extraction model based on word vectors and/or a second extraction model based on word vectors;

the converting each unmarked data in the unmarked data set into a preset format vector, and inputting the preset format vector into an entity extraction model to be trained includes:

converting each unlabeled data in the unlabeled data set into a word vector, and inputting the word vector to a first extraction model to be trained; and/or the presence of a gas in the gas,

and converting each unlabeled data in the unlabeled data set into a word vector, and inputting the word vector to a second extraction model to be trained.

4. The method according to claim 3, wherein the prediction entity is a prediction entity output by the first extraction model, or a prediction entity output by the second extraction model, or a prediction entity obtained by fusing a prediction entity output by the first extraction model and a prediction entity output by the second extraction model.

5. An entity extraction method, comprising:

inputting the preset format vector into a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed, wherein the entity extraction model is obtained by training based on the entity extraction model training method according to any one of claims 1 to 4.

6. The method of claim 5, wherein after acquiring the data to be processed, the method further comprises:

determining new words in the data to be processed;

the inputting the preset format vector into a pre-trained entity extraction model to obtain a target entity corresponding to the data to be processed includes:

7. The method according to claim 5 or 6, wherein the entity extraction model comprises a first extraction model based on word vectors and/or a second extraction model based on word vectors;

8. An apparatus for training an entity extraction model, comprising:

9. An entity extraction apparatus, comprising:

and the entity identification module is used for inputting the preset format vector to a pre-trained entity extraction model and identifying a target entity corresponding to the data to be processed, wherein the entity extraction model is obtained by training based on the entity extraction model training method according to any one of claims 1 to 4.

10. An electronic device, characterized in that the electronic device comprises:

one or more processors;

storage means for storing one or more programs;

when executed by the one or more processors, cause the one or more processors to implement a method of training an entity extraction model as claimed in any one of claims 1 to 4, and/or a method of entity extraction as claimed in any one of claims 5 to 7.

11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of training an entity extraction model according to any one of claims 1 to 4, and/or a method of entity extraction according to any one of claims 5 to 7.