CN112232070A - Natural language processing model construction method, system, electronic device and storage medium - Google Patents

Natural language processing model construction method, system, electronic device and storage medium Download PDF

Info

Publication number
CN112232070A
CN112232070A CN202011124616.1A CN202011124616A CN112232070A CN 112232070 A CN112232070 A CN 112232070A CN 202011124616 A CN202011124616 A CN 202011124616A CN 112232070 A CN112232070 A CN 112232070A
Authority
CN
China
Prior art keywords
word vector
natural language
language processing
loss function
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011124616.1A
Other languages
Chinese (zh)
Inventor
张鹏涛
景艳山
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Minglue Zhaohui Technology Co Ltd
Original Assignee
Beijing Minglue Zhaohui Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Minglue Zhaohui Technology Co Ltd filed Critical Beijing Minglue Zhaohui Technology Co Ltd
Priority to CN202011124616.1A priority Critical patent/CN112232070A/en
Publication of CN112232070A publication Critical patent/CN112232070A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/151Transformation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a natural language processing model construction method, a system, electronic equipment and a storage medium, wherein the technical scheme of the method comprises the steps of extracting information by adopting a combined extraction method, and mining information with different granularities, including word vectors, word vectors and the like corresponding to part-of-speech information; in addition, the method carries out negative sampling on the original training data to obtain a batch of negative samples, so that the problem of low resources of the model is solved, and the identification difficulty of the model is increased. The invention improves the implementation effect of extracting the unstructured text information and improves the robustness of the model.

Description

Natural language processing model construction method, system, electronic device and storage medium
Technical Field
The invention belongs to the field of data processing, and particularly relates to a natural language processing model construction method and system, electronic equipment and a storage medium.
Background
A large amount of unstructured texts exist in the field of language processing, particularly news texts, a large number of entities exist in the texts, different relationships exist among different entities, the unstructured texts can be effectively extracted, and the automatic understanding of the texts and the construction of a knowledge graph can be assisted.
The prior art mainly comprises a template-based method, a pipeline-based information extraction method and a semi-supervised-based information extraction method, but the encoder of the method is not strong enough, the feature dimensionality of the encoding is not rich enough, entity identification and relationship classification cannot be trained simultaneously, joint training cannot be performed by using original data, the problem of error accumulation between models cannot be solved, effective enhancement on the data cannot be performed, and the effect of the models is greatly influenced under the condition of less data.
Disclosure of Invention
The embodiment of the application provides a natural language processing model construction method, a natural language processing model construction system, electronic equipment and a storage medium, and aims to at least solve the problem that the extraction effect of current unstructured text information is poor.
In a first aspect, an embodiment of the present application provides a method for constructing a natural language processing model, including:
s101, marking original training data in a text sample;
s102, carrying out negative sampling on the original training data to obtain negative example data;
s103, combining the original training data and the negative example data into final training data;
s104, obtaining part-of-speech information in the text sample by using a natural language processing tool, and training a first word vector according to the final training data;
s105, converting words in the text sample into a word vector and a second word vector, and combining the word vector and the second word vector;
s106, obtaining an entity classification loss function of the text sample according to the first word vector, the word vector and the second word vector;
s107, obtaining a relation classification loss function of the text sample according to the relation information of the first word vector, the second word vector and the text sample;
and S108, adding the entity classification loss function and the relation classification loss function to obtain a combined loss function, and performing back propagation gradient operation on the combined loss function to obtain a natural language processing model.
Preferably, the original training data includes text, entities in the text, corresponding lengths of the text, tag sets of the entities, positions of the text pairs, and tag sets of the text pairs.
Preferably, the step S102 includes designating the number of entities and the number of relationships of the negative example, acquiring different entities, and forming corresponding relationships according to the different entities.
Preferably, the step S104 includes using Word2Vec when converting the first Word vector; the first word vector is a word vector corresponding to the part of speech information.
Preferably, the step S105 includes using RoBERTa when transforming the Word vector and using Word2Vec when transforming the second Word vector.
Preferably, the step S106 includes inputting the CLS vector information, the vector information of the entity length, and the vector information after the entity is maximally pooled into the entity classification model, splicing, and performing classification processing using a softmax function.
Preferably, the step S107 includes: and inputting the vector information of the head entity, the vector information of the tail entity and the vector information obtained by performing maximum pooling on the context information between the two entities into a relational classification model, splicing, and performing classification processing by using a softmax function.
In a second aspect, an embodiment of the present application provides a natural language processing model building system, which is suitable for the above natural language processing model building method, and includes:
a pretreatment unit: marking original training data in a text sample, carrying out negative sampling on the original training data to obtain negative example data, and combining the original training data and the negative example data into final training data;
a vector conversion unit: using a natural language processing tool to obtain part-of-speech information in the text sample, training a first word vector according to the final training data, converting words in the text sample into a word vector and a second word vector, and combining the word vector and the second word vector;
an entity classification loss function acquisition unit: obtaining an entity classification loss function of the text sample according to the first word vector, the word vector and the second word vector;
a relationship classification loss function acquisition unit: obtaining a relation classification loss function of the text sample according to the relation information of the first word vector, the second word vector and the text sample;
a model construction unit: and adding the entity classification loss function and the relation classification loss function to obtain a joint loss function, and performing back propagation gradient operation on the joint loss function to obtain a natural language processing model.
In some of these embodiments, the raw training data includes text, entities in the text, lengths to which the text corresponds, a set of labels for the entities, locations for pairs of the text, and a set of labels for pairs of the text.
In some embodiments, the preprocessing unit includes an entity number and a relationship number of a specified negative case, acquires different entities, and forms corresponding relationships according to the different entities.
In some of these embodiments, the vector conversion unit includes using Word2Vec in converting the first Word vector; the first word vector is a word vector corresponding to the part of speech information.
In some of these embodiments, the vector conversion unit includes using RoBERTa when converting the Word vector and using Word2Vec when converting the second Word vector.
In some embodiments, the entity classification loss function obtaining unit inputs the CLS vector information, the vector information of the entity length, and the vector information after the entity is maximally pooled into the entity classification model, and performs the splicing, and performs the classification processing using a softmax function.
In some embodiments, the relationship classification loss function obtaining unit includes: and inputting the vector information of the head entity, the vector information of the tail entity and the vector information obtained by performing maximum pooling on the context information between the two entities into a relational classification model, splicing, and performing classification processing by using a softmax function.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor, when executing the computer program, implements a natural language processing model building method as described in the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements a natural language processing model building method as described in the first aspect above.
Compared with the related art, the natural language processing model construction method provided by the embodiment of the application comprises the following steps:
1. the information is extracted by adopting a combined extraction method, so that the information of an entity can be better utilized, the problem of error accumulation caused by pipeline can be reduced, a model is finally obtained, and the model is convenient to deploy.
2. The text information can be more fully encoded by adopting the encoding information with different granularities, including pos level-based information, RoBERTA level-based information, word2 vec-based word level information and CLS information in RoBERTA.
3. And carrying out negative sampling on the original sample to obtain a batch of negative samples, so as to solve the problem of low resources of the model and increase the identification difficulty of the model, thereby improving the robustness of the model.
The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below to provide a more thorough understanding of the application.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flow diagram of a method for constructing a natural language processing model according to an embodiment of the present application;
FIG. 2 is a framework diagram of a natural language processing model building system according to an embodiment of the present application;
FIG. 3 is a block diagram of an electronic device according to an embodiment of the present application;
in the above figures:
11. a pre-processing unit; 12. a vector conversion unit; 13. an entity classification loss function acquisition unit; 14. a relation classification loss function obtaining unit; 15. a model construction unit; 20. a bus; 21. a processor; 22. a memory; 23. a communication interface.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
A large amount of unstructured texts exist in the language processing field, a large number of entities exist in the texts, different relationships exist among different entities, the unstructured texts can be effectively extracted, and the automatic understanding of the texts and the construction of a knowledge graph can be assisted. The embodiment of the invention provides a method and a system for constructing a natural language processing model, electronic equipment and a storage medium, which are applicable to information extraction of unstructured texts. The embodiment of the invention can be used for unstructured texts in the news field.
Some of the terms of art to which the invention relates are described below:
information Extraction (IE) refers to extracting corresponding entities and relationships between entities from a piece of text, and specific techniques include named entity identification (NER) and relationship classification (RE). Named Entity Recognition (NER) is a very fundamental task in the fields of NLP, knowledge-graph, etc., aimed at locating and classifying named entities in text into predefined categories such as people, organizations, locations, temporal expressions, quantities, monetary values, percentages, etc. The effect of named entity recognition directly determines the effect of downstream tasks. The relation classification (RE) is a form of text classification, and performs a classification operation on the extracted entity pairs and the text information to obtain a relation of the entity pairs.
Word2vec, a group of correlation models used to generate Word vectors. These models are shallow, two-layer neural networks that are trained to reconstruct linguistic word text. The network is represented by words and the input words in adjacent positions are guessed, and the order of the words is unimportant under the assumption of the bag-of-words model in word2 vec. After training is completed, the word2vec model can be used to map each word to a vector, which can be used to represent word-to-word relationships, and the vector is a hidden layer of the neural network.
BERT is a pre-training model issued by Google, and a fixed mask mode and a smaller batch size are adopted for training, so the power of BERT still needs to be further developed. In 2019, a deeper and larger pre-training model ROBERTA is published by a fackbook, the ROBERTA adopts a larger batch size, more data, longer training time and longer sentences, and a dynamic mask method is adopted to remove NSP tasks in BERTs, so that the ROBERTA obtains better performance and exceeds BERT in each index.
Softmax is a function used in the classification process to implement multi-classification, which simply maps some of the output neurons to real numbers between (0-1), and the normalization guarantees a sum of 1, so that the sum of the probabilities for the multi-classification is also exactly 1.
The maximum pooling is a common pooling operation that reduces the amount of data by the maximum value, and is generally performed by dividing an input image into several rectangular regions and outputting the maximum value for each sub-field. At present, the common pooling method has average pooling besides maximum pooling, reduces complex calculation from an upper hidden layer, can not be influenced by the inclination or rotation of a target, and can effectively reduce a sampling method of data dimension.
Referring to fig. 1, a flowchart of a method for constructing a natural language processing model according to an embodiment of the present application includes the following steps:
s101, marking original training data in a text sample;
s102, carrying out negative sampling on the original training data to obtain negative example data;
s103, combining the original training data and the negative example data into final training data;
s104, obtaining part-of-speech information in the text sample by using a natural language processing tool, and training a first word vector according to the final training data;
s105, converting words in the text sample into a word vector and a second word vector, and combining the word vector and the second word vector;
s106, obtaining an entity classification loss function of the text sample according to the first word vector, the word vector and the second word vector;
s107, obtaining a relation classification loss function of the text sample according to the relation information of the first word vector, the second word vector and the text sample;
and S108, adding the entity classification loss function and the relation classification loss function to obtain a combined loss function, and performing back propagation gradient operation on the combined loss function to obtain a natural language processing model.
In order to solve the problem of error accumulation caused by pipeline in the prior art, the embodiment of the invention provides a joint method for learning the entity in the text and the relationship between the entity pair, namely, considering the entity classification loss function and the relationship classification loss function at the same time.
It should be noted that the steps illustrated in the above-described flow diagrams or in the flow diagrams of the figures may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flow diagrams, in some cases, the steps illustrated or described may be performed in an order different than here.
The original training data comprises a text, entities in the text, lengths corresponding to the text, a label set of the entities, positions of text pairs and a label set of the text pairs.
The step S102 includes designating the number of entities and the number of relationships of the negative examples, acquiring different entities, and forming corresponding relationships according to the different entities.
Wherein, the step S104 comprises using Word2Vec when converting the first Word vector; the first word vector is a word vector corresponding to the part of speech information.
The ROBERTA is adopted as an encoder, and the ROBERTA adopts measures of larger batch size, more data, longer training time, longer sentences, dynamic mask and the like, so that the effect of the ROBERTA is better than that of bert in each data set. While taking into account the length of each entity, each entity is assigned a width vector. Since the bert series models are all word segmentation based on bytes, and information at a word level and information at a pos level are not considered, the embodiment of the invention proposes to blend information of word vectors and information at a pos level into feature codes.
Wherein, the step S105 comprises using RoBERTA when converting the Word vector and using Word2Vec when converting the second Word vector.
In some of these embodiments, the pre-training model used in transforming the word vector may also be any of xlnet, albert, t 5.
The step S106 includes inputting the CLS vector information, the vector information of the entity length, and the vector information after the entity is maximally pooled into the entity classification model, and performing a splicing process, and performing a classification process using a softmax function.
Wherein the step S107 includes: and inputting the vector information of the head entity, the vector information of the tail entity and the vector information obtained by performing maximum pooling on the context information between the two entities into a relational classification model, splicing, and performing classification processing by using a softmax function.
The embodiment of the application provides a natural language processing model construction system, which is suitable for the natural language processing model construction method. As used hereinafter, the terms "module," "unit," "subunit," and the like may implement a combination of software and/or hardware for a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware or a combination of software and hardware is also possible and contemplated.
Fig. 2 is a framework diagram of a natural language processing model building system according to an embodiment of the present application, and includes a preprocessing unit 11, a vector transformation unit 12, an entity classification loss function obtaining unit 13, a relationship classification loss function obtaining unit 14, and a model building unit 15, where:
the preprocessing unit 11: marking original training data in a text sample, carrying out negative sampling on the original training data to obtain negative example data, and combining the original training data and the negative example data into final training data;
the vector conversion unit 12: using a natural language processing tool to obtain part-of-speech information in the text sample, training a first word vector according to the final training data, converting words in the text sample into a word vector and a second word vector, and combining the word vector and the second word vector;
entity classification loss function acquisition unit 13: obtaining an entity classification loss function of the text sample according to the first word vector, the word vector and the second word vector;
the relationship classification loss function acquisition unit 14: obtaining a relation classification loss function of the text sample according to the relation information of the first word vector, the second word vector and the text sample;
the model construction unit 15: and adding the entity classification loss function and the relation classification loss function to obtain a joint loss function, and performing back propagation gradient operation on the joint loss function to obtain a natural language processing model.
In some of these embodiments, the raw training data includes text, entities in the text, lengths to which the text corresponds, a set of labels for the entities, locations for pairs of the text, and a set of labels for pairs of the text.
In some embodiments, the preprocessing unit 11 includes an entity number and a relationship number of a specified negative example, obtains different entities, and forms corresponding relationships according to the different entities.
In some of these embodiments, the vector conversion unit 12 includes using Word2Vec in converting the first Word vector; the first word vector is a word vector corresponding to the part of speech information.
In some of these embodiments, the vector conversion unit 12 includes RoBERTa for converting the Word vector and Word2Vec for converting the second Word vector.
In some embodiments, the entity classification loss function obtaining unit 13 inputs and splices CLS vector information, vector information of entity length, and vector information after the entity is maximally pooled into an entity classification model, and performs classification processing using a softmax function.
In some of these embodiments, the relationship classification loss function obtaining unit 14 includes: and inputting the vector information of the head entity, the vector information of the tail entity and the vector information obtained by performing maximum pooling on the context information between the two entities into a relational classification model, splicing, and performing classification processing by using a softmax function.
The above units may be functional units or program units, and may be implemented by software or hardware. For units implemented by hardware, the units may be located in the same processor; or the units may be located in different processors in any combination.
In addition, the method for constructing the natural language processing model according to the embodiment of the present application described in conjunction with fig. 1 may be implemented by an electronic device. Fig. 3 is a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application.
The computer device may comprise a processor 21 and a memory 22 in which computer program instructions are stored.
Specifically, the processor 21 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 22 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 22 may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, magnetic tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 22 may include removable or non-removable (or fixed) media, where appropriate. The memory 22 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 22 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 22 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory 22 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 21.
The processor 21 realizes any one of the natural language processing model construction methods in the above embodiments by reading and executing computer program instructions stored in the memory 22.
In some of these embodiments, the computer device may also include a communication interface 23 and a bus 20. As shown in fig. 2, the processor 21, the memory 22, and the communication interface 23 are connected via the bus 20 to complete mutual communication.
The communication port 23 may be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
The bus 20 includes hardware, software, or both to couple the components of the electronic device to one another. Bus 20 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 20 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 20 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The computer device may execute a natural language processing model building method in the embodiment of the present application.
In addition, in combination with the natural language processing model construction method in the foregoing embodiment, the embodiment of the present application may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the natural language processing model construction methods of the embodiments described above.
And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A natural language processing model construction method is characterized by comprising the following steps:
s101, marking original training data in a text sample;
s102, carrying out negative sampling on the original training data to obtain negative example data;
s103, combining the original training data and the negative example data into final training data;
s104, obtaining part-of-speech information in the text sample by using a natural language processing tool, and training a first word vector according to the final training data;
s105, converting words in the text sample into a word vector and a second word vector, and combining the word vector and the second word vector;
s106, obtaining an entity classification loss function of the text sample according to the first word vector, the word vector and the second word vector;
s107, obtaining a relation classification loss function of the text sample according to the relation information of the first word vector, the second word vector and the text sample;
and S108, adding the entity classification loss function and the relation classification loss function to obtain a combined loss function, and performing back propagation gradient operation on the combined loss function to obtain a natural language processing model.
2. The method of constructing a natural language processing model of claim 1, wherein the raw training data includes text, entities in the text, lengths to which the text corresponds, a set of labels for the entities, locations of pairs of the text, and a set of labels for the pairs of the text.
3. The method for constructing a natural language processing model according to claim 1, wherein the step S102 comprises specifying the number of entities and the number of relationships of the negative examples, obtaining different entities, and composing corresponding relationships according to the different entities.
4. The natural language processing model building method of claim 1, wherein the step S104 includes using Word2Vec when transforming the first Word vector; the first word vector is a word vector corresponding to the part of speech information.
5. The method of constructing a natural language processing model of claim 1 wherein said step S105 comprises using RoBERTa for translating said Word vector and using Word2Vec for translating said second Word vector.
6. The method for constructing a natural language processing model according to claim 1, wherein the step S106 comprises inputting and splicing CLS vector information, vector information of entity length, and vector information after the entity is maximally pooled into the entity classification model, and performing classification processing using a softmax function.
7. The natural language processing model building method of claim 1, wherein the step S107 includes: and inputting the vector information of the head entity, the vector information of the tail entity and the vector information obtained by performing maximum pooling on the context information between the two entities into a relational classification model, splicing, and performing classification processing by using a softmax function.
8. A natural language processing model building system, comprising:
a pretreatment unit: marking original training data in a text sample, carrying out negative sampling on the original training data to obtain negative example data, and combining the original training data and the negative example data into final training data;
a vector conversion unit: using a natural language processing tool to obtain part-of-speech information in the text sample, training a first word vector according to the final training data, converting words in the text sample into a word vector and a second word vector, and combining the word vector and the second word vector;
an entity classification loss function acquisition unit: obtaining an entity classification loss function of the text sample according to the first word vector, the word vector and the second word vector;
a relationship classification loss function acquisition unit: obtaining a relation classification loss function of the text sample according to the relation information of the first word vector, the second word vector and the text sample;
a model construction unit: and adding the entity classification loss function and the relation classification loss function to obtain a joint loss function, and performing back propagation gradient operation on the joint loss function to obtain a natural language processing model.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the natural language processing model building method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium on which a computer program is stored, the program, when being executed by a processor, implementing a natural language processing model construction method according to claims 1 to 7.
CN202011124616.1A 2020-10-20 2020-10-20 Natural language processing model construction method, system, electronic device and storage medium Pending CN112232070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011124616.1A CN112232070A (en) 2020-10-20 2020-10-20 Natural language processing model construction method, system, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011124616.1A CN112232070A (en) 2020-10-20 2020-10-20 Natural language processing model construction method, system, electronic device and storage medium

Publications (1)

Publication Number Publication Date
CN112232070A true CN112232070A (en) 2021-01-15

Family

ID=74118087

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011124616.1A Pending CN112232070A (en) 2020-10-20 2020-10-20 Natural language processing model construction method, system, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN112232070A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860871A (en) * 2021-03-17 2021-05-28 网易(杭州)网络有限公司 Natural language understanding model training method, natural language understanding method and device
CN113762381A (en) * 2021-09-07 2021-12-07 上海明略人工智能(集团)有限公司 Emotion classification method, system, electronic device and medium
CN113761872A (en) * 2021-09-07 2021-12-07 上海明略人工智能(集团)有限公司 Data detection method, system, electronic device and medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581387A (en) * 2020-05-09 2020-08-25 电子科技大学 Entity relation joint extraction method based on loss optimization

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581387A (en) * 2020-05-09 2020-08-25 电子科技大学 Entity relation joint extraction method based on loss optimization

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨绍雄: "基于改进语义假设的远程监督深度实体关系提取方法的研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》, no. 12, 15 December 2018 (2018-12-15), pages 13 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860871A (en) * 2021-03-17 2021-05-28 网易(杭州)网络有限公司 Natural language understanding model training method, natural language understanding method and device
CN112860871B (en) * 2021-03-17 2022-06-14 网易(杭州)网络有限公司 Natural language understanding model training method, natural language understanding method and device
CN113762381A (en) * 2021-09-07 2021-12-07 上海明略人工智能(集团)有限公司 Emotion classification method, system, electronic device and medium
CN113761872A (en) * 2021-09-07 2021-12-07 上海明略人工智能(集团)有限公司 Data detection method, system, electronic device and medium
CN113762381B (en) * 2021-09-07 2023-12-19 上海明略人工智能(集团)有限公司 Emotion classification method, system, electronic equipment and medium

Similar Documents

Publication Publication Date Title
CN112115267B (en) Training method, device, equipment and storage medium of text classification model
CN111460820B (en) Network space security domain named entity recognition method and device based on pre-training model BERT
CN112232070A (en) Natural language processing model construction method, system, electronic device and storage medium
CN111191032B (en) Corpus expansion method, corpus expansion device, computer equipment and storage medium
CN112507704B (en) Multi-intention recognition method, device, equipment and storage medium
CN112417859A (en) Intention recognition method, system, computer device and computer-readable storage medium
US20220300708A1 (en) Method and device for presenting prompt information and storage medium
CN113722438A (en) Sentence vector generation method and device based on sentence vector model and computer equipment
CN112417878A (en) Entity relationship extraction method, system, electronic equipment and storage medium
CN111767714B (en) Text smoothness determination method, device, equipment and medium
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
CN115759254A (en) Question-answering method, system and medium based on knowledge-enhanced generative language model
CN113449081A (en) Text feature extraction method and device, computer equipment and storage medium
CN112199954A (en) Disease entity matching method and device based on voice semantics and computer equipment
CN112528653A (en) Short text entity identification method and system
CN112035622A (en) Integrated platform and method for natural language processing
CN116561320A (en) Method, device, equipment and medium for classifying automobile comments
CN110705258A (en) Text entity identification method and device
CN115859999A (en) Intention recognition method and device, electronic equipment and storage medium
CN113255334A (en) Method, system, electronic device and storage medium for calculating word vector
CN114116975A (en) Multi-intention identification method and system
CN113536773A (en) Commodity comment sentiment analysis method and system, electronic equipment and storage medium
CN112329445A (en) Disorder code judging method, disorder code judging system, information extracting method and information extracting system
CN112149389A (en) Resume information structured processing method and device, computer equipment and storage medium
CN110928987A (en) Legal provision retrieval method based on neural network hybrid model and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination