CN112926325A - Chinese character relation extraction construction method based on BERT neural network - Google Patents

Chinese character relation extraction construction method based on BERT neural network Download PDF

Info

Publication number
CN112926325A
CN112926325A CN202110186063.0A CN202110186063A CN112926325A CN 112926325 A CN112926325 A CN 112926325A CN 202110186063 A CN202110186063 A CN 202110186063A CN 112926325 A CN112926325 A CN 112926325A
Authority
CN
China
Prior art keywords
relation
relationship
service
sentence
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110186063.0A
Other languages
Chinese (zh)
Inventor
刘登涛
张建
王谦超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Technology
Original Assignee
Beijing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Technology filed Critical Beijing University of Technology
Priority to CN202110186063.0A priority Critical patent/CN112926325A/en
Publication of CN112926325A publication Critical patent/CN112926325A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a Chinese character relation extraction method based on a BERT neural network, which is used for solving the technical problem that the character relation in an unstructured Chinese text is difficult to extract, and the specific content comprises a data collection module, a character recognition module and a character recognition module, wherein the data collection module is used for acquiring, cleaning and labeling training data; the characteristic acquisition and relation extraction module comprises sentence context coding, named entity identification and entity relation prediction; the relation storage module and the application service module are used for storing a figure entity relation map extracted from the text figure relation by using a map database, designing a core business service API (application programming interface) based on an SOA (service oriented architecture), providing data interaction service, and packaging system application by a Docker container, so that a system platform has high portability and expandability, and finally provides a text figure relation knowledge map function. The invention can effectively solve the problems of low recalling rate of people relation extraction, inaccurate relation and the like. So as to achieve better relation extraction platform service.

Description

Chinese character relation extraction construction method based on BERT neural network
Technical Field
The invention belongs to the field of computer natural language processing, and relates to a method for extracting Chinese character and human object relations based on a BERT neural network. Different from the current mainstream dependency syntax analysis method based on the traditional method, the method is an extraction method for deep learning, compared with the traditional Chinese character relation extraction, the system can relieve the defects of the traditional Chinese character relation extraction model to a certain extent by the deep learning method based on the BERT, namely, the selection of the characteristics is completed by model training, and the high-performance relation extraction model is obtained without manual intervention. In the final extraction of the text, the accuracy and the recall rate are very high.
Background
With the advent of the big data era and the continuous development of information technology, the society of today has entered the Artificial Intelligence (AI) era. Various artificial intelligence application products are continuously appeared, and play various important roles in various fields, such as great splendor in brand new application scenes of combining artificial intelligence and traditional industries, such as AI + agriculture, AI + medical treatment, AI + autopilot, AI + education and the like. In the artificial intelligence field, there is a very important sub-field, Natural Language Processing (NLP), and there is a very broad application prospect in the NLP as one of the important contents of the NLP for people relationship extraction.
With the advent of the big data era, the data scale is continuously increased, the problem of information overload is increasingly serious, and the method has great significance on how to quickly, efficiently and accurately acquire key information in the face of various text data with the rapid increase of field intersection, massive isomerism and fragmentation. The key information is extracted from the data to be analyzed, so that a research hotspot is formed, and the information extraction needs high-quality semantic entities for support. According to the specific problem, entity relation extraction plays an important role, the technology can extract specific entity, event, relation and other information from massive unstructured data in a network, further the specific entity, event, relation and other information is converted into a representation form which accords with the human cognitive world, structured data is formed and stored in a database, and organized, inquired and usable data knowledge is provided for various users. In the process from big data to big knowledge [3], the figure relation extraction research has important significance for the fields of intelligent semantic search, figure knowledge graph construction, question-answering system and the like.
(1) Application in intelligent semantic search
With the development of the internet, people nowadays increasingly rely on search engines, and when people encounter unsolved problems or have unknown information, the people can habitually go to search engines such as hundredths, ***, bin, 360 and the like to obtain a desired result. With the continuous development of the related technology of the search engine, the search engine can automatically associate according to semantic knowledge instead of simple keyword matching in the information retrieval process, search and feed back information really needed by a user, can realize accurate search of numbers and letters instead of fuzzy matching, can synchronize a database in the search engine in real time by increasing and deleting, and provides support for the technology of extracting entity relationships behind the search engine providing convenience for people. For example, when a search box inputs "what weapon the teachers and brothers of the Zhu Bajie" and faces such a complicated query, the search engine directly returns the names (Sunwukong) and the aliases of the teachers and brothers of the Zhu Bajie, which indicates that the search engine performs semantic analysis on the content input by the user in the process of returning the result, reads the problem of the user through entity relationship extraction, extracts a problem main body, then links with the corresponding nodes in the knowledge graph, performs deduplication on the search result and pushes the accurate result which the user wants.
(2) Application of figure knowledge graph construction
The first step of the figure knowledge graph construction technology is information extraction, and figure relation extraction is one of core contents of the information extraction, and effective relation triples can be obtained through figure identification and relation extraction. Finally, a knowledge graph is constructed between people so as to realize the large knowledge mining and reasoning service across surnames. For example, in a family character atlas, various characters directly related to the host male and related to the profile and the relationship with the host male can be easily obtained; among the historical figure maps, the method for grasping and evaluating the historical figures by using the historical figure maps plays an important role in historical learning and exploration; people's relationship can be combed and made clear in the novel people atlas more quickly, so that the novel can be understood more deeply.
Human relationship extraction studies are important for such systems, which can facilitate the automated construction of such knowledge-graphs. Therefore, the extraction algorithm for the relation of the research people and the related theoretical research and development of the extraction algorithm have important practical significance.
(3) Application in question-answering system
Before answering the questions provided by the user, the question answering system needs to analyze the contents of the questions and extract the semantic relation between the character pairs, so that the system can automatically provide reasonable answers. For example, in answering "what weapon the teachers and brothers of the pig sika ring? "when similar to a problem, a person relationship extraction technique is used to construct a connection between persons, and the problem can be converted into" what weapon is for grand Wudu "through < Zhu Bajie, teacher and brother, grand Wudu >", "? ", instead of returning a large number of web pages as a result set through information retrieval as in the conventional method, the results can be automatically answered by querying the results.
In conclusion, the entity relationship extraction research can be widely applied to the fields of intelligent semantic search, character knowledge graph construction, question-answering system and the like, the development of technologies such as natural language processing, artificial intelligence and the like is promoted, and the research progress of the academic world is promoted.
At present, most of Chinese character relation extraction methods extract character relations in texts based on a dependency syntax analysis method, and model one-to-one corresponding dependency relations are established through grammar rules of different languages so as to achieve the purpose of extracting character relations. However, such a method needs to manually define a large number of grammar rules, and the feature information extracted by the dependency syntax analysis is very original, and under the intervention of human subjective factors, the feature information of secondary processing needs to be input, so that the accumulated error is serious, and the accuracy of model extraction is affected. Therefore, the deep learning idea is applied to the character relation extraction task and has important research significance, the system can relieve the defect of the relation extraction model based on dependency syntax analysis to a certain extent through the deep learning method based on BERT, namely the selection of the characteristics is completed by model training, and the high-performance relation extraction model is obtained without manual intervention. In the final extraction of the text, the accuracy and the recall rate are very high.
The invention relates to a Chinese character relation extraction method based on a BERT neural network. The text preprocessing layer comprises two parts, the first part is acquisition of training expectation, person name NER recognition and text mask, and the second part is BERT Embedding, and the text is converted into word vectors. The characteristic extraction layer is composed of BI-LSTM, Dropout, residual mapping and Multi-head-Self-orientation extraction, and is used for extracting the characteristics of the processed original text and extracting deeper high-dimensional semantic characteristics of the sentence. The relation prediction layer module is used for positioning the relation word position index based on softmax and matching with the relation dictionary, and the relation prediction layer module is used for finally judging the relation words between two persons and forming relation triples after the input text is subjected to feature extraction.
The invention adopts the following technical scheme and implementation steps:
1. technical scheme for extracting overall process of Chinese character relationship
The Chinese character relation extraction is a very important part in the field of information extraction, and the main task of the Chinese character relation extraction is to extract the character relation contained in the text from the unstructured text, so that a lot of follow-up research can be carried out through the extracted character relation. The overall flow chart flow framework for Chinese character relationship extraction of the invention is shown in FIG. 1.
The formalization definition is as follows: for a set S of sentences containing relationships of persons W1,W2,W3,W4,....,WnW represents a character in a sentence, n represents a serial number of each character in the sentence, and character relation extraction is used to extract two characters W in the sentencei-jAnd Wk-1If more than 2 entity characters are contained in the sentence, the relation of combining every two characters is respectively returned. The relationship extracted from each group of relationships is expressed in the form of a triple, as shown in the following formula:
Figure BDA0002943104540000041
each of the three character relationship triplets obtained is an example relationship in the sentence S, for example, from the sentences "wang and grandma Welch and Tang-xiawangwei go to dining room to eat", three characters "Wang-just", "Wei-xiahong" and "Wanxiaqiang" can be identified, and the relationship between two combinations of the three triplets is obtained, which are respectively expressed as < "Wang-just", "grandma", "Weixiahong" >, < "Wang-just", "Tang", "Wanxiaqiang" >, "Weixiahong", "unswewn" and "Wanxiaqiang", wherein the unswewn in the third triplet represents that the two corresponding characters in the triplet have no relationship in the sentence.
2. Constructing a character relation data corpus:
because the corpus extracted by the disclosed Chinese character relationship is very lacking, currently, there is no more authoritative Chinese character relationship extraction data set, the project selects a remote supervision method to obtain a training corpus, 10000 character relationship pairs (figure 2) are downloaded through an interactive encyclopedia, the names of all data displayed in the project are replaced by virtual people because the privacy of other people is prevented from being invaded, then each character relationship pair is spliced to a Baidu search URL by a crawler and searched, searched text linguistic data is crawled into a database to be stored, the database is manually cleaned and labeled to construct character relationship linguistic data, 30000 sentences containing character relationships are finally cleaned out and are used as the data set of the Chinese character relationship extraction experiment of the project, 28000 sentences after the data set is randomly disordered are used as a training set and 2000 sentences are used as a test set, a partial data set presentation is shown in figure 3.
3. Chinese character relation extraction model
The structure of the BERT + BI-LSTM + Multi-head-Self-orientation + FC Chinese character relation extraction model based on deep learning provided by the invention, in the extraction of Chinese character relations, a pre-training model of the BERT plays a crucial role, the BERT network is based on a Tansformer network encoder end model, namely a large-scale network constructed based on the Attention mechanism can effectively solve the problems of long-term dependence, incapability of parallel training, low efficiency and the like brought by a sequence model, but the feature information of the corpus cannot be extracted efficiently only by using the BERT network model, because although the BERT network model is pre-trained by open source corpora, the parameters of the whole network model are not trained aiming at the extraction task of the character relation field, then, the subsequent BI-LSTM + Multi-head-Self-orientation + FC model part is needed to extract the high-dimensional semantic and other features of the incoming text.
The main idea of the bidirectional long-short term memory network model based on the BERT is that a BI-LSTM layer and a Multi-head-Self-attachment layer are added on the BERT model, and finally a nonlinear mapping layer and a softmax output layer are spliced, the structure of the bidirectional long-short term memory network model is shown in figure 3 and comprises 6 layers which are sequentially from top to bottom:
1. an input layer: the function of the layer is to preprocess the training corpus and input the training corpus into the model, firstly judge the number of characters in the sentence, copy the sentence by the number of characters in the sentence, the number of the sentences reaches the number of the characters which can be combined in pairs, then carry out mask operation, carry out mask processing to the names of two corresponding characters which want to judge the relationship in each sentence, namely, use the #' with the same number as the characters of the names to replace the corresponding names, there are two main reasons for carrying out the mask processing to the names, firstly, carry out marking, a plurality of names appear in one sentence, carry out marking to judge the relationship between the characters at two positions which are removed by the mask through the positions different from the mask, secondly, on the basis of the positions of the known names in the sentence, the relationship between the two characters is related to the expression form and the semantics of the sentence, the names of the two human entities have no influence, and the names of the two positions are given to the mask, so that not only is the information of the text sentences not lost, but also impurity information of sentences influenced by different name calling methods is deleted.
For example, the input sentence "wanggang and wier pink of wife and chaiweiwang of hall go to dining room to eat" is preprocessed to obtain three inputs as shown in table 1, and then the three sentences are respectively transmitted into the subsequent models to obtain the relationship between two characters corresponding to each sentence.
Figure BDA0002943104540000051
Figure BDA0002943104540000061
TABLE 1
2. BERT embedded layer:
in this layer, the BERT network maps each word to a low-dimensional vector space after weighted summation of the incoming statements, that is, word vectors of each word, extracts the characteristics of context in the corpus to a certain extent, and finally obtains the word vector W of each word of the incoming statements as the input of the next BI-LSTM layer.
3. BI-LSTM layer:
the layer has the function that the text word vectors output by the BERT embedding layer are used for extracting features from the positive direction and the negative direction by constructing the positive and negative LSTM neural networks, so that the problems of long dependence of sentences in character relation extraction and the like can be solved. The main purpose of the layer is to access the feature information behind the BERT network into a BI-LSTM network model and extract high-dimensional semantic features.
4. Dropout layer
In the process of model training, the parameters of the model are too many, the training samples are not many, the trained model is easy to generate the phenomenon of overfitting, when the Dropout layer is propagated forwards, the activation value of a certain neuron stops working with a certain probability p, for example, p is set to be 0.5, only 50% of all neurons can be randomly trained in each training process, so that the model does not depend on certain local characteristics too much, the generalization of the model is stronger, the overfitting can be effectively relieved finally, and the regularization effect is achieved to a certain extent.
Wherein the third BI-LSTM layer and the fourth Dropout layer are cycled three times in total and each cycle adds a residual structure.
5. Multi-head-Self-orientation layer:
the Self-Attention mechanism of layer-by-layer superposition is utilized to obtain new representation of each word in consideration of context information
6. Nonlinear mapping layer and softmax output layer:
the main work of the layer is to output depth feature vectors obtained by an upper layer module through a softmax function classifier, predict classification of character relations and compare the classification with character relation classification labeled manually. The method is specifically operated in such a way that dimension conversion is carried out on two fully-connected layers on a result after the Attention of an upper layer module, then an output result is mapped into a real number between 0 and 1 through a softmax function, and the normalized sum is 1. And then, taking the position number of the maximum value in the results after the softmax as the last output result, wherein the position number represents the position index of the first character of the relation word corresponding to the two characters in the corresponding corpus sentences.
In many traditional models, most of the models adopt classification models for predicting relationships of characters, but the traditional mode of defining classes in advance and then classifying the classes has a very good defect, which affects the accuracy of relationship extraction, for example, the problem of unbalanced data of each class of training set samples, if the classification mode is adopted, in the field of Chinese character relationship extraction, data of partial classes are difficult to obtain, for example, the difficulty of obtaining data of some remote relatives, superordinate relationships and superordinate relationships is far greater than that of data of couples, parents, brothers and sisters, so that the serious imbalance of training expectation finally crawled is caused, and the training effect of the models is very poor. Therefore, the algorithm adopts a reading understanding mode, the position index of the relation word in the sentence is returned, namely the final output result is obtained by selecting a method of returning the position of the position index of the two-character relation in the transmitted sentence, the result after the output of the softmax function classifier is the position index of the first character of the two-character direct relation word in the sentence, then the three characters or the two characters after the selection from the current index position of the sentence are traversed in the prepared relation dictionary, the complete relation is traversed, and finally the relation of the current two character triples is obtained.
Description of the drawings:
FIG. 1 is a flow chart of the Chinese character relationship extraction based on the BERT neural network
FIG. 2 is an example of Chinese character relationship pairs
FIG. 3 is an example of Chinese character relational corpora
FIG. 4 BI-LSTM network diagram
FIG. 5 is a diagram of a Chinese character relationship extraction model architecture based on the BERT neural network
FIG. 6 is a partial extraction result display diagram
FIG. 7 is a diagram of a process for visualizing extraction of relationships between people
FIG. 8 is a system architecture diagram of an embodiment of the present invention
FIG. 9 is a diagram of the structure of a website showing page
The specific implementation mode is as follows:
a Chinese character relationship extraction method based on a BERT neural network.
Comprises the following steps:
and in a text preprocessing layer, expanding the transmitted text according to the number of the entity characters and performing mask operation.
In the BERT layer, in the Embedding process, a given chinese sentence S composed of n words is first given as W1,W2,W3,W4,W5,.....,WnConverts each word Wi into a corresponding word vector e with dimensions 768x1 by BERTiI.e. E ═ E1,e2,e3,e4,e5,.....,en}。
In the BI-LSTM layer (fig. 4), since the BL-STM network includes two sub-LSTM networks in forward and backward directions, respectively, the forward and backward passes are required for each LSTM to pass through equations (1) - (6), whereet is the input at this time, ht-1 is the memory output at the previous time, Wf Wi Wo is the model parameter, and using a Dropout layer of 0.5, the input gate f is calculatedtAnd an output gate otForget gate itAnd the function of sigmod is to normalize the output result to be between 0 and 1, and finally calculate the output h of the secondary sub-model from the input data through three gatingL0,hL1,hL2,hL3,……,hLnH andR0,hR1,hR2,hR3,……,hRnsplicing the hidden vectors of the forward direction and the backward direction of each moment to obtain { [ h ]L0:hR20],[hL1:hR19],[hL2:hR18],……,[hL19:hR2],[hL20:hR1]I.e. BI-LSTM output h ═ h { (h) }0,h1,h2,h3,……,h19,h20}. L is LSTM network transmission from left to right and R represents transmission from the right to left LSTM network.
ft=sigmod(Dropout(Wf,0.5)[ht-1,et]+bf) (1)
it=sigmod(Dropout(Wi,0.5)[ht-1,et]+bi) (2)
ot=sigmod(Dropout(Wo,0.5)[ht-1,et]+bo) (3)
C′t=tanh(Wc[ht-1,et]+bc) (4)
Ct=ft*ct-1+it*c′t (5)
ht=ot*tanh(ct) (6)
And (4) continuously introducing the result output in the BI-LSTM layer into the BI-LSTM layer through a formula (7) in a residual error mode, and passing through the BI-LSTM layer for three times in total.
hl+1=hl+LSTM(hl,Wl) (7)
In a Multi-head-Self-orientation layer, Q (query), K (key), V (value) of each group are respectively calculated through a { Q, K, V } matrix of an orientation, because the Multi-head-Self-orientation adopted by the paper has a plurality of heads, a plurality of groups of { Qi, Ki, Vi } parameter matrixes with different parameters are required to respectively calculate { Qi, Ki, Vi } corresponding to each head through formulas (8) to (10), then an orientation value yi of each head is respectively calculated through formulas (11) to (13), then a normalized result is multiplied by the matrix V through a formula (14) to obtain a weight summation expression Ai, and then the output Ai of each head is spliced to obtain the output A { [ A ]1h1:A1h2:A1h3:…:A1hm],[A2h1:A2h2:A2h3:…:A2hm],[A3h1:A3h2:A3h3:…:A3hm],……,[Anh1:Anh2:Anh3:…:Anhm]}. Where the preceding subscripts 1-i denote the results output by the next head and the following subscripts 1-m denote the output for each position of each head output result.
qi=Qi*h (8)
ki=Ki*h (9)
ki=Ki*h (10)
Figure BDA0002943104540000091
Figure BDA0002943104540000092
yi=softmax(Score(qi,kj)) (13)
Ai=Self-Attention(q,k,v)=yi*v (14)
In a nonlinear mapping layer and a softmax output layer, firstly, passing an output A of an upper layer through two fully-connected layers through formulas (15) - (16), wherein W and b are parameters of a model, then performing softmax calculation to obtain a result, then finding the position of the maximum value of softmax through argmax to obtain a final prediction result, and the main work of the layer is to output a depth feature word vector obtained by the upper layer through a softmax function classifier, predict classification of character relations, and compare the classification with manually labeled character relation classifications.
o′=softmax(W2(W1*Ai+b1)+b2) (15)
o=numpy.argmax(o′) (16)
In the model initialization part, aiming at the relation extraction model, the initial weights of all the sub-layers of the model are set to be a random normal distribution matrix which is expected to be 0 and 1 in variance. For the parameter setting of the BERT embedded layer, the maximum input length of BERT is set to be 50, because through analyzing the training sentences sorted in chapters 3.2, the word number of the length of 99.94% of the data text is less than 50 words, when the length is greater than 50 words, the text is cut off and split, and the cut-off parts are respectively transmitted into the model for judgment. For parameter setting of the BI-LSTM layer, the initial weight of the acceptance data layer is set to 50 × 768, i.e., the dimension of 50 lengths of incoming sentences multiplied by the dimension 768 of word vectors output by the BERT model, the hidden layer dimension of the BI-LSTM is set to 128, and all Dropout layer parameters are set to 0.5, i.e., neurons are retained with a probability of 0.5. For the parameter setting of the Multi-head-Self-Attention layer, the head number of the Multi-head Self-Attention mechanism is set to be 12. The dimension of the Q, K, V matrix is set to 32.
For the parameter settings of the nonlinear mapping layer and the softmax output layer, the output dimensionalities of the fully-connected layers of the first two layers are 512, the output dimensionality of the last layer is 51, and the output dimensionalities represent the probabilities of 50 dimensionalities of sentence index positions returned after softmax calculation and 1 dimensionality representing an unknown relation.
In the training process, some hyper-parameters are set, Adam (epsilon is 106, rho is 0.95) is firstly used as an optimization algorithm of random Gradient Descent (SGD), namely, an initial learning rate is set to be 1.0, and a first moment and a second moment of the Gradient are used for dynamically adjusting the learning rate, so that the Gradient Descent is stable. At the same time, to avoid gradient explosions, the gradient norm was clipped to 0.9. The parameter of batch _ size is set to 32, that is, the number of texts in each batch is 32 texts during batch training. The epochs is set to have a parameter of 30, i.e., at most all of the text is trained in a loop for 30 times while training the model. The early stopping parameter is set to 5, i.e. the early stopping is triggered by monitoring whether the verification accuracy has improved in the last period of time during the training. If no over-model effect improvement occurs for 5 times continuously, the model training is stopped in advance, and the model is prevented from being over-fitted. The loss function is set as category _ cross entropy, that is, a cross entropy loss function is adopted, and cross entropy can measure the difference degree of two different probability distributions in the same random variable, which is often used as the difference between a true probability distribution and a predicted probability distribution. The smaller the value of the cross entropy, the better the model prediction effect. In table 2, the values of the hyper-parameters set for all network structures during the model training process are shown.
Description of parameters Value of
BERT longest input 50
BI-LSTM input_size (50,768)
BI-LSTM hidden_size 128
BI-LSTM num_layers 2
BI-LSTM Dropout 0.5
Attention Head 12
Attention Q K V 32
FC1、FC2 512
FC3 51
learning rate 1.0
cutting 0.9
batch_size 32
epochs 30
early_stopping 5
TABLE 2
And finally, in a dictionary matching layer, traversing three words or two words in a prepared relation dictionary from the obtained position index o at the current index position of the transmitted sentence to traverse a complete relation, and finally obtaining the relation triple of the current two characters. The partial extraction results are shown in figure 5. The test result shows that the model can accurately identify the characters appearing in a single sentence, and can accurately analyze the relation between every two different characters.
The knowledge storage module provides a character relationship extraction result storage and display service by using a graph database, and stores the relationship between the text character entity and the characteristic entity, as shown in fig. 6. The system adopts a service-oriented design mode to carry out platform design, a core service line of the system is divided into a specific embodiment (figure 7) based on an SOA architecture, a display page architecture composition of a website is shown in figure 8, the display page architecture composition comprises modules such as overall display, platform service, a management module, a user portal and the like, and an RESTFUL standard design is adopted to realize an API interface. And in consideration of the expandability of the platform and the support of high concurrency, the Redis is utilized for distributed caching. The knowledge service application of the system platform is packaged by using a Docker container technology, so that the distributed application is convenient for deployment, and the knowledge service system has high transportability and high expandability.
In one embodiment, the method further comprises a step of extracting a person relationship visualization, wherein the step of person relationship visualization comprises displaying the person relationship visualization through a Web application framework on a display terminal, and the method comprises the following steps:
receiving an https request of a client through a browser;
sending the http request to a web server network management gateway;
appointing an information position through a uniform resource locator and sending the information position to a view function;
the view function requests data in a data storage layer by using an HttpRequest object;
the data storage layer calls database data, corresponding data are extracted from the database to the view function according to objects needed in the view function, the data are transmitted to the presentation layer through the template language after being processed in the view function, and the presentation layer returns an http request to the browser to be displayed to a user.
The step of visualizing the character relationship further comprises: and integrally displaying the character relationship or constructing a character relationship network for displaying the character relationship.
Finally, the method is a Chinese character relation extraction method and system based on the BERT neural network, can automatically learn and extract a large number of features in training data, and overcomes the defect of low recall rate caused by less artificial features in the traditional Chinese character relation extraction method.

Claims (8)

1. A Chinese character relationship extraction system based on a BERT neural network is characterized by comprising the following steps:
the data collection module is used for acquiring training data, cleaning the data and labeling the data, and the collected data is used as the training data to obtain character entity labels and relationship labels among the character entity labels in each sentence;
the characteristic obtaining and relation extracting module comprises a context coding module, a named entity recognition module, a relation label embedding expression module and a characteristic obtaining and relation extracting module, wherein the context coding module is used for carrying out context coding on each sentence, the named entity recognition module is used for carrying out named entity recognition on the context semantic representation of each sentence, the context semantic representation and the named entity recognition result are integrated, and all entity relations in each sentence are predicted;
the relation storage module and the application service module are used for storing a figure entity relation map extracted from the text figure relation by using a map database, designing a core business service API (application programming interface) based on an SOA (service oriented architecture), providing data interaction service, applying a Docker container packaging system and finally providing a text figure relation knowledge map function.
2. The system for extracting chinese character relationship as claimed in claim 1, wherein the data collection module is specifically:
firstly, collecting a large amount of internet data containing character relations as training data, then using a crawler to splice each character relation pair to a Baidu search URL for searching, crawling searched text corpora to a database for storage, manually cleaning and labeling the searched text corpora, and constructing character relation corpora.
3. The system for extracting relationship between chinese characters as claimed in claim 1, wherein the feature obtaining and relationship extracting module is specifically:
carrying out named entity recognition on the persons appearing in each sentence in the well-regulated training data;
performing BERT Embedding word vector conversion on each sentence of the training data;
carrying out context coding on the converted word vector sentences to obtain the context semantic representation of each sentence;
and for each sentence of the data to be predicted, synthesizing the semantic features of the context and the recognition result of the named entity to predict the person entities pairwise, and predicting all the relation of the person entities in each sentence.
4. The system for extracting chinese character relationship as claimed in claim 1, wherein the relationship storage module and the application service module are specifically:
providing a Chinese character relation knowledge map storage service by using a map database, and storing character entity relations extracted from a Chinese text; splitting the Chinese character relation extraction system according to service functions based on an SOA (service oriented architecture), wherein the designed API (application program interface) comprises a user portal module interface, a management module interface, a total display module interface and a platform service module interface; providing user related data storage service and related system access log storage service by utilizing the Mongodb database; providing an API interface access cache service by utilizing a Redis distributed cache database; the knowledge service application is encapsulated by using a Docker container technology, so that the distributed application is convenient to deploy; the method comprises the steps of carrying out text clause segmentation, named entity identification, BERT word vector conversion, feature extraction and relationship inference on document information transmitted by a user, constructing a relationship triple on the relationship between a human entity and the inference, and carrying out knowledge graph construction and providing human relationship query and text relationship extraction services through the constructed triple.
5. A Chinese character relationship extraction method based on a BERT neural network is characterized by comprising the following steps:
acquiring training data, cleaning the data and labeling the data, and acquiring character entity labels and relationship labels among the character entity labels in each sentence by using the collected data as the training data;
the method comprises the steps of carrying out context coding on each sentence, carrying out named entity identification on context semantic representation of each sentence, synthesizing the context semantic representation and named entity identification results and relation label embedded representation, and predicting all entity relations in each sentence;
and storing a figure entity relationship map extracted from the text figure relationship by using a map database, designing a core business service API (application programming interface) based on an SOA (service oriented architecture), providing data interaction service, and finally providing a text figure relationship knowledge map function by using a Docker container packaging system.
6. The method of extracting chinese character relationship of claim 5, wherein:
firstly, collecting a large amount of internet data containing character relations as training data, then using a crawler to splice each character relation pair to a Baidu search URL for searching, crawling searched text corpora to a database for storage, manually cleaning and labeling the searched text corpora, and constructing character relation corpora.
7. The method of extracting chinese character relationship of claim 5, wherein:
carrying out named entity recognition on the persons appearing in each sentence in the well-regulated training data;
performing BERT Embedding word vector conversion on each sentence of the training data;
carrying out context coding on the converted word vector sentences to obtain the context semantic representation of each sentence;
and for each sentence of the data to be predicted, synthesizing the semantic features of the context and the recognition result of the named entity to predict the person entities pairwise, and predicting all the relation of the person entities in each sentence.
8. The method of extracting chinese character relationship of claim 5, wherein:
providing a Chinese character relation knowledge map storage service by using a map database, and storing character entity relations extracted from a Chinese text; splitting the Chinese character relation extraction system according to service functions based on an SOA (service oriented architecture), wherein the designed API (application program interface) comprises a user portal module interface, a management module interface, a total display module interface and a platform service module interface; providing user related data storage service and related system access log storage service by utilizing the Mongodb database; providing an API interface access cache service by utilizing a Redis distributed cache database; the knowledge service application is encapsulated by using a Docker container technology, so that the distributed application is convenient to deploy; the method comprises the steps of carrying out text clause segmentation, named entity identification, BERT word vector conversion, feature extraction and relationship inference on document information transmitted by a user, constructing a relationship triple on the relationship between a human entity and the inference, and carrying out knowledge graph construction and providing human relationship query and text relationship extraction services through the constructed triple.
CN202110186063.0A 2021-02-14 2021-02-14 Chinese character relation extraction construction method based on BERT neural network Pending CN112926325A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110186063.0A CN112926325A (en) 2021-02-14 2021-02-14 Chinese character relation extraction construction method based on BERT neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110186063.0A CN112926325A (en) 2021-02-14 2021-02-14 Chinese character relation extraction construction method based on BERT neural network

Publications (1)

Publication Number Publication Date
CN112926325A true CN112926325A (en) 2021-06-08

Family

ID=76169754

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110186063.0A Pending CN112926325A (en) 2021-02-14 2021-02-14 Chinese character relation extraction construction method based on BERT neural network

Country Status (1)

Country Link
CN (1) CN112926325A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283243A (en) * 2021-06-09 2021-08-20 广东工业大学 Entity and relation combined extraction method
CN113642336A (en) * 2021-08-27 2021-11-12 青岛全掌柜科技有限公司 Insurance automatic question-answering method and system based on SaaS
CN114218924A (en) * 2021-07-27 2022-03-22 广东电力信息科技有限公司 Text intention and entity combined identification method based on BERT model
CN116628174A (en) * 2023-02-17 2023-08-22 广东技术师范大学 End-to-end relation extraction method and system for fusing entity and relation information
CN116662578A (en) * 2023-08-02 2023-08-29 中国标准化研究院 End-to-end-based large-scale knowledge graph construction and storage method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125367A (en) * 2019-12-26 2020-05-08 华南理工大学 Multi-character relation extraction method based on multi-level attention mechanism
CN111159407A (en) * 2019-12-30 2020-05-15 北京明朝万达科技股份有限公司 Method, apparatus, device and medium for training entity recognition and relation classification model
CN111538849A (en) * 2020-04-29 2020-08-14 华中科技大学 Character relation graph construction method and system based on deep learning
CN111581376A (en) * 2020-04-17 2020-08-25 中国船舶重工集团公司第七一四研究所 Automatic knowledge graph construction system and method
CN112364091A (en) * 2020-11-09 2021-02-12 北京工商大学 Method and system for visually inquiring character relationship based on knowledge graph

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125367A (en) * 2019-12-26 2020-05-08 华南理工大学 Multi-character relation extraction method based on multi-level attention mechanism
CN111159407A (en) * 2019-12-30 2020-05-15 北京明朝万达科技股份有限公司 Method, apparatus, device and medium for training entity recognition and relation classification model
CN111581376A (en) * 2020-04-17 2020-08-25 中国船舶重工集团公司第七一四研究所 Automatic knowledge graph construction system and method
CN111538849A (en) * 2020-04-29 2020-08-14 华中科技大学 Character relation graph construction method and system based on deep learning
CN112364091A (en) * 2020-11-09 2021-02-12 北京工商大学 Method and system for visually inquiring character relationship based on knowledge graph

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DENGTAO LIU,等: "Chinese Character Relationship Extraction Method Based on BERT", 《INTERNATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE AND COMPUTER APPLICATIONS》, 28 June 2021 (2021-06-28), pages 883 - 887, XP033951013, DOI: 10.1109/ICAICA52286.2021.9497946 *
刘峰;高赛;于碧辉;郭放达;: "基于Multi-head Attention和Bi-LSTM的实体关系分类", 计算机***应用, no. 06, 15 June 2019 (2019-06-15), pages 118 - 124 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113283243A (en) * 2021-06-09 2021-08-20 广东工业大学 Entity and relation combined extraction method
CN113283243B (en) * 2021-06-09 2022-07-26 广东工业大学 Entity and relationship combined extraction method
CN114218924A (en) * 2021-07-27 2022-03-22 广东电力信息科技有限公司 Text intention and entity combined identification method based on BERT model
CN113642336A (en) * 2021-08-27 2021-11-12 青岛全掌柜科技有限公司 Insurance automatic question-answering method and system based on SaaS
CN113642336B (en) * 2021-08-27 2024-03-08 青岛全掌柜科技有限公司 SaaS-based insurance automatic question-answering method and system
CN116628174A (en) * 2023-02-17 2023-08-22 广东技术师范大学 End-to-end relation extraction method and system for fusing entity and relation information
CN116662578A (en) * 2023-08-02 2023-08-29 中国标准化研究院 End-to-end-based large-scale knowledge graph construction and storage method and system
CN116662578B (en) * 2023-08-02 2023-10-31 中国标准化研究院 End-to-end-based large-scale knowledge graph construction and storage method and system

Similar Documents

Publication Publication Date Title
CN109492157B (en) News recommendation method and theme characterization method based on RNN and attention mechanism
CN109885672B (en) Question-answering type intelligent retrieval system and method for online education
CN112926325A (en) Chinese character relation extraction construction method based on BERT neural network
CN107180045B (en) Method for extracting geographic entity relation contained in internet text
CN110175227A (en) A kind of dialogue auxiliary system based on form a team study and level reasoning
CN109829104A (en) Pseudo-linear filter model information search method and system based on semantic similarity
CN109960786A (en) Chinese Measurement of word similarity based on convergence strategy
CN115796181A (en) Text relation extraction method for chemical field
CN110765277A (en) Online equipment fault diagnosis platform of mobile terminal based on knowledge graph
CN115599899B (en) Intelligent question-answering method, system, equipment and medium based on aircraft knowledge graph
CN116975256B (en) Method and system for processing multisource information in construction process of underground factory building of pumped storage power station
CN116077942B (en) Method for realizing interactive content recommendation
CN111581364A (en) Chinese intelligent question-answer short text similarity calculation method oriented to medical field
CN111951079A (en) Credit rating method and device based on knowledge graph and electronic equipment
Korade et al. Strengthening Sentence Similarity Identification Through OpenAI Embeddings and Deep Learning.
CN117786052A (en) Intelligent power grid question-answering system based on domain knowledge graph
CN116450776A (en) Oil-gas pipe network law and regulation and technical standard retrieval system based on knowledge graph
CN116484023A (en) Method and system for constructing power industry knowledge base based on artificial intelligence
CN114238653B (en) Method for constructing programming education knowledge graph, completing and intelligently asking and answering
CN111581326B (en) Method for extracting answer information based on heterogeneous external knowledge source graph structure
CN112507097B (en) Method for improving generalization capability of question-answering system
CN115905554A (en) Chinese academic knowledge graph construction method based on multidisciplinary classification
Nahar et al. A Comparative Selection of Best Activation Pair Layer in Convolution Neural Network for Sentence Classification using Deep Learning Model
Yarovyi et al. Dictionary data structure for a text analysis task using cross-references
Liu et al. A resource retrieval method of multimedia recommendation system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination