CN116108206A

CN116108206A - Combined extraction method of financial data entity relationship and related equipment

Info

Publication number: CN116108206A
Application number: CN202310390977.8A
Authority: CN
Inventors: 雷琪; 孔冠卿; 李仪; 王勇
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-04-13
Filing date: 2023-04-13
Publication date: 2023-05-12
Anticipated expiration: 2043-04-13
Also published as: CN116108206B

Abstract

The invention provides a method for jointly extracting a financial data entity relationship and related equipment, comprising the following steps: acquiring characteristic information of a financial text to be processed, and carrying out partition filtering to obtain an input characteristic vector and a word vector; inputting the input feature vector and the word vector into a first extraction model to obtain an early recognition result and an early classification result; inputting the input feature vector, the early recognition result and the early classification result into a first mutual information module to obtain an entity information stream, and inputting the word vector, the early recognition result and the early classification result into a second mutual information module to obtain a relation information stream; inputting the entity information flow and the relation information flow into a second extraction model to obtain an entity identification result and a relation classification result, decoding the identification classification result to obtain a subject identification result, a subject identification result and a relation between the subject and the subject, and forming a triplet form to obtain an entity relation extraction result of the financial text; the accuracy of the relation extraction of the financial data entities is improved.

Description

Combined extraction method of financial data entity relationship and related equipment

Technical Field

The invention relates to the technical field of financial text natural language processing, in particular to a method and related equipment for jointly extracting financial data entity relations.

Background

With the development of economy, various types of economic and financial activity data are explosively increased, and a large amount of hidden knowledge can be obtained by analyzing the data so as to better serve the financial industry.

Relationship extraction is a common class of tasks in natural language processing. For two entities that have a relationship, they may be referred to as a subject and an object, respectively, the relationship extraction is to find out the relationship that exists between the subject and the object in unstructured or semi-structured data, and is expressed as a triplet of entity relationships, i.e., a < subject, a relationship, an object >.

In recent years, a relation extraction technology represented by deep learning is to use a large pre-training model and massive corpus to extract text features and to specifically map text features, to represent semantic information of entities and relations in the text as low-dimensional continuous space vectors, and to calculate and process the vectors to predict complex semantic information corresponding to the relations between the entities.

Currently, deep learning-based relation extraction techniques are mainly divided into two categories: pipeline extraction (pipeline), joint extraction (joint). Pipeline extraction refers to the problem that named entity identification and relationship classification are respectively carried out by using two models, so that error propagation is usually caused, and interaction between two tasks is not strong; and in a model, the joint extraction completes two tasks of named entity recognition and relationship classification by sharing the same encoder. The general joint extraction method generally has the following two problems, namely, the feature contradiction, namely, the feature information output by one encoder is not friendly to the two tasks of entity identification and relationship classification, and the information required by the two tasks can be overlapped and contradicted; and secondly, in an interactive mode, joint extraction can only realize one-way interaction from entity recognition to relation classification or relation classification to entity recognition, and two tasks are mutually promoted, so that the two-way interaction of how to realize the tasks is worth exploring.

However, most models only pay attention to the extraction process of text features, and the problems of poor generalization performance, poor semantic interpretation and the like still exist for carrying out relation prediction modeling by using text feature vectors; the existing relation extraction model is generally generalized, less research is conducted on the Chinese financial field, and most of the relation extraction model is analyzed on English texts, the combined extraction model in the English field is directly used for Chinese financial data, and the triad extraction accuracy is low and cannot be used for practice.

Disclosure of Invention

The invention provides a joint extraction method of a financial data entity relationship and related equipment, and aims to improve the accuracy of the extraction of the financial data entity relationship.

In order to achieve the above objective, the present invention provides a method for jointly extracting relationships between financial data entities, including:

step 1, obtaining characteristic information of a financial text to be processed;

step 2, carrying out partition filtering on the characteristic information to obtain an input characteristic vector related to the entity recognition task and a word vector related to the relationship classification task;

step 3, inputting the input feature vector and the word vector into a first extraction model, carrying out entity recognition on the input feature vector to obtain an early recognition result, and carrying out relation classification on the word vector to obtain an early classification result;

Step 4, inputting the input feature vector, the early recognition result and the early classification result into a first mutual information module for calculation to obtain an entity information stream;

step 5, inputting the word vector, the early recognition result and the early classification result into a second mutual information module for calculation to obtain a relation information flow;

step 6, inputting the entity information flow and the relation information flow into a second extraction model, carrying out entity identification on the entity information flow to obtain an entity identification result, and carrying out relation classification on the relation information flow to obtain a relation classification result;

and 7, respectively decoding the entity identification result and the relation classification result to obtain a subject identification result, a subject identification result and a relation between the subject and the subject, and forming the subject identification result, the subject identification result and the relation between the subject and the subject into a triplet form to obtain an entity relation extraction result of the financial text.

Further, step 1 includes:

acquiring a financial text to be processed;

and extracting features of the financial text through a FinBERT pre-training model to obtain deep feature information of the financial text.

Further, step 2 includes:

respectively inputting deep feature information of the financial text into two convolutional neural networks, and carrying out local feature extraction on the deep feature information by controlling the sizes of a convolutional window and a convolutional step length to obtain multi-granularity entity partition task features and multi-granularity relation partition task features;

And respectively establishing a local attention module at the output end of each convolutional neural network by utilizing the multi-granularity entity partition task characteristics, the multi-granularity relation partition task characteristics and the deep characteristic information, and filtering the deep characteristic information through a scoring function of the local attention module to obtain an input characteristic vector related to the entity recognition task and a word vector related to the relation classification task.

Further, the computing mode of the multi-granularity entity partition task features is as follows:

the computing mode of the multi-granularity relation partition task features is as follows:

wherein ,

is shown as +.>

Convolution step size +.>

Zero padding is->

The multi-granularity entity partition task feature obtained in the time, < + >>

Is shown as +.>

Convolution step size +.>

Zero padding is->

The multi-granularity relation partition task feature obtained in the process, < + >>

Convolution window for representing computing multi-granularity entity partition task characteristics>

Feature vector obtained once per slide, +.>

Convolution window for representing computing multi-granularity relation partition task characteristics>

Feature vector obtained once per slide, +.>

Representing +.>

To->

A word vector matrix of words of (a),

representing +. >

To->

Word vector matrix of words of +.>

Representing the number of words contained in the window, +.>

Representing the dimension of a single word vector, +.>

Representing the length of a sentence>

Indicating the amount of offset.

Further, filtering the deep feature information by a scoring function of the local attention module includes:

defining multi-granularity entity partition task features as query vectors, namely:

obtaining a key vector and a value vector by using deep feature information:

wherein ,

representing deep level characteristic information->

Representing the resulting key vector,/>

The resulting vector of values is represented by a vector of values,

representing a trainable matrix;

entity recognition task related input feature vector

The method comprises the following steps:

wherein ,

representing an activation function->

Representing key vector +.>

Transpose of->

Representing the dimensions of a single word vector.

defining the multi-granularity relation partition task feature as a query vector, namely:

obtaining a key vector and a value vector by using deep feature information:

wherein ,

representing deep level characteristic information->

Representing the resulting key vector,/>

The resulting vector of values is represented by a vector of values,

representing a trainable matrix;

Word vectors associated with relational classification tasks

The method comprises the following steps:

wherein ,

to activate the function +.>

Representing key vector +.>

Transpose of->

Representing the dimensions of a single word vector.

The invention also provides a joint extraction device of the financial data entity relationship, which comprises:

the acquisition module is used for acquiring the characteristic information of the financial text to be processed;

the partition filtering module is used for performing partition filtering on the characteristic information to obtain an input characteristic vector related to the entity recognition task and a word vector related to the relationship classification task;

the early recognition classification module is used for inputting the input feature vector and the word vector into the first extraction model, carrying out entity recognition on the input feature vector to obtain an early recognition result, and carrying out relation classification on the word vector to obtain an early classification result;

the first calculation module is used for inputting the input feature vector, the early recognition result and the early classification result into the first mutual information module for calculation to obtain an entity information flow;

the second calculation module is used for inputting the word vector, the early recognition result and the early classification result into the second mutual information module for calculation to obtain a relation information flow;

the identification classification module is used for inputting the entity information flow and the relation information flow into the second extraction model, carrying out entity identification on the entity information flow to obtain an entity identification result, and carrying out relation classification on the relation information flow to obtain a relation classification result;

And the extraction module is used for respectively decoding the entity identification result and the relation classification result to obtain a subject identification result, a subject identification result and a relation between the subject and the subject, and forming the subject identification result, the subject identification result and the relation between the subject and the subject into a triplet form to obtain the entity relation extraction result of the financial text.

The invention also provides a computer readable storage medium, wherein the computer readable storage medium stores a computer program, and the computer program realizes the joint extraction method of the financial data entity relationship when being executed by a processor.

The invention also provides a terminal device, which comprises a memory, a processor and a computer program stored in the memory and capable of running on the processor, wherein the processor realizes the joint extraction method of the financial data entity relationship when executing the computer program.

The scheme of the invention has the following beneficial effects:

(1) According to the invention, the feature information of the financial text to be processed is acquired, the feature information is subjected to partition filtering at the output ends of the two convolutional neural networks, so that the input feature vector related to the entity recognition task and the word vector related to the relation classification task are obtained, and the features are distinguished according to the task at the data feature level, so that the problem of feature contradiction is avoided;

(2) The method comprises the steps of taking a first extraction model as early prediction, inputting an input feature vector and a word vector into the first extraction model, carrying out entity recognition on the input feature vector to obtain an early recognition result, and carrying out relation classification on the word vector to obtain an early classification result; inputting the input feature vector, the early recognition result and the early classification result into a first mutual information module for calculation to obtain an entity information flow; inputting the word vector, the early recognition result and the early classification result into a second mutual information module for calculation to obtain a relation information stream, and calculating by utilizing the first mutual information module and the second mutual information module to filter noise information in the early recognition result and the early classification result to obtain an entity information stream and a relation information stream; the entity information flow and the relation information flow are input into a second extraction model to carry out entity identification and relation classification, an entity identification result and a relation classification result are obtained, the entity information flow and the relation information flow calculated through the mutual information module contain early relation classification, and the entity identification result is sent into the second extraction model to carry out a named entity identification task and a relation classification task, so that bidirectional interaction of entity identification and relation classification is realized;

(3) Respectively decoding the entity identification result and the relation classification result to obtain a subject identification result, a subject identification result and a relation between the subject and the subject, and forming the subject identification result, the subject identification result and the relation between the subject and the subject into a triplet form to obtain an entity relation extraction result of the financial text; the accuracy of the relation extraction of the financial data entities is improved.

Other advantageous effects of the present invention will be described in detail in the detailed description section which follows.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention;

FIG. 2 is a flow chart of an embodiment of the present invention.

Detailed Description

In order to make the technical problems, technical solutions and advantages to be solved more apparent, the following detailed description will be given with reference to the accompanying drawings and specific embodiments. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In the description of the present invention, it should be noted that the directions or positional relationships indicated by the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. are based on the directions or positional relationships shown in the drawings, are merely for convenience of describing the present invention and simplifying the description, and do not indicate or imply that the devices or elements referred to must have a specific orientation, be configured and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.

In the description of the present invention, it should be noted that, unless explicitly stated and limited otherwise, the terms "mounted," "connected," and "connected" are to be construed broadly, and may be, for example, a locked connection, a removable connection, or an integral connection; can be mechanically or electrically connected; can be directly connected or indirectly connected through an intermediate medium, and can be communication between two elements. The specific meaning of the above terms in the present invention will be understood in specific cases by those of ordinary skill in the art.

In addition, the technical features of the different embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

Aiming at the existing problems, the invention provides a joint extraction method of financial data entity relationship and related equipment, which filters out characteristic information irrelevant and contradictory to tasks through partition filtering, adopts a marking scheme based on form filling to conduct early prediction on the filtered characteristic information, displays interaction between modeling tasks through a mutual information module, and avoids complex entity pair relationship judgment process.

As shown in fig. 1 and 2, an embodiment of the present invention provides a method for jointly extracting a relationship between financial data entities, including:

Specifically, step 1 includes:

acquiring a financial text to be processed;

Specifically, the FinBERT pre-training model is an open-source Chinese BERT pre-training model facing the financial field for extracting feature vectors (or term vectors) from financial texts, has better coding effect on financial entities such as bonds, stock futures contracts, stocks, securities and the like, and is more in line with downstream tasks of the financial field.

The BERT model is a large-scale pre-training model for natural language processing published in 2018. The BERT model is a multi-layer bidirectional converter-Encoder structure, and the converter uses a self-attention mechanism and position coding to replace an LSTM long-term memory network so as to solve the problems that the sequence 2 sequence model has poor parallel performance and is difficult to process long-term dependence. Considering that the Dncoder structure in the converter can not acquire the prediction information, BERT only uses the Encoder structure, and uses a 12-layer or 24-layer converter bidirectional Encoder network for pre-training by using two tasks of Masked Language Model (MLM) and Next Sentence Prediction (NSP). MLM can be understood as complete filling, randomly covering 15% of words in a sentence, and then predicting the words by using context, so that the model learns the whole text semantics; NSP selects some sentence pairs A and B; where 50% of B is the next sentence of A and the remaining 50% are randomly selected, allowing the model to learn better about the correlation between sentences.

Initially, only the BERT model for english text, even if some BERT models for general chinese fields are released subsequently later, the FinBERT pre-training model has significantly improved performance in downstream tasks in multiple financial fields compared to other BERT models. The FinBERT pre-training model is therefore used when feature extraction is performed on financial text.

Specifically, step 2 includes:

In embodiments of the present invention, partition filtering is accomplished by establishing a local attention mechanism. Firstly, two conventional CNNs (convolutional neural networks) are utilized to obtain local information such as entity boundaries and the like, and sequence information among entities is obtained. Taking the example of financial clauses of "a single securities invests in all the asset support securities held by the fund, the market value of the single securities must not exceed 20% of the net value of the fund asset", one CNN model is used for acquiring boundary information of financial entities such as "asset support securities", "net value of the fund asset", and the other model is used for acquiring sequence local characteristic information between entity pairs such as "market value", "exceeding", and the like. And (3) by controlling the convolution window and the convolution step size of the CNN, obtaining multi-granularity entity identification task characteristics and multi-granularity relation classification task characteristics, and finishing characteristic partition.

The computing mode of the multi-granularity entity partition task features is as follows:

/>

wherein ,

is shown as +.>

Convolution step size +.>

Zero padding is->

Is shown as +.>

Convolution step size +.>

Zero padding is->

Feature vector obtained once per slide, +.>

Feature vector obtained once per slide, +.>

Representing +.>

To->

A word vector matrix of words of (a),

representing +.>

To->

Word vector matrix of words of +.>

Representing the number of words contained in the window, +.>

Representing the dimension of a single word vector, +.>

Representing the length of a sentence>

Indicating the amount of offset.

Specifically, filtering deep feature information by a scoring function of a local attention module includes:

after feature partitioning is completed, multi-granularity entity recognition task features and multi-granularity relation classification task features are respectively defined as query vectors, and different weights are distributed for words in different positions through a scoring function of an attention mechanism, so that features irrelevant to or contradictory to tasks in feature information of an original input text are filtered, and feature filtering is realized.

In order to obtain relevant input feature vectors of entity identification tasks, the embodiment of the invention defines the multi-granularity entity identification task features as query vectors, namely:

obtaining a key vector and a value vector by using deep feature information:

wherein ,

representing deep level characteristic information->

Representing the resulting key vector,/>

The resulting vector of values is represented by a vector of values,

representing a trainable matrix.

The entity identifies task related input feature vectors

The method comprises the following steps:

wherein ,

representing an activation function->

Representing key vector +.>

Transpose of->

Representing the dimensions of a single word vector.

To obtain word vectors related to the relationship classification task, multi-granularity relationship classification task features are defined as query vectors, namely:

obtaining a key vector and a value vector by using deep feature information:

wherein ,

representing deep level characteristic information->

Representing the resulting key vector,/>

The resulting vector of values is represented by a vector of values,

representing a trainable matrix.

Word vectors associated with the relationship classification task

The method comprises the following steps:

wherein ,

to activate the function +.>

Representing key vector +.>

Transpose of->

Representing the dimension representing a single word vector.

Specifically, step 3 includes: and inputting the input feature vector and the word vector into a first extraction model, carrying out entity recognition on the input feature vector to obtain an early recognition result, and carrying out relation classification on the word vector to obtain an early classification result.

The first extraction model is an early relation extraction model, the second extraction model is a relation extraction model, in order to extract a plurality of triples from a financial text at one time, the first extraction model and the second extraction model both adopt a form filling method when carrying out a named entity recognition task, and the first extraction model and the second extraction model have the same structure but do not share parameters. Because the interaction between the two tasks of entity identification and relationship classification is uncontrollable by only relying on the implicit mode of error back propagation, the task early prediction is performed by using a lower layer of the neural network, and the interaction between modeling tasks is displayed by the mode of information iterative propagation so as to improve the reasoning capacity of the model.

The Table filling is commonly called as Table filtering, and is a common labeling method in relation extraction; in the extraction model, the entity identification and relationship classification tasks adopt a parallel coding scheme, namely the entity identification and relationship classification share only one coding layer.

In the embodiment of the invention, because of the specificity of the financial field, the financial text is usually accurately used by words and strict in terms, the financial text is usually not nested by entities, a single word is used as an entity, and the like, and the relationship among the financial text entities is mostly in a one-way relationship, so that the financial entities are subdivided into subjects and objects, and the relationship among the financial triplet entities can be better described. Taking the example of financial clauses of "funds invest in a target index constituent stock not less than 80% of cashless assets and not less than 90% of the net value of the fund asset", the triples in the financial clauses have < target index constituent stock, asset ratio, cashless assets >, < target index constituent stock, asset ratio, net value of the fund asset >.

For the above reasons, when performing a named entity recognition task, it is assumed that the length of an input text is l, and a table of l×l is used, the position of the subject head-object head is labeled "HB-TB", the position of the subject head-object tail is labeled "HB-TE", and the position of the subject tail-object tail is labeled "HE-TE". In this way, a host can be obtained through the HB-TE and HE-TE labels during decoding, and a guest can be obtained through the HB-TB and HB-TE labels, so that the problem of extraction of the host and the guest is solved; in relation classification, a method of filling a table is also adopted, assuming that the length of an input text is l, the number of predefined relations is m, a matrix of l x m is used,

representing subject's head, object's head, which are respectively +.>

Personal words, relation between financial entities is +.>

Then at +.>

The positions of (2) are marked with a "1", and the other positions are marked with a "0". Decoding by decoding tag "1"The location of the subject-object relationship can be determined.

In particular, using input feature vectors

Get the>

Personal word pair->

，/>

Is characterized in that the labels are learned through two fully connected layers, and the expression is as follows:

Wherein Selu and softmax represent the Selu activation function and softmax activation function, respectively,

representing trainable weights, ++>

Indicating the amount of offset.

Finally, using the cross entropy loss function as a loss function of the entity identification task:

wherein ,

representing word pair tag type->

Sign of the expression wordPredictive value of sign->

Representing the true value of the word pair tag.

A relation classification task, which inputs word vectors related to the relation classification task

Similarly, the relation classification is performed by using word pairs composed of a subject head and an object head in a form filling manner.

Assume that the relationship between the subject and the object is

For->

Personal word pair->

The predictions of (2) are as follows:

wherein, relu represents a Relu activation function,

representation->

Activating function->

Representing trainable weights, ++>

Indicating the amount of offset.

A binary cross entropy loss function is selected as the loss function for the relationship classification, which is defined as follows:

wherein ,

representing the tag type->

Representing the predicted value of the tag->

Representing the true value of the tag and,

。

in the embodiment of the invention, the early prediction is carried out through the first extraction model to obtain the early recognition result of entity recognition and the early classification result of relation classification, which are respectively recorded as

。

The embodiment of the invention uses the mutual information module to filter out noise information in entity identification and relationship classification in early prediction, and then inputs the filtered information into the task of entity identification and relationship classification in the second extraction model respectively, thereby realizing bidirectional interaction of entity identification and relationship classification. Entity identification and relationship classification are often facilitated by each other, for example, for the financial text "dummies investable bonds, money market instruments, asset support securities", it is easier to determine "invest" relationships if entity information such as "dummies", "bonds" is known. Also, if the "investment" relationship is known, the entity pairs such as "silver life" and "bond", "silver life" and "securities supported by assets" are more favored than the entity pairs such as "securities" and "securities supported by assets" when the entity is identified. Therefore, partial entity information and partial relation information are obtained through early prediction and mutual information, and then the partial entity information and the partial relation information are sent into a second extraction model, so that bidirectional interaction of entity identification and relation classification is realized, and the accuracy of triplet extraction is improved. Meanwhile, before bidirectional interaction, the characteristics irrelevant to tasks are filtered by utilizing the partition filtering module, so that the accuracy of early prediction can be improved, noise information in early results is reduced, the model convergence speed is increased, and the overall structure is more reasonable.

In the embodiment of the present invention, the step 4 and the step 5 specifically include: will input feature vectors

Early recognition result->

And early classification result->

Input to the first mutual information module->

In the method, the filtered entity information flow is calculated

The method comprises the steps of carrying out a first treatment on the surface of the Word vector +.>

Early recognition result->

And early classification result->

Input to the second mutual information module

In, calculating the filtered relation information flow +.>

。

Because of the physical information flow

And relation information stream->

The first relation classification and entity identification information are included, so that the entity information flow is utilized +.>

As the input of entity identification task in the second extraction model, the interaction of relationship classification RC to entity identification NER is realized; utilize relational information flow->

And as the input of the relationship classification task in the second extraction model, the interaction of the entity recognition NER on the relationship classification RC is realized, so that the bidirectional interaction between the entity recognition and the relationship classification is realized.

The loss of this part is calculated by minimizing mutual information:

wherein ,

representing mutual information +.>

The representation input is +.>

Output is +.>

Probability of->

Is->

Is a variational approximation of (a).

Because the solving modes of the two formulas are similar, in the embodiment of the invention, only the first formula is solved; filtered entity information flow

Is obtained by sampling, first two neural networks are used +.>

Modeling solution

and />

Is calculated as the mean +.>

And standard deviation->

Then from Gaussian distribution->

Mid-sampling to obtain a physical information stream->

。/>

For the following

Two multi-layer perceptrons are used +.>

Will->

As input, then calculate the entity information flow +.>

Mean>

And standard deviation->

。

For the following

Calculation is performed using a gated recursive unit (gresells),

，/>

for calculating the physical information flow, respectively>

Mean>

And standard deviation->

The structure of greclels is as follows:

wherein ,

vector of representation +.>

Vector->

Splicing (I)>

Representing a splicing operation->

Representing the learnable parameters.

Obtaining

and />

Mean>

And standard deviation->

Thereafter, the loss is calculated by KL divergence formula>

。

Specifically, step 6 includes: and inputting the entity information flow and the relation information flow into a second extraction model, carrying out entity identification on the entity information flow to obtain an entity identification result, and carrying out relation classification on the relation information flow to obtain a relation classification result. The entity identification task and the relationship classification task are similar to step 3, except that the entity identification task input is an entity information stream

And the input of the relation classification is the relation information flow +.>

。

Specifically, step 7 includes: decoding the entity identification result and the relation classification result respectively to obtain a main body identification result

Object recognition result->

And relation between subject and object->

The method comprises the steps of carrying out a first treatment on the surface of the Identifying the subject as->

Object recognition result->

And relation between subject and object->

Form triplet to obtain the entity relation extraction result +.>

。

According to the embodiment of the invention, the feature information of the financial text to be processed is acquired, the local attention module is established in the two convolutional neural networks to carry out partition filtering on the feature information of the financial text, so that the input feature vector related to the entity recognition task and the word vector related to the relation classification task are obtained, the features are distinguished according to the task in the data feature level, and the problem of feature contradiction is avoided; the method comprises the steps of taking a first extraction model as early prediction, inputting an input feature vector and a word vector into the first extraction model, carrying out entity recognition on the input feature vector to obtain an early recognition result, and carrying out relation classification on the word vector to obtain an early classification result; inputting the input feature vector, the early recognition result and the early classification result into a first mutual information module for calculation to obtain an entity information flow; inputting the word vector, the early recognition result and the early classification result into a second mutual information module for calculation to obtain a relation information stream, and filtering noise information in the early recognition result and the early classification result by utilizing the calculation of the first mutual information module and the second mutual information module to obtain an entity information stream and a relation information stream; inputting the entity information flow into a second extraction model for entity identification to obtain an entity identification result; inputting the relation information flow into a second extraction model for relation classification to obtain a relation classification result, and sending the relation information flow and the relation information flow into the second extraction model for entity identification and relation classification because the entity information flow and the relation information flow contain early entity identification and the relation classification result, so that bidirectional interaction of the entity identification and the relation classification is realized; finally decoding the entity identification result to obtain a subject identification result and a guest identification result; decoding the relation classification result to obtain a relation between the subject and the object pair, and forming a triplet form by the subject identification result, the object identification result and the relation between the subject and the object pair to obtain an entity relation extraction result of the financial text; the accuracy of the relation extraction of the financial data entities is improved.

It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the embodiments of the present invention. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the implementation of all or part of the flow of the method of the foregoing embodiments of the present invention may be accomplished by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and the computer program may implement the steps of each of the foregoing method embodiments when executed by a processor. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to construct an apparatus/terminal equipment, recording medium, computer Memory, read-Only Memory (ROM), random access Memory (RAM, random Access Memory), electrical carrier signals, telecommunications signals, and software distribution media. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

It should be noted that the terminal device may be a mobile phone, a tablet computer, a notebook computer, an Ultra mobile personal computer (UMPC, ultra-mobile Personal Computer), a netbook, a personal digital assistant (PDA, personal Digital Assistant), or the like, and the terminal device may be a station (ST, stand) in a WLAN, for example, a cellular phone, a cordless phone, a session initiation protocol (SIP, session Initiation Protocol) phone, a wireless local loop (WLL, wireless Local Loop) station, a personal digital processing (PDA, personal Digital Assistant) device, a handheld device having a wireless communication function, a computing device, or other processing device connected to a wireless modem, a computer, a laptop computer, a handheld communication device, a handheld computing device, a satellite wireless device, or the like. The embodiment of the invention does not limit the specific type of the terminal equipment.

The processor may be a central processing unit (CPU, central Processing Unit), but may also be other general purpose processors, digital signal processors (DSP, digital Signal Processor), application specific integrated circuits (ASIC, application Specific Integrated Circuit), off-the-shelf programmable gate arrays (FPGA, field-Programmable Gate Array) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory may in some embodiments be an internal storage unit of the terminal device, such as a hard disk or a memory of the terminal device. The memory may in other embodiments also be an external storage device of the terminal device, such as a plug-in hard disk provided on the terminal device, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), etc. Further, the memory may also include both an internal storage unit and an external storage device of the terminal device. The memory is used to store an operating system, application programs, boot loader (BootLoader), data, and other programs, etc., such as program code for the computer program, etc. The memory may also be used to temporarily store data that has been output or is to be output.

It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present invention, specific functions and technical effects thereof may be found in the method embodiment section, and will not be described herein.

While the foregoing is directed to the preferred embodiments of the present invention, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present invention, and such modifications and adaptations are intended to be comprehended within the scope of the present invention.

Claims

1. A method for joint extraction of financial data entity relationships, comprising:

Step 4, inputting the input feature vector, the early recognition result and the early classification result into a first mutual information module for calculation to obtain an entity information flow;

and 7, respectively decoding the entity recognition result and the relation classification result to obtain a subject recognition result, an object recognition result and a relation between the subject and the object, and forming the subject recognition result, the object recognition result and the relation between the subject and the object into a triplet form to obtain an entity relation extraction result of the financial text.

2. The method of claim 1, wherein the step 1 includes:

acquiring a financial text to be processed;

3. The method of claim 2, wherein the step 2 includes:

and respectively establishing a local attention module at the output end of each convolutional neural network by utilizing the multi-granularity entity partition task characteristics, the multi-granularity relation partition task characteristics and the deep characteristic information, and filtering the deep characteristic information through a scoring function of the local attention module to obtain an input characteristic vector related to an entity identification task and a word vector related to a relation classification task.

4. The method of claim 3, wherein the step of extracting the relationship between the financial data entities,

/>

wherein ,

is shown as +.>

Convolution step size +.>

Zero padding is->

Is shown as +.>

Convolution step size +.>

Zero padding is->

Feature vector obtained once per slide, +.>

Every slideMoving the obtained feature vector once, < >>

Representing +.>

To->

A word vector matrix of words of (a),

representing +.>

To->

Word vector matrix of words of +.>

Representing the number of words contained in the window, +.>

Representing the dimension of a single word vector, +.>

Representing the length of a sentence>

Indicating the amount of offset.

5. The method of claim 4, wherein filtering the deep feature information by a scoring function of a local attention module comprises:

defining the multi-granularity entity partition task characteristics as query vectors, namely:

And obtaining a key vector and a value vector by using the deep feature information:

wherein ,

representing deep level characteristic information->

Representing the resulting key vector,/>

The resulting vector of values is represented by a vector of values,

representing a trainable matrix;

the entity identifies task-related input feature vectors

The method comprises the following steps:

wherein ,

representing an activation function->

Representing key vector +.>

Transpose of->

Representing the dimensions of a single word vector.

6. The method of claim 5, wherein filtering the deep feature information by a scoring function of a local attention module comprises:

/>

wherein ,

representing deep level characteristic information->

Representing the resulting key vector,/>

The resulting vector of values is represented by a vector of values,

representing a trainable matrix;

word vectors associated with the relationship classification task

The method comprises the following steps:

wherein ,

to activate the function +.>

Representing key vector +.>

Transpose of->

Representing the dimensions of a single word vector.

7. A joint extraction device for financial data entity relationship, comprising:

the partition filtering module is used for carrying out partition filtering on the characteristic information to obtain an input characteristic vector related to the entity identification task and a word vector related to the relation classification task;

the early recognition classification module is used for inputting the input feature vector and the word vector into a first extraction model, carrying out entity recognition on the input feature vector to obtain an early recognition result, and carrying out relation classification on the word vector to obtain an early classification result;

the identification classification module is used for inputting the entity information flow and the relation information flow into a second extraction model, carrying out entity identification on the entity information flow to obtain an entity identification result, and carrying out relation classification on the relation information flow to obtain a relation classification result;

And the extraction module is used for respectively decoding the entity recognition result and the relation classification result to obtain a subject recognition result, an object recognition result and a relation between the subject and the object, and forming the subject recognition result, the object recognition result and the relation between the subject and the object into a triplet form to obtain the entity relation extraction result of the financial text.

8. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements a method of joint extraction of financial data entity relationships according to any one of claims 1 to 6.

9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements a method of joint extraction of financial data entity relationships according to any one of claims 1 to 6 when the computer program is executed.