CN111241843A - Semantic relation inference system and method based on composite neural network - Google Patents

Semantic relation inference system and method based on composite neural network Download PDF

Info

Publication number
CN111241843A
CN111241843A CN201811446102.0A CN201811446102A CN111241843A CN 111241843 A CN111241843 A CN 111241843A CN 201811446102 A CN201811446102 A CN 201811446102A CN 111241843 A CN111241843 A CN 111241843A
Authority
CN
China
Prior art keywords
texts
neural network
vectors
training
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811446102.0A
Other languages
Chinese (zh)
Other versions
CN111241843B (en
Inventor
何广
朱琦
林鹏飞
袁源
覃玲华
毛仕文
陈开添
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Group Guangdong Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Group Guangdong Co Ltd filed Critical China Mobile Communications Group Co Ltd
Priority to CN201811446102.0A priority Critical patent/CN111241843B/en
Publication of CN111241843A publication Critical patent/CN111241843A/en
Application granted granted Critical
Publication of CN111241843B publication Critical patent/CN111241843B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the invention provides a semantic relation inference system and method based on a composite neural network. The system comprises a feature extraction unit, a training unit and a decision unit, wherein the training unit comprises a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model; the training unit is used for receiving the word vectors, respectively training the word vectors of the two texts to be matched with a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model, and outputting the result vectors output by the models to the decision unit; and the decision unit is used for receiving the result vector input by the training unit, integrating the result vector through a gradient enhancement decision tree and outputting the semantic relation of the two texts to be matched. The embodiment of the invention can improve the accuracy of synonym sense relation detection.

Description

Semantic relation inference system and method based on composite neural network
Technical Field
The embodiment of the invention relates to the technical field of natural language processing, in particular to a semantic relation inference system and a semantic relation inference method based on a composite neural network.
Background
With the rise of deep learning, semantic analysis based on a neural network becomes a research hotspot, and detection of synonym and synonym relation becomes a key for deducing the context relation of short texts.
At present, the method for improving the accuracy of the semantic relation inference method mainly extracts a large number of artificial features. Usually, the targeted extraction is performed based on the service situation and the data condition. For example, common business synonyms are normalized, and so on. However, the accuracy improvement of the method is usually difficult to migrate to another data set. Meanwhile, the manual feature extraction will take most of the time for system construction.
Disclosure of Invention
The embodiment of the invention provides a semantic relation inference system and a semantic relation inference method based on a composite neural network, which are used for solving the problem of low semantic relation inference accuracy in the prior art.
In a first aspect, an embodiment of the present invention provides a semantic relation inference system based on a composite neural network, where the system includes a feature extraction unit, a training unit, and a decision unit, where the training unit includes a dual-growth short-term memory neural network model, a decomposed focal length model, and an enhanced sequence inference model, where:
the feature extraction unit is used for extracting word vectors of input texts and outputting the word vectors to the training unit;
the training unit is used for receiving the word vectors, respectively training the word vectors of the two texts to be matched with a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model, and outputting the result vectors output by the models to the decision unit;
and the decision unit is used for receiving the result vector input by the training unit, integrating the result vector through a gradient enhancement decision tree and outputting the semantic relation of the two texts to be matched.
In a second aspect, an embodiment of the present invention provides a semantic relation inference method for a composite neural network, where the method includes:
extracting a word vector of an input text;
training a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model on the word vectors respectively;
and integrating the result vectors output by the models and outputting the semantic relation of the two texts to be matched.
In a third aspect, an embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the method provided in the second aspect.
In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, which when executed by a processor implements the method provided in the second aspect.
According to the embodiment of the invention, the word vectors are respectively trained on the dual-growth short-term memory neural network model, the decomposed focal length model and the enhanced sequence inference model, and then the semantic relation of the word vectors is judged through gradient enhancement decision, so that the detection accuracy of the synonym semantic relation can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic structural diagram of a semantic relationship inference system based on a composite neural network according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a network structure of a dual-growth short-term memory neural network model according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of a network structure of a decomposed focal length model according to an embodiment of the present invention;
FIG. 4 is a schematic diagram of a network structure of an enhanced sequence inference model according to an embodiment of the present invention;
FIG. 5 is a flowchart illustrating a semantic relationship inference method based on a composite neural network according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Fig. 1 shows a schematic structural diagram of a semantic relation inference system based on a composite neural network according to an embodiment of the present invention.
As shown in fig. 1, the system includes a feature extraction unit 11, a training unit 12, and a decision unit 13, the training unit includes a dual-growth short-term memory neural network model 121, a decomposed focal length model 122, and an enhanced sequence inference model 123, wherein:
the feature extraction unit is used for extracting word vectors of input texts and outputting the word vectors to the training unit;
specifically, the embodiment of the present invention may use a pre-trained word vector model or train itself on the original text to generate a word vector.
The training unit is used for receiving the word vectors, respectively training the word vectors of the two texts to be matched with a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model, and outputting the result vectors output by the models to the decision unit;
specifically, the embodiment of the invention inputs the word vectors of two texts to be matched into the training unit, and trains three models in the training unit respectively by using the data set. And finally, outputting the result vectors output by the three models to a decision unit as embedded vectors.
And the decision unit is used for receiving the matching result vector input by the training unit, integrating the matching result vector through a gradient enhancement decision tree and outputting the semantic relation of the two texts to be matched.
Specifically, the decision unit adopts a gradient enhancement decision tree to finally integrate the embedded vectors input by the training unit to obtain a judgment result of the semantic relationship of the word vectors of the two texts, and judges whether the word vectors of the two texts are synonyms or synonyms, so as to obtain the semantic relationship of the texts.
According to the embodiment of the invention, the word vectors are respectively trained on the dual-growth short-term memory neural network model, the decomposed focal length model and the enhanced sequence inference model, and then the semantic relation of the word vectors is judged through gradient enhancement decision, so that the detection accuracy of the synonym semantic relation can be improved.
Meanwhile, the feature automatic extraction based on the neural network reduces the workload of manual feature selection and construction in system construction, so that the method has wider application range and can more conveniently and quickly realize the inference of semantic relation.
On the basis of the above embodiment, the dual-growth short-term memory neural network model includes:
the first input module is used for respectively inputting the word vectors of the two texts to be matched into the two long-term and short-term memory neural networks to obtain the final hidden states of the two texts;
the first training module is used for training by taking the normalized difference value of the final hidden states of the two texts as a prediction label;
and the first output module is used for carrying out vector splicing on the final hidden states of the two trained texts and outputting the final hidden states to the decision unit.
Fig. 2 is a schematic network structure diagram of a dual-growth short-term memory neural network model provided by an embodiment of the present invention.
As shown in fig. 2, the dual-growth short-term Memory neural network (siame LSTM) model provided by the embodiment of the present invention includes two long short-term Memory neural networks (LSTM-a and LSTM-B), and the training process is as follows:
respectively inputting two texts to be matched into two LSTM networks;
and taking the normalized difference value obtained from the final hidden states of the LSTM-A and the LSTM-B as a prediction label, and performing matching training with a label provided by a daA set, wherein the calculation formula of the prediction label is as follows:
exp(-||h2 A-h3 B||1)
after training is finished, vector splicing is carried out on the final hidden states of the LSTM-A and the LSTM-B during use, and the final hidden states are input into a final gradient enhancement decision tree model.
On the basis of the above embodiment, the decomposed focal length model includes:
the second input module is used for inputting the word vectors of the two texts to be matched into a decomposition focusing matrix to obtain para-word vectors of the positions of the two word vectors;
the second training unit is used for inputting the comparison result of the para-word vector and the original word vector at the corresponding position into the feedforward neural network for training;
and the second output unit is used for splicing the vectors after position comparison results of the two trained texts are pooled and outputting the vectors to the decision unit.
Fig. 3 shows a network structure diagram of a decomposed focal length model provided by the embodiment of the present invention.
As shown in fig. 3, the training process for decomposing the focal length model is as follows:
the word vector is weight-computed using a neural network, each weight being the focus of decomposition (decompactable Attention). If a word vector in text A is expressed as Ai, a word vector in text B is expressed as BjExpressing the neural network as a function F (), the calculation of the focusing unit in the focusing matrix can be obtained by combining and calculating the corresponding focusing again from the following expression:
eij=F(Ai)TF(Bj)
calculating a para-word vector (A) for the text word vector location by aggregating the matrix with the original word vectorsi-ai,Bj-bj). The calculation method is as follows:
Figure BDA0001885747350000051
Figure BDA0001885747350000052
and comparing the para-word vector obtained by the above formula with the original word vector at the corresponding position, wherein the comparison mode is to input the two vectors after splicing into a feedforward neural network.
And integrating the comparison results on the vector positions of the words, and using a Global Average Pooling (Global Average Pooling) mode in the text range.
And splicing the pooled vectors of the two texts, and outputting the spliced vectors to a final linear layer to obtain a final inference result.
Fig. 4 is a schematic network structure diagram of an enhanced sequence inference model provided by an embodiment of the present invention.
As shown in fig. 4, the enhanced sequence inference model provided in the embodiment of the present invention includes a dual-growth short-term memory neural network, a layer decomposition focal length model, and a long-term short-term memory neural network, and is configured to receive the paired word vectors, respectively perform training on the dual-growth short-term memory neural network, the decomposition focal length model, and the long-term short-term memory neural network, respectively perform global maximum pooling and global average pooling on the hidden state output by the long-term short-term memory neural network, and output the vector after global maximum pooling and the vector after average pooling of the paired word vectors to the decision unit after being spliced.
The training process of the enhanced sequence inference model is as follows:
inputting two texts to be matched into two LSTM networks (LSTM-A1 and LSTM-B1);
using the hidden state of each step of the last step LSTM as the text region meaning code a of the word positioniAnd bj
Calculating the element of the focusing matrix to obtain the para-position local area code of the corresponding position, wherein the calculation formula is as follows:
Figure BDA0001885747350000061
Figure BDA0001885747350000062
calculating and converting the para-position local area code and the original local area code to obtain an integrated local area code;
the integrated local area codes are sequentially input into the next layer LSTM. Outputting the hidden state of the text, and respectively performing global maximum pooling and global average pooling in the text range to obtain global text representation;
and splicing the vectors after the global maximum pooling and the vectors after the average pooling corresponding to the two texts. The vector is output to a neural network as a final decision device and output for training; or directly output the vector to the final gradient enhanced decision tree model.
The embodiment of the invention outputs the embedded vectors of the three models of the training unit to the final gradient enhancement decision tree model and outputs the final result.
Fig. 5 is a flow chart illustrating a semantic relation inference method based on a composite neural network according to an embodiment of the present invention.
As shown in fig. 5, the semantic relationship inference method based on a composite neural network provided in the embodiment of the present invention specifically includes the following steps:
s11, extracting word vectors of the input text;
specifically, the embodiment of the present invention may use a pre-trained word vector model or train itself on the original text to generate a word vector.
S12, training a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model on the word vectors respectively;
specifically, the embodiment of the invention inputs the word vectors of two texts to be matched into the training unit, and trains three models in the training unit respectively by using the data set. Finally, the result vectors output by the three models are output to the decision unit as embedded vectors (embedding vectors).
And S13, integrating the result vectors output by the models and outputting the semantic relation of the two texts needing to be matched.
Specifically, the decision unit adopts a gradient enhancement decision tree to finally integrate the embedded vectors input by the training unit to obtain a judgment result of the semantic relationship of the word vectors of the two texts, and judges whether the word vectors of the two texts are synonyms or synonyms, so as to obtain the semantic relationship of the texts.
According to the embodiment of the invention, the word vectors are respectively trained on the dual-growth short-term memory neural network model, the decomposed focal length model and the enhanced sequence inference model, and then the semantic relation of the word vectors is judged through gradient enhancement decision, so that the detection accuracy of the synonym semantic relation can be improved.
Meanwhile, the feature automatic extraction based on the neural network reduces the workload of manual feature selection and construction in system construction, so that the method has wider application range and can more conveniently and quickly realize the inference of semantic relation.
On the basis of the above embodiment, S12 specifically includes a training step of the dual-growth short-term memory neural network model:
respectively inputting the word vectors of two texts to be matched into two long-term and short-term memory neural networks to obtain the final hidden states of the two texts;
training by taking the normalized difference value of the final hidden states of the two texts as a prediction label;
and carrying out vector splicing on the final hidden states of the two trained texts, and outputting the final hidden states to the decision unit.
Referring to fig. 2, a dual-growth Short-term Memory neural network (siame Long Short-term Memory, siame LSTM) model provided by the embodiment of the present invention includes two Long Short-term Memory neural networks (LSTM-a and LSTM-B), and the training process is as follows:
respectively inputting two texts to be matched into two LSTM networks;
and taking the normalized difference value obtained from the final hidden states of the LSTM-A and the LSTM-B as a prediction label, and performing matching training with a label provided by a daA set, wherein the calculation formula of the prediction label is as follows:
exp(-||h2 A-h3 B||1)
after training is finished, vector splicing is carried out on the final hidden states of the LSTM-A and the LSTM-B during use, and the final hidden states are input into a final gradient enhancement decision tree model.
On the basis of the above embodiment, the decomposed focal length model includes:
the second input module is used for inputting the word vectors of the two texts to be matched into a decomposition focusing matrix to obtain para-word vectors of the positions of the two word vectors;
the second training unit is used for inputting the comparison result of the para-word vector and the original word vector at the corresponding position into the feedforward neural network for training;
and the second output unit is used for splicing the vectors after position comparison results of the two trained texts are pooled and outputting the vectors to the decision unit.
On the basis of the above embodiment, S12 specifically includes a training step of decomposing the focal length model:
inputting the word vectors of two texts to be matched into a decomposition focusing matrix to obtain para-word vectors of two word vector positions;
inputting the comparison result of the para-word vector and the original word vector at the corresponding position into a feedforward neural network for training;
and splicing vectors after position comparison results of the two trained texts are pooled, and outputting the vectors to the decision unit.
Referring to fig. 3, a training process of the decomposed focal length model provided by the embodiment of the present invention is as follows:
the word vector is weight-computed using a neural network, each weight being the focus of decomposition (decompactable Attention). If a word vector in text A is expressed as Ai, a word vector in text B is expressed as BjExpressing the neural network as a function F (), the calculation of the focusing unit in the focusing matrix can be obtained by combining and calculating the corresponding focusing again from the following expression:
eij=F(Ai)TF(Bj)
calculating the para-word vector of the text word vector position by aggregating the matrix and the original word vector: (Ai-ai,Bj-bj). The calculation method is as follows:
Figure BDA0001885747350000081
Figure BDA0001885747350000082
and comparing the para-word vector obtained by the above formula with the original word vector at the corresponding position, wherein the comparison mode is to input the two vectors after splicing into a feedforward neural network.
And integrating the comparison results on the vector positions of the words, and using a Global Average Pooling (Global Average Pooling) mode in the text range.
And splicing the pooled vectors of the two texts, and outputting the spliced vectors to a final linear layer to obtain a final inference result.
On the basis of the above embodiment, S12 specifically includes a training step of enhancing the sequence inference model:
inputting the word vectors of two texts to be matched into a double-growth short-term memory neural network to obtain the hidden state of the two texts in each step;
taking the hidden state of each step of the double-growth short-term memory neural network as the position code of the corresponding text, and inputting the hidden state into a decomposition focusing matrix to obtain the para-position local area codes of the two texts;
inputting the para-position local area codes of the two texts into a long-short term memory neural network to obtain the hidden states of the two texts;
and splicing the vectors of the two texts after the hidden states are pooled, and outputting the spliced vectors to the decision unit.
Referring to fig. 4, a training process of the enhanced sequence inference model provided in the embodiment of the present invention is as follows:
inputting two texts to be matched into two LSTM networks respectively, wherein the two texts are not called LSTM-A1 and LSTM-B1;
the last step LThe hidden state of each step of STM is used as the text region meaning code a of the word positioniAnd bj
Calculating the element of the focusing matrix to obtain the para-position local area code of the corresponding position, wherein the calculation formula is as follows:
Figure BDA0001885747350000091
Figure BDA0001885747350000092
calculating and converting the para-position local area code and the original local area code to obtain an integrated local area code;
the integrated local area codes are sequentially input into the next layer LSTM. Outputting the hidden state of the text, and respectively performing global maximum pooling and global average pooling in the text range to obtain global text representation;
and splicing the vectors after the global maximum pooling and the vectors after the average pooling corresponding to the two texts. The vector is output to a neural network as a final decision device and output for training; or directly output the vector to the final gradient enhanced decision tree model.
The embodiment of the invention outputs the embedded vectors of the three models of the training unit to the final gradient enhancement decision tree model and outputs the final result.
An embodiment of the present invention further provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the method shown in fig. 5 is implemented.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
As shown in fig. 6, the electronic device provided by the embodiment of the present invention includes a memory 21, a processor 22, a bus 23, and a computer program stored on the memory 21 and executable on the processor 22. The memory 21 and the processor 22 complete communication with each other through the bus 23.
The processor 22 is used to call the program instructions in the memory 21 to implement the method of fig. 5 when executing the program.
For example, the processor implements the following method when executing the program:
extracting a word vector of an input text;
training a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model on the word vectors respectively;
and integrating the result vectors output by the models and outputting the semantic relation of the two texts to be matched.
According to the electronic equipment provided by the embodiment of the invention, the word vectors are respectively trained by the dual-growth short-term memory neural network model, the decomposition focal length model and the enhanced sequence inference model, and the semantic relation of the word vectors is judged by the gradient enhancement decision, so that the detection accuracy of the synonym semantic relation can be improved.
An embodiment of the present invention further provides a non-transitory computer readable storage medium, on which a computer program is stored, and the program, when executed by a processor, implements the steps of fig. 5.
For example, the processor implements the following method when executing the program:
extracting a word vector of an input text;
training a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model on the word vectors respectively;
and integrating the result vectors output by the models and outputting the semantic relation of the two texts to be matched.
The non-transitory computer-readable storage medium provided by the embodiment of the invention can improve the precision of synonym semantic relation detection by training the word vectors respectively through a dual-growth short-term memory neural network model, a decomposed focal length model and an enhanced sequence inference model and judging the semantic relation of the word vectors through gradient enhancement decision making.
An embodiment of the present invention discloses a computer program product, the computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions, which when executed by a computer, enable the computer to perform the methods provided by the above-mentioned method embodiments, for example, including:
extracting a word vector of an input text;
training a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model on the word vectors respectively;
and integrating the result vectors output by the models and outputting the semantic relation of the two texts to be matched.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A semantic relation inference system based on a composite neural network is characterized by comprising a feature extraction unit, a training unit and a decision unit, wherein the training unit comprises a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model, and the semantic relation inference system comprises:
the feature extraction unit is used for extracting word vectors of input texts and outputting the word vectors to the training unit;
the training unit is used for receiving the word vectors, respectively training the word vectors of the two texts to be matched with a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model, and outputting the result vectors output by the models to the decision unit;
and the decision unit is used for receiving the result vector input by the training unit, integrating the result vector through a gradient enhancement decision tree and outputting the semantic relation of the two texts to be matched.
2. The system of claim 1, wherein the dual-growth short-term memory neural network model comprises:
the first input module is used for respectively inputting the word vectors of the two texts to be matched into the two long-term and short-term memory neural networks to obtain the final hidden states of the two texts;
the first training module is used for training by taking the normalized difference value of the final hidden states of the two texts as a prediction label;
and the first output module is used for carrying out vector splicing on the final hidden states of the two trained texts and outputting the final hidden states to the decision unit.
3. The system of claim 1, wherein the decomposed focal length model comprises:
the second input module is used for inputting the word vectors of the two texts to be matched into a decomposition focusing matrix to obtain para-word vectors of the positions of the two word vectors;
the second training unit is used for inputting the comparison result of the para-word vector and the original word vector at the corresponding position into the feedforward neural network for training;
and the second output unit is used for splicing the vectors after position comparison results of the two trained texts are pooled and outputting the vectors to the decision unit.
4. The system of claim 1, wherein the enhanced sequence inference model comprises:
the third input module is used for inputting the word vectors of the two texts to be matched into a double-growth short-term memory neural network to obtain the hidden state of the two texts in each step;
the fourth input module is used for inputting the hidden state of each step of the double-growth short-term memory neural network into a decomposition focusing matrix to obtain the para-position local coding of the two texts as the position coding of the corresponding text;
the fifth input module is used for inputting the para-position local area codes of the two texts into a long-short term memory neural network to obtain the hidden states of the two texts;
and the third output unit is used for splicing the vectors of the two texts after the hidden states are pooled and outputting the spliced vectors to the decision unit.
5. A semantic relationship inference method based on a composite neural network, the method comprising:
extracting a word vector of an input text;
training a dual-growth short-term memory neural network model, a decomposition focal length model and an enhanced sequence inference model on the word vectors respectively;
and integrating the result vectors output by the models and outputting the semantic relation of the two texts to be matched.
6. The method of claim 5, further comprising:
training a double-growth short-term memory neural network model:
respectively inputting the word vectors of two texts to be matched into two long-term and short-term memory neural networks to obtain the final hidden states of the two texts;
training by taking the normalized difference value of the final hidden states of the two texts as a prediction label;
and carrying out vector splicing on the final hidden states of the two trained texts, and outputting the final hidden states to the decision unit.
7. The method of claim 5, further comprising:
and (3) decomposing a focal length model and training:
inputting the word vectors of two texts to be matched into a decomposition focusing matrix to obtain para-word vectors of two word vector positions;
inputting the comparison result of the para-word vector and the original word vector at the corresponding position into a feedforward neural network for training;
and splicing vectors after position comparison results of the two trained texts are pooled, and outputting the vectors to the decision unit.
8. The method of claim 5, further comprising:
training the enhanced sequence inference model:
inputting the word vectors of two texts to be matched into a double-growth short-term memory neural network to obtain the hidden state of the two texts in each step;
taking the hidden state of each step of the double-growth short-term memory neural network as the position code of the corresponding text, and inputting the hidden state into a decomposition focusing matrix to obtain the para-position local area codes of the two texts;
inputting the para-position local area codes of the two texts into a long-short term memory neural network to obtain the hidden states of the two texts;
and splicing the vectors of the two texts after the hidden states are pooled, and outputting the spliced vectors to the decision unit.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the composite neural network-based semantic relationship inference method according to any one of claims 5 to 8.
10. A non-transitory computer readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps of the composite neural network-based semantic relationship inference method according to any one of claims 5 to 8.
CN201811446102.0A 2018-11-29 2018-11-29 Semantic relation inference system and method based on composite neural network Active CN111241843B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811446102.0A CN111241843B (en) 2018-11-29 2018-11-29 Semantic relation inference system and method based on composite neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811446102.0A CN111241843B (en) 2018-11-29 2018-11-29 Semantic relation inference system and method based on composite neural network

Publications (2)

Publication Number Publication Date
CN111241843A true CN111241843A (en) 2020-06-05
CN111241843B CN111241843B (en) 2023-09-22

Family

ID=70872518

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811446102.0A Active CN111241843B (en) 2018-11-29 2018-11-29 Semantic relation inference system and method based on composite neural network

Country Status (1)

Country Link
CN (1) CN111241843B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529390A (en) * 2020-12-02 2021-03-19 平安医疗健康管理股份有限公司 Task allocation method and device, computer equipment and storage medium
CN113468288A (en) * 2021-07-23 2021-10-01 平安国际智慧城市科技股份有限公司 Content extraction method of text courseware based on artificial intelligence and related equipment
CN113643241A (en) * 2021-07-15 2021-11-12 北京迈格威科技有限公司 Interaction relation detection method, interaction relation detection model training method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080221878A1 (en) * 2007-03-08 2008-09-11 Nec Laboratories America, Inc. Fast semantic extraction using a neural network architecture
CN107578106A (en) * 2017-09-18 2018-01-12 中国科学技术大学 A kind of neutral net natural language inference method for merging semanteme of word knowledge
CN107871144A (en) * 2017-11-24 2018-04-03 税友软件集团股份有限公司 Invoice trade name sorting technique, system, equipment and computer-readable recording medium
CN108304911A (en) * 2018-01-09 2018-07-20 中国科学院自动化研究所 Knowledge Extraction Method and system based on Memory Neural Networks and equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080221878A1 (en) * 2007-03-08 2008-09-11 Nec Laboratories America, Inc. Fast semantic extraction using a neural network architecture
CN107578106A (en) * 2017-09-18 2018-01-12 中国科学技术大学 A kind of neutral net natural language inference method for merging semanteme of word knowledge
CN107871144A (en) * 2017-11-24 2018-04-03 税友软件集团股份有限公司 Invoice trade name sorting technique, system, equipment and computer-readable recording medium
CN108304911A (en) * 2018-01-09 2018-07-20 中国科学院自动化研究所 Knowledge Extraction Method and system based on Memory Neural Networks and equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112529390A (en) * 2020-12-02 2021-03-19 平安医疗健康管理股份有限公司 Task allocation method and device, computer equipment and storage medium
CN113643241A (en) * 2021-07-15 2021-11-12 北京迈格威科技有限公司 Interaction relation detection method, interaction relation detection model training method and device
CN113468288A (en) * 2021-07-23 2021-10-01 平安国际智慧城市科技股份有限公司 Content extraction method of text courseware based on artificial intelligence and related equipment
CN113468288B (en) * 2021-07-23 2024-04-16 平安国际智慧城市科技股份有限公司 Text courseware content extraction method based on artificial intelligence and related equipment

Also Published As

Publication number Publication date
CN111241843B (en) 2023-09-22

Similar Documents

Publication Publication Date Title
EP4131076A1 (en) Serialized data processing method and device, and text processing method and device
CN109857865B (en) Text classification method and system
EP3979098A1 (en) Data processing method and apparatus, storage medium, and electronic apparatus
CN107291775B (en) Method and device for generating repairing linguistic data of error sample
CN110019758B (en) Core element extraction method and device and electronic equipment
CN110399472B (en) Interview question prompting method and device, computer equipment and storage medium
CN111241843B (en) Semantic relation inference system and method based on composite neural network
CN112084769A (en) Dependency syntax model optimization method, device, equipment and readable storage medium
CN117271736A (en) Question-answer pair generation method and system, electronic equipment and storage medium
JP2019144706A (en) Device, method and program for learning relationship estimation model
CN117332788B (en) Semantic analysis method based on spoken English text
CN113779190B (en) Event causal relationship identification method, device, electronic equipment and storage medium
CN111046177A (en) Automatic arbitration case prejudging method and device
CN115952854B (en) Training method of text desensitization model, text desensitization method and application
CN111859979A (en) Ironic text collaborative recognition method, ironic text collaborative recognition device, ironic text collaborative recognition equipment and computer readable medium
CN114792097B (en) Method and device for determining prompt vector of pre-training model and electronic equipment
CN115796141A (en) Text data enhancement method and device, electronic equipment and storage medium
CN112818688B (en) Text processing method, device, equipment and storage medium
CN113761874A (en) Event reality prediction method and device, electronic equipment and storage medium
CN114357164A (en) Emotion-reason pair extraction method, device and equipment and readable storage medium
CN114970666A (en) Spoken language processing method and device, electronic equipment and storage medium
CN114492450A (en) Text matching method and device
CN111325387A (en) Interpretable law automatic decision prediction method and device
CN114942980B (en) Method and device for determining text matching
CN117313702A (en) Short text characterization method, short text characterization device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant