CN114492451A - Text matching method and device, electronic equipment and computer readable storage medium - Google Patents

Text matching method and device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN114492451A
CN114492451A CN202111580884.9A CN202111580884A CN114492451A CN 114492451 A CN114492451 A CN 114492451A CN 202111580884 A CN202111580884 A CN 202111580884A CN 114492451 A CN114492451 A CN 114492451A
Authority
CN
China
Prior art keywords
text
vector
input
layer
outputting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111580884.9A
Other languages
Chinese (zh)
Other versions
CN114492451B (en
Inventor
吕乐宾
蒋宁
王洪斌
吴海英
权佳成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN202111580884.9A priority Critical patent/CN114492451B/en
Publication of CN114492451A publication Critical patent/CN114492451A/en
Application granted granted Critical
Publication of CN114492451B publication Critical patent/CN114492451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a text matching method, a text matching device, electronic equipment and a computer-readable storage medium. The method comprises the following steps: inputting a first text and a second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text; wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer; the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector; the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector; and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result. Through the method, the accuracy of text matching can be improved.

Description

Text matching method and device, electronic equipment and computer readable storage medium
Technical Field
The present application relates to the field of text processing technologies, and in particular, to a text matching method and apparatus, an electronic device, and a computer-readable storage medium.
Background
The text matching task is an important research direction in Natural Language Processing (NLP), and plays an important role in tasks such as Information Retrieval (IR), Question Answering (QA), and Paraphrase Recognition (PR). Traditional text matching methods rely on predefined templates and manually extracted rules.
With the development of deep learning, deep neural networks have been widely applied to natural language processing tasks to reduce the cost and time consumed by manually extracting features. The text matching task aims to give two sections of texts Q and D, the similarity value of the two sections of texts is given by extracting semantic information and similarity features existing in the texts, and whether the contents of the two sections of texts belong to similar descriptions can be known through the final similarity value.
At present, the text matching has the problem of insufficient accuracy.
Disclosure of Invention
In order to solve the above problems, the present application provides a text matching method, apparatus, electronic device, and computer-readable storage medium, which can improve accuracy of text matching.
In order to solve the technical problem, the application adopts a technical scheme that: there is provided a text matching method, the method comprising: inputting a first text and a second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text; wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer; the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector; the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector; and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result.
In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided a text matching apparatus including: the text matching unit is used for inputting a first text and a second text to be matched into the text matching model for text matching processing and outputting a matching result of the first text and the second text; wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer; the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector; the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector; and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result.
In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided an electronic device comprising a processor and a memory coupled to the processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to implement the method according to the above technical solution.
In order to solve the above technical problem, another technical solution adopted by the present application is: there is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the method as provided in the above solution.
In the application, a first text vector and a second text vector are output by performing cross attention learning on a first text and a second text; respectively performing representation learning on the input first text vector and the input second text vector, and outputting a third text vector and a fourth text vector; and splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector, and outputting a matching result to perform text matching.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts. Wherein:
FIG. 1 is a schematic flow chart diagram of a first embodiment of a text matching method provided by the present application;
FIG. 2 is a flowchart illustrating a second embodiment of a text matching method provided by the present application;
FIG. 3 is a schematic structural diagram of an embodiment of a first interaction layer provided herein;
FIG. 4 is a schematic diagram of a structure of an embodiment of a distribution layer provided herein;
FIG. 5 is a schematic structural diagram of an embodiment of a second interaction layer provided herein;
FIG. 6 is a schematic structural diagram of an embodiment of a text matching model provided herein;
FIG. 7 is a schematic block diagram of an embodiment of a granular network provided herein;
FIG. 8 is a schematic diagram of the present application in comparison with the related art;
FIG. 9 is another schematic diagram comparing the technical solution of the present application with the related art;
FIG. 10 is another schematic diagram comparing the present embodiment with the related art;
FIG. 11 is another schematic diagram comparing the present embodiment with the related art;
FIG. 12 is a flowchart illustrating a third embodiment of a text matching method provided by the present application;
FIG. 13 is a schematic structural diagram of an embodiment of a text matching apparatus provided in the present application;
FIG. 14 is a schematic structural diagram of an embodiment of an electronic device provided in the present application;
FIG. 15 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. It is to be understood that the specific embodiments described herein are merely illustrative of the application and are not limiting of the application. It should be further noted that, for the convenience of description, only some of the structures related to the present application are shown in the drawings, not all of the structures. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
In order to better understand the scheme of the embodiments of the present application, the following first introduces the related terms and concepts that may be involved in the embodiments of the present application.
Text Matching: a deep learning task can be simply understood as calculating the similarity between two sentences and is mainly applied to information retrieval, intelligent question answering and the like.
Convolutional Neural Networks (CNN) are a class of feed forward Neural Networks (fed forward Neural Networks) that contain convolution computations and have a deep structure, and are one of the representative algorithms for deep learning (deep learning).
RNN (recurrent neural network) is an artificial neural network having a tree-like hierarchical structure and in which network nodes recur input information in the order of their connections.
Attention (Attention mechanism): a method for simulating the important information of human being and ignoring the unimportant information. Different weighted values can be distributed to the information coded at different time steps in the input text sequence, and different attention degrees of the model are represented.
Attention (reproduced-attention): from one of attention, calculating a representation vector importance level of each word in a text sentence;
cross-attention (Cross-attention): calculating the importance degree of each word in the sentence A to the whole sentence B; instead, the importance level of each word in sentence B to the entire sentence a can also be calculated.
LSTM (Long Short-Term Memory): the long and short term memory network is a kind of neural network for processing sequence data. Compared with a general neural network, the neural network can process data with sequence variation.
Bi-LSTM (Bi-directional Long Short-Term Memory, bidirectional Long-Term Memory network): a neural network that processes sequence data simultaneously from two directions, improved over LSTM.
Glove: a word embedding method in natural language processing.
Referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of a text matching method provided in the present application. The method comprises the following steps:
step 11: and acquiring a first text and a second text to be matched.
In some embodiments, the first text and the second text may be in the form of a question-and-answer pair, where the first text may be a question and the second text may be an answer. Alternatively, the first text may be an answer and the second text may be a question.
In some embodiments, the first text and the second text may be in the form of an information search, wherein the first text is a search text and the second text is a text to be matched.
In some embodiments, the first text and the second text may be in the form of a repeat recognition, wherein the first text is a first type of expression text and the second text is a second type of expression text.
Step 12: inputting a first text and a second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text; wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer; the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector; the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector; and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result.
In some embodiments, the text matching model includes a first interaction layer, a distribution layer, and a second interaction layer.
The first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector.
In some embodiments, in the first interaction layer, extraction of a single word vector, a word vector, and a phrase vector may be performed on the first text to form a first set of phrase vectors. And extracting a single word vector, a word vector and a phrase vector from the second text to form a second phrase vector set. And performing similarity comparison on the first phrase vector set and the second phrase vector set, namely performing cross attention learning to obtain a first text vector and a second text vector, and outputting.
For example, the synonyms between the first and second sets of phrase vectors, the ordering of the synonyms in the original text vector, and the attributes of the synonyms in the text vector, such as subject, predicate, object, fixed, object, or complement, can be associated with each other to determine the cross-attention between the first and second texts, thereby obtaining the first and second text vectors.
The distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector.
The method comprises the steps of performing representation learning on a first text vector and a second text vector which are input, enabling the first text vector and the second text vector to have corresponding representation attention, and further obtaining a corresponding third text vector and a corresponding fourth text vector, wherein the representation attention represents the contribution degree of a word vector in the text vectors to the text vectors. If the word is a subject, a predicate, or an object, the contribution degree is high, and if the word is an anaglyph, the contribution degree is low.
And because the representation attention represents the contribution degree of the words in the text to the text, the first text vector represents a third text vector obtained by learning, and each word in the third text vector has the contribution degree of the word to the text.
Similarly, the second text vector is a fourth text vector obtained through representation learning, and each word in the fourth text vector has its own contribution degree to the text.
And the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating the text similarity of the fifth text vector and outputting a matching result.
Since the fifth text vector is converted from the first text and the second text, the text similarity calculated from the fifth text vector may represent a matching result between the first text and the second text.
In the embodiment, a first text vector and a second text vector are output by performing cross attention learning on a first text and a second text; respectively performing representation learning on the input first text vector and the input second text vector, and outputting a third text vector and a fourth text vector; and splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector, and outputting a matching result to perform text matching.
Referring to fig. 2, fig. 2 is a schematic flowchart of a second embodiment of the text matching method provided in the present application. The method comprises the following steps:
step 21: and acquiring a first text and a second text to be matched.
In some embodiments, the first text and the second text may be in the form of a question-and-answer pair, where the first text may be a question and the second text may be an answer. Alternatively, the first text may be an answer and the second text may be a question.
In some embodiments, the first text and the second text may be in the form of an information search, wherein the first text is a search text and the second text is a text to be matched.
In some embodiments, the first text and the second text may be in the form of a repeat recognition, wherein the first text is a first type of expression text and the second text is a second type of expression text.
Step 22: and inputting the first text and the second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text.
In some embodiments, the text matching model includes a first interaction layer, a distribution layer, and a second interaction layer.
The first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector.
Referring to fig. 3, the first interaction layer includes a first embedding layer, a second embedding layer, a similar matrix layer, and a processing layer.
The first embedding layer is used for carrying out word embedding processing on an input first text and outputting a first processed text.
In some embodiments, the first text may be subjected to word embedding processing, i.e., vector conversion, to obtain a corresponding first processed text. For example, the continuous bag-of-words model is used, and the first text is input into the continuous bag-of-words model, so that the continuous bag-of-words model outputs the corresponding vector. Vector conversion can also be implemented using Skip-Gram.
The second embedding layer is used for carrying out word embedding processing on the input second text and outputting a second processed text.
In some embodiments, the second text may be subjected to word embedding processing, i.e., vector conversion, to obtain a corresponding second processed text. For example, the continuous bag-of-words model is used, and the second text is input into the continuous bag-of-words model, so that the continuous bag-of-words model outputs the corresponding vector. Vector conversion can also be implemented using Skip-Gram.
The similarity matrix layer is used for carrying out similarity processing on the input first processed text and the input second processed text and outputting a first weight vector and a second weight vector.
The similarity matrix layer is specifically used for determining a similarity matrix of the first processed text and the second processed text; and performing row normalization processing on the similarity matrix to obtain a first weight vector, and performing column normalization processing on the similarity matrix to obtain a second weight vector.
Because the similarity matrix is a two-dimensional matrix and is distributed in rows and columns, normalization processing is carried out on each row of the similarity matrix to obtain a weight vector corresponding to each row, and then the weight vectors corresponding to each row are summed to obtain a first weight vector. And normalizing each column of the similarity matrix to obtain a weight vector corresponding to each column, and then summing the weight vectors corresponding to each column to obtain a second weight vector.
The processing layer is used for fusing the input second weight vector with the first processing text and outputting a first text vector, and fusing the first weight vector with the second processing text and outputting a second text vector.
The distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector.
Referring to fig. 4, the distribution layer includes a first granular network, a second granular network, a first memory network, a second memory network, a first attention layer, and a second attention layer.
The first granularity network is used for performing multi-granularity extraction on an input first text vector to obtain a plurality of different first granularity information, and splicing the first granularity information to obtain a first spliced vector.
The first granularity network is specifically used for performing feature extraction on an input first text vector by using a plurality of groups of convolution windows with different sizes to obtain a plurality of different first granularity information, and splicing the first granularity information to obtain a first spliced vector.
Because the scales corresponding to the granularity information are different, more characteristic information can be acquired.
The first memory network is used for extracting features of the input first splicing vector and outputting the extracted first feature vector.
The first attention layer is used for performing representation learning on the input first feature vector and outputting a third text vector.
The second granularity network is used for performing multi-granularity extraction on the input second text vector to obtain a plurality of different second granularity information, and splicing the second granularity information to obtain a second spliced vector.
The second granularity network is specifically used for extracting features of input second text vectors by using multiple groups of convolution windows with different sizes to obtain multiple different second granularity information, and splicing the second granularity information to obtain a second spliced vector.
The second memory network is used for extracting the features of the input second splicing vector and outputting the extracted second feature vector.
And the second attention layer is used for performing representation learning on the input second feature vector and outputting a fourth text vector.
And the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating the text similarity of the fifth text vector and outputting a matching result.
Referring to fig. 5, the second interactive layer includes a splice layer and a full link layer.
And the splicing layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector.
And the full connection layer is used for performing text similarity calculation on the input fifth text vector and outputting a matching result.
In the embodiment, a first text vector and a second text vector are output by performing cross attention learning on a first text and a second text; respectively performing representation learning on the input first text vector and the input second text vector, and outputting a third text vector and a fourth text vector; and splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector, and outputting a matching result to perform text matching.
In an application scenario, two texts to be matched, such as a first text and a second text, are respectively obtained as described with reference to fig. 6 and fig. 7. And then carrying out word embedding operation on the first text in the first embedding layer, and carrying out word embedding operation on the second text in the second embedding layer.
For example, an embedding search function is set in the first embedding layer and the second embedding layer, the function obtains a search matrix by using a pre-trained Glove word vector, and each word of the first text and the second text is mapped to a high-dimensional vector space to obtain a corresponding word embedding text.
And then calculating the similarity between each word of the two word embedded texts in the similarity matrix layer to obtain a similarity matrix, then carrying out normalization processing on the similarity matrix according to columns, and summing all the columns to obtain a second weight vector.
And respectively carrying out normalization processing on the similar matrixes according to rows and summing all the rows to obtain a first weight vector.
And then multiplying the second weight vector by the word embedded text output by the first embedded layer in the first processing layer to realize the weighting processing of the word embedded text corresponding to the first text to obtain the first text vector.
And multiplying the first weight vector by the word embedded text output by the second embedded layer at the second processing layer to realize the weighting processing of the word embedded text corresponding to the second text to obtain a second text vector.
Specifically, the similarity between each pair of words in the first text and the second text is calculated by using word vectors after word embedding is performed on the first text and the second text to obtain a similarity matrix, the similarity matrix is normalized and summed according to rows and columns to obtain attention weights of the first text and the second text respectively, and the attention weights are used for weighting the original word embedded text to obtain new texts, such as the first text vector and the second text vector. The weighted first text vector and the weighted second text vector are easier to grasp the key parts of a piece of text in the processes of information extraction and text representation.
In the text matching process, the contribution degree of words in the text to the matching task is different, and different weight information needs to be given to different words in order to better play the role of important words in the text representation process, so the application introduces an attention mechanism, and adds attention weights from the other party to the first text and the second text respectively. This cross-attention consists of the attention of the first text to the second text (Q2D) and the attention of the second text to the first text (D2Q), and each weight value in the vector indirectly represents the overall importance of each word in the text to all words in another piece of text.
The description is made in conjunction with the following:
suppose that first texts Q each having a text length size X are givenx={Q1,…,QXAnd a second text D of text length size Yy={D1,…,DYLet MxyRepresenting the similarity matrix after the interaction of the first text and the second text, the representation of attention can be calculated as follows:
Mxy=Linear(Qx·Dy+bias);
AQ2D=sumcolcol(Mxy)·Dy);
AD2Q=sumrowrow(Mxy)·Qx);
where bias represents the bias added after the linear function,. represents the dot product operation between tensors,. sigma represents the softmax activation function, sum (-) represents the sum of tensors computed along the specified axis, AD2QThe attention vector representing the first text obtained, i.e. the first weight vector mentioned above, i.e. the attention of the first text to the second text, aQ2DAnd then combining the two attention vectors and the text with the embedded words in the corresponding first processing layer or second processing layer to obtain a new weighted text representation, and calculating as follows:
QATT=Q·AQ2D
DATT=D·AD2Q
QATTand DATTRespectively represent the weighted word-embedded text, wherein QATTThe first text vector, D, which can represent the above embodimentATTA second text vector representing the above embodiment.
Then, multi-granularity information extraction is performed on the first text vector by using a first granularity network, and multi-granularity information extraction is performed on the second text vector by using a second granularity network. And combining the obtained multiple pieces of granularity information.
Specifically, the following description is made with reference to fig. 7:
in fig. 7, the weighted word-embedded text is convolved in groups with three different sets of convolution windows, each set of convolution windows extracting a feature representation of a different granularity. For example, convolution window n1 and convolution window n2 are in one set, convolution window m1 and convolution window m2 are in one set, and convolution window p1 and convolution window p2 are in one set.
The triplet information of the convolution window n1 is (100, 1, 8), the triplet information of the convolution window n2 is (8, 1, 96), the triplet information of the convolution window m1 is (100, 1, 8), the triplet information of the convolution window m2 is (8, 2, 96), the triplet information of the convolution window p1 is (100, 1, 8), and the triplet information of the convolution window p2 is (8, 3, 96).
Wherein, the triplet information respectively represents: inputting the characteristic dimension, the convolution kernel size and outputting the characteristic dimension.
Specifically, the calculation formula is as follows:
Figure BDA0003427084590000091
Figure BDA0003427084590000092
Figure BDA0003427084590000093
Figure BDA0003427084590000094
where i, j ∈ {1,2,3},
Figure BDA0003427084590000095
the original text vector in the first layer of the representation granularity network is represented by tensor after information extraction and dimensionality reduction, namely the output of a convolution window n1, a convolution window p1 and a convolution window m1,
Figure BDA0003427084590000096
respectively representing the results of granularity information extraction and dimension enlargement in the second layer, namely the output of a convolution window n2, a convolution window p2 and a convolution window m2, WmultiAnd representing a granularity sliding window, and gradually extracting granularity information in the text sequence along with the sliding of the window, wherein sigma represents a RELU activation function. The last step of the granularity network adopts the residual error connection operation in the ResNeXt network, the information after the characteristic extraction is connected with the original information in a connection layer, and the calculation mode is shown as the following formula:
Figure BDA0003427084590000097
Figure BDA0003427084590000101
where i, j ∈ {1,2,3}, concat (·) denotes the stitching operation of the tensor, axis denotes the axis parameters,
Figure BDA0003427084590000102
and respectively representing the text information of each granularity in the first text vector and the second text vector after residual error connection, namely the output of the corresponding connection layer. Finally, the granularity network splices all the granularity information according to rows to obtain the expression tensor Q of the first splicing vector and the second splicing vectorall,DallThe formula is calculated as follows:
Figure BDA0003427084590000103
Figure BDA0003427084590000104
and then, inputting the multi-granularity information obtained by combining the first text vector and the second text vector into a first memory network and a second memory network respectively, performing full-text semantic learning, calculating the contribution degree of the granularity information of each word to full-text semantics to obtain a weight vector, namely representing attention, and combining the representing attention with the multi-granularity information to realize attention to different granularity information.
That is, a first stitching vector is input to a first memory network, and a second stitching vector is input to a second memory network.
The first Memory network and the second Memory network may be Bi-directional Long Short-Term Memory (Bi-LSTM).
Specifically, a Bi-LSTM network is respectively adopted for learning and representing and dimension compressing text features of multiple granularities for the spliced first splicing vector and the spliced second splicing vector. The Bi-LSTM network can realize the feature expression of the sequence at a high level in a more abstract way, so that the global information of the sequence can be better grasped, and the method is not limited to extracting the similarity features between words or phrases. Therefore, the Bi-LSTM network performs representation learning on the information after feature extraction, and can acquire global information in each granularity and global information between the granularities. Specifically, the output of the Bi-LSTM network can be expressed using the following equation:
Qrep=Bi-LSTM(Qall);
Drep=Bi-LSTM(Dall)。
wherein Q isrepRepresenting the output of the first memory network, corresponding to the first feature vector, DrepThe output of the second memory network is represented, corresponding to the second feature vector. Wherein, because the word group information extracted by the first granularity network and the second granularity network has different importance degrees to the matching task, the word group information with low importance degree may become noise for semantic understanding, a corresponding attention layer is added behind the Bi-LSTM network, such as a first attention layer and a second attention layer, the attention layer uses an attention representing mechanism, calculates the importance degree of each time step of the representation information to the global information by a full connection layer, then uses the importance degree as the weight value of the representation information, and adds a weight approximation to the representation informationAnd the method can better play the role of the phrase information with high importance degree when two sentences interact, and inhibit the role of the phrase information with low importance degree. The vector weights in the first and second attention levels are expressed using the following formula:
Figure BDA0003427084590000111
Figure BDA0003427084590000112
wherein i ∈ {1,2, … X }, j ∈ {1,2, … Y }, X and Y respectively represent the lengths of the first eigenvector and the second eigenvector,
Figure BDA0003427084590000113
vector representations of the i-th and j-th time steps, W, of two pieces of input text, respectivelyi,WjRepresents a learnable parameter, σ (-) represents a sigmoid activation function,
Figure BDA0003427084590000114
and respectively representing the weight values of the ith and j time steps of the first feature vector and the second feature vector, namely the vector weight.
Then, the weighting value is used for carrying out weighting processing on the input first characteristic vector and the input second characteristic vector to obtain the final output Qout,Dout. The outputs in the first and second attention layers are represented using the following formulas:
Qout=Qrep_att·Qrep
Dout=Drep_att·Drep
wherein Q isoutThe output representing the first layer of attention, i.e. the third text vector, D, described aboveoutThe output of the second attention layer, i.e. the fourth text vector described above, is represented.
And then combining the two processed text vectors in a splicing layer to obtain a fifth text vector, performing text similarity calculation on the input fifth text vector through a full-connection layer, and outputting a matching result. .
Specifically, the represented third text vector and the fourth text vector are tiled and connected in the first dimension, and the tiled third text vector and the fourth text vector are input into a full connection layer of the neural network to calculate the matching score of the two text vectors. Specifically, the following formula is adopted:
Zrep=concat(Qout,Dout,axis=-1);
Score=σ(WZrep+b)。
wherein Q isout,DoutThe sequences are respectively weighted and expressed by the first text and the second text, namely a third text vector and a fourth text vector, concat (-) represents a splicing function, ZrepDenotes the tensor after stitching, W denotes the learnable parameter, σ (-) denotes the Linear activation function, and Score is the matching Score of the two pieces of text that are finally output.
The matching score at this time may indicate a degree of matching of the first text and the second text.
In other embodiments, the text matching model may be trained in the above manner, and after the matching score is obtained, the weight of the entire text matching model is updated by a loss function according to the deviation between the matching score and the actual value.
For example, during training, the loss function is calculated as follows:
Loss=max(0,margin+y′-y)。
for a given two-segment input sequence, the difference between the final correct prediction score y and the incorrect prediction score y' can be used to represent the similarity relationship between the two prediction results, and margin is a coefficient given by itself. The higher y is, the lower y 'is, namely the larger y-y' is, the better the text matching model performs, but the difference between the scores is at most margin, and the larger the difference is, no more prize is paid.
In an application scenario, the technical solution of the present application is tested, and is explained with reference to fig. 8 to fig. 11:
the experiment of the application uses a WikiQA data set of Microsoft, an SNLI data set of Stanford and a semeval2016-task3 data set based on tweet to respectively carry out experimental comparison on three text matching tasks of question answering, text implication recognition and gesture detection.
WikiQA is a publicly available data set of open domain problem solutions containing 3047 questions and 29258 answers extracted from the query log of Bing, of which 1473 sentences are labeled as answers to the corresponding questions. Each question is associated with multiple answers to Wikipedia, according to the clicking behavior of the user, and the total number of questions and answers is 29258. Then, answers to the correct questions that are manually labeled are taken as candidate answers, and thus 1473 sentences are labeled as correct answers. The training set includes 20K pairs of sentences, the test set includes 6.1K pairs of sentences, the verification set includes 2.7K pairs of sentences, the query sentence includes 6.89 words on average, and the document sentence includes 22.73 words on average.
The SNLI dataset of Stanford is a dataset published by Stanford university for text inclusion recognition tasks in natural language processing. The SNLI data set is marked manually and comprises 570K text pairs, wherein the training set 550K, the verification set 10K and the test set 10K comprise three types of text pairs: intailment, conversation, neutral. Query sentences in the SNLI dataset contain 12.85 words on average, and document sentences contain 7.41 words.
SemEval-2016 Task3 contains two subtasks, "question-answer similarity" and "question-question similarity", and the experiment performed a comparison of the experimental results on the "question-answer similarity" data. In the "question-answer similarity" task, a particular question is presented, and then answers are ranked according to relevance to the question. In the pose detection task, the goal is to determine the preference for a given (pre-selected) target, which may not be an opinion target in the original text, so a deeper understanding and reasoning ability of the model for the sentence is required. In the experiment, the 'external answer' is selected as an alternative answer of the task, the label is divided into three results of Good, Potentiality Useful and Bad, the query sentence of the data set averagely comprises 39.29 words, and the document sentence comprises 36.85 words.
Fig. 8 shows the comparison of the experimental results of each mainstream model and the technical solution of the present application on the question-and-answer dataset of WikiQA. FMMI is used for representing the technical scheme of the application. Compared with a text implication recognition data set, the question-answer data set depends on the understanding of the semantics of the text data at multiple levels by the model, so that the model capable of extracting the text semantics better shows a better effect on the data set. The FMMI of the application has the advantages that the test results in three indexes of NDCG @3, NDCG @5 and MAP are superior to those of other models. It can be seen that FMMI can capture semantic information at a higher level by extracting and weighting multiple granularities of information than by directly capturing single granularity of similarity information between words.
Fig. 9 illustrates experimental comparisons of various mainstream models with FMMI of the present application on the SNLI dataset, and fig. 9 illustrates training procedures of various models on the SNLI question-answer dataset. Compared with a question-answer data set, the text implication identification task depends on the extraction of the whole text semantic information by the model and the acquisition of the local text characteristic information by the model, so that the model capable of better extracting the characteristics is not good in the WikiQA data set, but shows good experimental effects on the SNLI data set, such as MatchPyramid, DUET, CONV _ KNRM and other models. Since the FMMI can capture high-level text semantic information and can focus on local feature information better, the FMMI can show the optimal effect on the SNLI data set.
FIG. 10 shows the experimental comparison of each model with the FMMI model on top of the semeval2016-task3 dataset, and FIG. 10 shows the training process of each model on top of the semeval2016-task3 dataset. The data set selects an external answer (namely, a text to be matched is not explicitly mentioned in the original text) as a candidate target, so that the model is difficult to judge whether the text has relevance by comparing simple characteristics of two sentences, and more experimenting the understanding and reasoning capability of the model on the original text, so that the model based on the representation is possibly more competent for the task, such as ARC-II, Mv _ LSTM and the like. The FMMI of the present application is still superior to other matching models above this data set, so it can be demonstrated that FMMI can better determine the preference of statements in the original text for a given target in the pose detection task than other models.
FIG. 11 shows the performance of FMMI on WikiQA data set with one necessary processing unit removed, FMMI-IA, FMMI-LSTM, FMMI-RA, FMMI-Granet, representing the initial model after the interactive attention unit is removed, the Bi-LSTM representing model, the attention unit, and the granular network unit, respectively. The degree of contribution of each unit to the overall model can be seen from the experimental results. The maximum contribution degree is an interactive attention mechanism quoted in the front stage of the model, and the weighting processing of each word of the original text effectively improves the effect of information extraction in the rear stage of the model and the text representation capability; secondly, a Bi-LSTM unit is used for text representation, so that the Bi-LSTM unit plays an important role in semantic representation and extraction processes; secondly, attention is expressed, and the information after granularity extraction is weighted, so that the model pays attention to useful phrase information; and finally, the network is a multi-granularity network, and the accuracy of text matching is improved to a certain extent by acquiring more granularity information.
Referring to fig. 12, fig. 12 is a schematic flowchart of a text matching method according to a third embodiment of the present application. The method comprises the following steps:
step 121: and acquiring a first training text and a second training text.
Wherein the first training text and the second training text are marked with the similarity true value.
Step 122: and inputting the first training text and the second training text into a text matching model, and outputting a similarity output value of the first training text and the second training text, wherein the similarity output value is used as a matching result.
Step 123: and determining a loss function based on the deviation of the similarity output value and the similarity true value.
Step 124: and modifying the text matching model by using a loss function.
For example, during training, the loss function is calculated as follows:
Loss=max(0,margin+y′-y)。
for a given two-segment input sequence, the difference between the final correct prediction score y and the incorrect prediction score y' can be used to represent the similarity relationship between the two prediction results, and margin is a coefficient given by itself. The higher y, the lower y ', i.e. the larger y-y', represents that the text matching model performs better, but the difference between the scores is at most margin, and the larger difference is not more rewarded.
Referring to fig. 13, fig. 13 is a schematic structural diagram of an embodiment of a text matching apparatus provided in the present application. The text matching apparatus 130 includes a text matching unit 131.
The text matching unit 131 is configured to input the first text and the second text to be matched into the text matching model for text matching processing, and output a matching result of the first text and the second text.
Wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer;
the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector;
the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector;
and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result.
It can be understood that the text matching apparatus 130 is further configured to implement the method according to any of the embodiments, and please refer to any of the above technical solutions specifically, which is not described herein again.
Referring to fig. 14, fig. 14 is a schematic structural diagram of an embodiment of an electronic device provided in the present application. The electronic device 140 comprises a processor 141 and a memory 142 coupled to the processor 141, wherein the memory 142 stores a computer program, and the processor 141 is configured to execute the computer program to implement the following method:
inputting a first text and a second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text; wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer; the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector; the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector; and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result.
It can be understood that the processor 141 is further configured to execute a computer program to implement the method according to any of the above embodiments, which is specifically referred to any of the above technical solutions and is not described herein again.
Referring to fig. 15, fig. 15 is a schematic structural diagram of an embodiment of a computer-readable storage medium provided in the present application. The computer-readable storage medium 150 stores a computer program 151, the computer program 151, when executed by a processor, implementing the method of:
inputting a first text and a second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text; wherein the text matching model comprises a first interaction layer, a distribution layer and a second interaction layer; the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector; the distribution layer is used for respectively performing representation learning on the input first text vector and the input second text vector and outputting a third text vector and a fourth text vector; and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting a matching result.
It can be understood that, when being executed by a processor, the computer program 151 is further configured to implement the method according to any of the embodiments, which is specifically referred to any of the above technical solutions and is not described herein again.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the above-described device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.
The above description is only for the purpose of illustrating embodiments of the present application and is not intended to limit the scope of the present application, and all modifications of equivalent structures and equivalent processes, which are made according to the content of the present specification and the accompanying drawings, or which are directly or indirectly applied to other related technical fields, are also included in the scope of the present application.

Claims (10)

1. A method of text matching, the method comprising:
inputting a first text and a second text to be matched into a text matching model for text matching processing, and outputting a matching result of the first text and the second text;
wherein the text matching model comprises a first interaction layer, a distribution layer, and a second interaction layer;
the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector;
the distribution layer is used for respectively performing representation learning on the first text vector and the second text vector which are input and outputting a third text vector and a fourth text vector;
and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting the matching result.
2. The method of claim 1, wherein the first interaction layer comprises: a first embedding layer, a second embedding layer, a similar matrix layer and a processing layer;
the first embedding layer is used for carrying out word embedding processing on the input first text and outputting a first processed text;
the second embedding layer is used for carrying out word embedding processing on the input second text and outputting a second processed text;
the similarity matrix layer is used for carrying out similarity processing on the input first processed text and the input second processed text and outputting a first weight vector and a second weight vector;
and the processing layer is used for fusing the input second weight vector with the first processing text and outputting a first text vector, and fusing the first weight vector with the second processing text and outputting a second text vector.
3. The method according to claim 2, characterized in that the similarity matrix layer is specifically configured to:
determining a similarity matrix of the first processed text and the second processed text;
and performing row normalization processing on the similarity matrix to obtain the first weight vector, and performing column normalization processing on the similarity matrix to obtain the second weight vector.
4. The method of claim 1, wherein the distribution layer comprises: a first granular network, a second granular network, a first memory network, a second memory network, a first attention layer, and a second attention layer;
the first granularity network is used for performing multi-granularity extraction on the input first text vector to obtain a plurality of different first granularity information, and splicing the first granularity information to obtain a first spliced vector;
the first memory network is used for extracting features of the input first splicing vector and outputting an extracted first feature vector;
a first attention layer for performing representation learning on the input first feature vector and outputting the third text vector;
the second granularity network is used for performing multi-granularity extraction on the input second text vector to obtain a plurality of different second granularity information, and splicing the second granularity information to obtain a second spliced vector;
the second memory network is used for extracting features of the input second spliced vector and outputting an extracted second feature vector;
and the second attention layer is used for performing representation learning on the input second feature vector and outputting the fourth text vector.
5. The method of claim 4, wherein the first granular network is specifically configured to:
performing feature extraction on the input first text vector by utilizing a plurality of groups of convolution windows with different sizes to obtain a plurality of different first granularity information, and splicing the first granularity information to obtain a first spliced vector;
the second granular network is specifically configured to:
and performing feature extraction on the input second text vector by utilizing a plurality of groups of convolution windows with different sizes to obtain a plurality of different second granularity information, and splicing the second granularity information to obtain a second spliced vector.
6. The method of claim 1, wherein the second interaction layer comprises: splicing layers and full connecting layers;
the splicing layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector;
and the full connection layer is used for performing text similarity calculation on the input fifth text vector and outputting the matching result.
7. The method of claim 1, further comprising:
acquiring a first training text and a second training text;
inputting the first training text and the second training text into the text matching model, and outputting a similarity output value of the first training text and the second training text, wherein the similarity output value is used as the matching result;
determining a loss function based on the deviation between the similarity output value and the similarity true value;
and correcting the text matching model by using the loss function.
8. A text matching apparatus, characterized in that the text matching apparatus comprises:
the text matching unit is used for inputting a first text and a second text to be matched into a text matching model for text matching processing and outputting a matching result of the first text and the second text;
wherein the text matching model comprises a first interaction layer, a distribution layer, and a second interaction layer;
the first interaction layer is used for performing cross attention learning on the input first text and the input second text and outputting a first text vector and a second text vector;
the distribution layer is used for respectively performing representation learning on the first text vector and the second text vector which are input and outputting a third text vector and a fourth text vector;
and the second interaction layer is used for splicing the input third text vector and the input fourth text vector to obtain a fifth text vector, calculating text similarity of the fifth text vector and outputting the matching result.
9. An electronic device, characterized in that the electronic device comprises a processor and a memory coupled to the processor, in which memory a computer program is stored, the processor being adapted to execute the computer program to implement the method according to any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, implements the method according to any one of claims 1-7.
CN202111580884.9A 2021-12-22 2021-12-22 Text matching method, device, electronic equipment and computer readable storage medium Active CN114492451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111580884.9A CN114492451B (en) 2021-12-22 2021-12-22 Text matching method, device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111580884.9A CN114492451B (en) 2021-12-22 2021-12-22 Text matching method, device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN114492451A true CN114492451A (en) 2022-05-13
CN114492451B CN114492451B (en) 2023-10-24

Family

ID=81494049

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111580884.9A Active CN114492451B (en) 2021-12-22 2021-12-22 Text matching method, device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN114492451B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115456176A (en) * 2022-10-10 2022-12-09 延边大学 Text matching method and system based on knowledge enhancement
CN116383491A (en) * 2023-03-21 2023-07-04 北京百度网讯科技有限公司 Information recommendation method, apparatus, device, storage medium, and program product

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109815484A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism
CN112183085A (en) * 2020-09-11 2021-01-05 杭州远传新业科技有限公司 Machine reading understanding method and device, electronic equipment and computer storage medium
WO2021051574A1 (en) * 2019-09-16 2021-03-25 平安科技(深圳)有限公司 English text sequence labelling method and system, and computer device
CN112905827A (en) * 2021-02-08 2021-06-04 中国科学技术大学 Cross-modal image-text matching method and device and computer readable storage medium
CN113191357A (en) * 2021-05-18 2021-07-30 中国石油大学(华东) Multilevel image-text matching method based on graph attention network
CN113239181A (en) * 2021-05-14 2021-08-10 廖伟智 Scientific and technological literature citation recommendation method based on deep learning
WO2021164199A1 (en) * 2020-02-20 2021-08-26 齐鲁工业大学 Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299262A (en) * 2018-10-09 2019-02-01 中山大学 A kind of text implication relation recognition methods for merging more granular informations
CN109815484A (en) * 2018-12-21 2019-05-28 平安科技(深圳)有限公司 Based on the semantic similarity matching process and its coalignment for intersecting attention mechanism
WO2020124959A1 (en) * 2018-12-21 2020-06-25 平安科技(深圳)有限公司 Semantic similarity matching method based on cross attention mechanism, and apparatus therefor
CN109858032A (en) * 2019-02-14 2019-06-07 程淑玉 Merge more granularity sentences interaction natural language inference model of Attention mechanism
WO2021051574A1 (en) * 2019-09-16 2021-03-25 平安科技(深圳)有限公司 English text sequence labelling method and system, and computer device
WO2021164199A1 (en) * 2020-02-20 2021-08-26 齐鲁工业大学 Multi-granularity fusion model-based intelligent semantic chinese sentence matching method, and device
CN112183085A (en) * 2020-09-11 2021-01-05 杭州远传新业科技有限公司 Machine reading understanding method and device, electronic equipment and computer storage medium
CN112905827A (en) * 2021-02-08 2021-06-04 中国科学技术大学 Cross-modal image-text matching method and device and computer readable storage medium
CN113239181A (en) * 2021-05-14 2021-08-10 廖伟智 Scientific and technological literature citation recommendation method based on deep learning
CN113191357A (en) * 2021-05-18 2021-07-30 中国石油大学(华东) Multilevel image-text matching method based on graph attention network

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
DASOU: "全面梳理文本相似度/匹配-附代码-深度好文-不容错过", pages 1, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/180460887> *
FIGO: "基于交叉注意力机制的图像文本匹配", pages 1, Retrieved from the Internet <URL:https://zhuanlan.zhihu.com/p/367534387> *
KUANG-HUEI LEE 等: "Stacked Cross Attention for Image-Text Matching", COMPUTER VISION – ECCV 2018, pages 212 *
XING XU 等: "Cross-Modal Attention With Semantic Consistence for Image–Text Matching", IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS ( VOLUME: 31, ISSUE: 12, DECEMBER 2020), pages 5412 - 5425 *
吕秀程: "基于注意力机制和多任务对抗学习的文本立场分析", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 1, pages 138 - 2497 *
程淑玉;郭泽颖;刘威;印鉴;: "融合Attention多粒度句子交互自然语言推理研究", 小型微型计算机***, no. 06, pages 1215 - 1220 *
陈浩: "基于集成深度学习的文本匹配研究", 中国优秀硕士学位论文全文数据库 信息科技辑, no. 8, pages 138 - 777 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115456176A (en) * 2022-10-10 2022-12-09 延边大学 Text matching method and system based on knowledge enhancement
CN116383491A (en) * 2023-03-21 2023-07-04 北京百度网讯科技有限公司 Information recommendation method, apparatus, device, storage medium, and program product
CN116383491B (en) * 2023-03-21 2024-05-24 北京百度网讯科技有限公司 Information recommendation method, apparatus, device, storage medium, and program product

Also Published As

Publication number Publication date
CN114492451B (en) 2023-10-24

Similar Documents

Publication Publication Date Title
CN111538908B (en) Search ranking method and device, computer equipment and storage medium
CN107562792B (en) question-answer matching method based on deep learning
CN111415740B (en) Method and device for processing inquiry information, storage medium and computer equipment
CN109033068B (en) Method and device for reading and understanding based on attention mechanism and electronic equipment
CN108875074B (en) Answer selection method and device based on cross attention neural network and electronic equipment
CN109710915B (en) Method and device for generating repeated statement
CN108846077B (en) Semantic matching method, device, medium and electronic equipment for question and answer text
CN108549658B (en) Deep learning video question-answering method and system based on attention mechanism on syntax analysis tree
CN112069302B (en) Training method of conversation intention recognition model, conversation intention recognition method and device
CN110929515A (en) Reading understanding method and system based on cooperative attention and adaptive adjustment
CN114565104A (en) Language model pre-training method, result recommendation method and related device
CN112580369B (en) Sentence repeating method, method and device for training sentence repeating model
CN109522561B (en) Question and sentence repeated recognition method, device and equipment and readable storage medium
CN114492451B (en) Text matching method, device, electronic equipment and computer readable storage medium
WO2019220113A1 (en) Device and method for natural language processing
WO2019106965A1 (en) Information processing device, information processing method, and program
CN113408430B (en) Image Chinese description system and method based on multi-level strategy and deep reinforcement learning framework
CN109145083B (en) Candidate answer selecting method based on deep learning
CN112699215B (en) Grading prediction method and system based on capsule network and interactive attention mechanism
CN112000788B (en) Data processing method, device and computer readable storage medium
US20220383119A1 (en) Granular neural network architecture search over low-level primitives
CN117236410A (en) Trusted electronic file large language model training and reasoning method and device
CN115510226A (en) Emotion classification method based on graph neural network
CN110852071A (en) Knowledge point detection method, device, equipment and readable storage medium
CN114648032A (en) Training method and device of semantic understanding model and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant