CN113935329A - Asymmetric text matching method based on adaptive feature recognition and denoising - Google Patents
Asymmetric text matching method based on adaptive feature recognition and denoising Download PDFInfo
- Publication number
- CN113935329A CN113935329A CN202111192675.7A CN202111192675A CN113935329A CN 113935329 A CN113935329 A CN 113935329A CN 202111192675 A CN202111192675 A CN 202111192675A CN 113935329 A CN113935329 A CN 113935329A
- Authority
- CN
- China
- Prior art keywords
- representation
- hash
- document
- matching
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000003044 adaptive effect Effects 0.000 title claims abstract description 21
- 230000013016 learning Effects 0.000 claims abstract description 17
- 230000006870 function Effects 0.000 claims description 46
- 239000011159 matrix material Substances 0.000 claims description 41
- 238000012549 training Methods 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 9
- 238000005457 optimization Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 230000003993 interaction Effects 0.000 claims description 6
- 238000007781 pre-processing Methods 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 4
- 229910008987 W2Si Inorganic materials 0.000 claims description 3
- 230000007246 mechanism Effects 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 238000005070 sampling Methods 0.000 claims description 3
- 230000002194 synthesizing effect Effects 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 2
- 230000037430 deletion Effects 0.000 claims description 2
- 238000002474 experimental method Methods 0.000 abstract description 6
- 238000003058 natural language processing Methods 0.000 abstract description 4
- 241000609886 Addax Species 0.000 description 21
- 238000012360 testing method Methods 0.000 description 5
- 238000013527 convolutional neural network Methods 0.000 description 4
- 230000015556 catabolic process Effects 0.000 description 3
- 238000006731 degradation reaction Methods 0.000 description 3
- 238000002679 ablation Methods 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 238000011176 pooling Methods 0.000 description 2
- 238000010206 sensitivity analysis Methods 0.000 description 2
- 230000035045 associative learning Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000009849 deactivation Effects 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000008030 elimination Effects 0.000 description 1
- 238000003379 elimination reaction Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/048—Activation functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to an asymmetric text matching method based on adaptive feature recognition and denoising, and belongs to the technical field of natural language processing. The present invention is designed to explicitly identify identifying features and filter out irrelevant features in a context-aware manner for each asymmetric text pair. Specifically, a matching adaptive twin cell is first designed to adaptively identify the discriminating features, thereby deriving a corresponding hybrid representation for each text pair. Then, a local constraint Hash de-noising device is provided, and the characteristic level de-noising is carried out on the redundant long text by learning a differentiated low-dimensional binary code, so that better correlation learning is realized. A large number of experiments on real data sets of four different downstream tasks show that compared with the latest most advanced method, the method disclosed by the invention obtains huge performance gain and provides support for subsequent downstream tasks such as information retrieval, answer selection and the like.
Description
Technical Field
The invention relates to an asymmetric text matching method based on adaptive feature recognition and denoising, and belongs to the technical field of natural language processing.
Background
Text Matching (TM) is a valuable but challenging task in the fields of information retrieval and natural language processing. Given a pair of documents, the TM aims to predict their semantic relationship. Note that in many information retrieval systems, question-answering systems and dialogue systems, efficient matching algorithms are an indispensable asset. In most application scenarios, matching sequence pairs (e.g., query documents, keyword documents, and question-answer pairs) typically differ greatly in the amount of information (e.g., asymmetric text matching). For example, the average number of words in the InsuranceQA dataset for the two documents in the matching pair is 7.15 and 95.54 (i.e., orders of magnitude). The asymmetry of short queries and long documents makes it a very important task. Asymmetric text matching has become an increasing demand for many downstream tasks, such as information retrieval and natural language processing. Here, asymmetric means that the documents involved in the match contain different amounts of information, e.g., a short query for relatively long documents.
Early solutions can be divided into two categories, namely representation-based models and interaction-based models. The former solution utilizes Recurrent Neural Networks (RNN) and long short term memory networks (LSTM) to learn potential representations of document pairs, including DSSM, SNRM, and ARC-I, by processing each document independently. In contrast, the latter captures fine-grained interaction signals between them. It is generally believed that the use of interactive signals can greatly improve associative learning capabilities. Examples include DRMM, KNR and ARC-II. Recently, with the advent of deep pre-trained Language Models (LMs) like BERT, the latest LMs-based deep correlation models have pushed the development of the latest technologies tremendously. Specifically, LMs are pre-trained on a large-scale corpus and then applied to TM tasks by computing contextual semantic representations of sentence pairs. The goal is to further eliminate lexical mismatches between documents and queries. Despite these efforts to achieve significant performance gains, the main drawback of these models is that further feature recognition and denoising is omitted between asymmetric texts, which may help to improve matching performance.
Disclosure of Invention
The invention provides an asymmetric text matching method based on adaptive feature recognition and denoising, and designs a matching adaptive twin cell system (MAGS) for adaptively recognizing and identifying features so as to derive a corresponding mixed representation for each text pair.
The technical scheme of the invention is as follows: the asymmetric text matching method based on the self-adaptive feature recognition and denoising comprises the following specific steps:
step1, preprocessing the question-answer matching data set and the query-document matching data set;
step2, carrying out context representation on each asymmetric text pair preprocessed in Step1 by using a BERT-based context encoder; adaptively identifying discriminating features based on an adaptively matching twin cell, thereby deriving a respective hybrid representation for each asymmetric text pair; a local constraint Hash de-noising device is provided, and the characteristic level de-noising is carried out on a redundant long text by learning a distinctive low-dimensional binary code; and finally, obtaining the matching scores of the asymmetric text pairs by utilizing a similarity predictor.
As a further aspect of the present invention, in Step1, the question-answer matching data set includes inspuranceqa, wikiQA, and yahooQA, and the query-document matching data set employs MS MARCO; the preprocessing comprises the step of carrying out matching deletion on special characters in the text by using a regular expression.
As a further aspect of the present invention, in Step2, the context representation of each asymmetric text pair preprocessed in Step1 by a BERT-based context encoder includes:
selecting BERT as context encoder, following the format of BERT input, a specific mark symbol [ CLS ]]At the beginning of the sequence, i.e., { [ CLS],q1,q2,…,qlAnd { [ CLS { [],d1,d2,…,dt},Here, the BERT based context encoder is described as follows:
UQ=BERT([CLS],q1,q2,…,ql) (1)
VD=BERT([CLS],d1,d2,…,dt) (2)
wherein, UQ∈Rl×dAnd VD∈Rt×dA context representation representing query Q and document D, respectively; d represents the output dimension of BERT; to reduce the number of parameters, prevent overfitting, and facilitate information interaction across text pairs, queries and documents share a context encoder.
As a further aspect of the present invention, Step2, adaptively identifying the discriminating characteristic based on an adaptively matching twin cell, so as to derive a corresponding mixed representation for each asymmetric text pair, comprises:
the feature recognition process was simulated using adaptive matched twins called MAGS; the self-adaptive matching twin cell is a parallel architecture with two subunits MAG, namely a query end MAG and a document end MAG; because the query end and the document end MAG are the same, the query end MAG is:
given the context representation U of the extracted queryQ=[u1,…,ul]And a contextual representation V of the documentD=[v1,…,vt]L and t respectively represent the length of the query text and the length of the document text, and are used for identifying the identifying characteristic and synthesizing the identifying characteristic into the relevance characteristic; specifically, word-level similarity is first calculated as follows:
wherein S ∈ Rl×tIs a similarity matrix of all word pairs in the two sequences; these similarity scores are then normalized and are based on VDA reference representation is derived for each word in query Q:
RQ=softmax(S)VD (4)
the purpose of this operation is to pair V according to SDPerforming soft feature selection; that is, the relevant information in document D is transmitted to representation Q;
however, in this reference representation, irrelevant information in Q also provides further relevant learning; the supplementary features are first constructed by taking into account the difference of the reference representation from the original representation: dQ=UQ-RQ(ii) a Furthermore, to identify the discriminating characteristic, R is first identified using a similar pattern denoted by SQ、DQImportant features in these two semantic signals are as follows:
E=σ(W1S+B1) (5)
F(r)=RQ⊙E (6)
F(d)=DQ⊙(1-E) (7)
where σ (-) denotes a sigmoid activation function, W1And B1Are the transformation matrix and the bias matrix, respectively, and &, < > is the element bitwise product operation; then, the two parts are further connected, i.e.Andby an attention mechanism similar to equation 5:
pi=σ(W2Si+B2) (8)
wherein S isi,Fi (r)And Fi (d)Respectively correspond to the matrix S, F(r)、F(d)Row i of (1), symbol ≦ is the vector concatenation operation, d denotes the BERTOutput dimension, W2And B2Also a transformation matrix and a bias matrix, respectively; then, a high speed network is used to generate the discriminating characteristics of each word
pi=relu(W3Fi (c)+b3) (10)
gi=sig moid(W4Fi (c)+b4) (11)
ii=(1-gi)⊙Fi (c)+gi⊙pi (12)
Wherein, W3,W4∈R2d×2dAnd W5∈Rd×2dRepresenting a parameter matrix, b3,b4,b5Representing a bias vector; forming the synthesized mixed authentication features into a matrix:
as a further scheme of the present invention, Step2 proposes a locally constrained hash denoising device, where the feature level denoising of a redundant long text by learning a distinct low-dimensional binary code specifically includes:
the local constraint Hash de-noising device defines an encoding function FenA hash function FhAnd a decoding function Fde(ii) a (1) Coding function FenMap representation form HDConverting into a low-dimensional matrix B epsilon Rt×h(ii) a Here, a feedforward neural network FNN (-) implemented by a three-layer multi-layer perceptron MLP is used for FenModeling; furthermore, to filter semantic noise and mitigate the gradient vanishing problem, we choose to use relu (-) as the activation function of the second layer, which can skip unnecessary features and preserveIdentifying a clue; the encoding process is summarized as follows:
B=Fen(HD)=FFN(HD) (14)
(2) hash function FhIs used to learn a differentiated binary matrix representation for the purpose of cleansing and efficient matching; the sgn (·) function is the best choice for binarization, but sgn (·) is not trivial; therefore, an approximation function tanh (-) is used to replace sgn (-) for supporting model training; specifically, the hash function is expressed as follows;
BD=Fh(B)=tanh(αB) (15)
note that the hyper-parameter α is introduced to make the hash function more flexible and to generate a balanced, differentiated hash code, and to ensure that the value in B belongs to { -1,1}, an additional constraint is defined:
wherein B is(b)Sgn (b) denotes HDIs represented by a binary matrix, | | | | | luminous fluxFDenotes the F-norm, BDRepresenting the context of the document D after the document D passes through a Hash de-noising device, namely generating a binary code by a Hash function;
(3) decoding function FdeFrom BDIn reconstructing HDIt consists of three layers of multilayer perceptrons for decoding the binary matrix BDBack to the original one HDThus, reconstructing the sequence matrixThe definition is as follows:
wherein FNNT(. DEG) is a decoder function, and in order to reduce the loss of semantics in the reconstruction process, the mean square error MSE (DEG) is added as a constraint condition when a training model is used;
It is emphasized that also HQPerforming hash denoising, updating matrix representation H of query Q using a single MLP layerQTo match the dimension of the hash denoiser, h;
HQ=MLP(HQ) (19)。
as a further aspect of the present invention, in Step2, obtaining a matching score of an asymmetric text pair by using a similarity predictor includes:
context representation after passing through hash denoiser for query QAnd a context representation of document D after passing through a hash denoiserThe match score G (Q, D) between query Q and document D is estimated by the MaxSim operator, as follows:
where Norm (-) denotes L2 normalization, so that when the inner product of any two hidden representations is computed, the result is [ -1,1]I.e., equivalent to the rest of the chord similarity,is HQThe vector representation of the ith word in (a),is BDThe jth vector of (a).
As a further aspect of the present invention, in Step2, the model optimization includes:
in the training phase, by using a negative sampling strategy based on triple hinge loss:
L3=max{0,0.1-G(Q,D)+G(Q,D-)} (21)
wherein D-Is the corresponding negative sample document sampled from the training set, G (Q, D) is the matching score between query Q and document D;
finally, combining hinge loss and two constraints in a Hash denoiser; that is, the final optimization objective is L1、L2And L3Linear fusion of (2):
where δ and γ are tunable hyper-parameters that control the importance of two constraints respectively, θ is the parameter set, the parameters are updated in an end-to-end fashion on small batches using Adam, BDIs a context representation of the document D after passing through a hash de-noiser, namely a binary code generated by a hash function, BDRepresenting a hash code generated by document D using the sgn sign function.
The invention has the beneficial effects that:
according to the invention, for each asymmetric text pair, distinctive features are explicitly distinguished in a context perception manner and irrelevant features are filtered out; specifically, a matching adaptive twin cell (MAGS) is first designed to adaptively identify the discriminating features, thereby deriving a corresponding hybrid representation for each text pair. Then, the invention further provides a local constraint Hash de-noising device, which is used for carrying out characteristic level de-noising on the redundant long text by learning a distinguishing low-dimensional binary code, thereby realizing better correlation learning. Extensive experiments on real data sets of four different downstream tasks show that the proposed invention achieves a huge performance gain compared to the latest state-of-the-art alternatives.
Drawings
FIG. 1 is a schematic representation of a model of the present invention;
FIG. 2 is a diagram of the adaptive matching twin cell structure of the present invention;
FIG. 3 is a line graph of the superparameter sensitivity analysis of the present invention.
Detailed Description
Example 1: as shown in fig. 1-3, the asymmetric text matching method based on adaptive feature recognition and denoising specifically includes the following steps:
step1, preprocessing the question-answer matching data set and the query-document matching data set;
step1.1, pre-processing the question-answer match dataset (inspuranceQA, wikiQA and yahooQA) and the query-document match dataset (MS MARCO); and matching and deleting the special characters in the text by using a regular expression. Wherein a query-document matching dataset (MS MARCO) is a collection of 880 ten thousand web page paragraphs containing about 4 million query, positive and negative paragraph tuples. The present invention reports the results of an MSMARCO Dev set containing about 6900 queries; the question-answer matching dataset size is shown in table 1:
TABLE 1 statistics for the QA data set (the inspuranceQA Test set includes Test1 and Test2)
Step2, carrying out context representation on each asymmetric text pair preprocessed in Step1 by using a BERT-based context encoder; adaptively identifying discriminating features based on an adaptively matching twin cell, thereby deriving a respective hybrid representation for each asymmetric text pair; a local constraint Hash de-noising device is provided, and the characteristic level de-noising is carried out on a redundant long text by learning a distinctive low-dimensional binary code; and finally, obtaining the matching scores of the asymmetric text pairs by utilizing a similarity predictor.
As a further aspect of the present invention, in Step2, the context representation of each asymmetric text pair preprocessed in Step1 by a BERT-based context encoder includes:
selecting BERT as context encoder, following the format of BERT input, a specific mark symbol [ CLS ]]At the beginning of the sequence, i.e., { [ CLS],q1,q2,…,qlAnd { [ CLS { [],d1,d2,…,dtHere, the BERT based context encoder is described as follows:
UQ=BERT([CLS],q1,q2,…,ql) (1)
VD=BERT([CLS],d1,d2,…,dt) (2)
wherein, UQ∈Rl×dAnd VD∈Rt×dA context representation representing query Q and document D, respectively; d represents the output dimension of BERT; to reduce the number of parameters, prevent overfitting, and facilitate information interaction across text pairs, queries and documents share a context encoder.
As a further aspect of the present invention, Step2, adaptively identifying the discriminating characteristic based on an adaptively matching twin cell, so as to derive a corresponding mixed representation for each asymmetric text pair, comprises:
a human being can clearly identify the relationship between two sequences (e.g., query-document, keyword-document, and question-answer) at a glance. For example, a trained researcher can easily classify papers of his/her research direction according to title and abstract, because he/she can subconsciously identify distinguishing features, while ignoring irrelevant features for decision-making reasoning.
Using adaptive matching twin cells (called MAGS) to mimic the feature recognition process; the self-adaptive matching twin cell is a parallel architecture with two subunits MAG, namely a query end MAG and a document end MAG; since the query and document MAGs are the same, for simplicity, the present invention mainly describes the query MAG (i.e. fig. 2 illustrates the overall architecture), where the query MAG is:
given the context representation U of the extracted queryQ=[u1,…,ul]And a contextual representation V of the documentD=[v1,…,vt]L and t respectively represent the length of the query text and the length of the document text, and are used for identifying the identifying characteristic and synthesizing the identifying characteristic into the relevance characteristic; specifically, word-level similarity is first calculated as follows:
wherein S ∈ Rl×tIs a similarity matrix of all word pairs in the two sequences; these similarity scores are then normalized and are based on VDA reference representation is derived for each word in query Q:
RQ=softmax(S)VD (4)
the purpose of this operation is to pair V according to SDPerforming soft feature selection; that is, the relevant information in document D is transmitted to representation Q;
however, in this reference representation, irrelevant information in Q also provides further relevant learning; the supplementary features are first constructed by taking into account the difference of the reference representation from the original representation: dQ=UQ-RQ(ii) a Furthermore, to identify the discriminating characteristic, R is first identified using a similar pattern denoted by SQ、DQImportant features in these two semantic signals are as follows:
E=σ(W1S+B1) (5)
F(r)=RQ⊙E (6)
F(d)=DQ⊙(1-E) (7)
where σ (-) denotes a sigmoid activation function, W1And B1Are the transformation matrix and the bias matrix, respectively, and &, < > is the element bitwise product operation; then, the two parts are further connected, i.e.Andby an attention mechanism similar to equation 5:
pi=σ(W2Si+B2)(8)
wherein S isi,Fi (r)And Fi (d)Respectively correspond to the matrix S, F(r)、F(d)Line i of (1), symbolIs a vector concatenation operation, d denotes the output dimension of BERT, W2And B2Also a transformation matrix and a bias matrix, respectively; then, a high speed network is used to generate the discriminating characteristics of each word
pi=relu(W3Fi (c)+b3) (10)
gi=sig moid(W4Fi (c)+b4) (11)
ii=(1-gi)⊙Fi (c)+gi⊙pi (12)
Wherein, W3,W4∈R2d×2dAnd W5∈Rd×2dRepresenting a parameter matrix, b3,b4,b5Representing a bias vector; forming the synthesized mixed authentication features into a matrix:
a document end MAG: similar to the query MAG, the document MAG unit switches the roles of Q and D for the same flow, but the parameters of the two subunits are not shared. The invention usesRepresenting authentication features derived by the document-side MAG.
As a further scheme of the present invention, Step2 proposes a locally constrained hash denoising device, where the feature level denoising of a redundant long text by learning a distinct low-dimensional binary code specifically includes:
since document D is much larger than query Q, the discriminative feature extraction performed by document-side MAG still introduces much semantic noise. Here, the present invention employs a locally constrained hash denoiser to further filter out irrelevant features. More specifically, the locally constrained hash denoiser defines the encoding function FenA hash function FhAnd a decoding function Fde;
(1) Coding function FenMap representation form HDConverting into a low-dimensional matrix B epsilon Rt×h(ii) a Here, a feedforward neural network FNN (-) implemented by a three-layer multi-layer perceptron MLP is used for FenModeling; furthermore, in order to filter semantic noise and alleviate the gradient vanishing problem, we choose to use relu (-) as the activation function of the second layer (others are tanh ()), which can skip unnecessary features and preserve discriminating clues; the encoding process is summarized as follows:
B=Fen(HD)=FFN(HD) (14)
(2) hash function FhIs used to learn a differentiated binary matrix representation for the purpose of cleansing and efficient matching; the sgn (·) function is the best choice for binarization, but sgn (·) is not trivial; therefore, an approximation function tanh (-) is used to replace sgn (-) for supporting model training; in particular, a hash functionThe numbers are shown below;
BD=Fh(B)=tanh(αB) (15)
note that the hyper-parameter α is introduced to make the hash function more flexible and to generate a balanced, differentiated hash code, and to ensure that the value in B belongs to { -1,1}, an additional constraint is defined:
wherein B is(b)Sgn (b) denotes HDIs represented by a binary matrix, | | | | | luminous fluxFDenotes the F-norm, BDRepresenting the context of the document D after the document D passes through a Hash de-noising device, namely generating a binary code by a Hash function;
(3) decoding function FdeFrom BDIn reconstructing HDIt consists of three layers of multilayer perceptrons for decoding the binary matrix BDBack to the original one HDThus, reconstructing the sequence matrixThe definition is as follows:
wherein FNNTThe function of a decoder is used, and in order to reduce the loss of semantics in the reconstruction process, the mean square error MSE (mean square error) is added as a constraint condition when a model is trained;
it is emphasized that also HQPerforming hash denoising, updating matrix representation H of query Q using a single MLP layerQTo match the dimension of the hash denoiser, h;
HQ=MLP(HQ) (19)。
as a further aspect of the present invention, in Step2, obtaining a matching score of an asymmetric text pair by using a similarity predictor includes:
context representation after passing through hash denoiser for query QAnd a context representation of document D after passing through a hash denoiserThe match score G (Q, D) between query Q and document D is estimated by the MaxSim operator, as follows:
where Norm (-) denotes L2 normalization, so that when the inner product of any two hidden representations is computed, the result is [ -1,1]I.e., equivalent to the rest of the chord similarity,is HQThe vector representation of the ith word in (a),is BDThe jth vector of (a).
As a further aspect of the present invention, in Step2, the purpose of model optimization is to guide the learning about ADDAX and help estimate the matching score of asymmetric text pair, and the model optimization includes:
in the training phase, by using a negative sampling strategy based on triple hinge loss:
L3=max{0,0.1-G(Q,D)+G(Q,D-)} (21)
wherein D-Is the corresponding negative sample document sampled from the training set, G (Q, D) is the matching score between query Q and document D;
finally, a hinge is combinedTwo constraints in loss and hash denoiser; that is, the final optimization objective is L1、L2And L3Linear fusion of (2):
where δ and γ are tunable hyper-parameters that control the importance of two constraints respectively, θ is the parameter set, the parameters are updated in an end-to-end fashion on small batches using Adam, BDIs a context representation of the document D after passing through a hash de-noiser, namely a binary code generated by a hash function, BDRepresenting a hash code generated by document D using the sgn sign function.
In order to verify the effectiveness of the invention, a reference model for evaluating indexes, experimental detailed parameter setting and comparison is introduced below, and the experimental result is analyzed and discussed.
1. The evaluation indexes of the invention mainly adopt MRR (mean regenerative rank), P @1(Precision at 1) and MAP (mean Average Precision). In the experiments of the present invention, the present invention selects BERTbaseAs a context encoder in ADDAX. More specifically, the present invention sets the concealment dimension h 300. Minimum batch sizes for the inspuranceQA, wikiQA, yahooQA, and MS MARCO were set to 32, 64, and 64, respectively. The random deactivation rate was set to 0.1. The learning rates for inspuranceQA, MS MARCO, wikiQA and yahooQA were 5e-6, 5e-6, 1e-5 and 9e-6, respectively. The number of training times for inspuranceqa was 60, for wikiQA 18, and for yahooQA 9. In addition, the number of iterations of the present invention in MS MARCO is 20 ten thousand. The values of α, δ and γ are set to 5, 1e-6, 0.003, respectively.
2. Since asymmetric text matching has become an increasing demand for many downstream tasks, such as information retrieval and answer selection, experiments were conducted on four real data sets to evaluate the validity of the ADDAX proposed by the present invention, including question-answering and document retrieval tasks. At the same time, the present invention compares ADDAX to the two most advanced baselines. The first type may perform question-answer matching, and the other type may perform document retrieval.
Question and answer matching: selecting baseline models for answer selection can be divided into four categories: (a) the traditional single model is as follows: IARNN-GATE, AP-CNN, RNN-POA, AP-BilSTM, HD-LSTM, AP-LSTM, Multihop-Sequential-LSTM, HyperQA, MULT, TFM + HN, LSTM-CNN + HN; (b) a single model incorporating external knowledge: KAN, CKANN; (c) an aggregate model: SUMBASE,PTKLRXNET, SD (BiLSTM + TFM); (d) BERT-based models: HAS, BERT-pooling and BERT-anchorage.
Document retrieval: the present invention first takes BM25 as a baseline, which is a representative conventional search method. Including interaction-based neural ranking models such as KNRM, fastText + ConvKNRM, and Duet. Furthermore, since the proposed ADDAX uses BERT as the context coder, the present invention selects several latest pre-training language model-based methods, including BERTbaserander, DeepCT, docT5query, ColBERT, TCTCTColBERT, COIL-tok, and COIL-full. In addition, the present invention adds two dense retrievers for performance comparison, namely CLEAR and ADORE + STAR.
3. To verify the validity of the ADDAX proposed by the present invention and to take into account different task properties and data characteristics, the existing most advanced models in the four datasets are completely different. Table 2 summarizes the performance of the 22 methods of selecting answers on question-answer matches for the corresponding three data sets. The present invention chooses to discuss the experimental results separately on each data set.
Table 2 shows the performance comparison between the ADDAX proposed by the present invention and several of the most advanced baselines on the QA dataset, with the inapplicable results indicated by "- -" and no inapplicable results. The best results are highlighted in bold.
Results of inspuranceqa. Table 2 summarizes the experimental results of the inspuranceqa dataset. The present invention observes that traditional single models, such as AP-CNN, AP-BilSTM, Multihop-Sequential LSTM, and IARNN-GATE, have much lower P @1 values in both test sets than MULT, LSTM-CNN + HN, and TFM + HN. Furthermore, it is not surprising that BERT-based methods (e.g., BERT-posing, BERT-attention, and HAS) consistently yield better performance than the single model. Because BERT is pre-trained on large-scale linguistic corpora, it can leverage rich public knowledge to help eliminate lexical mismatches. These phenomena are consistent with conclusions drawn from previous work. The single models (such as KAN, CKANN and CKANN-L) that incorporate external knowledge are superior to the traditional single models and the BERT-based models. Because they can extract relevant information from external knowledge and Knowledge Graphs (KGs), the semantic signals are enriched, and the effectiveness of incorporating external knowledge is verified. At the same time, the present invention can see that performance of ADDAX is significantly better than nearly all baselines in the inspuranceQA dataset (except CKANN on Test 2).
Results on wikiQA. From table 2, the present invention analyzed MAP and MRR performance on wikiQA for a total of 17 methods. It is observed by the present invention that no significant advantage is obtained with a single model of external knowledge, compared to some single models (e.g., MULT and Multihop-Sequential LSTM). For example, the MAP of MULT on CKANN achieves a performance gain of 1.13%. Possible causes of this phenomenon are: (1) lack of wikiQA training data leads to insufficient relevance learning; (2) the integration of irrelevant external knowledge may produce semantic noise. Second, with respect to set models, SUMBASE,PTKLRXNET is superior to SD (BilSIM + TFM) in both MAP and MRR values. Obviously, the integrated model has better matching performance than the traditional single model and the model with external knowledge. This observation indicates that integrating multiple models is critical to improving generalization capability. Third, the present invention observes that BERT-pooling consistently performs worse than BERT-attention and HAS, compared to these most advanced BERT-based methods. This observation is in three numbersThe data sets are consistent, which indicates that interactive modeling plays an important role in text matching. In contrast, ADDAX achieves better performance relative to all baselines in the wikiQA dataset.
Results of yahooQA. From the results in Table 2, the present invention observes a similar performance pattern as the inspuranceQA dataset. ADDAX is clearly superior to all baselines in MAP and MRR. Specifically, the MAP value of ADDAX proposed by the present invention is improved by 3.23% compared to CKANN (best baseline).
Table 3 Experimental results for MS MARCO. The best performance is highlighted in bold. A% indicates the relative improvement in ADDAX over all baseline models.
Experimental results for MS MARCO. Table 3 reports the performance comparison results of different document retrieval models on MS MARCO. It can be seen from table 3 that, first, the performance of the conventional query document matching technique (i.e., BM25) is consistently much worse than the deep learning solutions (e.g., KNRM and fastText + ConvKNRM), which is not surprising. Second, pre-training language model based methods (e.g., BERT-base, ColBERT, and COIL-full) achieve better matching accuracy than KNRM, fastText + ConvKNRM, and Duet for all neural matching models. This is because the powerful language expression capabilities of the pre-trained language model greatly alleviate the vocabulary mismatch problem. Note that deep ct and DocT5Query, while they can break the term frequency constraint with pre-trained language models, are still poor in semantic matching. Furthermore, it is worth noting that dense retrievers almost compete with models based on pre-trained language models. Third, ADDAX always achieves the best performance on the MS MARCO dataset. ADDAX increased 1.2% -17.4% over MRR @10 at all baselines. Overall, the above comparisons performed on two different tasks and data sets consistently show that the ADDAX proposed by the present invention achieves significant performance gains overall. These results demonstrate that the adaptive matching twin cells and hash denoiser used in ADDAX can improve asymmetric text matching accuracy in performing feature recognition and denoising.
4. To verify that each module in the model of the invention is effective for the whole, the following comparison and ablation experiments were designed. More specifically, the present invention compares ADDAX to the following variants: (a) w/o MAG, removing the self-adaptive matched twin cells; (b) w/o FD, without feature recognition as described in equations 5-9; (c) w/o-HW, the fusion of two semantic signals by a high-speed network is saved, and the two semantic signals are directly added; (d) without HD, no locally constrained hash denoiser is included.
TABLE 4 ablation test results
Table 4 reports the results of these experiments on the MS MARCO and wikiQA datasets. The invention can see that the elimination of the self-adaptive matching twin cells leads to the largest performance reduction, and is followed by a hash de-noising device. In particular, the w/o MAG at wikiQA decreased by 8.77% and 8.48%, respectively, and the MRR @10 value of MS MARCO decreased by 1.25%, for the MAP and MRR values. This suggests that adaptive matching twins play a crucial role in the identification of identifying features at ADDAX to improve matching accuracy. In addition, w/o HD also causes performance degradation, which explains the effectiveness of performing feature-level denoising at the document end. More specifically, for each specific structure designed in the MAGS, the present invention also leads to the following conclusions: (i) a performance degradation of w/o FD can be observed, which indicates that it is important to adaptively highlight different kinds of semantic signals; (ii) the performance of w/o HW is also somewhat degraded. This shows that high speed networks synthesize hybrid authentication features more efficiently.
5. In this section, the invention performed sensitivity analysis on four important hyperparameters (α, δ, γ, and h) on the wikiOA test set. From fig. 3, it can be seen by the present invention that by increasing α to 5 (refer to fig. 3(c)), matching performance can be improved by learning a more robust hash function. Further, fig. 3(b) plots the performance line graph by changing the δ value. The present invention observes that ADDAX is insensitive to the range of [1-7,1-5], and better matching accuracy is obtained when δ is 1 e-6. Fig. 3(a) plots the performance line by varying the gamma value. When γ is greater than or less than 0.003, the performance becomes worse.
In order to select the most suitable low dimensional space h, the invention has been experimented with adjusting h between {64,128,256,300,512 }. The results are shown in FIG. 3 (d). The present invention observes that ADDAX consistently achieves better matching accuracy on wikiQA datasets when h is 300. As h becomes smaller or larger, a certain performance degradation of ADDAX occurs. This may be due to the fact that a small h produces insufficient semantic signal, while a large value will inevitably lead to an overfitting of the model.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Claims (7)
1. The asymmetric text matching method based on the self-adaptive feature recognition and denoising is characterized by comprising the following steps: the method comprises the following specific steps:
step1, preprocessing the question-answer matching data set and the query-document matching data set;
step2, carrying out context representation on each asymmetric text pair preprocessed in Step1 by using a BERT-based context encoder; adaptively identifying discriminating features based on an adaptively matching twin cell, thereby deriving a respective hybrid representation for each asymmetric text pair; a local constraint Hash de-noising device is provided, and the characteristic level de-noising is carried out on a redundant long text by learning a distinctive low-dimensional binary code; and finally, obtaining the matching scores of the asymmetric text pairs by utilizing a similarity predictor.
2. The asymmetric text matching method based on adaptive feature recognition and denoising as claimed in claim 1, wherein: in Step1, the question-answer matching data set comprises inspuranceQA, wikiQA and yahooQA, and the query-document matching data set adopts MS MARCO; the preprocessing comprises the step of carrying out matching deletion on special characters in the text by using a regular expression.
3. The asymmetric text matching method based on adaptive feature recognition and denoising as claimed in claim 1, wherein: in Step2, the context representation of each asymmetric text pair preprocessed in Step1 by a BERT-based context encoder comprises:
selecting BERT as context encoder, following the format of BERT input, a specific mark symbol [ CLS ]]At the beginning of the sequence, i.e., { [ CLS],q1,q2,…,qlAnd { [ CLS { [],d1,d2,…,dtHere, the BERT based context encoder is described as follows:
UQ=BERT([CLS],q1,q2,…,ql) (1)
VD=BERT([CLS],d1,d2,…,dt) (2)
wherein, UQ∈Rl×dAnd VD∈Rt×dA context representation representing query Q and document D, respectively; d represents the output dimension of BERT; to reduce the number of parameters, prevent overfitting, and facilitate information interaction across text pairs, queries and documents share a context encoder.
4. The asymmetric text matching method based on adaptive feature recognition and denoising as claimed in claim 1, wherein: in Step2, adaptively identifying the discriminating characteristic based on an adaptively matching twin cell, so as to derive a corresponding mixed representation for each asymmetric text pair comprises:
the feature recognition process was simulated using adaptive matched twins called MAGS; the self-adaptive matching twin cell is a parallel architecture with two subunits MAG, namely a query end MAG and a document end MAG; because the query end and the document end MAG are the same, the query end MAG is:
given the context representation U of the extracted queryQ=[u1,…,ul]And a contextual representation V of the documentD=[v1,…,vt]L and t respectively represent the length of the query text and the length of the document text, and are used for identifying the identifying characteristic and synthesizing the identifying characteristic into the relevance characteristic; specifically, word-level similarity is first calculated as follows:
wherein S ∈ Rl×tIs a similarity matrix of all word pairs in the two sequences; these similarity scores are then normalized and are based on VDA reference representation is derived for each word in query Q:
RQ=soft max(S)VD (4)
the purpose of this operation is to pair V according to SDPerforming soft feature selection; that is, the relevant information in document D is transmitted to representation Q;
however, in this reference representation, irrelevant information in Q also provides further relevant learning; the supplementary features are first constructed by taking into account the difference of the reference representation from the original representation: dQ=UQ-RQ(ii) a Furthermore, to identify the discriminating characteristic, R is first identified using a similar pattern denoted by SQ、DQImportant features in these two semantic signals are as follows:
E=σ(W1S+B1) (5)
F(r)=RQ⊙E (6)
F(d)=DQ⊙(1-E) (7)
where σ (-) denotes a sigmoid activation function, W1And B1Respectively, transformation matrix and offsetMatrix, and |, is an element bitwise product operation; then, the two parts are further connected, i.e. Fi (r)And Fi (d)By an attention mechanism similar to equation 5:
pi=σ(W2Si+B2) (8)
wherein S isi,Fi (r)And Fi (d)Respectively correspond to the matrix S, F(r)、F(d)Line i of (1), symbolIs a vector concatenation operation, d denotes the output dimension of BERT, W2And B2Also a transformation matrix and a bias matrix, respectively; then, a high speed network is used to generate the discriminating characteristics of each word
pi=relu(W3Fi (c)+b3) (10)
gi=sigmoid(W4Fi (c)+b4) (11)
ii=(1-gi)⊙Fi (c)+gi⊙pi (12)
5. the asymmetric text matching method based on adaptive feature recognition and denoising as claimed in claim 1, wherein: in Step2, a locally constrained hash denoising method is provided, in which learning a distinctive low-dimensional binary code to perform feature-level denoising on a redundant long text specifically includes:
the local constraint Hash de-noising device defines an encoding function FenA hash function FhAnd a decoding function Fde;
(1) Coding function FenMap representation form HDConverting into a low-dimensional matrix B epsilon Rt×h(ii) a Here, a feedforward neural network FNN (-) implemented by a three-layer multi-layer perceptron MLP is used for FenModeling; furthermore, in order to filter semantic noise and alleviate the gradient vanishing problem, relu (-) is chosen as the second layer of activation function, which can skip unnecessary features and preserve discriminating clues; the encoding process is summarized as follows:
B=Fen(HD)=FFN(HD) (14)
(2) hash function FhIs used to learn a differentiated binary matrix representation for the purpose of cleansing and efficient matching; the sgn (·) function is the best choice for binarization, but sgn (·) is not trivial; therefore, an approximation function tanh (-) is used to replace sgn (-) for supporting model training; specifically, the hash function is expressed as follows;
BD=Fh(B)=tanh(αB) (15)
note that the hyper-parameter α is introduced to make the hash function more flexible and to generate a balanced, differentiated hash code, and to ensure that the value in B belongs to { -1,1}, an additional constraint is defined:
wherein B is(b)Sgn (b) denotes HDIs represented by a binary matrix, | | | | | luminous fluxFDenotes the F-norm, BDRepresenting the context of the document D after the document D passes through a Hash de-noising device, namely generating a binary code by a Hash function;
(3) decoding function FdeFrom BDIn reconstructing HDIt consists of three layers of multilayer perceptrons for decoding the binary matrix BDBack to the original one HDThus, reconstructing the sequence matrixThe definition is as follows:
wherein FNNTThe function of a decoder is used, and in order to reduce the loss of semantics in the reconstruction process, the mean square error MSE (mean square error) is added as a constraint condition when a model is trained;
it is emphasized that also HQPerforming hash denoising, updating matrix representation H of query Q using a single MLP layerQTo match the dimension of the hash denoiser, h;
HQ=MLP(HQ) (19)。
6. the asymmetric text matching method based on adaptive feature recognition and denoising as claimed in claim 1, wherein: in Step2, obtaining a matching score of an asymmetric text pair by using a similarity predictor comprises:
context representation after passing through hash denoiser for query QAnd a context representation of document D after passing through a hash denoiserThe match score G (Q, D) between query Q and document D is estimated by the MaxSim operator, as follows:
7. The asymmetric text matching method based on adaptive feature recognition and denoising as claimed in claim 5, wherein: in Step2, model optimization includes:
in the training phase, by using a negative sampling strategy based on triple hinge loss:
L3=max{0,0.1-G(Q,D)+G(Q,D-)} (21)
wherein D-Is the corresponding negative sample document sampled from the training set, G (Q, D) is the matching score between query Q and document D;
finally, combining hinge loss and two constraints in a Hash denoiser; that is, the final optimization objective is L1、L2And L3Linear fusion of (2):
where δ and γ are tunable hyper-parameters that control the importance of two constraints respectively, θ is the parameter set, the parameters are updated in an end-to-end fashion on small batches using Adam, BDIs a context representation of the document D after passing through a hash de-noiser, namely a binary code generated by a hash function, BDRepresenting a hash code generated by document D using the sgn sign function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111192675.7A CN113935329B (en) | 2021-10-13 | 2021-10-13 | Asymmetric text matching method based on adaptive feature recognition and denoising |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111192675.7A CN113935329B (en) | 2021-10-13 | 2021-10-13 | Asymmetric text matching method based on adaptive feature recognition and denoising |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113935329A true CN113935329A (en) | 2022-01-14 |
CN113935329B CN113935329B (en) | 2022-12-13 |
Family
ID=79278623
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111192675.7A Active CN113935329B (en) | 2021-10-13 | 2021-10-13 | Asymmetric text matching method based on adaptive feature recognition and denoising |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113935329B (en) |
Citations (46)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254210A1 (en) * | 2011-03-28 | 2012-10-04 | Siva Kiran Dhulipala | Systems and methods of utf-8 pattern matching |
CN102855276A (en) * | 2012-07-20 | 2013-01-02 | 北京大学 | Method for judging polarity of comment text and application of method |
CN104657350A (en) * | 2015-03-04 | 2015-05-27 | 中国科学院自动化研究所 | Hash learning method for short text integrated with implicit semantic features |
US9424598B1 (en) * | 2013-12-02 | 2016-08-23 | A9.Com, Inc. | Visual search in a controlled shopping environment |
US20160323442A1 (en) * | 2015-05-01 | 2016-11-03 | Vyng, Inc. | Methods and systems for management of video and ring tones among mobile devices |
CN106250777A (en) * | 2016-07-26 | 2016-12-21 | 合肥赛猊腾龙信息技术有限公司 | In the leakage-preventing system of data, a kind of document fingerprint extracts and matching process |
CN106599129A (en) * | 2016-12-02 | 2017-04-26 | 山东科技大学 | Multi-beam point cloud data denoising method considering terrain characteristics |
CN106776553A (en) * | 2016-12-07 | 2017-05-31 | 中山大学 | A kind of asymmetric text hash method based on deep learning |
EP3185221A1 (en) * | 2015-12-23 | 2017-06-28 | Friedrich Kisters | Authentication apparatus and method for optical or acoustic character recognition |
CN107184203A (en) * | 2017-07-03 | 2017-09-22 | 重庆大学 | Electrocardiosignal Feature point recognition method based on adaptive set empirical mode decomposition |
US20170293678A1 (en) * | 2016-04-11 | 2017-10-12 | Nuance Communications, Inc. | Adaptive redo for trace text input |
US20180048762A1 (en) * | 2015-05-01 | 2018-02-15 | Vyng, Inc. | Methods and systems for management of media content associated with message context on mobile computing devices |
CN108074310A (en) * | 2017-12-21 | 2018-05-25 | 广东汇泰龙科技有限公司 | Voice interactive method and intelligent lock administration system based on sound identification module |
CN108244744A (en) * | 2016-12-29 | 2018-07-06 | ***通信有限公司研究院 | A kind of method of moving state identification, sole and footwear |
CN108319672A (en) * | 2018-01-25 | 2018-07-24 | 南京邮电大学 | Mobile terminal malicious information filtering method and system based on cloud computing |
CN108491430A (en) * | 2018-02-09 | 2018-09-04 | 北京邮电大学 | It is a kind of based on the unsupervised Hash search method clustered to characteristic direction |
CN109033478A (en) * | 2018-09-12 | 2018-12-18 | 重庆工业职业技术学院 | A kind of text information law analytical method and system for search engine |
CN109119085A (en) * | 2018-08-24 | 2019-01-01 | 深圳竹云科技有限公司 | A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector |
CN109858018A (en) * | 2018-12-25 | 2019-06-07 | 中国科学院信息工程研究所 | A kind of entity recognition method and system towards threat information |
CN109960732A (en) * | 2019-03-29 | 2019-07-02 | 广东石油化工学院 | A kind of discrete Hash cross-module state search method of depth and system based on robust supervision |
CN109992648A (en) * | 2019-04-10 | 2019-07-09 | 北京神州泰岳软件股份有限公司 | The word-based depth text matching technique and device for migrating study |
CN110019685A (en) * | 2019-04-10 | 2019-07-16 | 北京神州泰岳软件股份有限公司 | Depth text matching technique and device based on sequence study |
CN110020002A (en) * | 2018-08-21 | 2019-07-16 | 平安普惠企业管理有限公司 | Querying method, device, equipment and the computer storage medium of event handling scheme |
CN110147531A (en) * | 2018-06-11 | 2019-08-20 | 广州腾讯科技有限公司 | A kind of recognition methods, device and the storage medium of Similar Text content |
CN110166478A (en) * | 2019-05-30 | 2019-08-23 | 陕西交通电子工程科技有限公司 | Content of text safe transmission method, device, computer equipment and storage medium |
CN110321562A (en) * | 2019-06-28 | 2019-10-11 | 广州探迹科技有限公司 | A kind of short text matching process and device based on BERT |
CN110390023A (en) * | 2019-07-02 | 2019-10-29 | 安徽继远软件有限公司 | A kind of knowledge mapping construction method based on improvement BERT model |
CN110472230A (en) * | 2019-07-11 | 2019-11-19 | 平安科技(深圳)有限公司 | The recognition methods of Chinese text and device |
CN110610001A (en) * | 2019-08-12 | 2019-12-24 | 大箴(杭州)科技有限公司 | Short text integrity identification method and device, storage medium and computer equipment |
CN110688861A (en) * | 2019-09-26 | 2020-01-14 | 沈阳航空航天大学 | Multi-feature fusion sentence-level translation quality estimation method |
CN110717325A (en) * | 2019-09-04 | 2020-01-21 | 北京三快在线科技有限公司 | Text emotion analysis method and device, electronic equipment and storage medium |
CN111078911A (en) * | 2019-12-13 | 2020-04-28 | 宁波大学 | Unsupervised hashing method based on self-encoder |
US20200151503A1 (en) * | 2018-11-08 | 2020-05-14 | Adobe Inc. | Training Text Recognition Systems |
CN111209401A (en) * | 2020-01-03 | 2020-05-29 | 西安电子科技大学 | System and method for classifying and processing sentiment polarity of online public opinion text information |
CN111460077A (en) * | 2019-01-22 | 2020-07-28 | 大连理工大学 | Cross-modal Hash retrieval method based on class semantic guidance |
CN111460176A (en) * | 2020-05-11 | 2020-07-28 | 南京大学 | Multi-document machine reading understanding method based on Hash learning |
CN111581956A (en) * | 2020-04-08 | 2020-08-25 | 国家计算机网络与信息安全管理中心 | Sensitive information identification method and system based on BERT model and K nearest neighbor |
CN111737706A (en) * | 2020-05-11 | 2020-10-02 | 华南理工大学 | Front-end portrait encryption and identification method with biological feature privacy protection function |
CN112199520A (en) * | 2020-09-19 | 2021-01-08 | 复旦大学 | Cross-modal Hash retrieval algorithm based on fine-grained similarity matrix |
CN112732748A (en) * | 2021-01-07 | 2021-04-30 | 西安理工大学 | Non-invasive household appliance load identification method based on adaptive feature selection |
CN112906716A (en) * | 2021-02-25 | 2021-06-04 | 北京理工大学 | Noisy SAR image target identification method based on wavelet de-noising threshold self-learning |
CN112925888A (en) * | 2019-12-06 | 2021-06-08 | 上海大岂网络科技有限公司 | Method and device for training question-answer response and small sample text matching model |
CN112989055A (en) * | 2021-04-29 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Text recognition method and device, computer equipment and storage medium |
CN113064959A (en) * | 2020-01-02 | 2021-07-02 | 南京邮电大学 | Cross-modal retrieval method based on deep self-supervision sorting Hash |
CN113076398A (en) * | 2021-03-30 | 2021-07-06 | 昆明理工大学 | Cross-language information retrieval method based on bilingual dictionary mapping guidance |
CN113239181A (en) * | 2021-05-14 | 2021-08-10 | 廖伟智 | Scientific and technological literature citation recommendation method based on deep learning |
-
2021
- 2021-10-13 CN CN202111192675.7A patent/CN113935329B/en active Active
Patent Citations (47)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20120254210A1 (en) * | 2011-03-28 | 2012-10-04 | Siva Kiran Dhulipala | Systems and methods of utf-8 pattern matching |
EP2691883A1 (en) * | 2011-03-28 | 2014-02-05 | Citrix Systems Inc. | Systems and methods of utf-8 pattern matching |
CN102855276A (en) * | 2012-07-20 | 2013-01-02 | 北京大学 | Method for judging polarity of comment text and application of method |
US9424598B1 (en) * | 2013-12-02 | 2016-08-23 | A9.Com, Inc. | Visual search in a controlled shopping environment |
CN104657350A (en) * | 2015-03-04 | 2015-05-27 | 中国科学院自动化研究所 | Hash learning method for short text integrated with implicit semantic features |
US20160323442A1 (en) * | 2015-05-01 | 2016-11-03 | Vyng, Inc. | Methods and systems for management of video and ring tones among mobile devices |
US20180048762A1 (en) * | 2015-05-01 | 2018-02-15 | Vyng, Inc. | Methods and systems for management of media content associated with message context on mobile computing devices |
EP3185221A1 (en) * | 2015-12-23 | 2017-06-28 | Friedrich Kisters | Authentication apparatus and method for optical or acoustic character recognition |
US20170293678A1 (en) * | 2016-04-11 | 2017-10-12 | Nuance Communications, Inc. | Adaptive redo for trace text input |
CN106250777A (en) * | 2016-07-26 | 2016-12-21 | 合肥赛猊腾龙信息技术有限公司 | In the leakage-preventing system of data, a kind of document fingerprint extracts and matching process |
CN106599129A (en) * | 2016-12-02 | 2017-04-26 | 山东科技大学 | Multi-beam point cloud data denoising method considering terrain characteristics |
CN106776553A (en) * | 2016-12-07 | 2017-05-31 | 中山大学 | A kind of asymmetric text hash method based on deep learning |
CN108244744A (en) * | 2016-12-29 | 2018-07-06 | ***通信有限公司研究院 | A kind of method of moving state identification, sole and footwear |
CN107184203A (en) * | 2017-07-03 | 2017-09-22 | 重庆大学 | Electrocardiosignal Feature point recognition method based on adaptive set empirical mode decomposition |
CN108074310A (en) * | 2017-12-21 | 2018-05-25 | 广东汇泰龙科技有限公司 | Voice interactive method and intelligent lock administration system based on sound identification module |
CN108319672A (en) * | 2018-01-25 | 2018-07-24 | 南京邮电大学 | Mobile terminal malicious information filtering method and system based on cloud computing |
CN108491430A (en) * | 2018-02-09 | 2018-09-04 | 北京邮电大学 | It is a kind of based on the unsupervised Hash search method clustered to characteristic direction |
CN110147531A (en) * | 2018-06-11 | 2019-08-20 | 广州腾讯科技有限公司 | A kind of recognition methods, device and the storage medium of Similar Text content |
CN110020002A (en) * | 2018-08-21 | 2019-07-16 | 平安普惠企业管理有限公司 | Querying method, device, equipment and the computer storage medium of event handling scheme |
CN109119085A (en) * | 2018-08-24 | 2019-01-01 | 深圳竹云科技有限公司 | A kind of relevant audio recognition method of asymmetric text based on wavelet analysis and super vector |
CN109033478A (en) * | 2018-09-12 | 2018-12-18 | 重庆工业职业技术学院 | A kind of text information law analytical method and system for search engine |
US20200151503A1 (en) * | 2018-11-08 | 2020-05-14 | Adobe Inc. | Training Text Recognition Systems |
CN109858018A (en) * | 2018-12-25 | 2019-06-07 | 中国科学院信息工程研究所 | A kind of entity recognition method and system towards threat information |
CN111460077A (en) * | 2019-01-22 | 2020-07-28 | 大连理工大学 | Cross-modal Hash retrieval method based on class semantic guidance |
CN109960732A (en) * | 2019-03-29 | 2019-07-02 | 广东石油化工学院 | A kind of discrete Hash cross-module state search method of depth and system based on robust supervision |
CN109992648A (en) * | 2019-04-10 | 2019-07-09 | 北京神州泰岳软件股份有限公司 | The word-based depth text matching technique and device for migrating study |
CN110019685A (en) * | 2019-04-10 | 2019-07-16 | 北京神州泰岳软件股份有限公司 | Depth text matching technique and device based on sequence study |
CN110166478A (en) * | 2019-05-30 | 2019-08-23 | 陕西交通电子工程科技有限公司 | Content of text safe transmission method, device, computer equipment and storage medium |
CN110321562A (en) * | 2019-06-28 | 2019-10-11 | 广州探迹科技有限公司 | A kind of short text matching process and device based on BERT |
CN110390023A (en) * | 2019-07-02 | 2019-10-29 | 安徽继远软件有限公司 | A kind of knowledge mapping construction method based on improvement BERT model |
CN110472230A (en) * | 2019-07-11 | 2019-11-19 | 平安科技(深圳)有限公司 | The recognition methods of Chinese text and device |
CN110610001A (en) * | 2019-08-12 | 2019-12-24 | 大箴(杭州)科技有限公司 | Short text integrity identification method and device, storage medium and computer equipment |
CN110717325A (en) * | 2019-09-04 | 2020-01-21 | 北京三快在线科技有限公司 | Text emotion analysis method and device, electronic equipment and storage medium |
CN110688861A (en) * | 2019-09-26 | 2020-01-14 | 沈阳航空航天大学 | Multi-feature fusion sentence-level translation quality estimation method |
CN112925888A (en) * | 2019-12-06 | 2021-06-08 | 上海大岂网络科技有限公司 | Method and device for training question-answer response and small sample text matching model |
CN111078911A (en) * | 2019-12-13 | 2020-04-28 | 宁波大学 | Unsupervised hashing method based on self-encoder |
CN113064959A (en) * | 2020-01-02 | 2021-07-02 | 南京邮电大学 | Cross-modal retrieval method based on deep self-supervision sorting Hash |
CN111209401A (en) * | 2020-01-03 | 2020-05-29 | 西安电子科技大学 | System and method for classifying and processing sentiment polarity of online public opinion text information |
CN111581956A (en) * | 2020-04-08 | 2020-08-25 | 国家计算机网络与信息安全管理中心 | Sensitive information identification method and system based on BERT model and K nearest neighbor |
CN111737706A (en) * | 2020-05-11 | 2020-10-02 | 华南理工大学 | Front-end portrait encryption and identification method with biological feature privacy protection function |
CN111460176A (en) * | 2020-05-11 | 2020-07-28 | 南京大学 | Multi-document machine reading understanding method based on Hash learning |
CN112199520A (en) * | 2020-09-19 | 2021-01-08 | 复旦大学 | Cross-modal Hash retrieval algorithm based on fine-grained similarity matrix |
CN112732748A (en) * | 2021-01-07 | 2021-04-30 | 西安理工大学 | Non-invasive household appliance load identification method based on adaptive feature selection |
CN112906716A (en) * | 2021-02-25 | 2021-06-04 | 北京理工大学 | Noisy SAR image target identification method based on wavelet de-noising threshold self-learning |
CN113076398A (en) * | 2021-03-30 | 2021-07-06 | 昆明理工大学 | Cross-language information retrieval method based on bilingual dictionary mapping guidance |
CN112989055A (en) * | 2021-04-29 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Text recognition method and device, computer equipment and storage medium |
CN113239181A (en) * | 2021-05-14 | 2021-08-10 | 廖伟智 | Scientific and technological literature citation recommendation method based on deep learning |
Non-Patent Citations (1)
Title |
---|
刘邦国 等: ""一种面向PDF文本内容审查的高效多模式匹配算法"", 《计算机应用研究》 * |
Also Published As
Publication number | Publication date |
---|---|
CN113935329B (en) | 2022-12-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN109271522B (en) | Comment emotion classification method and system based on deep hybrid model transfer learning | |
CN106484674B (en) | Chinese electronic medical record concept extraction method based on deep learning | |
CN112508077B (en) | Social media emotion analysis method and system based on multi-modal feature fusion | |
CN109189925A (en) | Term vector model based on mutual information and based on the file classification method of CNN | |
CN110502753A (en) | A kind of deep learning sentiment analysis model and its analysis method based on semantically enhancement | |
CN110991190B (en) | Document theme enhancement system, text emotion prediction system and method | |
CN109992669B (en) | Keyword question-answering method based on language model and reinforcement learning | |
CN114564565A (en) | Deep semantic recognition model for public safety event analysis and construction method thereof | |
CN112749274B (en) | Chinese text classification method based on attention mechanism and interference word deletion | |
CN110807324A (en) | Video entity identification method based on IDCNN-crf and knowledge graph | |
CN111125333A (en) | Generation type knowledge question-answering method based on expression learning and multi-layer covering mechanism | |
Zhou et al. | Master: Multi-task pre-trained bottlenecked masked autoencoders are better dense retrievers | |
CN115687595A (en) | Comparison and interpretation generation method based on template prompt and oriented to common sense question answering | |
Antit et al. | TunRoBERTa: a Tunisian robustly optimized BERT approach model for sentiment analysis | |
Wang et al. | Non-uniform speaker disentanglement for depression detection from raw speech signals | |
Sun et al. | Multi-classification speech emotion recognition based on two-stage bottleneck features selection and MCJD algorithm | |
Kišš et al. | SoftCTC—semi-supervised learning for text recognition using soft pseudo-labels | |
Dwojak et al. | From dataset recycling to multi-property extraction and beyond | |
CN113935329B (en) | Asymmetric text matching method based on adaptive feature recognition and denoising | |
CN115952360A (en) | Domain-adaptive cross-domain recommendation method and system based on user and article commonality modeling | |
CN115964475A (en) | Dialogue abstract generation method for medical inquiry | |
CN115588486A (en) | Traditional Chinese medicine diagnosis generating device based on Transformer and application thereof | |
CN114943216A (en) | Case microblog attribute-level viewpoint mining method based on graph attention network | |
CN114757183A (en) | Cross-domain emotion classification method based on contrast alignment network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |