CN113656547B - Text matching method, device, equipment and storage medium - Google Patents

Text matching method, device, equipment and storage medium Download PDF

Info

Publication number
CN113656547B
CN113656547B CN202110942420.1A CN202110942420A CN113656547B CN 113656547 B CN113656547 B CN 113656547B CN 202110942420 A CN202110942420 A CN 202110942420A CN 113656547 B CN113656547 B CN 113656547B
Authority
CN
China
Prior art keywords
sentence
text
statement
information
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110942420.1A
Other languages
Chinese (zh)
Other versions
CN113656547A (en
Inventor
沈越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110942420.1A priority Critical patent/CN113656547B/en
Publication of CN113656547A publication Critical patent/CN113656547A/en
Application granted granted Critical
Publication of CN113656547B publication Critical patent/CN113656547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to artificial intelligence and provides a text matching method, device, equipment and storage medium. According to the method, when a text matching request is received, a search sentence is acquired according to the text matching request, the length requirement of a sentence dimension reduction model is acquired, the search sentence is subjected to coding processing according to the length requirement to obtain sentence codes, the sentence codes are analyzed based on the sentence dimension reduction model to obtain sentence information, the sentence information is subjected to normalization processing to obtain sentence characteristics, a text to be selected and the information to be selected are acquired according to the text matching request, filtering processing is carried out on the information to be selected to obtain the characteristics to be selected, the text similarity between the search sentence and the text to be selected is calculated according to the sentence characteristics and the characteristics to be selected, and the text to be selected with the maximum text similarity is determined to be a target text. The invention can improve the text matching efficiency and the matching accuracy. Furthermore, the present invention relates to blockchain technology, wherein the target text can be stored in a blockchain.

Description

Text matching method, device, equipment and storage medium
Technical Field
The present invention relates to the field of artificial intelligence technologies, and in particular, to a text matching method, apparatus, device, and storage medium.
Background
Text matching refers to matching text similar to the meaning of a search sentence from a knowledge base, and the reading efficiency of a user can be improved by the text matching mode. In the current text matching implementation mode, the search sentence and each text to be selected are learned together based on the BERT model to select the text to be matched most, however, the matching efficiency is low due to more repeated processing steps and more trained BERT model parameters.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a text matching method, apparatus, device, and storage medium that can improve matching efficiency and matching accuracy.
In one aspect, the present invention provides a text matching method, where the text matching method includes:
when a text matching request is received, acquiring a search sentence according to the text matching request;
acquiring a pre-trained sentence dimension reduction model, and acquiring the length requirement of the sentence dimension reduction model;
coding the search statement according to the length requirement to obtain a statement code;
analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information;
Carrying out normalization processing on the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
According to a preferred embodiment of the present invention, the obtaining a search sentence according to the text matching request includes:
analyzing the text matching request message to obtain data information carried by the message;
extracting a statement path and a statement mark from the data information, and calculating the query total quantity of the statement path and the statement mark;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
According to a preferred embodiment of the present invention, the encoding the search statement according to the length requirement, to obtain a statement code includes:
Splitting the search statement to obtain a plurality of search characters and splitting serial numbers of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence identification;
splicing a preset identifier, the type identifier of the sentence type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is greater than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or alternatively
If the coding length is smaller than the length requirement, filling the intermediate code by taking a length difference value between the coding length and the length requirement as a filling bit number to obtain the statement code; or alternatively
And if the coding length is equal to the length requirement, determining the intermediate coding as the statement coding.
According to a preferred embodiment of the present invention, before obtaining the pre-trained sentence dimensionality reduction model, the method further includes:
acquiring a learner and acquiring an initial requirement of the learner;
Obtaining a training sample, wherein the training sample comprises sample sentences and similar texts;
extracting semantic codes of the similar texts;
coding the sample statement according to the initial requirement to obtain a sample code;
performing dimension reduction processing on the sample codes based on the learner to obtain predictive codes;
and adjusting the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
According to a preferred embodiment of the present invention, the sentence dimension reduction model includes a convolution layer, a pooling layer and a full connection layer, and the analyzing the sentence code based on the sentence dimension reduction model includes:
performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the offset value to obtain the statement information.
According to a preferred embodiment of the present invention, the filtering the information to be selected to obtain the feature to be selected includes:
acquiring a preset list, wherein the preset list comprises initial characterization of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the feature to be selected.
According to a preferred embodiment of the present invention, the calculating the text similarity between the search sentence and each candidate text according to the sentence feature and the candidate feature includes:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the maximum value from the character similarity as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
On the other hand, the invention also provides a text matching device, which comprises:
The acquisition unit is used for acquiring a search statement according to the text matching request when the text matching request is received;
the acquisition unit is also used for acquiring a pre-trained sentence dimension reduction model and acquiring the length requirement of the sentence dimension reduction model;
the coding unit is used for coding the search statement according to the length requirement to obtain a statement code;
the analysis unit is used for analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information;
the processing unit is used for carrying out normalization processing on the statement information to obtain statement characteristics;
the obtaining unit is further configured to obtain a plurality of candidate texts and candidate information corresponding to each candidate text according to the text matching request;
the filtering unit is used for filtering the information to be selected to obtain characteristics to be selected;
the computing unit is used for computing the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and the determining unit is used for determining the text to be selected with the maximum text similarity as the target text.
In another aspect, the present invention also proposes an electronic device, including:
A memory storing computer readable instructions; a kind of electronic device with high-pressure air-conditioning system
And a processor executing computer readable instructions stored in the memory to implement the text matching method.
In another aspect, the present invention also proposes a computer readable storage medium having stored therein computer readable instructions that are executed by a processor in an electronic device to implement the text matching method.
According to the technical scheme, the sentence information is normalized, so that the sentence characteristics and the characteristics to be selected are in the same operation level when the text similarity is calculated, the calculation accuracy of the text similarity is improved, meanwhile, the fact that the module length of the sentence characteristics and the module length of the characteristics to be selected are not needed to be analyzed when the text similarity is calculated subsequently is ensured, and the calculation efficiency of the text similarity is improved. In addition, the invention does not directly generate the global feature vector of the search sentence and the text to be selected, but calculates the text similarity by utilizing the feature coding sequence of the search sentence and the text to be selected on a low level, can analyze the relation between the search sentence and the text to be selected from a fine granularity angle, and is beneficial to improving the matching accuracy of the target text. In addition, the method and the device directly acquire the candidate information corresponding to the candidate text according to the text matching request, do not need to further analyze the candidate text, and improve the matching efficiency of the target text.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the text matching method of the present invention.
Fig. 2 is a functional block diagram of a preferred embodiment of the text matching device of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device implementing a preferred embodiment of the text matching method of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
As shown in fig. 1, a flow chart of a preferred embodiment of the text matching method of the present invention is shown. The order of the steps in the flowchart may be changed and some steps may be omitted according to various needs.
The text matching method can acquire and process related data based on artificial intelligence technology. Among these, artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results.
Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and other directions.
The text matching method is applied to one or more electronic devices, wherein the electronic devices are devices capable of automatically performing numerical calculation and/or information processing according to preset or stored computer readable instructions, and the hardware comprises, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (Field-Programmable Gate Array, FPGA), digital signal processors (Digital Signal Processor, DSP), embedded devices and the like.
The electronic device may be any electronic product that can interact with a user in a human-computer manner, such as a personal computer, tablet computer, smart phone, personal digital assistant (Personal Digital Assistant, PDA), game console, interactive internet protocol television (Internet Protocol Television, IPTV), smart wearable device, etc.
The electronic device may comprise a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network electronic device, a group of electronic devices made up of multiple network electronic devices, or a Cloud based Cloud Computing (Cloud Computing) made up of a large number of hosts or network electronic devices.
The network on which the electronic device is located includes, but is not limited to: the internet, wide area networks, metropolitan area networks, local area networks, virtual private networks (Virtual Private Network, VPN), etc.
And S10, when a text matching request is received, acquiring a search statement according to the text matching request.
In at least one embodiment of the present invention, the text matching request carries data information such as a sentence path and a sentence identifier. The text matching request may be triggered by any user.
The search sentence refers to a sentence needing text semantic matching. For example, the search term may be: text about weather comments.
In at least one embodiment of the present invention, the electronic device obtaining the search sentence according to the text matching request includes:
analyzing the text matching request message to obtain data information carried by the message;
extracting a statement path and a statement mark from the data information, and calculating the query total quantity of the statement path and the statement mark;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
And operating the query statement to obtain the search statement.
The sentence path is a path storing the search sentences, and a plurality of sentences needing text matching are stored in the sentence path.
The sentence identification refers to an identification capable of uniquely identifying the search sentence.
The total number of objects of the query template is the same as the total number of queries.
The proper query template can be obtained through the total amount, so that the query template is not required to be corrected when the query statement is generated, the generation efficiency of the query statement is improved, the search statement is further obtained through the query statement, and the statement paths are not required to be positioned and traversed one by one in the statement paths, so that the acquisition efficiency of the search statement can be improved.
S11, acquiring a pre-trained sentence dimension reduction model, and acquiring the length requirement of the sentence dimension reduction model.
In at least one embodiment of the present invention, the statement dimension-reducing model refers to a model for dimension-reducing processing of characterization information.
The length requirement refers to the length of the characterization information of the sentence dimension reduction model. For example, the length requirement may be 128 bits.
In at least one embodiment of the present invention, before obtaining the pre-trained sentence dimension-reduction model, the method further comprises:
acquiring a learner and acquiring an initial requirement of the learner;
obtaining a training sample, wherein the training sample comprises sample sentences and similar texts;
extracting semantic codes of the similar texts;
coding the sample statement according to the initial requirement to obtain a sample code;
performing dimension reduction processing on the sample codes based on the learner to obtain predictive codes;
and adjusting the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
Wherein the initial requirement refers to the maximum length of the characterization information input into the learner, and the initial requirement is preset.
The coding distance refers to the difference between the predictive coding and the semantic coding.
The statement codes input into the statement dimension reduction model for analysis can be ensured to contain comprehensive characterization information through parameter adjustment of the initial requirement, so that information loss in search statements is avoided, and dimension reduction accuracy of the statement dimension reduction model on the statement codes can be improved through parameter adjustment of the network parameters.
And S12, carrying out coding processing on the search statement according to the length requirement to obtain a statement code.
In at least one embodiment of the present invention, the statement encoding refers to a vector representation of the search statement, the length of the vector representation being the length requirement.
In at least one embodiment of the present invention, the electronic device performs encoding processing on the search statement according to the length requirement, and obtaining the statement code includes:
splitting the search statement to obtain a plurality of search characters and splitting serial numbers of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence identification;
splicing a preset identifier, the type identifier of the sentence type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is greater than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or alternatively
If the coding length is smaller than the length requirement, filling the intermediate code by taking a length difference value between the coding length and the length requirement as a filling bit number to obtain the statement code; or alternatively
And if the coding length is equal to the length requirement, determining the intermediate coding as the statement coding.
Wherein the plurality of search characters includes characters such as punctuation in the search sentence.
The character map stores a plurality of characters and a vector representation of each character.
The sentence type refers to the sentence type of the search sentence, and correspondingly, the type identifier refers to an identifier capable of indicating the sentence type. For example, the sentence type is a question sentence and the type identifier may be Q.
The initial code is identified through the type identification, so that analysis of the sentence code is facilitated, the length of the sentence code can be ensured through the relation between the code length and the length requirement, and the representation capability of the sentence code on the search sentence is improved.
Specifically, the electronic device concatenates the character vectors according to the splitting sequence number, and obtaining the initial code includes:
and splicing the character vectors according to the sequence from the small split sequence number to the large split sequence number to obtain the initial code.
Specifically, the electronic device concatenates the preset identifier, the type identifier of the sentence type, and the initial code, and the obtaining the intermediate code includes:
Splicing the type identifier at the tail end of the preset identifier to obtain splicing information;
and splicing the initial codes at the tail end of the splicing information to obtain the intermediate codes.
S13, analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information.
In at least one embodiment of the present invention, the statement information refers to information obtained after performing a dimension reduction process on the statement code.
In at least one embodiment of the present invention, the sentence dimension reduction model includes a convolution layer, a pooling layer, and a full connection layer, and the electronic device analyzes the sentence code based on the sentence dimension reduction model, and obtaining sentence information includes:
performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the offset value to obtain the statement information.
Wherein the plurality of convolution kernels, the pooling function, the weight matrix, and the bias values are generated from training the learner.
The characteristic capability of the convolution characteristics to the search sentences can be further improved through the convolution layer, and interference information in the convolution characteristics can be removed through the pooling layer, so that the accuracy of the subsequent text similarity is improved.
S14, carrying out normalization processing on the statement information to obtain statement characteristics.
In at least one embodiment of the present invention, the statement feature refers to feature information where the statement information is between [0,1 ].
In at least one embodiment of the present invention, the electronic device performs normalization processing on the sentence information, so as to ensure that the sentence feature and the feature to be selected are in the same operation level, thereby improving accuracy of the text similarity.
S15, obtaining a plurality of candidate texts and the candidate information corresponding to each candidate text according to the text matching request.
In at least one embodiment of the present invention, the candidate text refers to text that needs to be matched with the search term.
The information to be selected is based on the characterization information obtained after the encoding processing, the dimension reduction processing and the normalization processing are performed on the text to be selected.
In at least one embodiment of the present invention, the electronic device obtaining a plurality of candidate texts and candidate information corresponding to each candidate text according to the text matching request includes:
Extracting a text path from the data information;
determining all texts in the text path as the multiple texts to be selected, and acquiring a text identifier of each text to be selected from the text path;
and acquiring the information to be selected from a vector list corresponding to the text path based on each text identifier.
S16, filtering the information to be selected to obtain the characteristics to be selected.
In at least one embodiment of the present invention, the feature to be selected refers to the information to be selected from which the preset stop word and the preset symbol are removed.
In at least one embodiment of the present invention, the filtering the candidate information by the electronic device to obtain the candidate feature includes:
acquiring a preset list, wherein the preset list comprises initial characterization of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the feature to be selected.
According to the embodiment, after the text to be selected is encoded, filtering processing is performed on the information to be selected instead of filtering processing before encoding the text to be selected, and therefore the fact that preset stop words and preset symbols with encoding significance are removed can be avoided, the representation accuracy of the information to be selected is improved, meanwhile, the simplifying performance of the characteristics to be selected can be improved through filtering the information to be selected, and therefore the calculation efficiency of the text similarity is improved.
S17, calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected.
In at least one embodiment of the present invention, the text similarity refers to a similarity between the search term and each of the candidate texts.
In at least one embodiment of the present invention, the computing, by the electronic device, the text similarity between the search sentence and each candidate text according to the sentence feature and the candidate feature includes:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the maximum value from the character similarity as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
And calculating the text similarity according to the relation between the first character feature in the sentence features and the second character feature in the to-be-selected features, wherein the first character feature and the second character feature belong to a low-level feature coding sequence, so that the similarity between the search sentence and the to-be-selected text can be determined from fine granularity, and the accuracy of the text similarity is improved.
And S18, determining the text to be selected with the maximum text similarity as a target text.
In at least one embodiment of the present invention, the target text refers to the candidate text that is most similar to the search term.
It is emphasized that to further ensure the privacy and security of the target text, the target text may also be stored in a blockchain node.
In at least one embodiment of the invention, the method further comprises:
acquiring a request number of the text matching request;
packaging the request number and the target text to obtain a feedback result;
and sending the feedback result to a trigger terminal of the text matching request.
By the embodiment, the feedback result can be timely sent to the trigger terminal, and timeliness is improved.
According to the technical scheme, the sentence information is normalized, so that the sentence characteristics and the characteristics to be selected are in the same operation level when the text similarity is calculated, the calculation accuracy of the text similarity is improved, meanwhile, the fact that the module length of the sentence characteristics and the module length of the characteristics to be selected are not needed to be analyzed when the text similarity is calculated subsequently is ensured, and the calculation efficiency of the text similarity is improved. In addition, the invention does not directly generate the global feature vector of the search sentence and the text to be selected, but calculates the text similarity by utilizing the feature coding sequence of the search sentence and the text to be selected on a low level, can analyze the relation between the search sentence and the text to be selected from a fine granularity angle, and is beneficial to improving the matching accuracy of the target text. In addition, the method and the device directly acquire the candidate information corresponding to the candidate text according to the text matching request, do not need to further analyze the candidate text, and improve the matching efficiency of the target text.
Fig. 2 is a functional block diagram of a preferred embodiment of the text matching device of the present invention. The text matching device 11 includes an acquisition unit 110, an encoding unit 111, an analysis unit 112, a processing unit 113, a filtering unit 114, a calculation unit 115, a determination unit 116, an extraction unit 117, a dimension reduction unit 118, an adjustment unit 119, a packaging unit 120, and a transmission unit 121. The module/unit referred to herein is a series of computer readable instructions capable of being retrieved by the processor 13 and performing a fixed function and stored in the memory 12. In the present embodiment, the functions of the respective modules/units will be described in detail in the following embodiments.
When receiving the text matching request, the acquisition unit 110 acquires a search sentence according to the text matching request.
In at least one embodiment of the present invention, the text matching request carries data information such as a sentence path and a sentence identifier. The text matching request may be triggered by any user.
The search sentence refers to a sentence needing text semantic matching. For example, the search term may be: text about weather comments.
In at least one embodiment of the present invention, the obtaining unit 110 obtains a search sentence according to the text matching request, including:
Analyzing the text matching request message to obtain data information carried by the message;
extracting a statement path and a statement mark from the data information, and calculating the query total quantity of the statement path and the statement mark;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
The sentence path is a path storing the search sentences, and a plurality of sentences needing text matching are stored in the sentence path.
The sentence identification refers to an identification capable of uniquely identifying the search sentence.
The total number of objects of the query template is the same as the total number of queries.
The proper query template can be obtained through the total amount, so that the query template is not required to be corrected when the query statement is generated, the generation efficiency of the query statement is improved, the search statement is further obtained through the query statement, and the statement paths are not required to be positioned and traversed one by one in the statement paths, so that the acquisition efficiency of the search statement can be improved.
The obtaining unit 110 obtains a pre-trained sentence dimension reduction model, and obtains a length requirement of the sentence dimension reduction model.
In at least one embodiment of the present invention, the statement dimension-reducing model refers to a model for dimension-reducing processing of characterization information.
The length requirement refers to the length of the characterization information of the sentence dimension reduction model. For example, the length requirement may be 128 bits.
In at least one embodiment of the present invention, before acquiring the pre-trained sentence dimensionality reduction model, the acquiring unit 110 acquires a learner and acquires an initial requirement of the learner;
the obtaining unit 110 obtains a training sample, where the training sample includes a sample sentence and a similar text;
the extraction unit 117 extracts semantic codes of the similar text;
the encoding unit 111 encodes the sample sentence according to the initial requirement to obtain a sample code;
the dimension reduction unit 118 performs dimension reduction processing on the sample codes based on the learner to obtain predictive codes;
the adjusting unit 119 adjusts the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced, so as to obtain the sentence dimension reduction model.
Wherein the initial requirement refers to the maximum length of the characterization information input into the learner, and the initial requirement is preset.
The coding distance refers to the difference between the predictive coding and the semantic coding.
The statement codes input into the statement dimension reduction model for analysis can be ensured to contain comprehensive characterization information through parameter adjustment of the initial requirement, so that information loss in search statements is avoided, and dimension reduction accuracy of the statement dimension reduction model on the statement codes can be improved through parameter adjustment of the network parameters.
The encoding unit 111 performs encoding processing on the search statement according to the length requirement, so as to obtain a statement code.
In at least one embodiment of the present invention, the statement encoding refers to a vector representation of the search statement, the length of the vector representation being the length requirement.
In at least one embodiment of the present invention, the encoding unit 111 performs encoding processing on the search statement according to the length requirement, and obtaining the statement code includes:
splitting the search statement to obtain a plurality of search characters and splitting serial numbers of each search character;
Acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence identification;
splicing a preset identifier, the type identifier of the sentence type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is greater than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or alternatively
If the coding length is smaller than the length requirement, filling the intermediate code by taking a length difference value between the coding length and the length requirement as a filling bit number to obtain the statement code; or alternatively
And if the coding length is equal to the length requirement, determining the intermediate coding as the statement coding.
Wherein the plurality of search characters includes characters such as punctuation in the search sentence.
The character map stores a plurality of characters and a vector representation of each character.
The sentence type refers to the sentence type of the search sentence, and correspondingly, the type identifier refers to an identifier capable of indicating the sentence type. For example, the sentence type is a question sentence and the type identifier may be Q.
The initial code is identified through the type identification, so that analysis of the sentence code is facilitated, the length of the sentence code can be ensured through the relation between the code length and the length requirement, and the representation capability of the sentence code on the search sentence is improved.
Specifically, the encoding unit 111 concatenates the character vectors according to the splitting sequence number, and the obtaining the initial encoding includes:
and splicing the character vectors according to the sequence from the small split sequence number to the large split sequence number to obtain the initial code.
Specifically, the encoding unit 111 concatenates the preset identifier, the type identifier of the sentence type, and the initial code, and the obtaining the intermediate code includes:
splicing the type identifier at the tail end of the preset identifier to obtain splicing information;
and splicing the initial codes at the tail end of the splicing information to obtain the intermediate codes.
The analysis unit 112 analyzes the sentence code based on the sentence dimension reduction model to obtain sentence information.
In at least one embodiment of the present invention, the statement information refers to information obtained after performing a dimension reduction process on the statement code.
In at least one embodiment of the present invention, the sentence dimension reduction model includes a convolution layer, a pooling layer, and a full-connection layer, and the analyzing unit 112 analyzes the sentence code based on the sentence dimension reduction model, and the obtaining sentence information includes:
Performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the offset value to obtain the statement information.
Wherein the plurality of convolution kernels, the pooling function, the weight matrix, and the bias values are generated from training the learner.
The characteristic capability of the convolution characteristics to the search sentences can be further improved through the convolution layer, and interference information in the convolution characteristics can be removed through the pooling layer, so that the accuracy of the subsequent text similarity is improved.
The processing unit 113 performs normalization processing on the sentence information to obtain sentence characteristics.
In at least one embodiment of the present invention, the statement feature refers to feature information where the statement information is between [0,1 ].
In at least one embodiment of the present invention, the processing unit 113 performs normalization processing on the sentence information, so as to ensure that the sentence feature and the feature to be selected are in the same operation level, thereby improving accuracy of the text similarity.
The obtaining unit 110 obtains a plurality of candidate texts and candidate information corresponding to each candidate text according to the text matching request.
In at least one embodiment of the present invention, the candidate text refers to text that needs to be matched with the search term.
The information to be selected is based on the characterization information obtained after the encoding processing, the dimension reduction processing and the normalization processing are performed on the text to be selected.
In at least one embodiment of the present invention, the obtaining unit 110 obtains a plurality of candidate texts and candidate information corresponding to each candidate text according to the text matching request includes:
extracting a text path from the data information;
determining all texts in the text path as the multiple texts to be selected, and acquiring a text identifier of each text to be selected from the text path;
and acquiring the information to be selected from a vector list corresponding to the text path based on each text identifier.
The filtering unit 114 performs filtering processing on the information to be selected to obtain a feature to be selected.
In at least one embodiment of the present invention, the feature to be selected refers to the information to be selected from which the preset stop word and the preset symbol are removed.
In at least one embodiment of the present invention, the filtering unit 114 performs filtering processing on the candidate information, where obtaining the candidate feature includes:
acquiring a preset list, wherein the preset list comprises initial characterization of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the feature to be selected.
According to the embodiment, after the text to be selected is encoded, filtering processing is performed on the information to be selected instead of filtering processing before encoding the text to be selected, and therefore the fact that preset stop words and preset symbols with encoding significance are removed can be avoided, the representation accuracy of the information to be selected is improved, meanwhile, the simplifying performance of the characteristics to be selected can be improved through filtering the information to be selected, and therefore the calculation efficiency of the text similarity is improved.
The calculating unit 115 calculates the text similarity between the search sentence and each candidate text according to the sentence feature and the candidate feature.
In at least one embodiment of the present invention, the text similarity refers to a similarity between the search term and each of the candidate texts.
In at least one embodiment of the present invention, the calculating unit 115 calculates the text similarity between the search sentence and each candidate text according to the sentence feature and the candidate feature, including:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the maximum value from the character similarity as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
And calculating the text similarity according to the relation between the first character feature in the sentence features and the second character feature in the to-be-selected features, wherein the first character feature and the second character feature belong to a low-level feature coding sequence, so that the similarity between the search sentence and the to-be-selected text can be determined from fine granularity, and the accuracy of the text similarity is improved.
The determining unit 116 determines the candidate text having the greatest text similarity as the target text.
In at least one embodiment of the present invention, the target text refers to the candidate text that is most similar to the search term.
It is emphasized that to further ensure the privacy and security of the target text, the target text may also be stored in a blockchain node.
In at least one embodiment of the present invention, the obtaining unit 110 obtains a request number of the text matching request;
the packaging unit 120 packages the request number and the target text to obtain a feedback result;
the sending unit 121 sends the feedback result to the trigger terminal of the text matching request.
By the embodiment, the feedback result can be timely sent to the trigger terminal, and timeliness is improved.
According to the technical scheme, the sentence information is normalized, so that the sentence characteristics and the characteristics to be selected are in the same operation level when the text similarity is calculated, the calculation accuracy of the text similarity is improved, meanwhile, the fact that the module length of the sentence characteristics and the module length of the characteristics to be selected are not needed to be analyzed when the text similarity is calculated subsequently is ensured, and the calculation efficiency of the text similarity is improved. In addition, the invention does not directly generate the global feature vector of the search sentence and the text to be selected, but calculates the text similarity by utilizing the feature coding sequence of the search sentence and the text to be selected on a low level, can analyze the relation between the search sentence and the text to be selected from a fine granularity angle, and is beneficial to improving the matching accuracy of the target text. In addition, the method and the device directly acquire the candidate information corresponding to the candidate text according to the text matching request, do not need to further analyze the candidate text, and improve the matching efficiency of the target text.
Fig. 3 is a schematic structural diagram of an electronic device according to a preferred embodiment of the present invention for implementing a text matching method.
In one embodiment of the invention, the electronic device 1 includes, but is not limited to, a memory 12, a processor 13, and computer readable instructions, such as a text matching program, stored in the memory 12 and executable on the processor 13.
It will be appreciated by those skilled in the art that the schematic diagram is merely an example of the electronic device 1 and does not constitute a limitation of the electronic device 1, and may include more or less components than illustrated, or may combine certain components, or different components, e.g. the electronic device 1 may further include input-output devices, network access devices, buses, etc.
The processor 13 may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. The general purpose processor may be a microprocessor or the processor may be any conventional processor, etc., and the processor 13 is an operation core and a control center of the electronic device 1, connects various parts of the entire electronic device 1 using various interfaces and lines, and executes an operating system of the electronic device 1 and various installed applications, program codes, etc.
Illustratively, the computer readable instructions may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to complete the present invention. The one or more modules/units may be a series of computer readable instructions capable of performing a specific function, the computer readable instructions describing a process of executing the computer readable instructions in the electronic device 1. For example, the computer-readable instructions may be divided into an acquisition unit 110, an encoding unit 111, an analysis unit 112, a processing unit 113, a filtering unit 114, a calculation unit 115, a determination unit 116, an extraction unit 117, a dimension reduction unit 118, an adjustment unit 119, a packaging unit 120, and a transmission unit 121.
The memory 12 may be used to store the computer readable instructions and/or modules, and the processor 13 may implement various functions of the electronic device 1 by executing or executing the computer readable instructions and/or modules stored in the memory 12 and invoking data stored in the memory 12. The memory 12 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like; the storage data area may store data created according to the use of the electronic device, etc. Memory 12 may include non-volatile and volatile memory, such as: a hard disk, memory, plug-in hard disk, smart Media Card (SMC), secure Digital (SD) Card, flash Card (Flash Card), at least one disk storage device, flash memory device, or other storage device.
The memory 12 may be an external memory and/or an internal memory of the electronic device 1. Further, the memory 12 may be a physical memory, such as a memory bank, a TF Card (Trans-flash Card), or the like.
The integrated modules/units of the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the present invention may also be implemented by implementing all or part of the processes in the methods of the embodiments described above, by instructing the associated hardware by means of computer readable instructions, which may be stored in a computer readable storage medium, the computer readable instructions, when executed by a processor, implementing the steps of the respective method embodiments described above.
Wherein the computer readable instructions comprise computer readable instruction code which may be in the form of source code, object code, executable files, or in some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer readable instruction code, a recording medium, a USB flash disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory).
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
In connection with fig. 1, the memory 12 in the electronic device 1 stores computer readable instructions for implementing a text matching method, the processor 13 being executable to implement:
when a text matching request is received, acquiring a search sentence according to the text matching request;
acquiring a pre-trained sentence dimension reduction model, and acquiring the length requirement of the sentence dimension reduction model;
coding the search statement according to the length requirement to obtain a statement code;
analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information;
Carrying out normalization processing on the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
In particular, the specific implementation method of the processor 13 on the computer readable instructions may refer to the description of the relevant steps in the corresponding embodiment of fig. 1, which is not repeated herein.
In the several embodiments provided in the present invention, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The computer readable storage medium has stored thereon computer readable instructions, wherein the computer readable instructions when executed by the processor 13 are configured to implement the steps of:
When a text matching request is received, acquiring a search sentence according to the text matching request;
acquiring a pre-trained sentence dimension reduction model, and acquiring the length requirement of the sentence dimension reduction model;
coding the search statement according to the length requirement to obtain a statement code;
analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information;
carrying out normalization processing on the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. The units or means may also be implemented by one unit or means in software or hardware. The terms first, second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. A text matching method, characterized in that the text matching method comprises:
when a text matching request is received, acquiring a search sentence according to the text matching request;
acquiring a pre-trained sentence dimension reduction model, and acquiring the length requirement of the sentence dimension reduction model;
coding the search statement according to the length requirement to obtain a statement code;
analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information;
carrying out normalization processing on the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
2. The text matching method of claim 1, wherein the obtaining a search term according to the text matching request comprises:
analyzing the text matching request message to obtain data information carried by the message;
Extracting a statement path and a statement mark from the data information, and calculating the query total quantity of the statement path and the statement mark;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
3. The text matching method as claimed in claim 2, wherein said encoding the search sentence according to the length requirement includes:
splitting the search statement to obtain a plurality of search characters and splitting serial numbers of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence identification;
splicing a preset identifier, the type identifier of the sentence type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is greater than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or alternatively
If the coding length is smaller than the length requirement, filling the intermediate code by taking a length difference value between the coding length and the length requirement as a filling bit number to obtain the statement code; or alternatively
And if the coding length is equal to the length requirement, determining the intermediate coding as the statement coding.
4. The text matching method of claim 1, wherein prior to obtaining the pre-trained sentence dimension reduction model, the method further comprises:
acquiring a learner and acquiring an initial requirement of the learner;
obtaining a training sample, wherein the training sample comprises sample sentences and similar texts;
extracting semantic codes of the similar texts;
coding the sample statement according to the initial requirement to obtain a sample code;
performing dimension reduction processing on the sample codes based on the learner to obtain predictive codes;
and adjusting the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
5. The text matching method of claim 1, wherein the sentence dimension reduction model comprises a convolution layer, a pooling layer and a full-connection layer, wherein analyzing the sentence code based on the sentence dimension reduction model to obtain sentence information comprises:
Performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the offset value to obtain the statement information.
6. The text matching method of claim 1, wherein the filtering the candidate information to obtain the candidate feature comprises:
acquiring a preset list, wherein the preset list comprises initial characterization of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the feature to be selected.
7. The text matching method of claim 1, wherein said calculating the text similarity of the search term to each of the candidate texts based on the term features and the candidate features comprises:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
Calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the maximum value from the character similarity as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
8. A text matching device, the text matching device comprising:
the acquisition unit is used for acquiring a search statement according to the text matching request when the text matching request is received;
the acquisition unit is also used for acquiring a pre-trained sentence dimension reduction model and acquiring the length requirement of the sentence dimension reduction model;
the coding unit is used for coding the search statement according to the length requirement to obtain a statement code;
the analysis unit is used for analyzing the sentence codes based on the sentence dimension reduction model to obtain sentence information;
the processing unit is used for carrying out normalization processing on the statement information to obtain statement characteristics;
the obtaining unit is further configured to obtain a plurality of candidate texts and candidate information corresponding to each candidate text according to the text matching request;
The filtering unit is used for filtering the information to be selected to obtain characteristics to be selected;
the computing unit is used for computing the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and the determining unit is used for determining the text to be selected with the maximum text similarity as the target text.
9. An electronic device, the electronic device comprising:
a memory storing computer readable instructions; a kind of electronic device with high-pressure air-conditioning system
A processor executing computer readable instructions stored in the memory to implement the text matching method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized by: stored in the computer readable storage medium are computer readable instructions that are executed by a processor in an electronic device to implement the text matching method of any one of claims 1 to 7.
CN202110942420.1A 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium Active CN113656547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110942420.1A CN113656547B (en) 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110942420.1A CN113656547B (en) 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113656547A CN113656547A (en) 2021-11-16
CN113656547B true CN113656547B (en) 2023-06-30

Family

ID=78479901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110942420.1A Active CN113656547B (en) 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113656547B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887192B (en) * 2021-12-06 2022-05-27 阿里巴巴达摩院(杭州)科技有限公司 Text matching method and device and storage medium
CN114519094A (en) * 2022-02-16 2022-05-20 平安普惠企业管理有限公司 Method and device for conversational recommendation based on random state and electronic equipment
CN116108163B (en) * 2023-04-04 2023-06-27 之江实验室 Text matching method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766424A (en) * 2018-12-29 2019-05-17 安徽省泰岳祥升软件有限公司 It is a kind of to read the filter method and device for understanding model training data
CN111209395A (en) * 2019-12-27 2020-05-29 铜陵中科汇联科技有限公司 Short text similarity calculation system and training method thereof
CN111427995A (en) * 2020-02-26 2020-07-17 平安科技(深圳)有限公司 Semantic matching method and device based on internal countermeasure mechanism and storage medium
CN111563387A (en) * 2019-02-12 2020-08-21 阿里巴巴集团控股有限公司 Sentence similarity determining method and device and sentence translation method and device
CN112966073A (en) * 2021-04-07 2021-06-15 华南理工大学 Short text matching method based on semantics and shallow features
CN113239700A (en) * 2021-04-27 2021-08-10 哈尔滨理工大学 Text semantic matching device, system, method and storage medium for improving BERT

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083623A1 (en) * 2015-09-21 2017-03-23 Qualcomm Incorporated Semantic multisensory embeddings for video search by text

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109766424A (en) * 2018-12-29 2019-05-17 安徽省泰岳祥升软件有限公司 It is a kind of to read the filter method and device for understanding model training data
CN111563387A (en) * 2019-02-12 2020-08-21 阿里巴巴集团控股有限公司 Sentence similarity determining method and device and sentence translation method and device
CN111209395A (en) * 2019-12-27 2020-05-29 铜陵中科汇联科技有限公司 Short text similarity calculation system and training method thereof
CN111427995A (en) * 2020-02-26 2020-07-17 平安科技(深圳)有限公司 Semantic matching method and device based on internal countermeasure mechanism and storage medium
CN112966073A (en) * 2021-04-07 2021-06-15 华南理工大学 Short text matching method based on semantics and shallow features
CN113239700A (en) * 2021-04-27 2021-08-10 哈尔滨理工大学 Text semantic matching device, system, method and storage medium for improving BERT

Also Published As

Publication number Publication date
CN113656547A (en) 2021-11-16

Similar Documents

Publication Publication Date Title
CN113656547B (en) Text matching method, device, equipment and storage medium
CN111694826B (en) Data enhancement method and device based on artificial intelligence, electronic equipment and medium
CN112989826B (en) Test question score determining method, device, equipment and medium based on artificial intelligence
CN113408278B (en) Intention recognition method, device, equipment and storage medium
CN113408268B (en) Slot filling method, device, equipment and storage medium
CN113094478B (en) Expression reply method, device, equipment and storage medium
CN113064973A (en) Text classification method, device, equipment and storage medium
CN113283675A (en) Index data analysis method, device, equipment and storage medium
CN113536770B (en) Text analysis method, device and equipment based on artificial intelligence and storage medium
CN113268597B (en) Text classification method, device, equipment and storage medium
CN113342977B (en) Invoice image classification method, device, equipment and storage medium
CN111783425B (en) Intention identification method based on syntactic analysis model and related device
CN113627186B (en) Entity relation detection method based on artificial intelligence and related equipment
CN113420545B (en) Abstract generation method, device, equipment and storage medium
CN116629423A (en) User behavior prediction method, device, equipment and storage medium
CN113486680B (en) Text translation method, device, equipment and storage medium
CN113326365B (en) Reply sentence generation method, device, equipment and storage medium
CN111401069A (en) Intention recognition method and intention recognition device for conversation text and terminal
CN112949305B (en) Negative feedback information acquisition method, device, equipment and storage medium
CN114942749A (en) Development method, device and equipment of approval system and storage medium
CN113705468A (en) Digital image identification method based on artificial intelligence and related equipment
CN113269179A (en) Data processing method, device, equipment and storage medium
CN112989820A (en) Legal document positioning method, device, equipment and storage medium
CN113468334B (en) Ciphertext emotion classification method, device, equipment and storage medium
CN112989044B (en) Text classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant