WO2020143314A1 - 一种基于搜索引擎的问答方法、装置、存储介质及计算机设备 - Google Patents

一种基于搜索引擎的问答方法、装置、存储介质及计算机设备 Download PDF

Info

Publication number
WO2020143314A1
WO2020143314A1 PCT/CN2019/118080 CN2019118080W WO2020143314A1 WO 2020143314 A1 WO2020143314 A1 WO 2020143314A1 CN 2019118080 W CN2019118080 W CN 2019118080W WO 2020143314 A1 WO2020143314 A1 WO 2020143314A1
Authority
WO
WIPO (PCT)
Prior art keywords
keyword
keywords
candidate answer
answer
question
Prior art date
Application number
PCT/CN2019/118080
Other languages
English (en)
French (fr)
Inventor
杨坤
许开河
王少军
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020143314A1 publication Critical patent/WO2020143314A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/9032Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of big data technology, and in particular to a question and answer method, device, storage medium, and computer equipment based on a search engine.
  • Artificial intelligence chat robots can be used in education, entertainment and other fields.
  • parents can use artificial intelligence chat robots to guide children to learn various knowledge.
  • children can chat with artificial intelligence
  • the robot asks: What are the stars in the solar system?
  • the artificial intelligence chat robot answers accordingly according to the content stored in the database.
  • chatbots due to the limited content stored in the database of artificial intelligence chatbots, if the answers corresponding to the questions are not stored in the database in advance, the chatbot cannot respond, resulting in poor chatbots' ability to respond.
  • embodiments of the present application provide a question and answer method, device, storage medium, and computer equipment based on a search engine, to solve the problem of poor response ability of the prior art chat robots.
  • an embodiment of the present application provides a question and answer method based on a search engine.
  • the method includes: acquiring a target question input by a user; determining keywords of the target question; and searching from a search engine according to the keywords Multiple search results; calculating the matching degree of each of the multiple search results with the keyword; using search results with a matching degree greater than or equal to a preset value as candidate answers; determining whether the type of the candidate answer Is a document type; if the candidate answer type is a document type, the candidate answer is parsed according to a preset algorithm to obtain an answer to the target question; if the candidate answer type is not a document type, the candidate is determined The answer is the answer to the target question.
  • an embodiment of the present application provides a question and answer device based on a search engine.
  • the device includes: an acquiring unit for acquiring a target question input by a user; a first determining unit for determining a keyword of the target question ; A search unit for searching multiple search results from a search engine based on the keywords; a calculation unit for calculating the matching degree of each search result and the keywords in the multiple search results; second The determining unit is used to select a search result whose matching degree is greater than or equal to a preset value as a candidate answer; the first determining unit is used to determine whether the type of the candidate answer is a document type; and the analyzing unit is used to determine whether the candidate answer Is a document type, then the candidate answer is parsed according to a preset algorithm to obtain the answer to the target question; a third determining unit is used to determine that the candidate answer is if the candidate answer type is not a document type The answer to the target question.
  • an embodiment of the present application provides a storage medium, the storage medium including a stored program, wherein, when the program is running, the device where the storage medium is located is controlled to execute the above-mentioned search engine-based question and answer method.
  • an embodiment of the present application provides a computer device, including a memory and a processor, where the memory is used to store information including program instructions, and the processor is used to control execution of the program instructions, and the program instructions are processed by the processor.
  • multiple search results are searched from the search engine according to the keywords of the target question, and the search results whose matching degree with the keyword is greater than or equal to the preset value are used as candidate answers, if the type of the candidate answer is literature Type, the candidate answer is parsed according to the preset algorithm to obtain the answer to the target question; if the type of the candidate answer is not the document type, the candidate answer is determined to be the answer to the target question, if the answer corresponding to the question is not stored in the database in advance, the chat robot Searching the answer to the question through the search engine solves the problem in the prior art that when the answer corresponding to the question is not stored in the database in advance, the chat robot cannot respond to the chat robot’s poor response ability, and the effect of improving the chat robot’s response ability is achieved.
  • FIG. 1 is a flowchart of an optional search engine-based question answering method according to an embodiment of the present application
  • FIG. 2 is a schematic diagram of an optional search engine-based question answering device according to an embodiment of the present application
  • FIG. 3 is a schematic diagram of an optional computer device provided by an embodiment of the present application.
  • An embodiment of the present application provides a question and answer method based on a search engine. As shown in FIG. 1, the method includes:
  • Step S102 Acquire a target question input by the user.
  • Step S104 Determine the keywords of the target problem.
  • Step S106 multiple search results are searched from the search engine according to the keywords.
  • step S108 the matching degree of each search result and the keywords in the multiple search results is calculated.
  • the method of calculating the matching degree of the search result and the keyword is as follows: extract a preset number of high-frequency words whose frequency of occurrence exceeds the preset frequency threshold from the search result, compare the extracted high-frequency words with the keyword, and obtain The number of high-frequency words and keywords coincide to determine the matching degree of the search results and keywords. If the extracted high-frequency words and keywords do not overlap at all, it means that the search results match the keywords with a low degree; if the extracted high-frequency words and keywords have a high degree of coincidence, it means that the search results and keywords Has a high degree of matching. It should be noted that before extracting high-frequency words from the search results, you need to segment the search results, and then remove the words that have no practical meaning such as " ⁇ " and " ⁇ ".
  • the preset frequency threshold can be set according to actual needs.
  • step S110 a search result whose matching degree is greater than or equal to a preset value is used as a candidate answer.
  • step S112 it is determined whether the type of the candidate answer is a document type.
  • the document type refers to a text type whose number of characters exceeds a preset character number threshold, such as papers, journal articles, patents, and the like.
  • the preset character number threshold can be set according to actual needs.
  • step S114 if the type of the candidate answer is the document type, the candidate answer is parsed according to a preset algorithm to obtain the answer to the target question.
  • Step S116 if the type of the candidate answer is not the document type, it is determined that the candidate answer is the answer of the target question.
  • multiple search results are searched from the search engine according to the keywords of the target question, and the search results whose matching degree with the keyword is greater than or equal to the preset value are used as candidate answers, if the type of the candidate answer is literature Type, the candidate answer is parsed according to the preset algorithm to obtain the answer to the target question; if the type of the candidate answer is not the document type, the candidate answer is determined to be the answer to the target question, if the answer corresponding to the question is not stored in the database in advance, the chat robot Searching the answer to the question through the search engine solves the problem in the prior art that when the answer corresponding to the question is not stored in the database in advance, the chat robot cannot respond to the chat robot’s poor response ability, and the effect of improving the chat robot’s response ability is achieved.
  • determining keywords of the target question includes: extracting keywords from the target question, using the extracted keywords as the first keyword; obtaining the previous question input before the user enters the target question; entering the target from the user
  • the keyword is extracted from the previous question input before the question, and the extracted keyword is used as the second keyword; the first keyword and the second keyword are used as keywords of the target question.
  • chat content is related to the previous chat content, so before searching for the result of the target question, you need to refer to the previous question entered before entering the target question, for example: the first question is: "Are there any remaining tickets for the high-speed rail from Shanghai to Beijing tomorrow?"
  • the second question is "how much is a ticket”.
  • the second question contains insufficient information and needs to be combined with the previous question Identify the specific question the user is searching for, "How much is a high-speed rail second-class ticket from Shanghai to Beijing?".
  • determining keywords for the target question includes: extracting keywords from the target question, using the extracted keywords as the first keyword; returning the first question associated with the first keyword to the user; obtaining The user's response to the first question; extract the keywords from the user's response to the first question, and use the extracted keywords as the third keywords; use the first keywords and the third keywords as the keywords of the target question .
  • the question “where do you want to find the weather” corresponding to "how is the weather tomorrow” is returned to the user.
  • the specific geographic location information input by the user such as “Chengdu”
  • the user's question is incomplete, supplement the user's question as a complete question by querying historical information or further asking the user to accurately determine the user's question, improve the accuracy of search results, and improve the user's chat experience.
  • the method further includes: determining the plurality of search results Whether there is advertising information in; filter out the search results where there is advertising information.
  • the output results include not only the results you want to find, but also useless information such as advertisements and promotion, for example: enter "Rheumat need attention" in Baidu search, the search results
  • the top ranking is hospital webpage advertisements related to the treatment of rheumatism. After filtering the advertisement information, the user experience is better.
  • the candidate answer is parsed according to a preset algorithm to obtain the answer to the target question, including: segmenting the keyword and the candidate answer, and obtaining multiple segmentations of the keyword and the candidate answer Multiple participles; get the word vectors corresponding to multiple participles of keywords and the word vectors corresponding to multiple participles of candidate answers; add the word vectors corresponding to multiple participles of keywords to obtain the initial vector representation of the keywords, The word vectors corresponding to multiple participles of the candidate answer are added to obtain the initial vector representation of the candidate answer; the initial vector representation of the keyword is input into the first deep learning neural network for processing, and the first deep learning neural network outputs the final feature of the keyword Vector representation, the initial vector representation of the candidate answer is input to the second deep learning neural network for processing, the second deep learning neural network outputs the final feature vector representation of the candidate answer, the first deep learning neural network and the second deep learning neural network layer Different; the final feature vector representation of the keyword and the final feature vector representation
  • the first deep learning neural network and the second deep learning neural network are a neural network model that combines CNN and LSTM with the attention mechanism.
  • the initial vector indicates that it is stored in the embedding matrix of the embedding layer before input to the neural network.
  • the double-layer LSTM can fully mine the serialized features of keywords and candidate answers, and the model extracted by the double-layer LSTM can be merged with the features extracted by the attention mechanism to obtain rich semantic feature information of keywords and candidate answers.
  • CNN different convolution kernel sizes extract the features of different granularities of keywords and candidate answers, and merge the feature information of different granularities in a splicing manner to strengthen the comprehensiveness of the feature information. Through the fusion of this neural network, different dimensions After the data is input, it will not change the dimensions of the data.
  • the number of layers of the neural network is not as good as possible. If the length of the input sequence is too long, the number of layers of the neural network needs to be increased, but if the length of the input sequence data is short and the number of layers of the neural network is large, it will be reduced. The learning effect of the neural network, so the neural network needs to set different network layers according to the actual situation.
  • the sequence of keywords is short, and the sequence of literature is long, so the required number of network layers is different.
  • the final feature vector representation of the keyword and the final feature vector representation of the candidate answer are used as inner products to obtain the normalized probability representation of multiple vector representations in the candidate answer, and the multiple word vector representations are weighted according to the normalized probability
  • the first vector can be obtained.
  • the decoder can be a unidirectional LSTM.
  • the preset algorithm used to obtain the answer in the candidate answer of the document type may be the R-NET algorithm.
  • the R-NET algorithm model includes: using representation learning to make a representation for the keyword of the target problem and each word in the relevant literature, that is, the vector in deep learning, which mainly uses a two-way recurrent neural network; through gating The convolutional network + attention mechanism compares the vector in the keyword of the target problem with the vector in the relevant literature, and finds the text part that is closer to the target problem in the relevant literature; through the gated convolution network + attention The force mechanism compares the parts of the text that are closer to each other to get a candidate answer; for each word in the candidate answer, predict which word is the beginning of the answer, and which word is the end of the answer, the system will pick The text with the highest probability is output as the answer.
  • R-NET uses an additional gate to filter out unimportant information before the related literature word representation and target problem representation are entered into the RNN.
  • the search engine can not directly find the answer to the user's question
  • the answer is obtained from the literature through the R-NET algorithm, which expands the answer retrieval range while also improving the accuracy of the answer and the chat robot's ability to respond.
  • An embodiment of the present application provides a search engine-based question answering device, which is used to perform the search engine-based question answering method.
  • the device includes: an acquiring unit 10, a first determining unit 20, and a searching unit 30.
  • the obtaining unit 10 is used to obtain a target question input by a user.
  • the first determining unit 20 is used to determine the keywords of the target problem.
  • the search unit 30 is configured to search multiple search results from the search engine according to keywords.
  • the calculation unit 40 is used to calculate the matching degree of each search result and the keyword in the plurality of search results.
  • the second determination unit 50 is configured to use a search result with a matching degree greater than or equal to a preset value as a candidate answer.
  • the first judgment unit 60 is used to judge whether the type of the candidate answer is a document type.
  • the parsing unit 70 is configured to parse the candidate answer according to a preset algorithm to obtain the answer to the target question if the type of the candidate answer is the document type.
  • the third determining unit 80 is configured to determine the candidate answer as the answer to the target question if the type of the candidate answer is not the document type.
  • multiple search results are searched from the search engine according to the keywords of the target question, and the search results whose matching degree with the keyword is greater than or equal to the preset value are used as candidate answers, if the type of the candidate answer is literature Type, the candidate answer is parsed according to the preset algorithm to obtain the answer to the target question; if the type of the candidate answer is not the document type, the candidate answer is determined to be the answer to the target question, if the answer corresponding to the question is not stored in the database in advance, the chat robot Searching for the answer to the question through the search engine solves the problem in the prior art that when the answer corresponding to the question is not stored in the database in advance, the chat robot cannot respond to the chat robot’s poor response ability, and the effect of improving the chat robot’s response ability is achieved.
  • the first determination unit 20 includes: a first extraction module, a first acquisition module, a second extraction module, and a first determination module.
  • the first extraction module is used to extract keywords from the target question, and use the extracted keywords as the first keywords.
  • the first obtaining module is used to obtain the last question input before the user inputs the target question.
  • the second extraction module is used to extract keywords from the previous question input before the user inputs the target question, and use the extracted keywords as the second keywords.
  • the first determining module is used to use the first keyword and the second keyword as keywords of the target question.
  • the first determination unit 20 includes: a third extraction module, a return module, a second acquisition module, a fourth extraction module, and a second determination module.
  • the third extraction module is used to extract keywords from the target question, and use the extracted keywords as the first keywords.
  • the return module is used to return the first question associated with the first keyword to the user.
  • the second obtaining module is used to obtain the user's response to the first question.
  • the fourth extraction module is used to extract keywords from the user's response to the first question, and use the extracted keywords as the third keyword.
  • the second determination module is configured to use the first keyword and the third keyword as keywords of the target question.
  • the device further includes: a second judgment unit and a filtering unit.
  • the second judgment unit is used after the search unit 30 searches for a plurality of search results from the search engine according to the keywords, and before the calculation unit 40 calculates the matching degree of each search result with the keywords in the plurality of search results, Determine whether there is advertising information in multiple search results.
  • the filtering unit is used to filter out the search results with advertisement information.
  • the parsing unit 70 includes: a word segmentation module, a third acquisition module, a third determination module, an input module, a fourth determination module, an output module, and a fifth determination module.
  • the word segmentation module is used to segment the keywords and candidate answers to obtain multiple word segments of the keyword and multiple word segments of the candidate answers.
  • the third obtaining module is used to obtain word vectors corresponding to multiple word segments of keywords and word vectors corresponding to multiple word segments of candidate answers.
  • the third determining module is used to add word vectors corresponding to multiple participles of keywords to obtain an initial vector representation of keywords, and add word vectors corresponding to multiple participles of candidate answers to obtain initial vector representations of candidate answers .
  • the input module is used to input the initial vector representation of the keyword into the first deep learning neural network for processing, the first deep learning neural network outputs the final feature vector representation of the keyword, and the initial vector representation of the candidate answer is input into the second deep learning neural network
  • the network performs processing, and the final feature vector of the candidate answer output by the second deep learning neural network indicates that the first deep learning neural network has a different number of layers from the second deep learning neural network.
  • the fourth determining module is used to make the inner product of the final feature vector representation of the keyword and the final feature vector representation of the candidate answer to obtain multiple pieces of information in the candidate answer, and to perform weighted combination on the multiple pieces of information through the attention mechanism, Get the first vector.
  • the output module is used to input the first vector into the decoder, and the decoder outputs the corresponding text.
  • the fifth determination module is used to take the text output by the decoder as the answer to the target question.
  • the preset algorithm used to obtain the answer in the candidate answer of the document type may be the R-NET algorithm.
  • an embodiment of the present application provides a storage medium, the storage medium includes a stored program, wherein, when the program is running, the device where the storage medium is located is controlled to perform the following steps: obtain a target question input by a user; determine keywords of the target question; Multiple search results are searched from the search engine according to keywords; the matching degree of each search result and the keywords in the multiple search results is calculated; the search results whose matching degree is greater than or equal to the preset value are taken as candidate answers; the candidate answers are judged Whether the type of is the document type; if the type of the candidate answer is the document type, the candidate answer is parsed according to a preset algorithm to obtain the answer to the target question; if the type of the candidate answer is not the document type, the candidate answer is determined to be the answer to the target question.
  • the device where the storage medium is located also performs the following steps: extract keywords from the target question, and use the extracted keywords as the first keyword; obtain the previous question input before the user enters the target question ; Extract keywords from the previous question entered by the user before entering the target question, and use the extracted keywords as the second keyword; use the first keyword and the second keyword as the keywords of the target question.
  • the device on which the storage medium is located also performs the following steps: extract keywords from the target question, and use the extracted keywords as the first keywords; return to the user the keywords associated with the first keywords The first question; get the user's answer to the first question; extract the keywords from the user's answer to the first question, use the extracted keywords as the third keyword; use the first keyword and the third keyword as Keywords for the target problem.
  • the device where the storage medium is located also performs the following steps: after searching multiple search results from the search engine according to the keywords, and calculating each search result and keyword in calculating multiple search results Before the matching degree of, determine whether there is advertising information in multiple search results; filter out the search results with advertising information.
  • the device where the storage medium is located also performs the following steps: word segmentation of keywords and candidate answers to obtain multiple word segments of the keyword and multiple word segments of the candidate answer; acquiring multiple word segmentations of the keyword The word vector corresponding to the word segment of the candidate answer and the word vector corresponding to the word segmentation of the candidate answer; adding the word vectors corresponding to the word segmentation of the keyword to obtain the initial vector representation of the keyword, and comparing the word vectors corresponding to the word segmentation of the candidate answer Add the initial vector representation of the candidate answer; input the initial vector representation of the keyword to the first deep learning neural network for processing, the first deep learning neural network outputs the final feature vector representation of the keyword, and the initial vector of the candidate answer represents the input The second deep learning neural network performs processing.
  • the second deep learning neural network outputs the final feature vector representation of the candidate answer.
  • the first deep learning neural network and the second deep learning neural network have different layers; the final feature vector representation of the keyword and the candidate
  • the final feature vector of the answer represents the inner product to obtain multiple pieces of information in the candidate answer.
  • the multiple pieces of information are weighted and combined to obtain the first vector; the first vector is input to the decoder, and the decoder output corresponds to Text; use the text output by the decoder as the answer to the target question.
  • the device that controls the storage medium further performs the following steps: if the type of the candidate answer is a document type, the candidate answer is parsed according to the R-NET algorithm to obtain the answer to the target question.
  • an embodiment of the present application provides a computer device including a memory and a processor.
  • the memory is used to store information including program instructions.
  • the processor is used to control the execution of the program instructions.
  • the program instructions are loaded and executed by the processor to implement the following Steps: Obtain the target question input by the user; determine the keyword of the target question; search multiple search results from the search engine according to the keyword; calculate the matching degree of each search result and the keyword in the multiple search results; Search results greater than or equal to the preset value are used as candidate answers; determine whether the type of candidate answer is a document type; if the type of candidate answer is a document type, then parse the candidate answer according to the preset algorithm to get the answer to the target question; if the candidate answer Is not a document type, then the candidate answer is determined to be the answer to the target question.
  • the following steps are also achieved: extract keywords from the target question, and use the extracted keywords as the first keyword; obtain the previous question input before the user enters the target question ; Extract keywords from the previous question entered by the user before entering the target question, and use the extracted keywords as the second keyword; use the first keyword and the second keyword as the keywords of the target question.
  • the following steps are also achieved: extracting keywords from the target question, and using the extracted keywords as the first keywords; returning to the user the keywords associated with the first keywords The first question; get the user's answer to the first question; extract the keywords from the user's answer to the first question, use the extracted keywords as the third keyword; use the first keyword and the third keyword as Keywords for the target problem.
  • the following steps are also implemented: after searching multiple search results from the search engine according to the keywords, and, after calculating multiple search results, each search result and keyword Before the matching degree, determine whether there is advertising information in multiple search results; filter out the search results with advertising information.
  • word segmentation of keywords and candidate answers to obtain multiple word segments of the keyword and multiple word segments of the candidate answer; acquiring multiple word segment correspondences of the keyword The word vector corresponding to the word segment of the candidate answer and the word vector corresponding to the word segmentation of the candidate answer; adding the word vectors corresponding to the word segmentation of the keyword to obtain the initial vector representation of the keyword, and comparing the word vectors corresponding to the word segmentation of the candidate answer Add the initial vector representation of the candidate answer; input the initial vector representation of the keyword to the first deep learning neural network for processing, the first deep learning neural network outputs the final feature vector representation of the keyword, and the initial vector of the candidate answer represents the input The second deep learning neural network performs processing.
  • the second deep learning neural network outputs the final feature vector representation of the candidate answer.
  • the first deep learning neural network and the second deep learning neural network have different layers; the final feature vector representation of the keyword and the candidate
  • the final feature vector of the answer represents the inner product to obtain multiple pieces of information in the candidate answer.
  • the multiple pieces of information are weighted and combined to obtain the first vector; the first vector is input to the decoder, and the decoder output corresponds to Text; use the text output by the decoder as the answer to the target question.
  • the following steps are also implemented: If the type of the candidate answer is a document type, the candidate answer is parsed according to the R-NET algorithm to obtain the answer to the target question.
  • FIG. 3 is a schematic diagram of a computer device provided by an embodiment of the present application.
  • the computer device 50 of this embodiment includes a processor 51, a memory 52, and a computer program 53 stored in the memory 52 and executable on the processor 51.
  • the computer program 53 is executed by the processor 51
  • the question-answering method based on the search engine in the embodiment is not repeated here.
  • the computer program is executed by the processor 51, the functions of the models/units in the question-answering device based on the search engine in the embodiment are implemented. To avoid repetition, they are not described here one by one.
  • the computer device 50 may be a computing device such as a desktop computer, a notebook, a palmtop computer, and a cloud server.
  • the computer equipment may include, but is not limited to, the processor 51 and the memory 52.
  • FIG. 3 is only an example of the computer device 50, and does not constitute a limitation on the computer device 50, and may include more or less components than shown, or combine some components, or different components.
  • computer equipment may also include input and output devices, network access devices, buses, and so on.
  • the so-called processor 51 can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Field-programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
  • the memory 52 may be an internal storage unit of the computer device 50, such as a hard disk or a memory of the computer device 50.
  • the memory 52 may also be an external storage device of the computer device 50, for example, a plug-in hard disk equipped on the computer device 50, a smart memory card (Smart Media (SMC), a secure digital (SD) card, and a flash memory card (Flash Card) etc.
  • the memory 52 may also include both the internal storage unit of the computer device 50 and the external storage device.
  • the memory 52 is used to store computer programs and other programs and data required by computer devices.
  • the memory 52 may also be used to temporarily store data that has been or will be output.
  • the disclosed system, device, and method may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division, and there may be other divisions in actual implementation, for example, multiple units or components may be combined Or it can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware, or in the form of hardware plus software functional units.
  • the above integrated unit implemented in the form of a software functional unit may be stored in a computer-readable storage medium.
  • the above software functional unit is stored in a storage medium, and includes several instructions to enable a computer device (which may be a personal computer, server, or network device, etc.) or processor (Processor) to perform the methods described in the embodiments of the present application Partial steps.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code .

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于搜索引擎的问答方法、装置、存储介质及计算机设备,该方法包括:获取用户输入的目标问题(S102);确定目标问题的关键词(S104);根据关键词从搜索引擎中搜索到多个搜索结果(S106);计算多个搜索结果中每个搜索结果与关键词的匹配度(S108);将匹配度大于或等于预设值的搜索结果作为候选答案(S110);判断候选答案的类型是否是文献类型(S112);如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案(S114);如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案(S116)。所述方法解决了聊天机器人应答能力差的问题。

Description

一种基于搜索引擎的问答方法、装置、存储介质及计算机设备
本申请要求于2019年01月09日提交中国专利局、申请号为201910018881.2、申请名称为“一种基于搜索引擎的问答方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
【技术领域】
本申请涉及大数据技术领域,尤其涉及一种基于搜索引擎的问答方法、装置、存储介质及计算机设备。
【背景技术】
随着技术的发展,诞生了人工智能聊天机器人,人工智能聊天机器人可以应用于教育、娱乐等领域,例如,家长可以使用人工智能聊天机器人辅导孩子学习各种知识,例如,孩子可以向人工智能聊天机器人提问:太阳系有哪些恒星?人工智能聊天机器人根据数据库存储的内容进行相应回答。
但是,由于目前人工智能聊天机器人的数据库存储的内容有限,如果提问对应的答案没有预先存储在数据库中,则聊天机器人无法应答,导致聊天机器人应答能力差。
【申请内容】
有鉴于此,本申请实施例提供了一种基于搜索引擎的问答方法、装置、存储介质及计算机设备,用以解决现有技术聊天机器人应答能力差的问题。
一方面,本申请实施例提供了一种基于搜索引擎的问答方法,所述方法包括:获取用户输入的目标问题;确定所述目标问题的关键词;根据所 述关键词从搜索引擎中搜索到多个搜索结果;计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度;将匹配度大于或等于预设值的搜索结果作为候选答案;判断所述候选答案的类型是否是文献类型;如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案;如果所述候选答案的类型不是文献类型,则确定所述候选答案为所述目标问题的答案。
一方面,本申请实施例提供了一种基于搜索引擎的问答装置,所述装置包括:获取单元,用于获取用户输入的目标问题;第一确定单元,用于确定所述目标问题的关键词;搜索单元,用于根据所述关键词从搜索引擎中搜索到多个搜索结果;计算单元,用于计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度;第二确定单元,用于将匹配度大于或等于预设值的搜索结果作为候选答案;第一判断单元,用于判断所述候选答案的类型是否是文献类型;解析单元,用于如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案;第三确定单元,用于如果所述候选答案的类型不是文献类型,则确定所述候选答案为所述目标问题的答案。
一方面,本申请实施例提供了一种存储介质,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行上述的基于搜索引擎的问答方法。
一方面,本申请实施例提供了一种计算机设备,包括存储器和处理器,所述存储器用于存储包括程序指令的信息,所述处理器用于控制程序指令的执行,所述程序指令被处理器加载并执行时实现上述的基于搜索引擎的问答方法的步骤。
本申请实施例中,根据目标问题的关键词从搜索引擎中搜索到多个搜索结果,将与关键词的匹配度大于或等于预设值的搜索结果作为候选答案,如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案;如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案,如果提问对应的答案没有预先存储在数 据库中,聊天机器人通过搜索引擎搜索问题答案,解决了现有技术中当提问对应的答案没有预先存储在数据库中,则聊天机器人无法应答导致聊天机器人应答能力差的问题,达到了提高聊天机器人应答能力的效果。
【附图说明】
为了更清楚地说明本申请实施例的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。
图1是根据本申请实施例一种可选的基于搜索引擎的问答方法的流程图;
图2是根据本申请实施例一种可选的基于搜索引擎的问答装置的示意图;
图3是本申请实施例提供的一种可选的计算机设备的示意图。
【具体实施方式】
为了更好的理解本申请的技术方案,下面结合附图对本申请实施例进行详细描述。
应当明确,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其它实施例,都属于本申请保护的范围。
在本申请实施例中使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请实施例和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。
应当理解,本文中使用的术语“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单 独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,一般表示前后关联对象是一种“或”的关系。
本申请实施例提供了一种基于搜索引擎的问答方法,如图1所示,该方法包括:
步骤S102,获取用户输入的目标问题。
步骤S104,确定目标问题的关键词。
步骤S106,根据关键词从搜索引擎中搜索到多个搜索结果。
步骤S108,计算多个搜索结果中每个搜索结果与关键词的匹配度。
计算搜索结果与关键词的匹配度的方法如下:从搜索结果中提取预设数量的出现频率超过预设频率阈值的高频词,将提取得到的高频词与关键词进行比较,根据提取得到的高频词与关键词重合的数量来确定搜索结果与关键词的匹配度。如果提取得到的高频词与关键词完全没有重合,则说明该搜索结果与关键词的匹配度较低;如果提取得到的高频词与关键词重合度高,则说明该搜索结果与关键词的匹配度较高。需要注意的是,在从搜索结果中提取高频词之前,需要先将搜索结果进行分词,然后去掉“的”、“得”等没有实际意义的词。预设频率阈值可根据实际需求进行设置。
步骤S110,将匹配度大于或等于预设值的搜索结果作为候选答案。
步骤S112,判断候选答案的类型是否是文献类型。
在本申请实施例中,文献类型指字符数超出预设字符数阈值的文本类型,例如论文、期刊文章、专利等。预设字符数阈值可根据实际需求进行设置。
步骤S114,如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案。
步骤S116,如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案。
本申请实施例中,根据目标问题的关键词从搜索引擎中搜索到多个搜索结果,将与关键词的匹配度大于或等于预设值的搜索结果作为候选答案,如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目 标问题的答案;如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案,如果提问对应的答案没有预先存储在数据库中,聊天机器人通过搜索引擎搜索问题答案,解决了现有技术中当提问对应的答案没有预先存储在数据库中,则聊天机器人无法应答导致聊天机器人应答能力差的问题,达到了提高聊天机器人应答能力的效果。
可选地,确定目标问题的关键词,包括:从目标问题中提取出关键词,将提取出的关键词作为第一关键词;获取用户输入目标问题之前输入的上一个问题;从用户输入目标问题之前输入的上一个问题中提取出关键词,将提取出的关键词作为第二关键词;将第一关键词和第二关键词作为目标问题的关键词。
在与机器人聊天时,一般后面的聊天内容与前面的聊天内容是有关联的,所以在搜索目标问题的结果之前,需要参考输入目标问题之前输入的上一个问题,比如:第一个问题是:“明天从上海到北京的高铁二等座,还有剩余票吗?”,第二个问题是“一张票多少钱”,第二个问题包含的信息是不充分的,需要结合前一个问题确定用户要搜索的具体问题,即“一张从上海到北京的高铁二等座票多少钱?”。
可选地,确定目标问题的关键词,包括:从目标问题中提取出关键词,将提取出的关键词作为第一关键词;向用户返回与第一关键词相关联的第一问题;获取用户针对第一问题的答复;从用户针对第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;将第一关键词和第三关键词作为目标问题的关键词。
例如,如果用户输入了“明天天气怎么样”,则向用户返回“明天天气怎么样”对应的问题“您要查找哪个地方的天气”。获得用户输入的具体地理位置信息,比如“成都”之后,可以确定用户的目标问题是“成都明天天气怎么样”。
如果用户的问题不完整,通过查询历史信息或者进一步向用户提问以将用户的问题补充为完整问题,准确地确定用户的问题,提高了搜索结果的准确度,提升了用户的聊天体验效果。
可选地,在根据关键词从搜索引擎中搜索到多个搜索结果之后,并且, 在计算多个搜索结果中每个搜索结果与关键词的匹配度之前,方法还包括:判断多个搜索结果中是否存在广告信息;过滤掉存在广告信息的搜索结果。
在搜索引擎中输入搜索的问题,输出的结果不仅仅包括想要查找的结果,还包括广告、推广之类的无用信息,比如:在百度搜索中输入“风湿需要注意事项”,搜索到的结果排名在前的就是治疗风湿病相关的医院网页广告,将广告信息过滤后,用户体验更佳。
可选地,如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案,包括:对关键词和候选答案进行分词,得到关键词的多个分词和候选答案的多个分词;获取关键词的多个分词对应的词向量和候选答案的多个分词对应的词向量;将关键词的多个分词对应的词向量进行相加得到关键词的初始向量表示,将候选答案的多个分词对应的词向量进行相加得到候选答案的初始向量表示;将关键词的初始向量表示输入第一深度学习神经网络进行处理,第一深度学习神经网络输出关键词的最终特征向量表示,将候选答案的初始向量表示输入第二深度学习神经网络进行处理,第二深度学习神经网络输出候选答案的最终特征向量表示,第一深度学习神经网络与第二深度学习神经网络层数不同;将关键词的最终特征向量表示和候选答案的最终特征向量表示做内积,得到候选答案中多个信息片段,通过注意力机制,对多个信息片段进行加权组合,得到第一向量;将第一向量输入解码器,解码器输出对应的文本;将解码器输出的文本作为目标问题的答案。
第一深度学习神经网络和第二深度学习神经网络是将CNN与LSTM与注意力机制相结合的一种神经网络模型,初始向量表示在输入神经网络之前,被存入嵌入层的嵌入矩阵中,双层的LSTM可以充分挖掘关键词和候选答案的序列化特征,将双层的LSTM提取的模型与注意力机制提取的特征进行融合,可以获得丰富的关键词和候选答案的语义特征信息,通过CNN不同卷积核尺寸提取关键词和候选答案的不同粒度的特征,并以拼接的方式融合不同粒度的特征信息,以强化特征信息的全面性,通过这种神经网络的融合,不同的维度的数据输入后,不会改变数据的维度,适合各种维度的数据进行融合拼接,可以有效的避免数据维度统一过程中的 信息损失。神经网络的层数并不是越多越好,如果输入序列长度过长,则需要增加神经网络的层数,但是如果输入的序列数据长度较短,而神经网络的层数较多时,则会降低神经网络的学习效果,所以神经网络需要根据实际情况设置不同的网络层数。关键词的序列较短,而文献的序列长度较长,所以需要的网络层数不同。
将关键词的最终特征向量表示和候选答案的最终特征向量表示做内积,可以得到候选答案中多个向量表示的归一化概率表示,按照归一化概率对得到多个词向量表示进行加权组合,可得到第一向量,通过解码器解码第一向量时,解码器可以为单向的LSTM。
可选地,在文献类型的候选答案中获取答案采用的预设算法可以为R-NET算法。
具体地R-NET算法模型包括:采用表示学习,给目标问题的关键词和相关文献中的每一个词做一个表示,即深度学习里的向量,主要运用的是双向循环神经网络;通过门控卷积网络+注意力机制将目标问题的关键词中的向量和相关文献中的向量做一个比对,在相关文献中找出与目标问题比较接近的文字部分;通过门控卷积网络+注意力机制将比较接近的文字部分放在全局中进行比对,得到候选答案;针对候选答案中的每一个词汇进行预测,哪一个词是答案的开始,到哪个词是答案的结束,***会挑出可能性最高的一段文本,作为答案输出。
通过门控卷积网络+注意力机制对比目标问题中的向量和相关文献中的向量时,对相关文献中每个词,计算其关于目标问题的注意力分布,并使用该注意力分布汇总目标问题表示,将相关文献该词表示和目标问题表示输入RNN编码,得到该词的表示。不同的是,在相关文献词表示和目标问题表示输入RNN之前,R-NET使用了一个额外的门来过滤不重要的信息。
当使用搜索引擎不能直接查找到用户问题的答案时,通过R-NET算法从文献中获取答案,在扩大了答案检索范围的同时也提高了答***度,提高了聊天机器人应答能力。
本申请实施例提供了一种基于搜索引擎的问答装置,该装置用于执行 上述基于搜索引擎的问答方法,如图2所示,该装置包括:获取单元10、第一确定单元20、搜索单元30、计算单元40、第二确定单元50、第一判断单元60、解析单元70、第三确定单元80。
获取单元10,用于获取用户输入的目标问题。
第一确定单元20,用于确定目标问题的关键词。
搜索单元30,用于根据关键词从搜索引擎中搜索到多个搜索结果。
计算单元40,用于计算多个搜索结果中每个搜索结果与关键词的匹配度。
第二确定单元50,用于将匹配度大于或等于预设值的搜索结果作为候选答案。
第一判断单元60,用于判断候选答案的类型是否是文献类型。
解析单元70,用于如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案。
第三确定单元80,用于如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案。
本申请实施例中,根据目标问题的关键词从搜索引擎中搜索到多个搜索结果,将与关键词的匹配度大于或等于预设值的搜索结果作为候选答案,如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案;如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案,如果提问对应的答案没有预先存储在数据库中,聊天机器人通过搜索引擎搜索问题答案,解决了现有技术中当提问对应的答案没有预先存储在数据库中,则聊天机器人无法应答导致聊天机器人应答能力差的问题,达到了提高聊天机器人应答能力的效果。
可选地,第一确定单元20包括:第一提取模块、第一获取模块、第二提取模块、第一确定模块。第一提取模块,用于从目标问题中提取出关键词,将提取出的关键词作为第一关键词。第一获取模块,用于获取用户输入目标问题之前输入的上一个问题。第二提取模块,用于从用户输入目标问题之前输入的上一个问题中提取出关键词,将提取出的关键词作为第二关键词。第一确定模块,用于将第一关键词和第二关键词作为目标问题 的关键词。
可选地,第一确定单元20包括:第三提取模块、返回模块、第二获取模块、第四提取模块、第二确定模块。第三提取模块,用于从目标问题中提取出关键词,将提取出的关键词作为第一关键词。返回模块,用于向用户返回与第一关键词相关联的第一问题。第二获取模块,用于获取用户针对第一问题的答复。第四提取模块,用于从用户针对第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词。第二确定模块,用于将第一关键词和第三关键词作为目标问题的关键词。
可选地,装置还包括:第二判断单元、过滤单元。第二判断单元,用于在搜索单元30根据关键词从搜索引擎中搜索到多个搜索结果之后,并且,在计算单元40计算多个搜索结果中每个搜索结果与关键词的匹配度之前,判断多个搜索结果中是否存在广告信息。过滤单元,用于过滤掉存在广告信息的搜索结果。
可选地,解析单元70包括:分词模块、第三获取模块、第三确定模块、输入模块、第四确定模块、输出模块、第五确定模块。分词模块,用于对关键词和候选答案进行分词,得到关键词的多个分词和候选答案的多个分词。第三获取模块,用于获取关键词的多个分词对应的词向量和候选答案的多个分词对应的词向量。第三确定模块,用于将关键词的多个分词对应的词向量进行相加得到关键词的初始向量表示,将候选答案的多个分词对应的词向量进行相加得到候选答案的初始向量表示。输入模块,用于将关键词的初始向量表示输入第一深度学习神经网络进行处理,第一深度学习神经网络输出关键词的最终特征向量表示,将候选答案的初始向量表示输入第二深度学习神经网络进行处理,第二深度学习神经网络输出候选答案的最终特征向量表示,第一深度学习神经网络与第二深度学习神经网络层数不同。第四确定模块,用于将关键词的最终特征向量表示和候选答案的最终特征向量表示做内积,得到候选答案中多个信息片段,通过注意力机制,对多个信息片段进行加权组合,得到第一向量。输出模块,用于将第一向量输入解码器,解码器输出对应的文本。第五确定模块,用于将解码器输出的文本作为目标问题的答案。
可选地,在文献类型的候选答案中获取答案采用的预设算法可以为R-NET算法。
一方面,本申请实施例提供了一种存储介质,存储介质包括存储的程序,其中,在程序运行时控制存储介质所在设备执行以下步骤:获取用户输入的目标问题;确定目标问题的关键词;根据关键词从搜索引擎中搜索到多个搜索结果;计算多个搜索结果中每个搜索结果与关键词的匹配度;将匹配度大于或等于预设值的搜索结果作为候选答案;判断候选答案的类型是否是文献类型;如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案;如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:从目标问题中提取出关键词,将提取出的关键词作为第一关键词;获取用户输入目标问题之前输入的上一个问题;从用户输入目标问题之前输入的上一个问题中提取出关键词,将提取出的关键词作为第二关键词;将第一关键词和第二关键词作为目标问题的关键词。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:从目标问题中提取出关键词,将提取出的关键词作为第一关键词;向用户返回与第一关键词相关联的第一问题;获取用户针对第一问题的答复;从用户针对第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;将第一关键词和第三关键词作为目标问题的关键词。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:在根据关键词从搜索引擎中搜索到多个搜索结果之后,并且,在计算多个搜索结果中每个搜索结果与关键词的匹配度之前,判断多个搜索结果中是否存在广告信息;过滤掉存在广告信息的搜索结果。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:对关键词和候选答案进行分词,得到关键词的多个分词和候选答案的多个分词;获取关键词的多个分词对应的词向量和候选答案的多个分词对应的词向量;将关键词的多个分词对应的词向量进行相加得到关键词的初始向量表示,将候选答案的多个分词对应的词向量进行相加得到候选答案的初始向 量表示;将关键词的初始向量表示输入第一深度学习神经网络进行处理,第一深度学习神经网络输出关键词的最终特征向量表示,将候选答案的初始向量表示输入第二深度学习神经网络进行处理,第二深度学习神经网络输出候选答案的最终特征向量表示,第一深度学习神经网络与第二深度学习神经网络层数不同;将关键词的最终特征向量表示和候选答案的最终特征向量表示做内积,得到候选答案中多个信息片段,通过注意力机制,对多个信息片段进行加权组合,得到第一向量;将第一向量输入解码器,解码器输出对应的文本;将解码器输出的文本作为目标问题的答案。
可选地,在程序运行时控制存储介质所在设备还执行以下步骤:如果候选答案的类型是文献类型,则根据R-NET算法解析候选答案,得到目标问题的答案。
一方面,本申请实施例提供了一种计算机设备,包括存储器和处理器,存储器用于存储包括程序指令的信息,处理器用于控制程序指令的执行,程序指令被处理器加载并执行时实现以下步骤:获取用户输入的目标问题;确定目标问题的关键词;根据关键词从搜索引擎中搜索到多个搜索结果;计算多个搜索结果中每个搜索结果与关键词的匹配度;将匹配度大于或等于预设值的搜索结果作为候选答案;判断候选答案的类型是否是文献类型;如果候选答案的类型是文献类型,则根据预设算法解析候选答案,得到目标问题的答案;如果候选答案的类型不是文献类型,则确定候选答案为目标问题的答案。
可选地,程序指令被处理器加载并执行时还实现以下步骤:从目标问题中提取出关键词,将提取出的关键词作为第一关键词;获取用户输入目标问题之前输入的上一个问题;从用户输入目标问题之前输入的上一个问题中提取出关键词,将提取出的关键词作为第二关键词;将第一关键词和第二关键词作为目标问题的关键词。
可选地,程序指令被处理器加载并执行时还实现以下步骤:从目标问题中提取出关键词,将提取出的关键词作为第一关键词;向用户返回与第一关键词相关联的第一问题;获取用户针对第一问题的答复;从用户针对第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;将 第一关键词和第三关键词作为目标问题的关键词。
可选地,程序指令被处理器加载并执行时还实现以下步骤:在根据关键词从搜索引擎中搜索到多个搜索结果之后,并且,在计算多个搜索结果中每个搜索结果与关键词的匹配度之前,判断多个搜索结果中是否存在广告信息;过滤掉存在广告信息的搜索结果。
可选地,程序指令被处理器加载并执行时还实现以下步骤:对关键词和候选答案进行分词,得到关键词的多个分词和候选答案的多个分词;获取关键词的多个分词对应的词向量和候选答案的多个分词对应的词向量;将关键词的多个分词对应的词向量进行相加得到关键词的初始向量表示,将候选答案的多个分词对应的词向量进行相加得到候选答案的初始向量表示;将关键词的初始向量表示输入第一深度学习神经网络进行处理,第一深度学习神经网络输出关键词的最终特征向量表示,将候选答案的初始向量表示输入第二深度学习神经网络进行处理,第二深度学习神经网络输出候选答案的最终特征向量表示,第一深度学习神经网络与第二深度学习神经网络层数不同;将关键词的最终特征向量表示和候选答案的最终特征向量表示做内积,得到候选答案中多个信息片段,通过注意力机制,对多个信息片段进行加权组合,得到第一向量;将第一向量输入解码器,解码器输出对应的文本;将解码器输出的文本作为目标问题的答案。
可选地,程序指令被处理器加载并执行时还实现以下步骤:如果候选答案的类型是文献类型,则根据R-NET算法解析候选答案,得到目标问题的答案。
图3是本申请实施例提供的一种计算机设备的示意图。如图3所示,该实施例的计算机设备50包括:处理器51、存储器52以及存储在存储器52中并可在处理器51上运行的计算机程序53,该计算机程序53被处理器51执行时实现实施例中的基于搜索引擎的问答方法,为避免重复,此处不一一赘述。或者,该计算机程序被处理器51执行时实现实施例中基于搜索引擎的问答装置中各模型/单元的功能,为避免重复,此处不一一赘述。
计算机设备50可以是桌上型计算机、笔记本、掌上电脑及云端服 务器等计算设备。计算机设备可包括,但不仅限于,处理器51、存储器52。本领域技术人员可以理解,图3仅仅是计算机设备50的示例,并不构成对计算机设备50的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如计算机设备还可以包括输入输出设备、网络接入设备、总线等。
所称处理器51可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
存储器52可以是计算机设备50的内部存储单元,例如计算机设备50的硬盘或内存。存储器52也可以是计算机设备50的外部存储设备,例如计算机设备50上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器52还可以既包括计算机设备50的内部存储单元也包括外部存储设备。存储器52用于存储计算机程序以及计算机设备所需的其他程序和数据。存储器52还可以用于暂时地存储已经输出或者将要输出的数据。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的***,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的***,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如,多个单元或组件可以结合或者可以集成到另一个***,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它 的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能单元的形式实现的集成的单元,可以存储在一个计算机可读取存储介质中。上述软件功能单元存储在一个存储介质中,包括若干指令用以使得一台计算机装置(可以是个人计算机,服务器,或者网络装置等)或处理器(Processor)执行本申请各个实施例所述方法的部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (20)

  1. 一种基于搜索引擎的问答方法,其特征在于,所述方法包括:
    获取用户输入的目标问题;
    确定所述目标问题的关键词;
    根据所述关键词从搜索引擎中搜索到多个搜索结果;
    计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度;
    将匹配度大于或等于预设值的搜索结果作为候选答案;
    判断所述候选答案的类型是否是文献类型;
    如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案;
    如果所述候选答案的类型不是文献类型,则确定所述候选答案为所述目标问题的答案。
  2. 根据权利要求1所述的方法,其特征在于,所述确定所述目标问题的关键词,包括:
    从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    获取所述用户输入所述目标问题之前输入的上一个问题;
    从所述用户输入所述目标问题之前输入的所述上一个问题中提取出关键词,将提取出的关键词作为第二关键词;
    将所述第一关键词和所述第二关键词作为所述目标问题的关键词。
  3. 根据权利要求1所述的方法,其特征在于,所述确定所述目标问题的关键词,包括:
    从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    向所述用户返回与所述第一关键词相关联的第一问题;
    获取所述用户针对所述第一问题的答复;
    从所述用户针对所述第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;
    将所述第一关键词和所述第三关键词作为所述目标问题的关键词。
  4. 根据权利要求1所述的方法,其特征在于,在所述根据所述关键词从搜索引擎中搜索到多个搜索结果之后,并且,在所述计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度之前,所述方法还包括:
    判断所述多个搜索结果中是否存在广告信息;
    过滤掉存在广告信息的搜索结果。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案,包括:
    对所述关键词和所述候选答案进行分词,得到所述关键词的多个分词和所述候选答案的多个分词;
    获取所述关键词的多个分词对应的词向量和所述候选答案的多个分词对应的词向量;
    将所述关键词的多个分词对应的词向量进行相加得到所述关键词的初始向量表示,将所述候选答案的多个分词对应的词向量进行相加得到所述候选答案的初始向量表示;
    将所述关键词的初始向量表示输入第一深度学习神经网络进行处理,所述第一深度学习神经网络输出所述关键词的最终特征向量表示,将所述候选答案的初始向量表示输入第二深度学习神经网络进行处理,所述第二深度学习神经网络输出所述候选答案的最终特征向量表示,所述第一深度学习神经网络与所述第二深度学习神经网络层数不同;
    将所述关键词的最终特征向量表示和所述候选答案的最终特征向量表示做内积,得到所述候选答案中多个信息片段,通过注意力机制,对所述多个信息片段进行加权组合,得到第一向量;
    将所述第一向量输入解码器,所述解码器输出对应的文本;
    将所述解码器输出的文本作为所述目标问题的答案。
  6. 一种基于搜索引擎的问答装置,其特征在于,所述装置包括:
    获取单元,用于获取用户输入的目标问题;
    第一确定单元,用于确定所述目标问题的关键词;
    搜索单元,用于根据所述关键词从搜索引擎中搜索到多个搜索结果;
    计算单元,用于计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度;
    第二确定单元,用于将匹配度大于或等于预设值的搜索结果作为候选答案;
    第一判断单元,用于判断所述候选答案的类型是否是文献类型;
    解析单元,用于如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案;
    第三确定单元,用于如果所述候选答案的类型不是文献类型,则确定所述候选答案为所述目标问题的答案。
  7. 根据权利要求6所述的装置,其特征在于,所述第一确定单元包括:
    第一提取模块,用于从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    第一获取模块,用于获取所述用户输入所述目标问题之前输入的上一个问题;
    第二提取模块,用于从所述用户输入所述目标问题之前输入的所述上一个问题中提取出关键词,将提取出的关键词作为第二关键词;
    第一确定模块,用于将所述第一关键词和所述第二关键词作为所述目标问题的关键词。
  8. 根据权利要求6所述的装置,其特征在于,所述第一确定单元包括:
    第三提取模块,用于从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    返回模块,用于向所述用户返回与所述第一关键词相关联的第一问题;
    第二获取模块,用于获取所述用户针对所述第一问题的答复;
    第四提取模块,用于从所述用户针对所述第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;
    第二确定模块,用于将所述第一关键词和所述第三关键词作为所述目标问题的关键词。
  9. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    第二判断单元,用于判断所述多个搜索结果中是否存在广告信息;
    过滤单元,用于过滤掉存在广告信息的搜索结果。
  10. 根据权利要求6~9任一项所述的装置,其特征在于,所述解析单元包括:
    分词模块,用于对所述关键词和所述候选答案进行分词,得到所述关键词的多个分词和所述候选答案的多个分词;
    第三获取模块,用于获取所述关键词的多个分词对应的词向量和所述候选答案的多个分词对应的词向量;
    第三确定模块,用于将所述关键词的多个分词对应的词向量进行相加得到所述关键词的初始向量表示,将所述候选答案的多个分词对应的词向量进行相加得到所述候选答案的初始向量表示;
    输入模块,用于将所述关键词的初始向量表示输入第一深度学习神经网络进行处理,所述第一深度学习神经网络输出所述关键词的最终特征向量表示,将所述候选答案的初始向量表示输入第二深度学习神经网络进行处理,所述第二深度学习神经网络输出所述候选答案的最终特征向量表示,所述第一深度学习神经网络与所述第二深度学习神经网络层数不同;
    第四确定模块,用于将所述关键词的最终特征向量表示和所述候选答案的最终特征向量表示做内积,得到所述候选答案中多个信息片段,通过注意力机制,对所述多个信息片段进行加权组合,得到第一向量;
    输出模块,用于将所述第一向量输入解码器,所述解码器输出对应的文本;
    第五确定模块,用于将所述解码器输出的文本作为所述目标问题的答案。
  11. 一种存储介质,其特征在于,所述存储介质包括存储的程序,其中,在所述程序运行时控制所述存储介质所在设备执行以下步骤:
    获取用户输入的目标问题;
    确定所述目标问题的关键词;
    根据所述关键词从搜索引擎中搜索到多个搜索结果;
    计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度;
    将匹配度大于或等于预设值的搜索结果作为候选答案;
    判断所述候选答案的类型是否是文献类型;
    如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案;
    如果所述候选答案的类型不是文献类型,则确定所述候选答案为所述目标问题的答案。
  12. 根据权利要求11所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备执行所述确定所述目标问题的关键词的步骤,包括:
    从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    获取所述用户输入所述目标问题之前输入的上一个问题;
    从所述用户输入所述目标问题之前输入的所述上一个问题中提取出关键词,将提取出的关键词作为第二关键词;
    将所述第一关键词和所述第二关键词作为所述目标问题的关键词。
  13. 根据权利要求11所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备执行所述确定所述目标问题的关键词的步骤,包括:
    从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    向所述用户返回与所述第一关键词相关联的第一问题;
    获取所述用户针对所述第一问题的答复;
    从所述用户针对所述第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;
    将所述第一关键词和所述第三关键词作为所述目标问题的关键词。
  14. 根据权利要求11所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备在执行所述根据所述关键词从搜索引擎中搜索到多个搜索结果之后,并且,在执行所述计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度之前,还执行以下步骤:
    判断所述多个搜索结果中是否存在广告信息;
    过滤掉存在广告信息的搜索结果。
  15. 根据权利要求11~14任一项所述的存储介质,其特征在于,在所述程序运行时控制所述存储介质所在设备在执行所述如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案的步骤,包括:
    对所述关键词和所述候选答案进行分词,得到所述关键词的多个分词和所述候选答案的多个分词;
    获取所述关键词的多个分词对应的词向量和所述候选答案的多个分词对应的词向量;
    将所述关键词的多个分词对应的词向量进行相加得到所述关键词的 初始向量表示,将所述候选答案的多个分词对应的词向量进行相加得到所述候选答案的初始向量表示;
    将所述关键词的初始向量表示输入第一深度学习神经网络进行处理,所述第一深度学习神经网络输出所述关键词的最终特征向量表示,将所述候选答案的初始向量表示输入第二深度学习神经网络进行处理,所述第二深度学习神经网络输出所述候选答案的最终特征向量表示,所述第一深度学习神经网络与所述第二深度学习神经网络层数不同;
    将所述关键词的最终特征向量表示和所述候选答案的最终特征向量表示做内积,得到所述候选答案中多个信息片段,通过注意力机制,对所述多个信息片段进行加权组合,得到第一向量;
    将所述第一向量输入解码器,所述解码器输出对应的文本;
    将所述解码器输出的文本作为所述目标问题的答案。
  16. 一种计算机设备,包括存储器和处理器,所述存储器用于存储包括程序指令的信息,所述处理器用于控制程序指令的执行,其特征在于,所述程序指令被处理器加载并执行时实现以下步骤:
    获取用户输入的目标问题;
    确定所述目标问题的关键词;
    根据所述关键词从搜索引擎中搜索到多个搜索结果;
    计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度;
    将匹配度大于或等于预设值的搜索结果作为候选答案;
    判断所述候选答案的类型是否是文献类型;
    如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案;
    如果所述候选答案的类型不是文献类型,则确定所述候选答案为所述目标问题的答案。
  17. 根据权利要求16所述的计算机设备,其特征在于,所述程序指令被处理器加载并执行时实现所述确定所述目标问题的关键词的步骤,包括:
    从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    获取所述用户输入所述目标问题之前输入的上一个问题;
    从所述用户输入所述目标问题之前输入的所述上一个问题中提取出关键词,将提取出的关键词作为第二关键词;
    将所述第一关键词和所述第二关键词作为所述目标问题的关键词。
  18. 根据权利要求16所述的计算机设备,其特征在于,所述程序指令被处理器加载并执行时实现所述确定所述目标问题的关键词的步骤,包括:
    从所述目标问题中提取出关键词,将提取出的关键词作为第一关键词;
    向所述用户返回与所述第一关键词相关联的第一问题;
    获取所述用户针对所述第一问题的答复;
    从所述用户针对所述第一问题的答复中提取出关键词,将提取出的关键词作为第三关键词;
    将所述第一关键词和所述第三关键词作为所述目标问题的关键词。
  19. 根据权利要求16所述的计算机设备,其特征在于,所述程序指令被处理器加载并执行时在实现所述根据所述关键词从搜索引擎中搜索到多个搜索结果之后,并且,在实现所述计算所述多个搜索结果中每个搜索结果与所述关键词的匹配度之前,还实现以下步骤:
    判断所述多个搜索结果中是否存在广告信息;
    过滤掉存在广告信息的搜索结果。
  20. 根据权利要求16~19任一项所述的计算机设备,其特征在于,所述程序指令被处理器加载并执行时在实现所述如果所述候选答案的类型是文献类型,则根据预设算法解析所述候选答案,得到所述目标问题的答案的步骤,包括:
    对所述关键词和所述候选答案进行分词,得到所述关键词的多个分词和所述候选答案的多个分词;
    获取所述关键词的多个分词对应的词向量和所述候选答案的多个分词对应的词向量;
    将所述关键词的多个分词对应的词向量进行相加得到所述关键词的初始向量表示,将所述候选答案的多个分词对应的词向量进行相加得到所述候选答案的初始向量表示;
    将所述关键词的初始向量表示输入第一深度学习神经网络进行处理,所述第一深度学习神经网络输出所述关键词的最终特征向量表示,将所述候选答案的初始向量表示输入第二深度学习神经网络进行处理,所述第二深度学习神经网络输出所述候选答案的最终特征向量表示,所述第一深度学习神经网络与所述第二深度学习神经网络层数不同;
    将所述关键词的最终特征向量表示和所述候选答案的最终特征向量表示做内积,得到所述候选答案中多个信息片段,通过注意力机制,对所述多个信息片段进行加权组合,得到第一向量;
    将所述第一向量输入解码器,所述解码器输出对应的文本;
    将所述解码器输出的文本作为所述目标问题的答案。
PCT/CN2019/118080 2019-01-09 2019-11-13 一种基于搜索引擎的问答方法、装置、存储介质及计算机设备 WO2020143314A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910018881.2A CN109918560B (zh) 2019-01-09 2019-01-09 一种基于搜索引擎的问答方法和装置
CN201910018881.2 2019-01-09

Publications (1)

Publication Number Publication Date
WO2020143314A1 true WO2020143314A1 (zh) 2020-07-16

Family

ID=66960078

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/118080 WO2020143314A1 (zh) 2019-01-09 2019-11-13 一种基于搜索引擎的问答方法、装置、存储介质及计算机设备

Country Status (2)

Country Link
CN (1) CN109918560B (zh)
WO (1) WO2020143314A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163405A (zh) * 2020-09-08 2021-01-01 北京百度网讯科技有限公司 问题的生成方法和装置

Families Citing this family (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109918560B (zh) * 2019-01-09 2024-03-12 平安科技(深圳)有限公司 一种基于搜索引擎的问答方法和装置
CN110727764A (zh) * 2019-10-10 2020-01-24 珠海格力电器股份有限公司 一种话术生成方法、装置及话术生成设备
CN112749260A (zh) * 2019-10-31 2021-05-04 阿里巴巴集团控股有限公司 信息交互方法、装置、设备及介质
CN111143522B (zh) * 2019-11-29 2023-08-01 华东师范大学 一种端到端的任务型对话***的领域适应方法
CN110929015B (zh) * 2019-12-06 2024-04-02 北京金山数字娱乐科技有限公司 一种多文本分析方法及装置
CN111460095B (zh) * 2020-03-17 2023-06-27 北京百度网讯科技有限公司 问答处理方法、装置、电子设备及存储介质
CN111651567B (zh) * 2020-04-16 2023-09-22 北京奇艺世纪科技有限公司 一种业务问答数据处理方法及装置
CN111680264B (zh) * 2020-04-20 2023-12-22 重庆兆光科技股份有限公司 一种多文档阅读理解方法
CN111930894B (zh) * 2020-08-13 2022-10-28 腾讯科技(深圳)有限公司 长文本匹配方法及装置、存储介质、电子设备
CN112541069A (zh) * 2020-12-24 2021-03-23 山东山大鸥玛软件股份有限公司 一种结合关键词的文本匹配方法、***、终端及存储介质
CN112667809A (zh) * 2020-12-25 2021-04-16 平安科技(深圳)有限公司 一种文本处理方法、装置及电子设备、存储介质
CN113592523B (zh) * 2021-06-03 2024-03-26 山东大学 一种金融数据处理***及方法
CN113392308B (zh) * 2021-06-22 2024-06-25 抖音视界有限公司 内容搜索方法、装置、设备及介质
CN114461777B (zh) * 2022-02-14 2024-07-19 平安科技(深圳)有限公司 智能问答方法、装置、设备及存储介质
CN116910232B (zh) * 2023-09-13 2024-01-09 之江实验室 天文文献检索方法和天文文献搜索方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699845A (zh) * 2015-03-31 2015-06-10 北京奇虎科技有限公司 基于提问类搜索词的搜索结果提供方法及装置
CN106294635A (zh) * 2016-08-02 2017-01-04 北京百度网讯科技有限公司 应用程序搜索方法、深度神经网络模型的训练方法及装置
CN108491433A (zh) * 2018-02-09 2018-09-04 平安科技(深圳)有限公司 聊天应答方法、电子装置及存储介质
CN109918560A (zh) * 2019-01-09 2019-06-21 平安科技(深圳)有限公司 一种基于搜索引擎的问答方法和装置

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902652A (zh) * 2014-02-27 2014-07-02 深圳市智搜信息技术有限公司 自动问答***
CN106649786B (zh) * 2016-12-28 2020-04-07 北京百度网讯科技有限公司 基于深度问答的答案检索方法及装置
CN108536708A (zh) * 2017-03-03 2018-09-14 腾讯科技(深圳)有限公司 一种自动问答处理方法及自动问答***
CN107729468B (zh) * 2017-10-12 2019-12-17 华中科技大学 基于深度学习的答案抽取方法及***
CN108153876B (zh) * 2017-12-26 2021-07-23 爱因互动科技发展(北京)有限公司 智能问答方法及***
CN108415977B (zh) * 2018-02-09 2022-02-15 华南理工大学 一个基于深度神经网络及强化学习的生成式机器阅读理解方法
CN109086303B (zh) * 2018-06-21 2021-09-28 深圳壹账通智能科技有限公司 基于机器阅读理解的智能对话方法、装置、终端

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104699845A (zh) * 2015-03-31 2015-06-10 北京奇虎科技有限公司 基于提问类搜索词的搜索结果提供方法及装置
CN106294635A (zh) * 2016-08-02 2017-01-04 北京百度网讯科技有限公司 应用程序搜索方法、深度神经网络模型的训练方法及装置
CN108491433A (zh) * 2018-02-09 2018-09-04 平安科技(深圳)有限公司 聊天应答方法、电子装置及存储介质
CN109918560A (zh) * 2019-01-09 2019-06-21 平安科技(深圳)有限公司 一种基于搜索引擎的问答方法和装置

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163405A (zh) * 2020-09-08 2021-01-01 北京百度网讯科技有限公司 问题的生成方法和装置

Also Published As

Publication number Publication date
CN109918560B (zh) 2024-03-12
CN109918560A (zh) 2019-06-21

Similar Documents

Publication Publication Date Title
WO2020143314A1 (zh) 一种基于搜索引擎的问答方法、装置、存储介质及计算机设备
CN110162593B (zh) 一种搜索结果处理、相似度模型训练方法及装置
WO2021174919A1 (zh) 简历数据信息解析及匹配方法、装置、电子设备及介质
US20230237328A1 (en) Information processing method and terminal, and computer storage medium
CN109815487B (zh) 文本质检方法、电子装置、计算机设备及存储介质
US20190392066A1 (en) Semantic Analysis-Based Query Result Retrieval for Natural Language Procedural Queries
CN113656582B (zh) 神经网络模型的训练方法、图像检索方法、设备和介质
WO2020258502A1 (zh) 文本分析方法、装置、计算机装置及存储介质
US8577882B2 (en) Method and system for searching multilingual documents
US20220318275A1 (en) Search method, electronic device and storage medium
WO2017181834A1 (zh) 一种智能问答方法及装置
US20220083874A1 (en) Method and device for training search model, method for searching for target object, and storage medium
CN112214593A (zh) 问答处理方法、装置、电子设备及存储介质
CN112883193A (zh) 一种文本分类模型的训练方法、装置、设备以及可读介质
CN113312461A (zh) 基于自然语言处理的智能问答方法、装置、设备及介质
CN111737997A (zh) 一种文本相似度确定方法、设备及储存介质
CN112613293B (zh) 摘要生成方法、装置、电子设备及存储介质
CN112559709A (zh) 基于知识图谱的问答方法、装置、终端以及存储介质
CN115203421A (zh) 一种长文本的标签生成方法、装置、设备及存储介质
CN113836303A (zh) 一种文本类别识别方法、装置、计算机设备及介质
CN116226785A (zh) 目标对象识别方法、多模态识别模型的训练方法和装置
CN113326702A (zh) 语义识别方法、装置、电子设备及存储介质
CN113626704A (zh) 基于word2vec模型的推荐信息方法、装置及设备
CN113095072B (zh) 文本处理方法及装置
CN111368066A (zh) 获取对话摘要的方法、装置和计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19908818

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19908818

Country of ref document: EP

Kind code of ref document: A1