CN110851484A

CN110851484A - Method and device for obtaining multi-index question answers

Info

Publication number: CN110851484A
Application number: CN201911106796.8A
Authority: CN
Inventors: 孙子钧
Original assignee: Beijing Shannon Huiyu Technology Co Ltd
Current assignee: Beijing Shannon Huiyu Technology Co Ltd
Priority date: 2019-11-13
Filing date: 2019-11-13
Publication date: 2020-02-28

Abstract

The invention provides a method and a device for acquiring answers to multi-index questions, wherein the method comprises the following steps: acquiring question information input by a user; performing word segmentation processing on the problem information based on the multi-mode model, determining a word segmentation result, and extracting a plurality of indexes in the problem information; establishing a dependency relationship between words according to the word segmentation result, and converting the problem information into a query statement in a machine language form corresponding to each index according to the dependency relationship; and determining the query result corresponding to each index, and displaying the query results of all the indexes in the same coordinate system. By the method and the device for acquiring the multi-index answers to the questions, provided by the embodiment of the invention, semantic analysis can be performed more accurately based on multiple modes, so that word segmentation results are more accurate; the semantic information of the sentence can be completely and comprehensively described based on the dependency relationship between the words, so that the query sentence is more accurate, and the query accuracy is improved.

Description

Method and device for obtaining multi-index question answers

Technical Field

The invention relates to the technical field of natural language processing, in particular to a method and a device for acquiring answers to multi-index questions.

Background

At present, the mainstream financial problem searching method is a database retrieval technology based on keyword matching. The database stores massive fields and data, when a user asks a question, the user uses a traditional word segmentation algorithm to extract a keyword of the question, and then enters the database for query according to the keyword to find a result.

The following problems and disadvantages mainly exist in the search technology:

the result based on keyword matching shows a large number of files containing keywords, and manual reading and screening from the answers are needed, so that the efficiency is low. And based on keyword matching, the questions cannot be understood, accurate answers are difficult to show, and returned results often only have relevance but cannot answer the questions exactly. In addition, the conventional search method generally supports the search of only one index, and the search effect is slightly poor when a plurality of indexes need to be searched simultaneously.

Disclosure of Invention

To solve the above problems, embodiments of the present invention provide a method and an apparatus for obtaining answers to multi-index questions.

In a first aspect, an embodiment of the present invention provides a method for obtaining answers to a multi-index question, including:

acquiring question information input by a user, wherein the question information comprises a plurality of indexes with the same attribute;

performing word segmentation processing on the problem information based on a multi-mode model, determining a word segmentation result, and extracting a plurality of indexes in the problem information, wherein the multi-mode model comprises at least two items of a word model, a character model, a pinyin model and a character pattern model;

establishing a dependency relationship between words according to the word segmentation result, and converting the problem information into a query statement in a machine language form corresponding to each index according to the dependency relationship;

and querying a corresponding database according to the query statement, determining a query result corresponding to each index, and displaying the query results of all the indexes in the same coordinate system.

In a second aspect, an embodiment of the present invention further provides an apparatus for obtaining answers to multiple index questions, including:

the system comprises a problem acquisition module, a problem analysis module and a problem analysis module, wherein the problem acquisition module is used for acquiring problem information input by a user, and the problem information comprises a plurality of indexes with the same attribute;

the word segmentation module is used for performing word segmentation processing on the problem information based on a multi-mode model, determining a word segmentation result and extracting a plurality of indexes in the problem information, wherein the multi-mode model comprises at least two items of a word model, a character model, a pinyin model and a character pattern model;

the processing module is used for establishing the dependency relationship between words according to the word segmentation result and converting the problem information into the query statement in the form of the machine language corresponding to each index according to the dependency relationship;

and the query display module is used for querying the corresponding database according to the query statement, determining the query result corresponding to each index, and displaying the query results of all the indexes in the same coordinate system.

In the solution provided by the first aspect of the embodiments of the present invention, the multi-modal model is used to perform word segmentation on the problem information input by the user, so as to extract multiple indexes with the same attribute and establish the dependency relationship between words; and converting the problem information into a query statement in a machine language form according to the dependency relationship, thereby quickly determining the query result of each index by using the query statement and simultaneously displaying the query results of a plurality of indexes. The method carries out word segmentation processing based on a multi-mode model, and can carry out semantic analysis more accurately based on a multi-mode model, so that word segmentation results are more accurate; the semantic information of the sentence can be completely and comprehensively described based on the dependency relationship between the words, so that the query sentence is more accurate, more accurate results can be queried, and the query accuracy is improved; by extracting indexes with the same attribute and converting the problem information into the query statement corresponding to each index, the original problem information can be simplified, and the query result corresponding to each index can be queried accurately; and simultaneously displaying the query results of all indexes, and facilitating the comparison of the query results among the indexes by the user.

In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a flowchart illustrating a method for obtaining answers to a multi-index question according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a specific method for performing word segmentation processing based on a multi-modal model in the method for obtaining answers to multi-index questions according to the embodiment of the present invention;

FIG. 3 is a schematic diagram of a bidirectional long-short memory recurrent neural network model according to an embodiment of the present invention;

FIG. 4 is a diagram illustrating a Stack-LSTM based syntactic dependency tree model provided by an embodiment of the present invention;

FIG. 5 is a diagram illustrating a display of query results provided by an embodiment of the invention;

fig. 6 is a schematic structural diagram illustrating an apparatus for obtaining answers to a multi-index question according to an embodiment of the present invention.

Detailed Description

In the description of the present invention, it is to be understood that the terms "center", "longitudinal", "lateral", "length", "width", "thickness", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", "clockwise", "counterclockwise", and the like, indicate orientations and positional relationships based on those shown in the drawings, and are used only for convenience of description and simplicity of description, and do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be considered as limiting the present invention.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless specifically defined otherwise.

In the present invention, unless otherwise expressly specified or limited, the terms "mounted," "connected," "secured," and the like are to be construed broadly and can, for example, be fixedly connected, detachably connected, or integrally connected; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

The embodiment of the invention provides a method for acquiring answers to multi-index questions, which is used for inquiring answers to the multi-index questions. Referring to fig. 1, the method includes:

step 101: problem information input by a user is obtained, and the problem information comprises a plurality of indexes with the same attribute.

In the embodiment of the present invention, the "question information" refers to information input by a user when the user needs to query, and the question information does not need to be in a question form. For example, when a user needs to query the chinese GDP, the user may input only the "chinese GDP" or may input what the "chinese GDP" is in the form of a question. Meanwhile, the issue information includes one or more indices. In the present embodiment, the "index" is the content included in the question information; for example, if the question information is "GDP in china", the question information includes two indexes "china" and "GDP". Meanwhile, the index has corresponding attributes, which include semantic attributes (such as name of a person and name of a place), grammatical attributes (such as subject and adjective), etc., which are not limited in this embodiment. If the problem information comprises a plurality of indexes, and part or all of the indexes have the same attribute, the problem information is a multi-index problem, and when a user needs to query the multi-index problem, the query process is suitable for the embodiment of the invention. For example, the question information is "GDP in china, the united states, and japan" which includes four indexes, i.e., "china", "united states", "japan", and "GDP", and "china", "united states", and "japan" are country names, and these three indexes have the same attribute, the question is a multi-index question. Or, the problem information is "average human GDP and average human steel consumption in china", and the problem information includes three indexes of "china", "average human GDP" and "average human steel consumption", where "average human GDP" and "average human steel consumption" are both terms defined by "china", and then "average human GDP" and "average human steel consumption" can be regarded as two indexes having the same attribute.

Step 102: performing word segmentation processing on the problem information based on a multi-mode model, determining a word segmentation result, and extracting a plurality of indexes in the problem information, wherein the multi-mode model comprises at least two items of a word model, a character model, a pinyin model and a character pattern model.

In the embodiment of the invention, the multi-mode model is generated based on the words, the characters, the pinyin and the character patterns, and the natural language processing is carried out through multiple dimensions, so that the method has higher accuracy compared with the traditional processing mode only based on the characters. Specifically, "word" refers to a word segmentation determined based on a conventional word segmentation model; "character" refers to basic information in a language, such as a Chinese character is a character, etc.; the pinyin is the specific attribute of the Chinese characters, and the pronunciation of each Chinese character also contains semantic information thereof to a certain extent, such as polyphones and the like; "glyph" refers to a character pattern belonging to the class of pictographic characters, such as Chinese characters, and the shape of each Chinese character may also contain specific semantics. For example, the original text is "i am a chinese", and can be classified as "i/is/a/chinese" after being processed based on the "word"; after being processed based on the character, the method can be divided into 'I/Y/I/M/China/man'; after being processed based on the pinyin, the Chinese character can be divided into 'wo/shi/yi/ge/zhong/guo/ren'; each character can be mapped into a Chinese character picture based on the font, and then corresponding processing is carried out.

Wherein, one model included in the multi-modal model is used for performing semantic analysis processing based on corresponding parameters, for example, a "word model" in the multi-modal model is used for performing semantic analysis based on a "word"; and finally, finally determining the most appropriate and accurate word segmentation result based on all models (such as word models, character models, pinyin models and font models) in the multi-modal model.

Meanwhile, the attribute of each participle can be determined according to the participle processing process and the participle result, and the participles are determined to have the same attribute; when some participles are determined to have the same attribute, the participles can be used as indexes with the same attribute. Typically, an index contains one or more tokens. For example, the index "GDP" includes a word "GDP", and the index "per capita steel consumption" may include three words "per capita", "steel", and "consumption". The number of the word segmentation included in the index is determined according to the actual situation and the word segmentation mode. For example, "per capita steel consumption amount" may be used as a word, and this case corresponds to an index.

Step 103: and establishing a dependency relationship between words according to the word segmentation result, and converting the problem information into a query statement in a machine language form corresponding to each index according to the dependency relationship.

In the embodiment of the invention, the Dependency relationship between words can be specifically established based on a deep learning model, and the syntactic structure of the problem information, namely Dependency syntax (Dependency Parsing), can be revealed through the Dependency relationship, so that the problem information can be analyzed into a Dependency relationship syntactic tree, and the natural language is translated into the query sentence which can be understood by a machine. Meanwhile, because the problem information includes a plurality of indexes having the same attribute, the query statement can be generated by taking the index as a unit, that is, one index corresponds to one query statement. For example, the question information is "GDP in china, the united states, and japan", and is converted into three query sentences, which are "GDP in china", "GDP in the united states", and "GDP in japan", respectively, in the present embodiment.

Step 104: and querying a corresponding database according to the query statement, determining a query result corresponding to each index, and displaying the query results of all the indexes in the same coordinate system.

In the embodiment of the invention, a corresponding database is preset for a user to inquire; specifically, for financial problems, the financial text may be obtained in various ways (such as web crawling, etc.), and then a database related to financial data is generated. And querying the database after determining the query statement, and further extracting and displaying a query result corresponding to each index for a user to look up. Meanwhile, all the query results of the indexes with the same attribute are displayed in the same coordinate system, so that the user can conveniently compare the query results among the indexes. The same coordinate system means that the horizontal axis and the vertical axis of the coordinate system are the same.

According to the method for obtaining the multi-index answer to the question, provided by the embodiment of the invention, the multi-mode model is used for carrying out word segmentation on the question information input by the user, a plurality of indexes with the same attribute can be extracted, and the dependency relationship between words is established; and converting the problem information into a query statement in a machine language form according to the dependency relationship, thereby quickly determining the query result of each index by using the query statement and simultaneously displaying the query results of a plurality of indexes. The method carries out word segmentation processing based on a multi-mode model, and can carry out semantic analysis more accurately based on a multi-mode model, so that word segmentation results are more accurate; the semantic information of the sentence can be completely and comprehensively described based on the dependency relationship between the words, so that the query sentence is more accurate, more accurate results can be queried, and the query accuracy is improved; by extracting indexes with the same attribute and converting the problem information into the query statement corresponding to each index, the original problem information can be simplified, and the query result corresponding to each index can be queried accurately; and simultaneously displaying the query results of all indexes, and facilitating the comparison of the query results among the indexes by the user.

On the basis of the above embodiment, referring to fig. 2, the step 102 "performing word segmentation processing on the question information based on the multi-modal model" includes steps 1021-1025:

step 1021: and determining an initial word segmentation result through a preset word segmentation model, and determining a first semantic meaning of the problem information by taking the word as a basic unit.

In the embodiment of the invention, the word segmentation model can adopt the existing word segmentation model, such as a Chinese word segmentation device. Based on the initial word segmentation result, a word model is established by taking words as basic units, the word model can be specifically a long-short memory neural network model (LSTM), and the semantics of the problem information can be determined based on the word model. For example, for the sentence "i am a chinese", the input of the word model is "i/is/a/chinese", and corresponding semantics are output.

Step 1022: all characters of the question information are determined, and the second semantic meaning of the question information is determined by taking the characters as basic units.

The traditional word segmentation model only takes words as language units, and the Chinese semantics on a character level are ignored by the model; in the embodiment of the invention, a character model is established by taking characters as basic units, and the character model can also adopt a long and short memory neural network model; the semantics at the sentence level can be processed based on the character model. For example, for the sentence "i am a chinese", the input of the character model is "i/is/one/middle/country/person".

Step 1023: determining pinyin corresponding to each character, determining pinyin vectors of each pinyin, determining first character vectors corresponding to the pinyin vectors through a convolutional neural network, and further determining third semantics of problem information according to the first character vectors by taking the characters as basic units.

Because Chinese characters have phonetic attributes, namely the pronunciation of each character contains semantic information to a certain extent, in the embodiment of the invention, each Chinese character is mapped into Chinese pinyin, each pinyin character is represented by a vector, then a character vector, namely a first character vector, is obtained based on the pinyin vector through a Convolutional Neural Network (CNN), and then the semantics of the characters are combined into the semantics of sentences through another layer of long-short memory neural network (LSTM). For example, for the sentence "I am a Chinese," the input is "wo/shi/yi/ge/zhong/guo/ren".

Step 1024: and generating a corresponding font image for each character, converting the font image into a corresponding second character vector, and determining a fourth semantic meaning of the problem information according to the second character vector by taking the character as a basic unit.

In the embodiment of the invention, because the Chinese characters belong to pictographic characters, and the shape of each Chinese character contains rich semantics, a character pattern model is added, each Chinese character is regarded as a picture, and each character pattern picture is changed into a vector by a convolutional neural network in machine vision. Thus, the graphic meaning of the Chinese character is covered, and then the semantics of the character are combined into the semantics of the sentence through another layer of long-short memory neural network (LSTM).

Step 1025: comprehensively determining semantic information of the problem information according to the first semantic, the second semantic, the third semantic and the fourth semantic, performing word segmentation processing on the problem information according to the semantic information, determining a final word segmentation result, and extracting a plurality of indexes with the same attribute in the problem information.

In the embodiment of the invention, comprehensive semantics are determined based on word, character, pinyin and character multi-mode Chinese natural language processing models, and finally the importance or weight of four different models is determined by using a neural network system based on attention (attention) to determine a final processing result. For the whole multi-modal model, the first semantic meaning, the second semantic meaning, the third semantic meaning and the fourth semantic meaning are intermediate processing results and can not be displayed to a user, namely, the multi-modal model converts problem information into corresponding words, characters, pinyin and characters and then serves as model input, and further a final word segmentation result can be obtained. Meanwhile, the attribute of each participle can be determined according to the participle processing process and the participle result, and the participles are determined to have the same attribute; when some participles are determined to have the same attribute, the participles can be used as indexes with the same attribute.

On the basis of the above embodiment, after the step 102 "performing word segmentation processing on the question information based on the multi-modal model", the method further includes:

and performing semantic understanding processing on the problem information according to the word segmentation result, judging whether the problem information needs to be rewritten according to the semantic understanding processing result, and correcting the problem information when the problem information needs to be rewritten.

In the embodiment of the invention, since some financial problems relate to professional terms, if the user needs to manually input an accurate problem, the requirement on the professional level of the user is high, and time and labor are wasted, the problem information is corrected based on semantic understanding processing in the embodiment, so that the subsequently generated query statement is more accurate. For example, if the question information input by the user is "ten-fold ten-shares of ten years", the question may be rewritten to "stock that has been increased by ten-fold in the number of rewarding quotation prices before the past ten years", or the like.

On the basis of the above embodiment, performing word segmentation processing on the question information based on the multi-modal model includes:

and performing word segmentation processing on the problem information based on a multi-mode model, and performing part-of-speech tagging on the segmented words based on a preset bidirectional long and short memory recurrent neural network model.

In the embodiment of the invention, word segmentation is carried out on the problem information, and part of speech tagging is carried out, and the word segmentation can be particularly carried out on the basis of a preset bidirectional long and short memory recurrent neural network model. In this embodiment, a Bi-Directional Language Model (Bi-Directional Language Model) is trained in advance based on a chinese corpus, and then the trained Language Model is used to initialize the word vectors. After the word vector is initialized, the word vector of each word site is obtained by using a bidirectional long and short memory recurrent neural network model, and the word vector is used as the input of a classification model to determine the part-of-speech mark of each word. For example, if "i is a Chinese," the process of tagging parts of speech is shown in fig. 3.

On the basis of the above embodiment, the process of establishing the dependency relationship between words in step 103 may be specifically implemented based on a shift-reduce algorithm of a stack-neural network, and in each step, the algorithm determines whether the next action is shift or reduce by using one classifier. The algorithm models characters of which a syntax tree is built (stack LSTM) and characters of which the syntax tree is not built (queue LSTM) by using two long and short memory neural networks. A Stack-LSTM based syntactic dependency tree model is shown in FIG. 4. In step 103, when generating the query sentence, the bi-directional language model may be trained based on the chinese corpus in advance, and then the word vector may be initialized by using the trained language model. After the word vector is initialized, the word vector of each word site is obtained by using a bidirectional long and short memory recurrent neural network model, and the word vector is used as the input of a classification model to determine the part-of-speech mark of each word. And then establishing a dependency relationship between words based on a shift-reduce algorithm of a stack-neural network, and converting the problem information into a query statement in a machine language form corresponding to each index according to the dependency relationship.

On the basis of the above embodiment, the step 104 "displaying the query results of all the indexes under the same coordinate system" includes:

determining parameters required to be displayed by each index in the problem information, and determining a display mode corresponding to the parameters, wherein the display mode comprises one or more of a curve graph, a bar graph, a pie graph and a table; and displaying the parameters corresponding to each index in the display mode.

In the embodiment of the invention, the semantic analysis is carried out on the problem information to determine the parameters needing to be displayed. In general, the parameter to be displayed is the last indicator in the question information. For example, the question information is "GDP in china, the united states, and japan", and for the question information, the parameter that needs to be displayed is "GDP". Meanwhile, the parameter may be displayed in a plurality of display modes, and one or more display modes may be selected in this embodiment. Preferably, one parameter selects a display mode, and the parameters corresponding to the indexes with the same attribute are displayed in the same display mode, so that a user can conveniently compare and view the parameters. Taking the user query "GDP in china, usa and japan" as an example, one display manner of displaying the query result is shown in fig. 5. In fig. 5, curve 1 represents the GDP in the united states, curve 2 represents the GDP in china, and curve 3 represents the GDP in japan.

Optionally, a change rate (such as a proportional increase rate) of the parameter of each index may also be determined, and the parameter change rate is displayed simultaneously with the display of the parameter. Wherein the rate of change of the parameter is typically displayed in the form of a line graph.

According to the method for obtaining the multi-index answer to the question, provided by the embodiment of the invention, the multi-mode model is used for carrying out word segmentation on the question information input by the user, a plurality of indexes with the same attribute can be extracted, and the dependency relationship between words is established; and converting the problem information into a query statement in a machine language form according to the dependency relationship, thereby quickly determining the query result of each index by using the query statement and simultaneously displaying the query results of a plurality of indexes. The method carries out word segmentation processing based on a multi-mode model, and can carry out semantic analysis more accurately based on a multi-mode model, so that word segmentation results are more accurate; the semantic information of the sentence can be completely and comprehensively described based on the dependency relationship between the words, so that the query sentence is more accurate, more accurate results can be queried, and the query accuracy is improved; by extracting indexes with the same attribute and converting the problem information into the query statement corresponding to each index, the original problem information can be simplified, and the query result corresponding to each index can be queried accurately; and simultaneously displaying the query results of all indexes, and facilitating the comparison of the query results among the indexes by the user. The question is rewritten based on semantics as necessary to generate a more accurate query statement. And determining a display mode based on the index, displaying the change rate of the data, and facilitating the user to check the query result.

The above describes in detail the method flow for obtaining answers to multi-index questions, and the method can also be implemented by corresponding devices, and the structure and function of the device are described in detail below.

The apparatus for obtaining answers to multi-index questions provided in the embodiment of the present invention, as shown in fig. 6, includes:

the question acquiring module 61 is used for acquiring question information input by a user, and the question information comprises a plurality of indexes with the same attribute;

the word segmentation module 62 is configured to perform word segmentation processing on the problem information based on a multi-modal model, determine a word segmentation result, and extract multiple indexes in the problem information, where the multi-modal model includes at least two of a word model, a character model, a pinyin model, and a font model;

the processing module 63 is configured to establish a dependency relationship between words according to the word segmentation result, and convert the problem information into a query statement in a machine language form corresponding to each index according to the dependency relationship;

and the query display module 64 is configured to query the corresponding database according to the query statement, determine a query result corresponding to each index, and display the query results of all the indexes in the same coordinate system.

On the basis of the above embodiment, the word segmentation module 62 includes:

the word processing unit is used for determining an initial word segmentation result through a preset word segmentation model and determining a first semantic meaning of the problem information by taking a word as a basic unit;

the character processing unit is used for determining all characters of the problem information and determining a second semantic meaning of the problem information by taking the characters as a basic unit;

the pinyin processing unit is used for determining pinyin corresponding to each character, determining a pinyin vector of each pinyin, determining a first character vector corresponding to the pinyin vector through a convolutional neural network, and further determining a third semantic meaning of problem information according to the first character vector by taking the character as a basic unit;

the font processing unit is used for generating a corresponding font picture for each character, converting the font picture into a corresponding second character vector, and further determining a fourth semantic meaning of the problem information according to the second character vector by taking the character as a basic unit;

the multi-mode word segmentation unit is used for comprehensively determining semantic information of the problem information according to the first semantic, the second semantic, the third semantic and the fourth semantic, performing word segmentation processing on the problem information according to the semantic information, determining a final word segmentation result, and extracting a plurality of indexes with the same attribute in the problem information.

On the basis of the above embodiment, the apparatus further comprises a rewriting module;

after the word segmentation module 62 performs word segmentation processing on the problem information based on the multi-modal model, the rewriting module is configured to perform semantic understanding processing on the problem information according to a word segmentation result, determine whether the problem information needs to be rewritten according to a semantic understanding processing result, and correct the problem information when rewriting is needed.

On the basis of the above embodiment, the word segmentation module 62 is configured to:

On the basis of the above embodiment, the query display module 64 is configured to:

determining parameters required to be displayed by each index in the problem information, and determining a display mode corresponding to the parameters, wherein the display mode comprises one or more of a curve graph, a bar graph, a pie graph and a table;

and displaying the parameters corresponding to each index in the display mode.

According to the device for obtaining the multi-index answer to the question, provided by the embodiment of the invention, the multi-mode model is used for carrying out word segmentation on the question information input by the user, a plurality of indexes with the same attribute can be extracted, and the dependency relationship between words is established; and converting the problem information into a query statement in a machine language form according to the dependency relationship, thereby quickly determining the query result of each index by using the query statement and simultaneously displaying the query results of a plurality of indexes. The method carries out word segmentation processing based on a multi-mode model, and can carry out semantic analysis more accurately based on a multi-mode model, so that word segmentation results are more accurate; the semantic information of the sentence can be completely and comprehensively described based on the dependency relationship between the words, so that the query sentence is more accurate, more accurate results can be queried, and the query accuracy is improved; by extracting indexes with the same attribute and converting the problem information into the query statement corresponding to each index, the original problem information can be simplified, and the query result corresponding to each index can be queried accurately; and simultaneously displaying the query results of all indexes, and facilitating the comparison of the query results among the indexes by the user. The question is rewritten based on semantics as necessary to generate a more accurate query statement. And determining a display mode based on the index, displaying the change rate of the data, and facilitating the user to check the query result.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. A method for obtaining answers to multi-index questions, comprising:

2. The method of claim 1, wherein the tokenizing the question information based on a multi-modal model comprises:

determining an initial word segmentation result through a preset word segmentation model, and determining a first semantic meaning of the problem information by taking a word as a basic unit;

determining all characters of the question information, and determining a second semantic meaning of the question information by taking the characters as basic units;

determining pinyin corresponding to each character, determining pinyin vectors of each pinyin, determining first character vectors corresponding to the pinyin vectors through a convolutional neural network, and further determining third semantics of the problem information according to the first character vectors by taking the characters as basic units;

generating a corresponding font image for each character, converting the font image into a corresponding second character vector, and determining a fourth semantic meaning of the problem information according to the second character vector by taking the character as a basic unit;

comprehensively determining semantic information of the problem information according to the first semantic, the second semantic, the third semantic and the fourth semantic, performing word segmentation processing on the problem information according to the semantic information, determining a final word segmentation result, and extracting a plurality of indexes with the same attribute in the problem information.

3. The method of claim 1, further comprising, after the tokenizing the question information based on the multi-modal model:

and performing semantic understanding processing on the problem information according to the word segmentation result, judging whether the problem information needs to be rewritten according to the semantic understanding processing result, and correcting the problem information when rewriting is needed.

4. The method of claim 1, wherein the tokenizing the question information based on a multi-modal model comprises:

5. The method of claim 1, wherein displaying the query results of all the metrics in the same coordinate system comprises:

and displaying the parameters corresponding to each index in the display mode.

6. An apparatus for obtaining answers to a multi-index question, comprising:

7. The apparatus of claim 6, wherein the word segmentation module comprises:

the character processing unit is used for determining all characters of the question information and determining a second semantic meaning of the question information by taking the characters as a basic unit;

the pinyin processing unit is used for determining pinyin corresponding to each character, determining a pinyin vector of each pinyin, determining a first character vector corresponding to the pinyin vector through a convolutional neural network, and further determining a third semantic meaning of the problem information according to the first character vector by taking the character as a basic unit;

and the multi-mode word segmentation unit is used for comprehensively determining semantic information of the problem information according to the first semantic, the second semantic, the third semantic and the fourth semantic, performing word segmentation processing on the problem information according to the semantic information, determining a final word segmentation result, and extracting a plurality of indexes with the same attribute in the problem information.

8. The apparatus of claim 6, further comprising a rewrite module;

after the word segmentation module carries out word segmentation processing on the problem information based on a multi-mode model, the rewriting module is used for carrying out semantic understanding processing on the problem information according to the word segmentation result, judging whether the problem information needs to be rewritten according to the semantic understanding processing result, and correcting the problem information when rewriting is needed.

9. The apparatus of claim 6, wherein the word segmentation module is configured to:

10. The apparatus of claim 6, wherein the query display module is configured to:

and displaying the parameters corresponding to each index in the display mode.