CN117093693A - Intelligent question-answering method based on NLP - Google Patents

Intelligent question-answering method based on NLP Download PDF

Info

Publication number
CN117093693A
CN117093693A CN202311070132.7A CN202311070132A CN117093693A CN 117093693 A CN117093693 A CN 117093693A CN 202311070132 A CN202311070132 A CN 202311070132A CN 117093693 A CN117093693 A CN 117093693A
Authority
CN
China
Prior art keywords
entity
training
extraction model
data
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311070132.7A
Other languages
Chinese (zh)
Other versions
CN117093693B (en
Inventor
韩三普
陈竑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Shenwei Zhixin Technology Co ltd
Original Assignee
Beijing Shenwei Zhixin Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Shenwei Zhixin Technology Co ltd filed Critical Beijing Shenwei Zhixin Technology Co ltd
Priority to CN202311070132.7A priority Critical patent/CN117093693B/en
Publication of CN117093693A publication Critical patent/CN117093693A/en
Application granted granted Critical
Publication of CN117093693B publication Critical patent/CN117093693B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/006Artificial life, i.e. computing arrangements simulating life based on simulated virtual individual or collective life forms, e.g. social simulations or particle swarm optimisation [PSO]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/086Learning methods using evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/12Computing arrangements based on biological models using genetic models
    • G06N3/126Evolutionary algorithms, e.g. genetic algorithms or genetic programming
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Physiology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Animal Behavior & Ethology (AREA)
  • Human Computer Interaction (AREA)
  • Genetics & Genomics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses an intelligent question-answering method based on NLP, which comprises the following steps: preprocessing the acquired corpus data to obtain basic support data; constructing an entity extraction model and an entity relation extraction model by adopting an NLP model, and training the entity extraction model and the entity relation extraction model; according to the basic support data, a trained entity extraction model and an entity relation extraction model are adopted to construct a question-answer database; acquiring question sentence information, constructing a query sentence according to the question sentence information, and matching target entity data and target entity relations in a question-answer database according to the query sentence; and generating answer information according to the target entity data and the target entity relationship to complete intelligent question answering. According to the method, the relationship between various entities is determined by constructing the knowledge graph, and the query statement is constructed by the question sentence input by the user, so that the relationship between the target entity data and the target entity is obtained, intelligent question answering is realized, and the working efficiency is effectively improved.

Description

Intelligent question-answering method based on NLP
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to an intelligent question-answering method based on NLP.
Background
At present, a large amount of customer service exists in each industry, and the customer service is not only a more-number of posts in the service industry, but also has mobility larger than other posts. Every time a customer service leaves a job, a new customer service is required, and after training for a certain time, the customer service can go on duty, so that the consumed time cost is high. In addition, the conventional customer service often communicates with customers through typing, but the problem of low efficiency is caused by adopting the typing mode to communicate with the customers.
Disclosure of Invention
Aiming at the defects in the prior art, the intelligent question-answering method based on NLP solves the problems in the prior art.
In order to achieve the aim of the invention, the invention adopts the following technical scheme: an intelligent question-answering method based on NLP, comprising:
acquiring corpus data comprising entity data and entity relations, and preprocessing the corpus data to obtain basic support data;
constructing an entity extraction model and an entity relation extraction model by using an NLP model, and training the entity extraction model and the entity relation extraction model to obtain a trained entity extraction model and an entity relation extraction model;
According to the basic support data, a trained entity extraction model and an entity relation extraction model are adopted to construct a question-answer database;
acquiring question sentence information sent by a user, obtaining a text to be analyzed, constructing a query sentence by using the text to be analyzed, and matching target entity data and target entity relations in the question-answer database according to the query sentence;
and generating answer information corresponding to the question sentence information according to the target entity data and the target entity relationship, and completing the intelligent question answering.
In one possible implementation manner, obtaining corpus data including entity data and entity relationships, and preprocessing the corpus data to obtain basic support data, including:
acquiring a plurality of pieces of corpus data comprising entity data and entity relations through a web crawler tool, or acquiring the plurality of pieces of corpus data comprising the entity data and the entity relations through a man-machine interaction mode, or acquiring the plurality of pieces of corpus data comprising the entity data and the entity relations in a corpus database;
converting the acquired corpus data into txt format to obtain converted corpus data, and cleaning, dividing and naming entity data enhancing the converted corpus data to obtain basic support data.
In one possible implementation manner, the washing, the segmentation and the named entity data enhancement processing are performed on the converted corpus data to obtain basic support data, which includes:
removing repeated content from the converted corpus data to obtain first corpus data;
calling a third-party intelligent error correction API to correct the first corpus data to obtain second corpus data;
removing redundant words in the second corpus data by adopting a regular expression to obtain third corpus data, wherein the redundant words are used for representing preset words to be removed;
the method comprises the steps of performing word segmentation on third corpus data by using a Jieba word segmentation tool, and removing stop words in the third corpus data to obtain fourth corpus data;
dividing the fourth corpus data according to a preset text length limiting threshold value to obtain at least one fifth corpus data;
performing paraphrasing replacement on text entity words in the fifth corpus data, or randomly exchanging the positions of two adjacent words, or performing back-translation enhancement on the fifth corpus data through third-party translation software to obtain enhanced corpus data;
and taking the fifth corpus data and the enhanced corpus data together as basic support data.
In one possible implementation manner, the entity extraction model and the entity relation extraction model are constructed by using an NLP model, and the entity extraction model and the entity relation extraction model are trained to obtain a trained entity extraction model and entity relation extraction model, which comprises the following steps:
constructing an entity extraction model by adopting an ALBERT-BiLSTM-CRF model in the NLP model, and constructing an entity relation extraction model by adopting a Mutil att_BiGRU model in the NLP model;
training the entity extraction model and the entity relation extraction model by adopting training data to obtain the entity extraction model and the entity relation extraction model after training.
In one possible implementation, the entity extraction model includes a first input layer, an ALBERT layer, a BiLSTM layer, a CRF layer, and a first output layer;
the entity relation extraction model comprises a second input layer, a representation layer, a BIGRU layer, a character attention layer, a statement attention layer and an output layer;
training the entity extraction model and the entity relation extraction model by using training data, wherein the training comprises the following steps:
acquiring training corpus data, preprocessing the training corpus data, and obtaining preprocessed training corpus data;
Marking the preprocessed training corpus data by adopting a BIO marking mode to obtain the preprocessed training corpus data and corresponding BIO marking;
training the entity extraction model according to the preprocessed training corpus data and the corresponding BIO labels to obtain a trained entity extraction model;
performing relationship labeling by the preprocessed training corpus data to obtain the preprocessed training corpus data and corresponding relationship labeling;
and training the entity relation extraction model according to the preprocessed training corpus data and the corresponding relation labels to obtain a trained entity relation extraction model.
In one possible implementation manner, training the entity extraction model according to the preprocessed training corpus data and the corresponding BIO labels to obtain a trained entity extraction model, including:
dividing the preprocessed training corpus data into a first training set and a first verification set;
training corpus data in the first training set is used as actual input of an entity extraction model, BIO labels of the training corpus data are used as expected output of the entity extraction model, and an improved MBO algorithm is adopted to train network parameters of the entity extraction model for a plurality of times;
Taking the training corpus data in the first verification set as the actual input of the entity extraction model, and obtaining a first training result of the entity extraction model, wherein the first training result comprises training completion or incompletion;
if the first training result is that training is completed, taking the network parameter at the moment as the final network parameter of the entity extraction model to obtain a training completed entity extraction model; if the first training result is that training is not completed, training is conducted again according to the first training set until the first training result is that training is completed, and an entity extraction model with the completed training is obtained.
In one possible implementation manner, training the entity relationship extraction model according to the preprocessed training corpus data and the corresponding relationship labels to obtain a trained entity relationship extraction model, including:
dividing the preprocessed training corpus data into a second training set and a second verification set;
training corpus data in the second training set is used as actual input of an entity relation extraction model, relation labels of the training corpus data are used as expected output of the entity relation extraction model, and an improved MBO algorithm is adopted to train network parameters of the entity relation extraction model for a plurality of times;
Taking the training corpus data in the second verification set as the actual input of the entity relation extraction model, and obtaining a second training result of the entity relation extraction model, wherein the second training result comprises training completion or incompletion;
if the second training result is that training is completed, taking the network parameter at the moment as the final network parameter of the entity relation extraction model to obtain a training completed entity relation extraction model; and if the second training result is that the training is not completed, retraining according to the first training set until the second training result is that the training is completed, and obtaining an entity relation extraction model after the training is completed.
In one possible implementation, an improved MBO algorithm includes:
a1, randomly generating network parameters of a model to be trained to obtain an initial individual;
a2, randomly generating N initial individuals to obtain an initial population;
a3, acquiring an fitness value of each individual in the initial population, sorting the individuals in the initial population according to the fitness value, taking the front NP1 individual as a first sub-population, and taking the rear NP2 individual as a second sub-population; wherein np1=ceil (p×np), np2=np-NP 1, NP represents the total number of individuals in the initial population, p represents mobility, and p=5/12, ceil represents a rounding function;
A4, setting a first counter t=1, a second counter i=1 and a third counter j=1;
a5, taking out the ith individual in the first sub-population during the t-th trainingAnd let k have an initial value of 1;
a6, obtaining the individualThe corresponding first update index parameter r1=rand 1.2, rand representing the random number in (0, 1);
judging whether the first updating index parameter r1 is smaller than or equal to the mobility p, if so, for the individualThe kth value of (a) is first updated and step A7 is entered, otherwise +.>The k value of the step (a) is updated for the second time, and the step (A7) is entered;
the first update includes:
a611 to individualsThe kth value of (a) is updated as:
wherein,representing individual->K=1, 2, …, K representing the total number of network parameters of the model to be trained, +.>K-th value in the individual with the largest current fitness value,/for>A kth value representing a first random individual in the first sub-population,/v>A kth value representing a second random individual in the first sub-population,/v>A kth value representing a third random individual in the first sub-population;
a612, judge and includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +. >K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
the second update includes:
a621 to individualsThe kth value of (a) is updated as:
wherein,a kth value representing a fourth random individual in the first sub-population, and r1, r2, q1, and q2 are different from each other;
a622, judge and includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
a7, judging whether the value of K is equal to or greater than K, if so, completing the individualIf the update of the step (a) is not performed, the step (A8) is performed, otherwise, the step (A6) is performed;
a8, judging whether the second counter i is equal to or larger than NP1, if so, entering a step A9, otherwise, adding one to the count value of the second counter i, and returning to the step A5;
a9, taking out the jth individual in the second sub-population during the tth trainingAnd let k have an initial value of 1;
A10. obtaining an individualThe corresponding adjustment rate BAR is:
BAR=λ+μt
wherein λ represents a first constant factor, μ represents a second constant factor, t max Representing the maximum number of iterations, BAR min Representing the lower limit of the adjustment rate, BAR max An upper limit indicating the adjustment rate;
a11, acquiring a second updated index parameter r2 = rand, judging whether the second updated index parameter r2 is smaller than or equal to the mobility p, and if yes, for the individual Make a third update and go to A12, otherwise +.>Performing a fourth update and entering A12;
the third update includes:
a1111, to individualsThe kth value of (a) is updated as:
wherein,representing individual->The value of k-th value after updating, for example>The kth value in the individual with the largest fitness value in the t-th training is represented;
a1112, judge to includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
the fourth update includes:
a1121, for individualsThe kth value of (a) is updated as:
wherein,a kth value representing a random individual in the second sub-population;
a1122, judgment includesIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
a12, judging whether the second updating index parameter r2 is larger than the adjustment rate BAR, if so, the individual is determinedThe kth value of (a) is updated to +.>Otherwise, directly entering the step A13;wherein dx is k Represents a random step size, alpha represents a weight factor, +.>S max Representing the maximum step size;
A13, judging whether the value of K is equal to or greater than K, if so, entering a step A14, otherwise, adding one to the count value of K, and returning to the step A11;
a14, re-acquiring the fitness value of each individual in the first sub-population and the second sub-population, determining L target individuals with the largest fitness value, and carrying out mutation update on the target individuals, wherein the mutation update is as follows:
wherein,represents the target individual at the time of the t-th training +.>K-th value of>Representing updated-> The kth value in the individual with the greatest current fitness, cauchy (0, 1) represents the random number generated by the standard Cauchy distribution;
a15, judging whether the first counter t is equal to or greater than the maximum iteration number t max If so, taking the individual with the largest fitness as the final network parameter of the model to be trained, otherwise, leading the count value of the first counter t to be increased by one, and returning to the step A5.
In one possible implementation manner, according to the basic support data, a question-answer database is built by using a trained entity extraction model and an entity relation extraction model, including:
entity data in basic support data is identified by adopting a trained entity extraction model, and the entity data is used as map nodes;
Identifying entity relation data in the basic support data by adopting the entity relation extraction model after training, and obtaining knowledge graph data by taking the entity relation data as a connecting edge between graph nodes;
and taking the knowledge graph data as a question-answer database.
In one possible implementation manner, constructing a query sentence with text to be analyzed, and matching target entity data and target entity relations in the question-answer database according to the query sentence, including:
performing word segmentation, entity extraction and entity relation extraction operation on the text to be analyzed to obtain extraction data;
based on the extracted data, generating a query sentence, and matching target entity data and target entity relations in a question-answer database according to the query sentence.
According to the intelligent question-answering method based on NLP, the knowledge graph is constructed, so that various entities and relations among the entities can be determined, when a user has a question, the query statement is constructed through the question inputted by the user, so that the entity corresponding to the question and the relation among the entities can be queried, answers corresponding to the question are generated, customer service efficiency can be improved, and time cost expenditure is reduced.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
Fig. 1 is a flowchart of an intelligent question-answering method based on NLP according to an embodiment of the present application.
Specific embodiments of the present application have been shown by way of the above drawings and will be described in more detail below. The drawings and the written description are not intended to limit the scope of the inventive concepts in any way, but rather to illustrate the inventive concepts to those skilled in the art by reference to the specific embodiments.
Detailed Description
The following description of the embodiments of the present application is provided to facilitate understanding of the present application by those skilled in the art, but it should be understood that the present application is not limited to the scope of the embodiments, and all the applications which make use of the inventive concept are protected by the spirit and scope of the present application as defined and defined in the appended claims to those skilled in the art.
Embodiments of the present application are described in detail below with reference to the accompanying drawings.
As shown in fig. 1, an intelligent question-answering method based on NLP includes:
s1, acquiring corpus data comprising entity data and entity relations, and preprocessing the corpus data to obtain basic support data.
By preprocessing the corpus data, the entity extraction and entity relation extraction can be conveniently carried out later, so that the final result is more accurate.
S2, constructing an entity extraction model and an entity relation extraction model by adopting an NLP (Neuro-Linguistic Programming, natural language processing) model, and training the entity extraction model and the entity relation extraction model to obtain a trained entity extraction model and a trained entity relation extraction model.
The main purpose of constructing the entity extraction model and the entity relation extraction model is as follows: the first aspect may perform entity extraction and entity relationship extraction on the basic support data, so that a knowledge graph may be constructed to form a database. The second aspect can perform entity extraction and entity relation extraction on the question sentence information sent by the user, so that a query sentence can be formed, data in a database can be queried according to the query sentence, and the queried data can be fed back to the user as an answer.
And S3, constructing a question-answer database by adopting a trained entity extraction model and an entity relation extraction model according to the basic support data.
The building of the question-answer database by using the trained entity extraction model and entity relation extraction model may include: and extracting the entity from the basic support data through the entity extraction model after training, extracting the entity relation from the basic support data through the entity relation extraction model after training, taking the extracted entity as a node, taking the extracted entity relation as a connecting edge between the nodes, and storing the data in a knowledge graph mode, thereby constructing a question-answer database.
S4, acquiring question sentence information sent by a user, obtaining a text to be analyzed, constructing a query sentence by the text to be analyzed, and matching target entity data and target entity relations in the question-answer database according to the query sentence.
S5, generating answer information corresponding to the question sentence information according to the target entity data and the target entity relationship, and completing the intelligent question answering.
Optionally, after the target entity data and the target entity relationship are obtained, text information may be generated according to the target entity data and the target entity relationship, and the text information may be sent to the client. Basic support data including target entity data and target entity relationships may also be presented to the customer.
In one possible implementation manner, obtaining corpus data including entity data and entity relationships, and preprocessing the corpus data to obtain basic support data, including:
the method comprises the steps of obtaining a plurality of pieces of corpus data comprising entity data and entity relations through a web crawler tool, or obtaining the plurality of pieces of corpus data comprising the entity data and the entity relations through a man-machine interaction mode, or obtaining the plurality of pieces of corpus data comprising the entity data and the entity relations in a corpus database.
Converting the acquired corpus data into txt format to obtain converted corpus data, and cleaning, dividing and naming entity data enhancing the converted corpus data to obtain basic support data.
Data crawled by a web crawler is generally in json format, so that the data crawled by the web crawler needs to be converted into txt format, so that the aspects are uniformly processed.
In one possible implementation manner, the washing, the segmentation and the named entity data enhancement processing are performed on the converted corpus data to obtain basic support data, which includes:
and removing repeated content from the converted corpus data to obtain first corpus data.
Because the data is crawled by the web crawlers, a large amount of repeated content can exist, and the repeated content can be reserved for one part, so that the processing amount of the subsequent data is reduced.
And calling a third-party intelligent error correction API (Application Program Interface ) to correct the errors of the first corpus data to obtain second corpus data.
And eliminating redundant words in the second corpus data by adopting a regular expression to obtain third corpus data, wherein the redundant words are used for representing preset words to be eliminated.
And performing word segmentation on the third corpus data by using a Jieba word segmentation tool, and removing stop words in the third corpus data to obtain fourth corpus data.
And dividing the fourth corpus data according to a preset text length limiting threshold value to obtain at least one fifth corpus data.
And performing paraphrasing replacement on text entity words in the fifth corpus data, or randomly exchanging the positions of two adjacent words, or performing back-translation enhancement on the fifth corpus data through third-party translation software to obtain enhanced corpus data.
And taking the fifth corpus data and the enhanced corpus data together as basic support data.
In one possible implementation manner, the entity extraction model and the entity relation extraction model are constructed by using an NLP model, and the entity extraction model and the entity relation extraction model are trained to obtain a trained entity extraction model and entity relation extraction model, which comprises the following steps:
And constructing an entity extraction model by adopting an ALBERT-BiLSTM-CRF model in the NLP model, and constructing an entity relation extraction model by adopting a Mutil att_BiGRU model in the NLP model.
It should be noted that, besides the ALBERT-BiLSTM-CRF model and the Mutil att_BiGRU model, other NLP models can be used to extract entities and entity relationships.
Training the entity extraction model and the entity relation extraction model by adopting training data to obtain the entity extraction model and the entity relation extraction model after training.
Alternatively, the entity extraction model and the entity relationship extraction model may be trained using a random gradient descent optimization algorithm, an AdaGrad (Adaptive Gradient ) algorithm, adam (Adaptive momentum, adaptive momentum random optimization) algorithm, or other optimization algorithm.
In one possible implementation, the entity extraction model includes a first input layer, an ALBERT layer, a BiLSTM layer, a CRF layer, and a first output layer.
The model firstly obtains word representation vector representation corresponding to words in each sequence text and fused with semantic information in the general field through an ALBERT layer, then inputs the word representation vector representation into a BiLSTM layer, the BiLSTM layer can obtain sentence semantic codes, then decodes the sentence semantic codes through a CRF layer, and obtains a marking sequence with optimal text entity information according to rationality relations among labels.
The entity relation extraction model comprises a second input layer, a representation layer, a BIGRU layer, a character attention layer, a statement attention layer and an output layer.
The sentence characteristic representation network respectively represents each sentence as a character vector and a position vector, and splices the two obtained vector forms. And the spliced vector is used as the input of the BIGRU layer to perform feature coding, and further learn the feature representation of the sentence context. Before outputting, the attention-introducing mechanism distributes attention to each entity in the sentence to distinguish the role of different entities on relationship classification, and finally, the whole feature expression of the sentence is output.
In the sentence characteristic representation network, the input sentence needs to be represented in a vector form to conform to the input form of the BIGRU layer, so that the conversion can be performed through the representation layer, and the specific process is as follows: for any given sentence (a 1, a2, …, an), an is the individual characters in the sentence, and a vector form Vec (V) is formed by combining the character vector and the position vector W ,V L ),V W Representing character vectors, V L Is a position vector. And the character vectors adopt Word2vec Word vector models, each character in the sentence is mapped into a corresponding vector space, and finally the vector representation is obtained. The closer the absolute distance from the target word is, the more information is included to help determine the relationship between the entities, thus introducing a position vector in terms of vector representation of the input sentence. The position vector represents the relative position of the current character with respect to the preceding and following entities in the extraction task, and the arrangement direction of the characters in the sentence is taken as the positive direction, so that the relative position of each character with respect to the preceding entity in the entity pair is represented as positive, and the relative position with respect to the following entity is represented as negative.
For the final vector (h 1, h2, …, hn) output by the BiGRU network, n represents the length of the sentence, hn represents the final vector representation of the character through the BiGRU layer, mainly the attention weight of hn is obtained through calculation and normalization processing, and finally the sentence vector representation through the character attention layer is obtained.
The basic support data contains a large number of sentences containing the same entity pairs, the sentences are in relatively close connection, and the relationship definition categories are inconsistent in different sentences, so that the difficulty is increased for extracting the whole relationship. Therefore, a sentence attention layer is introduced, and the sentence attention layer focuses on learning sentence collection features containing the same entity pairs in basic support data, so that attention is allocated to each sentence in the collection.
Training the entity extraction model and the entity relation extraction model by using training data, wherein the training comprises the following steps:
and acquiring training corpus data, preprocessing the training corpus data, and obtaining preprocessed training corpus data.
And marking the preprocessed training corpus data by adopting a BIO marking mode to obtain the preprocessed training corpus data and the corresponding BIO marking.
And training the entity extraction model according to the preprocessed training corpus data and the corresponding BIO labels to obtain a trained entity extraction model.
And carrying out relationship labeling by the preprocessed training corpus data to obtain the preprocessed training corpus data and the corresponding relationship labeling.
And training the entity relation extraction model according to the preprocessed training corpus data and the corresponding relation labels to obtain a trained entity relation extraction model.
In one possible implementation manner, training the entity extraction model according to the preprocessed training corpus data and the corresponding BIO labels to obtain a trained entity extraction model, including:
the preprocessed training corpus data is divided into a first training set and a first verification set.
And training the network parameters of the entity extraction model for a plurality of times by adopting an improved MBO algorithm.
And taking the training corpus data in the first verification set as the actual input of the entity extraction model, and obtaining a first training result of the entity extraction model, wherein the first training result comprises training completion or incompletion.
And if the first training result is that training is completed, taking the network parameter at the moment as the final network parameter of the entity extraction model to obtain the entity extraction model after training is completed. If the first training result is that training is not completed, training is conducted again according to the first training set until the first training result is that training is completed, and an entity extraction model with the completed training is obtained.
Optionally, when the entity extraction model is verified through the first verification set, an error function value of the entity extraction model is obtained, if the error function value is smaller than a set threshold, the first training result can be judged to be training completion, otherwise, the first training result can be judged to be training incompletion.
In one possible implementation manner, training the entity relationship extraction model according to the preprocessed training corpus data and the corresponding relationship labels to obtain a trained entity relationship extraction model, including:
dividing the preprocessed training corpus data into a second training set and a second verification set.
And training the network parameters of the entity relation extraction model for a plurality of times by adopting an improved MBO algorithm.
And taking the training corpus data in the second verification set as the actual input of the entity relation extraction model, and obtaining a second training result of the entity relation extraction model, wherein the second training result comprises training completion or incompletion.
And if the second training result is that the training is completed, taking the network parameter at the moment as the final network parameter of the entity relation extraction model to obtain the entity relation extraction model after the training is completed. And if the second training result is that the training is not completed, retraining according to the first training set until the second training result is that the training is completed, and obtaining an entity relation extraction model after the training is completed.
Optionally, when the entity relation extraction model is verified through the second verification set, an error function value of the entity relation extraction model is obtained, if the error function value is smaller than a set threshold, the second training result can be judged to be training completion, and otherwise, the second training result can be judged to be training incompletion.
In one possible implementation, an improved MBO algorithm includes:
a1, randomly generating network parameters of a model to be trained to obtain an initial individual.
A2, randomly generating N initial individuals to obtain an initial population.
A3, acquiring the fitness value of each individual in the initial population, sorting the individuals in the initial population according to the fitness value, taking the first NP1 individual as a first sub-population, and the second NP2 individual as a second sub-population. Where np1=ceil (p×np), np2=np-NP 1, NP represents the total number of individuals in the initial population, p represents mobility, and p=5/12, ceil represents a rounding function.
Optionally, the labeling accuracy or the relation extraction accuracy of all samples can be used as the fitness value, and when the fitness value is greater than a certain threshold, the model training can be considered to be completed. It should be noted that, an error function (such as a mean square error of all samples) of the network model may also be obtained, and a negative value of the error function is taken as the fitness value.
A4, a first counter t=1, a second counter i=1, and a third counter j=1 are set.
A5, taking out the ith individual in the first sub-population during the t-th trainingAnd let k have an initial value of 1.
A6, obtaining the individualThe corresponding first update index parameter r1=rand 1.2, rand representing the random number in (0, 1).
Judging whether the first updating index parameter r1 is smaller than or equal to the mobility p, if so, for the individualThe kth value of (a) is first updated and step A7 is entered, otherwise +.>A second update is made to the kth value of (b) and step A7 is entered.
The first update includes:
a611 to individualsThe kth value of (a) is updated as:
wherein,representing individual->K=1, 2, …, K representing the total number of network parameters of the model to be trained, +.>K-th value in the individual with the largest current fitness value,/for >A kth value representing a first random individual in the first sub-population,/v>A kth value representing a second random individual in the first sub-population,/v>A kth value representing a third random individual in the first sub-population.
A612, judge and includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of the update program.
The second update includes:
a621 to individualsThe kth value of (a) is updated as:
wherein,the kth value representing the fourth random individual in the first sub-population, and r1, r2, q1 and q2 are different from each other.
A622, judge and includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of the update program.
A7, judging whether the value of K is equal to or greater than K, if so, completing the individualIf not, go to step A8, otherwise return to step A6.
A8, judging whether the second counter i is equal to or larger than NP1, if so, entering a step A9, otherwise, adding one to the count value of the second counter i, and returning to the step A5.
A9, taking out the jth individual in the second sub-population during the tth trainingAnd let k have an initial value of 1.
A10, obtaining individualsThe corresponding adjustment rate BAR is:
BAR=λ+μt
wherein λ represents a first constant factor, μ represents a second constant factor, t max Representing the maximum number of iterations, BAR min Representing the lower limit of the adjustment rate, BAR max The upper limit of the adjustment rate is indicated.
A11, acquiring a second updated index parameter r2 = rand, judging whether the second updated index parameter r2 is smaller than or equal to the mobility p, and if yes, for the individualMake a third update and go to A12, otherwise +.>A fourth update is made and a12 is entered.
The third update includes:
a1111, to individualsThe kth value of (a) is updated as:
/>
wherein,representing individual->The value of k-th value after updating, for example>Represents the kth value in the individuals with the largest fitness value during the t-th training.
A1112, judge to includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of the update program.
The fourth update includes:
a1121, for individualsThe kth value of (a) is updated as:
wherein,represents the kth value of the random individuals in the second sub-population.
A1122, judgment includesIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of the update program.
A12, judging whether the second updating index parameter r2 is larger than the adjustment rate BAR, if so, the individual is determinedThe kth value of (a) is updated to +.>Otherwise, directly enter step A13. Wherein dx is k Represents a random step size, alpha represents a weight factor, +.>S max Representing the maximum step size.
A13, judging whether the value of K is equal to or greater than K, if so, entering a step A14, otherwise, adding one to the value of K, and returning to the step A11.
A14, re-acquiring the fitness value of each individual in the first sub-population and the second sub-population, determining L target individuals with the largest fitness value, and carrying out mutation update on the target individuals, wherein the mutation update is as follows:
wherein,represents the target individual at the time of the t-th training +.>K-th value of>Representing updated-> The kth value in the individual that represents the current maximum fitness, cauchy (0, 1) represents the random number generated by the standard Cauchy distribution.
A15, judging whether the first counter t is equal to or greater than the maximum iteration number t max If so, taking the individual with the largest fitness as the final network parameter of the model to be trained, otherwise, leading the count value of the first counter t to be increased by one, and returning to the step A5.
The training algorithm provided by the embodiment has stronger global searching capability, can avoid sinking into a local optimal solution, and is beneficial to finding the optimal solution faster by carrying out variant updating on individuals with poor adaptability, thereby realizing accurate training of a network model.
In one possible implementation manner, according to the basic support data, a question-answer database is built by using a trained entity extraction model and an entity relation extraction model, including:
and identifying entity data in the basic support data by adopting the entity extraction model after training, and taking the entity data as map nodes.
And identifying entity relation data in the basic support data by using the entity relation extraction model after training, and obtaining knowledge graph data by taking the entity relation data as a connecting edge between graph nodes. And taking the knowledge graph data as a question-answer database.
Alternatively, the Neo4j graph database may be selected to store the knowledge-graph data, thereby obtaining the question-answer database.
In one possible implementation manner, constructing a query sentence with text to be analyzed, and matching target entity data and target entity relations in the question-answer database according to the query sentence, including:
And performing word segmentation, entity extraction and entity relation extraction operation on the text to be analyzed to obtain extraction data.
Based on the extracted data, generating a query sentence, and matching target entity data and target entity relations in a question-answer database according to the query sentence.
According to the intelligent question-answering method based on NLP, the knowledge graph is constructed, so that various entities and relations among the entities can be determined, when a user has a question, the query statement is constructed through the question inputted by the user, so that the entity corresponding to the question and the relation among the entities can be queried, answers corresponding to the question are generated, customer service efficiency can be improved, and time cost expenditure is reduced.
It should be noted that any method using the inventive concept should be within the scope of the present invention. Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains.
It is to be understood that the invention is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. An intelligent question-answering method based on NLP, which is characterized by comprising the following steps:
acquiring corpus data comprising entity data and entity relations, and preprocessing the corpus data to obtain basic support data;
constructing an entity extraction model and an entity relation extraction model by using an NLP model, and training the entity extraction model and the entity relation extraction model to obtain a trained entity extraction model and an entity relation extraction model;
according to the basic support data, a trained entity extraction model and an entity relation extraction model are adopted to construct a question-answer database;
acquiring question sentence information sent by a user, obtaining a text to be analyzed, constructing a query sentence by using the text to be analyzed, and matching target entity data and target entity relations in the question-answer database according to the query sentence;
and generating answer information corresponding to the question sentence information according to the target entity data and the target entity relationship, and completing the intelligent question answering.
2. The NLP-based intelligent question-answering method according to claim 1, wherein obtaining corpus data including entity data and entity relationships, and preprocessing the corpus data to obtain basic support data, comprises:
acquiring a plurality of pieces of corpus data comprising entity data and entity relations through a web crawler tool, or acquiring the plurality of pieces of corpus data comprising the entity data and the entity relations through a man-machine interaction mode, or acquiring the plurality of pieces of corpus data comprising the entity data and the entity relations in a corpus database;
converting the acquired corpus data into txt format to obtain converted corpus data, and cleaning, dividing and naming entity data enhancing the converted corpus data to obtain basic support data.
3. The NLP-based intelligent question-answering method according to claim 2, wherein the steps of performing washing, segmentation and named entity data enhancement on the converted corpus data to obtain basic support data, include:
removing repeated content from the converted corpus data to obtain first corpus data;
calling a third-party intelligent error correction API to correct the first corpus data to obtain second corpus data;
Removing redundant words in the second corpus data by adopting a regular expression to obtain third corpus data, wherein the redundant words are used for representing preset words to be removed;
the method comprises the steps of performing word segmentation on third corpus data by using a Jieba word segmentation tool, and removing stop words in the third corpus data to obtain fourth corpus data;
dividing the fourth corpus data according to a preset text length limiting threshold value to obtain at least one fifth corpus data;
performing paraphrasing replacement on text entity words in the fifth corpus data, or randomly exchanging the positions of two adjacent words, or performing back-translation enhancement on the fifth corpus data through third-party translation software to obtain enhanced corpus data;
and taking the fifth corpus data and the enhanced corpus data together as basic support data.
4. The NLP-based intelligent question-answering method according to claim 2, wherein constructing the entity extraction model and the entity relation extraction model by using the NLP model, and training the entity extraction model and the entity relation extraction model to obtain a trained entity extraction model and entity relation extraction model, comprises:
constructing an entity extraction model by adopting an ALBERT-BiLSTM-CRF model in the NLP model, and constructing an entity relation extraction model by adopting a Mutil att_BiGRU model in the NLP model;
Training the entity extraction model and the entity relation extraction model by adopting training data to obtain the entity extraction model and the entity relation extraction model after training.
5. The NLP-based intelligent question-answering method according to claim 4, wherein the entity extraction model comprises a first input layer, an ALBERT layer, a BiLSTM layer, a CRF layer, and a first output layer;
the entity relation extraction model comprises a second input layer, a representation layer, a BIGRU layer, a character attention layer, a statement attention layer and an output layer;
training the entity extraction model and the entity relation extraction model by using training data, wherein the training comprises the following steps:
acquiring training corpus data, preprocessing the training corpus data, and obtaining preprocessed training corpus data;
marking the preprocessed training corpus data by adopting a BIO marking mode to obtain the preprocessed training corpus data and corresponding BIO marking;
training the entity extraction model according to the preprocessed training corpus data and the corresponding BIO labels to obtain a trained entity extraction model;
performing relationship labeling by the preprocessed training corpus data to obtain the preprocessed training corpus data and corresponding relationship labeling;
And training the entity relation extraction model according to the preprocessed training corpus data and the corresponding relation labels to obtain a trained entity relation extraction model.
6. The NLP-based intelligent question-answering method of claim 5, wherein training the entity extraction model according to the preprocessed training corpus data and the corresponding BIO-label to obtain a trained entity extraction model comprises:
dividing the preprocessed training corpus data into a first training set and a first verification set;
training corpus data in the first training set is used as actual input of an entity extraction model, BIO labels of the training corpus data are used as expected output of the entity extraction model, and an improved MBO algorithm is adopted to train network parameters of the entity extraction model for a plurality of times;
taking the training corpus data in the first verification set as the actual input of the entity extraction model, and obtaining a first training result of the entity extraction model, wherein the first training result comprises training completion or incompletion;
if the first training result is that training is completed, taking the network parameter at the moment as the final network parameter of the entity extraction model to obtain a training completed entity extraction model; if the first training result is that training is not completed, training is conducted again according to the first training set until the first training result is that training is completed, and an entity extraction model with the completed training is obtained.
7. The NLP-based intelligent question-answering method of claim 5, wherein training the entity relationship extraction model according to the preprocessed training corpus data and the corresponding relationship labels to obtain a trained entity relationship extraction model comprises:
dividing the preprocessed training corpus data into a second training set and a second verification set;
training corpus data in the second training set is used as actual input of an entity relation extraction model, relation labels of the training corpus data are used as expected output of the entity relation extraction model, and an improved MBO algorithm is adopted to train network parameters of the entity relation extraction model for a plurality of times;
taking the training corpus data in the second verification set as the actual input of the entity relation extraction model, and obtaining a second training result of the entity relation extraction model, wherein the second training result comprises training completion or incompletion;
if the second training result is that training is completed, taking the network parameter at the moment as the final network parameter of the entity relation extraction model to obtain a training completed entity relation extraction model; and if the second training result is that the training is not completed, retraining according to the first training set until the second training result is that the training is completed, and obtaining an entity relation extraction model after the training is completed.
8. The NLP-based intelligent question-answering method according to claim 7, wherein the modified MBO algorithm comprises:
a1, randomly generating network parameters of a model to be trained to obtain an initial individual;
a2, randomly generating N initial individuals to obtain an initial population;
a3, acquiring an fitness value of each individual in the initial population, sorting the individuals in the initial population according to the fitness value, taking the front NP1 individual as a first sub-population, and taking the rear NP2 individual as a second sub-population; wherein np1=ceil (p×np), np2=np-NP 1, NP represents the total number of individuals in the initial population, p represents mobility, and p=5/12, ceil represents a rounding function;
a4, setting a first counter t=1, a second counter i=1 and a third counter j=1;
a5, taking out the ith individual in the first sub-population during the t-th trainingAnd let k have an initial value of 1;
a6, obtaining the individualThe corresponding first update index parameter r1=rand 1.2, rand representing the random number in (0, 1);
judging whether the first updating index parameter r1 is smaller than or equal to the mobility p, if so, for the individualThe kth value of (a) is subjected to a first update and step A7 is entered, otherwise, for eachBody->The k value of the step (a) is updated for the second time, and the step (A7) is entered;
The first update includes:
a611 to individualsThe kth value of (a) is updated as:
wherein,representing individual->K=1, 2, …, K representing the total number of network parameters of the model to be trained, +.>K-th value in the individual with the largest current fitness value,/for>A kth value representing a first random individual in the first sub-population,/v>A kth value representing a second random individual in the first sub-population,/v>A kth value representing a third random individual in the first sub-population;
a612, judge and includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
the second update includes:
a621 to individualsThe kth value of (a) is updated as:
wherein,a kth value representing a fourth random individual in the first sub-populationAnd r1, r2, q1 and q2 are different from each other;
a622, judge and includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
a7, judging whether the value of K is equal to or greater than K, if so, completing the individual If the update of the step (a) is not performed, the step (A8) is performed, otherwise, the step (A6) is performed;
a8, judging whether the second counter i is equal to or larger than NP1, if so, entering a step A9, otherwise, adding one to the count value of the second counter i, and returning to the step A5;
a9, taking out the jth individual in the second sub-population during the tth trainingAnd let k have an initial value of 1;
a10, obtaining individualsThe corresponding adjustment rate BAR is:
BAR=λ+μt
wherein λ represents a first constant factor, μ represents a second constant factor, t max Representing the maximum number of iterations, BAR min Representing the lower limit of the adjustment rate, BAR max An upper limit indicating the adjustment rate;
a11, acquiring a second updated index parameter r2 = rand, judging whether the second updated index parameter r2 is smaller than or equal to the mobility p, and if yes, for the individualMake a third update and go to A12, otherwise +.>Performing a fourth update and entering A12;
the third update includes:
a1111, to individualsThe kth value of (a) is updated as:
wherein,representing individual->The value of k-th value after updating, for example>The kth value in the individual with the largest fitness value in the t-th training is represented;
a1112, judge to includeIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +. >K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
the fourth update includes:
a1121, for individualsThe kth value of (a) is updated as:
wherein,a kth value representing a random individual in the second sub-population;
a1122, judgment includesIs->Whether or not the adaptation degree of (a) is greater than including +.>Is->Is adapted to receive +.>K-th value>Is otherwise refused to be about the individual>K-th value>Is updated according to the update of (a);
a12, judging whether the second updating index parameter r2 is larger than the adjustment rate BAR, if so, the individual is determinedThe kth value of (a) is updated to +.>Otherwise, directly entering the step A13; wherein dx is k Represents a random step size, a represents a weight factor,S max representing the maximum step size;
a13, judging whether the value of K is equal to or greater than K, if so, entering a step A14, otherwise, adding one to the count value of K, and returning to the step A11;
a14, re-acquiring the fitness value of each individual in the first sub-population and the second sub-population, determining L target individuals with the largest fitness value, and carrying out mutation update on the target individuals, wherein the mutation update is as follows:
wherein,represents the target individual at the time of the t-th training +.>K-th value of>Representing updated- >The kth value in the individual with the greatest current fitness, cauchy (0, 1) represents the random number generated by the standard Cauchy distribution;
a15, judging whether the first counter t is equal to or greater than the maximum iteration number t max If so, taking the individual with the largest fitness as the final network parameter of the model to be trained, otherwise, leading the count value of the first counter t to be increased by one, and returning to the step A5.
9. The NLP-based intelligent question-answering method according to claim 1, wherein constructing a question-answering database using a trained entity extraction model and entity relationship extraction model according to the basic support data comprises:
entity data in basic support data is identified by adopting a trained entity extraction model, and the entity data is used as map nodes;
identifying entity relation data in the basic support data by adopting the entity relation extraction model after training, and obtaining knowledge graph data by taking the entity relation data as a connecting edge between graph nodes;
and taking the knowledge graph data as a question-answer database.
10. The NLP-based intelligent question-answering method according to claim 9, wherein constructing a query sentence with text to be analyzed, and matching target entity data and target entity relationships in the question-answering database according to the query sentence, comprises:
Performing word segmentation, entity extraction and entity relation extraction operation on the text to be analyzed to obtain extraction data;
based on the extracted data, generating a query sentence, and matching target entity data and target entity relations in a question-answer database according to the query sentence.
CN202311070132.7A 2023-08-23 2023-08-23 Intelligent question-answering method based on NLP Active CN117093693B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311070132.7A CN117093693B (en) 2023-08-23 2023-08-23 Intelligent question-answering method based on NLP

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311070132.7A CN117093693B (en) 2023-08-23 2023-08-23 Intelligent question-answering method based on NLP

Publications (2)

Publication Number Publication Date
CN117093693A true CN117093693A (en) 2023-11-21
CN117093693B CN117093693B (en) 2024-05-07

Family

ID=88781692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311070132.7A Active CN117093693B (en) 2023-08-23 2023-08-23 Intelligent question-answering method based on NLP

Country Status (1)

Country Link
CN (1) CN117093693B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492077A (en) * 2018-09-29 2019-03-19 北明智通(北京)科技有限公司 The petrochemical field answering method and system of knowledge based map
CN113010660A (en) * 2021-04-22 2021-06-22 国网信息通信产业集团有限公司 Intelligent question and answer method and device based on knowledge graph
CN113312501A (en) * 2021-06-29 2021-08-27 中新国际联合研究院 Construction method and device of safety knowledge self-service query system based on knowledge graph
CN115544232A (en) * 2022-10-11 2022-12-30 重庆长安新能源汽车科技有限公司 Vehicle-mounted intelligent question answering and information recommending method and device
CN116010581A (en) * 2023-02-08 2023-04-25 金现代信息产业股份有限公司 Knowledge graph question-answering method and system based on power grid hidden trouble shooting scene
CN116089581A (en) * 2022-12-30 2023-05-09 天津光电通信技术有限公司 Intelligent question-answering method based on knowledge graph

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109492077A (en) * 2018-09-29 2019-03-19 北明智通(北京)科技有限公司 The petrochemical field answering method and system of knowledge based map
CN113010660A (en) * 2021-04-22 2021-06-22 国网信息通信产业集团有限公司 Intelligent question and answer method and device based on knowledge graph
CN113312501A (en) * 2021-06-29 2021-08-27 中新国际联合研究院 Construction method and device of safety knowledge self-service query system based on knowledge graph
CN115544232A (en) * 2022-10-11 2022-12-30 重庆长安新能源汽车科技有限公司 Vehicle-mounted intelligent question answering and information recommending method and device
CN116089581A (en) * 2022-12-30 2023-05-09 天津光电通信技术有限公司 Intelligent question-answering method based on knowledge graph
CN116010581A (en) * 2023-02-08 2023-04-25 金现代信息产业股份有限公司 Knowledge graph question-answering method and system based on power grid hidden trouble shooting scene

Also Published As

Publication number Publication date
CN117093693B (en) 2024-05-07

Similar Documents

Publication Publication Date Title
CN110457675B (en) Predictive model training method and device, storage medium and computer equipment
CN111209738B (en) Multi-task named entity recognition method combining text classification
CN107168945B (en) Bidirectional cyclic neural network fine-grained opinion mining method integrating multiple features
CN110019843B (en) Knowledge graph processing method and device
CN113239210B (en) Water conservancy literature recommendation method and system based on automatic completion knowledge graph
CN109885660A (en) A kind of question answering system and method based on information retrieval that knowledge mapping is energized
CN109117472A (en) A kind of Uighur name entity recognition method based on deep learning
CN111723575A (en) Method, device, electronic equipment and medium for recognizing text
CN110008469A (en) A kind of multi-level name entity recognition method
CN112052324A (en) Intelligent question answering method and device and computer equipment
CN111666427A (en) Entity relationship joint extraction method, device, equipment and medium
CN111753545A (en) Nested entity recognition method and device, electronic equipment and storage medium
CN112163424A (en) Data labeling method, device, equipment and medium
CN111858854B (en) Question-answer matching method and relevant device based on historical dialogue information
CN113591457A (en) Text error correction method, device, equipment and storage medium
CN113128233B (en) Construction method and system of mental disease knowledge map
CN109872775B (en) Document labeling method, device, equipment and computer readable medium
CN114416942A (en) Automatic question-answering method based on deep learning
CN113821605A (en) Event extraction method
CN112069799A (en) Dependency syntax based data enhancement method, apparatus and readable storage medium
CN114692568A (en) Sequence labeling method based on deep learning and application
CN106933802B (en) Multi-data-source-oriented social security entity identification method and device
CN114330318A (en) Method and device for recognizing Chinese fine-grained entities in financial field
CN107783958B (en) Target statement identification method and device
CN113705207A (en) Grammar error recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant