CN110704623A - Method, device, system and storage medium for improving entity identification rate based on Rasa _ Nlu framework - Google Patents
Method, device, system and storage medium for improving entity identification rate based on Rasa _ Nlu framework Download PDFInfo
- Publication number
- CN110704623A CN110704623A CN201910923027.0A CN201910923027A CN110704623A CN 110704623 A CN110704623 A CN 110704623A CN 201910923027 A CN201910923027 A CN 201910923027A CN 110704623 A CN110704623 A CN 110704623A
- Authority
- CN
- China
- Prior art keywords
- rasa
- intention
- entity identification
- nlu
- rate based
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Computational Linguistics (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the field of data processing, in particular to a method, a device, a system and a storage medium for improving entity identification rate based on a Rasa _ Nlu framework, wherein the method comprises the following steps: firstly, inputting voice and using jieba word segmentation; then, obtaining and preprocessing the corpus; next, carrying out MITIE model training, and carrying out model training by adopting a tool, namely wordrep in MITIE to obtain a data set; and finally, establishing a Rasa _ Nlu corpus and a model for intention identification and entity identification to obtain the intention of the user, wherein the method can accurately analyze the intention of the user by using the latest natural language processing technology in the field of artificial intelligence, and the method based on the Rasa _ Nlu framework in the computer scene can improve the entity identification rate, solve the problem of low entity identification rate in the existing method and provide great convenience for people.
Description
Technical Field
The invention relates to the field of data processing, in particular to a method, a device, a system and a storage medium for improving entity identification rate based on a Rasa _ Nlu framework.
Background
The natural Language processing (Nature Language Process) is divided into three links, wherein most of the difficult points appear in the natural Language understanding (Nature Language understanding), and the main problems are ambiguity and unknown Language phenomena. On the one hand, the ambiguity phenomenon existing in natural language in large quantity is a fundamental problem which puzzles people to realize the application target no matter in lexical level, syntactic level, semantic level and pragmatic level, and no matter in which kind of language unit. On the other hand, for a specific system, there is always the possibility of various unexpected situations such as unknown vocabulary, unknown structure, etc., and every language is dynamically changed with the development of society, new vocabulary (especially some new names of people, place, organization, and special vocabulary), new word senses, new vocabulary usage (new word classes), and even new sentence structure are continuously appearing, especially in spoken language conversations or computer network conversations (microblogs, blogs, etc., rare and odd word and speech structures are more common.
At present, entity recognition rates of many natural language understanding methods in the market are particularly low, so a method for improving the entity recognition rate based on a RasaNlu framework in a computer scene is developed.
Disclosure of Invention
In view of the above existing problems, an object of the present invention is to provide a method for improving an entity identification rate based on a RasaNlu framework in a computer scenario, to solve the problem of low entity identification rate in the existing method, and to solve the above existing problems in the prior art, the present invention provides a method for improving an entity identification rate based on a Rasa _ Nlu framework, including the following steps:
step S1: inputting voice and using jieba word segmentation;
step S2: obtaining and preprocessing a corpus;
step S3: carrying out MITIE model training, namely carrying out model training by adopting a tool, namely wordrep in MITIE to obtain a data set;
step S4: constructing a Rasa _ Nlu corpus and a model for intention identification and entity identification;
step S5: the intention of the user is acquired.
Preferably, the intention recognition in step S4 is to classify at sentence level to clarify the intention; the entity identification is to find out the key entities in the user question at the word level and fill entity slots.
In order to achieve the above object, the present invention further provides an apparatus for improving entity recognition rate based on Rasa _ Nlu framework, comprising
The information input module is used for inputting voice;
the information acquisition and preprocessing module is used for acquiring voice information and preprocessing the voice information;
the MITIE model training module is used for training a model to obtain a data set;
constructing a Rasa _ Nlu corpus and a model for intention identification and entity identification;
and the acquisition module is used for acquiring the intention of the user.
To achieve the above object, the present invention further provides a system for improving entity identification rate based on Rasa _ Nlu framework, including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the steps of the above method when executing the computer program.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program, which when executed by a processor, performs the steps of the above method.
The invention has the beneficial effects that: the current latest natural language processing technology in the field of artificial intelligence is applied, the intention of a user can be accurately analyzed, and the method based on the Rasanlu framework in a computer scene can improve the entity recognition rate and provide great convenience for people.
Drawings
Fig. 1 is an overall flowchart of a method for improving an entity identification rate based on a Rasa _ Nlu framework in embodiment 1 of the present invention.
Fig. 2 is a block diagram illustrating an apparatus for improving an entity recognition rate according to a Rasa _ Nlu framework in embodiment 2 of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings of the present invention, and it is obvious that the described embodiments are only some embodiments of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Fig. 1 is a flowchart of the overall embodiment 1 of the method for improving the entity recognition rate based on the Rasa _ Nlu framework. As shown in fig. 1, a method for improving an entity identification rate based on a Rasa _ Nlu framework includes the following steps:
step S1: speech input and word segmentation using jieba.
Step S2: and obtaining and preprocessing the corpus.
Step S3: and (3) carrying out MITIE model training, namely carrying out model training by adopting a tool, namely wordrep in MITIE to obtain a data set.
Step S4: and constructing Rasa _ Nlu corpora and models for intention identification and entity identification.
In the step, the intention identification is to classify at sentence level to clarify the intention; the entity identification is to find out the key entities in the user question at the word level and fill entity slots.
Step S5: the intention of the user is acquired.
Example 2
Fig. 2 is a block diagram of an embodiment 2 of the apparatus for increasing an entity recognition rate based on a Rasa _ Nlu framework. As shown in fig. 2, the present embodiment provides an apparatus for improving an entity identification rate based on a Rasa _ Nlu framework, including
The information input module is used for inputting voice;
the information acquisition and preprocessing module is used for acquiring voice information and preprocessing the voice information;
the MITIE model training module is used for training a model to obtain a data set;
constructing a Rasa _ Nlu corpus and a model for intention identification and entity identification;
and the acquisition module is used for acquiring the intention of the user.
Example 3
The embodiment provides a system for improving an entity identification rate based on a Rasa _ Nlu framework, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the method when executing the computer program.
Example 4
The present embodiment provides a computer-readable storage medium, on which a computer program is stored, which program, when being executed by a processor, carries out the steps of the above-mentioned method.
In summary, the method, the apparatus, the system and the storage medium for improving the entity recognition rate based on the Rasa _ Nlu framework disclosed in the embodiments of the present invention can accurately analyze the intention of the user by using the current latest natural language processing technology in the field of artificial intelligence, and the method based on the rasan nlu framework in a computer scenario can improve the entity recognition rate, thereby providing great convenience for people.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can understand that the changes or modifications within the technical scope of the present invention are included in the scope of the present invention, and therefore, the scope of the present invention should be subject to the protection scope of the claims.
Claims (5)
1. A method for improving entity identification rate based on a Rasa _ Nlu framework is characterized by comprising the following steps:
step S1: inputting voice and using jieba word segmentation;
step S2: obtaining and preprocessing a corpus;
step S3: carrying out MITIE model training, namely carrying out model training by adopting a tool, namely wordrep in MITIE to obtain a data set;
step S4: constructing a Rasa _ Nlu corpus and a model for intention identification and entity identification;
step S5: the intention of the user is acquired.
2. The method for improving entity recognition rate based on Rasa _ Nlu framework of claim 1, wherein: in step S4, the intention recognition is to classify the sentence level to clarify the intention; the entity identification is to find out the key entities in the user question at the word level and fill entity slots.
3. An apparatus for improving entity identification rate based on Rasa _ Nlu framework, characterized in that: comprises that
The information input module is used for inputting voice;
the information acquisition and preprocessing module is used for acquiring voice information and preprocessing the voice information;
the MITIE model training module is used for training a model to obtain a data set;
constructing a Rasa _ Nlu corpus and a model for intention identification and entity identification;
and the acquisition module is used for acquiring the intention of the user.
4. A system for improving entity recognition rate based on Rasa _ Nlu framework, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein: the processor, when executing the computer program, realizes the steps of the method of any of the preceding claims 1 to 2.
5. A computer-readable storage medium having stored thereon a computer program, characterized in that: the program when executed by a processor implements the steps of the method of any of claims 1 to 2.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910923027.0A CN110704623A (en) | 2019-09-27 | 2019-09-27 | Method, device, system and storage medium for improving entity identification rate based on Rasa _ Nlu framework |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910923027.0A CN110704623A (en) | 2019-09-27 | 2019-09-27 | Method, device, system and storage medium for improving entity identification rate based on Rasa _ Nlu framework |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110704623A true CN110704623A (en) | 2020-01-17 |
Family
ID=69198239
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910923027.0A Pending CN110704623A (en) | 2019-09-27 | 2019-09-27 | Method, device, system and storage medium for improving entity identification rate based on Rasa _ Nlu framework |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110704623A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114564916A (en) * | 2022-03-03 | 2022-05-31 | 山东新一代信息产业技术研究院有限公司 | Method, device and medium for simplifying corpus addition and corpus tagging |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108427722A (en) * | 2018-02-09 | 2018-08-21 | 卫盈联信息技术(深圳)有限公司 | intelligent interactive method, electronic device and storage medium |
CN109146610A (en) * | 2018-07-16 | 2019-01-04 | 众安在线财产保险股份有限公司 | It is a kind of intelligently to insure recommended method, device and intelligence insurance robot device |
CN109522393A (en) * | 2018-10-11 | 2019-03-26 | 平安科技(深圳)有限公司 | Intelligent answer method, apparatus, computer equipment and storage medium |
-
2019
- 2019-09-27 CN CN201910923027.0A patent/CN110704623A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108427722A (en) * | 2018-02-09 | 2018-08-21 | 卫盈联信息技术(深圳)有限公司 | intelligent interactive method, electronic device and storage medium |
CN109146610A (en) * | 2018-07-16 | 2019-01-04 | 众安在线财产保险股份有限公司 | It is a kind of intelligently to insure recommended method, device and intelligence insurance robot device |
CN109522393A (en) * | 2018-10-11 | 2019-03-26 | 平安科技(深圳)有限公司 | Intelligent answer method, apparatus, computer equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
王雅君: ""基于RASA的智能语音对话***"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114564916A (en) * | 2022-03-03 | 2022-05-31 | 山东新一代信息产业技术研究院有限公司 | Method, device and medium for simplifying corpus addition and corpus tagging |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI636452B (en) | Method and system of voice recognition | |
CN111090736B (en) | Question-answering model training method, question-answering method, device and computer storage medium | |
CN111708869B (en) | Processing method and device for man-machine conversation | |
CN112507706B (en) | Training method and device for knowledge pre-training model and electronic equipment | |
CN111160041B (en) | Semantic understanding method and device, electronic equipment and storage medium | |
CN110019304B (en) | Method for expanding question-answering knowledge base, storage medium and terminal | |
CN115309877A (en) | Dialog generation method, dialog model training method and device | |
CN112560510A (en) | Translation model training method, device, equipment and storage medium | |
CN113779062A (en) | SQL statement generation method and device, storage medium and electronic equipment | |
US20230094730A1 (en) | Model training method and method for human-machine interaction | |
CN110019305B (en) | Knowledge base expansion method, storage medium and terminal | |
CN117271736A (en) | Question-answer pair generation method and system, electronic equipment and storage medium | |
CN113569559B (en) | Short text entity emotion analysis method, system, electronic equipment and storage medium | |
CN113553411B (en) | Query statement generation method and device, electronic equipment and storage medium | |
CN112349294B (en) | Voice processing method and device, computer readable medium and electronic equipment | |
CN109934347B (en) | Device for expanding question-answer knowledge base | |
CN110704623A (en) | Method, device, system and storage medium for improving entity identification rate based on Rasa _ Nlu framework | |
CN109002498B (en) | Man-machine conversation method, device, equipment and storage medium | |
CN111680146A (en) | Method and device for determining new words, electronic equipment and readable storage medium | |
US20230317058A1 (en) | Spoken language processing method and apparatus, and storage medium | |
CN108920560B (en) | Generation method, training method, device, computer readable medium and electronic equipment | |
CN111046674A (en) | Semantic understanding method and device, electronic equipment and storage medium | |
CN116186219A (en) | Man-machine dialogue interaction method, system and storage medium | |
CN114490969B (en) | Question and answer method and device based on table and electronic equipment | |
CN115620726A (en) | Voice text generation method, and training method and device of voice text generation model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200117 |