CN114049889A - Intelligent conversation feedback system based on interaction scene - Google Patents
Intelligent conversation feedback system based on interaction scene Download PDFInfo
- Publication number
- CN114049889A CN114049889A CN202111288634.8A CN202111288634A CN114049889A CN 114049889 A CN114049889 A CN 114049889A CN 202111288634 A CN202111288634 A CN 202111288634A CN 114049889 A CN114049889 A CN 114049889A
- Authority
- CN
- China
- Prior art keywords
- module
- language
- feedback
- speech
- verification
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 17
- 238000012795 verification Methods 0.000 claims abstract description 22
- 238000001514 detection method Methods 0.000 claims abstract description 4
- 238000004458 analytical method Methods 0.000 claims description 21
- 238000000034 method Methods 0.000 claims description 9
- 230000015572 biosynthetic process Effects 0.000 claims description 7
- 230000002452 interceptive effect Effects 0.000 claims description 7
- 238000003786 synthesis reaction Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 description 7
- 238000006243 chemical reaction Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 3
- 241000282414 Homo sapiens Species 0.000 description 2
- 230000001915 proofreading effect Effects 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/005—Language recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to the technical field of intelligent interaction, in particular to an intelligent conversation feedback system based on interaction scenes, which comprises a conversation receiving module and a language question explanation module, wherein a verification module for voice detection is arranged at the downstream of the language question explanation module, a feedback module for voice data recognition is arranged at the downstream of the verification module, and the language question explanation module is arranged to judge the language type, so that the unification of the content called by a machine and the data type to be matched is ensured, and the judgment accuracy is improved.
Description
Technical Field
The invention relates to the technical field of intelligent interaction, in particular to an intelligent session feedback system based on interaction scenes.
Background
Intelligent speech, i.e. intelligent speech technology, is used for realizing man-machine language communication, and includes speech recognition technology (ASR) and speech synthesis technology (TTS), and the research on intelligent speech technology starts with speech recognition technology, which dates back to the 50 s of the 20 th century. With the development of information technology, intelligent voice technology has become the most convenient and effective means for people to acquire and communicate information.
In the prior art, intelligent voice is a common interaction mode, but the problems of low recognition rate and easy repetition of semantic recognition generally exist.
The prior patent number CN201610042063.2, entitled chinese patent for intelligent voice dialog interaction method and apparatus, discloses an intelligent voice dialog interaction method and apparatus. The method comprises the following steps: acquiring voice content of a voice request of a user and keywords in the voice content; matching in a voice database according to the keywords to obtain the semantics corresponding to the keywords; judging whether the semantics are complete semantics or incomplete semantics; if the complete semantics are found, inquiring a service item corresponding to the complete semantics in the conversation database according to the complete semantics; and executing self-service or manual service according to the service items. The method and the device shorten the time of positioning service items by customers, guide the users to quickly finish self-service or manual service, simultaneously improve the feeling of the users by humanized interactive design, and disclose the following technical characteristics: matching to obtain the semantics corresponding to the keywords, judging whether the semantics are complete semantics or incomplete semantics, acquiring the answer voice of the user, and executing the acquisition of the keywords in the voice content until the semantics corresponding to the keywords are complete semantics, wherein although the improvement is made on the solution of the problem of complete semantics, the problems of low recognition rate and repeated semantics are still not solved.
Disclosure of Invention
The invention aims to provide an intelligent conversation feedback system based on an interactive scene, and solves the problems that the voice recognition rate is low and semantic recognition is easy to repeat in the prior art.
The invention is realized by the following technical scheme that the voice recognition system comprises a conversation receiving module and a language interpretation module, wherein a verification module for voice detection is arranged at the downstream of the language interpretation module, and a feedback module for voice data recognition is arranged at the downstream of the verification module.
It should be noted that, the technical scheme of the present application is provided with the language question explanation module and the verification module, and is different from the prior art in that the language question explanation module can pre-judge the type of the language, in the prior art, only the complete semantic recognition is performed on the language, and direct language type recognition is lacked, so that when interaction is performed on people in different countries, the problem of inconvenience may occur, and the verification and coincidence analysis module is provided to ensure that the language can perform semantic judgment and coincidence degree judgment after the language type is determined.
The verification module is internally provided with a corrector and a coincidence analysis module, the corrector corrects the speech read by the language question-explaining module according to the language type, the coincidence analysis module performs initial comparison analysis on the content of the speech in the database, and if the content of the speech cannot be coincided, the content of the speech is transmitted to the feedback module to perform feedback interactive questioning and guide the interlocutor to perform the double-listening verification of the sent instruction.
It should be noted that the feedback module is configured to re-speak the speech recognized by the machine correspondingly, that is, re-speak the speech of the interlocutor identically, and perform secondary confirmation of the interlocutor, so that the feedback content of the dialogue, that is, the re-spoken content, found in the actual production application can be corresponded according to the language type of the speaker of the dialogue, thereby improving the accuracy and efficiency of the re-examination.
And a language generation module is arranged at the downstream of the feedback module and is used for converting the language after the machine identification judgment into a natural language.
It should be noted that the language generation module is provided to help the machine to accurately generate the language type, and change the machine language into a language that can be understood by human beings, where the language type includes but is not limited to chinese, english, and the like.
And a judgment module is arranged at the downstream of the language generation module and is used for carrying out accuracy rate discrimination analysis on the generated language, if the generated language is in accordance with the accuracy rate discrimination analysis, the next step of program is carried out, and if the generated language is not in accordance with the accuracy rate discrimination analysis, the generated language is re-entered into the verification module for verification processing.
It should be noted that the judgment module is arranged to ensure the accuracy of the generated language and to ensure the final output of the language again, and the applicant finds that the judgment module in the system can ensure the success rate efficiently in the actual operation process.
And a voice synthesis module is arranged at the downstream of the judgment module and is used for synthesizing and outputting voice.
It should be noted that the speech synthesis module outputs the machine language as human speech or text for interaction, and the specific output type is matched according to the type input by the interlocutor in advance.
Compared with the prior art, the invention has the following advantages and beneficial effects:
1. the language interpretation module is arranged to judge the language type, so that the unification of the content called by the machine and the type of the matched data to be carried out is ensured, and the judgment accuracy is improved;
2. the coincidence analysis module can efficiently ensure the coincidence and the integrity of the semantics, and can also carry out the automatic recognition and judgment of the semantics on the premise of finishing the effect of the prior art, including the recognition of time difference and accent;
3. the judgment module is arranged, so that the last output semantic or voice command can be ensured, the requirement of the interlocutor can be met more accurately and timely and interactively solved, and the feedback effect efficiency is improved.
Drawings
FIG. 1 is a schematic flow diagram of the present invention.
Detailed Description
Referring to fig. 1 for description, the present embodiment provides an intelligent conversational feedback system based on an interaction scenario, which is mainly used to solve the problems of low speech recognition rate and easy repetition of semantic recognition in the prior art, and the system is already in an actual experimental stage.
The specific embodiment mode of the invention is as follows, comprising a conversation receiving module and a language interpretation module, wherein a verification module for voice detection is arranged at the downstream of the language interpretation module, and a feedback module for voice data recognition is arranged at the downstream of the verification module. And a corrector and a coincidence analysis module are arranged in the verification module, the corrector corrects the language type of the voice read by the language interpretation module, the coincidence analysis module performs primary comparison analysis on the content of the voice in the database, and if the voice content cannot be coincided, the voice content is transmitted to the feedback module to perform feedback interactive questioning, so as to guide the interlocutor to perform the re-listening verification of the sent instruction. A language generation module is arranged at the downstream of the feedback module and used for converting the language after the machine identification and judgment into a natural language. And a judgment module is arranged at the downstream of the language generation module and is used for carrying out accuracy rate discrimination analysis on the generated language, if the generated language is in accordance with the accuracy rate discrimination analysis, the next step of program is carried out, and if the generated language is not in accordance with the accuracy rate discrimination analysis, the generated language is re-entered into the verification module for verification processing. And a voice synthesis module is arranged at the downstream of the judgment module and is used for synthesizing and outputting voice.
The specific operation process is as follows: this system needs to be installed first in the feedback system where interaction is required, followed by its own action. Firstly, it should be noted that, the steps or programs related to the recognition and voice command transmission in the present application are the same as those in the prior art, and are performed in the prior art, then the interlocutor needing feedback operation performs voice control interaction before the device equipped with the system, and after the interlocutor speaks out the related voice command, the system recognizes, and then the language interpretation module of the system functions, and the system recognizes the language type, if the language of the interlocutor is determined to be chinese, english, italian, russian, etc., the related language type is not limited, the system can perform self-recognition, then, after the system recognizes the corresponding language type, the corresponding language control system flow is correspondingly opened, so as to conveniently call the subsequent language coincidence analysis library, and then, after recognition, the system enters the coincidence analysis module, the voice frequency, the tail sound and the pronunciation coincidence degree of the language command are intensively judged in the coincidence analysis module, when all judgments meet the requirements, namely after the meeting command is found, the sound frequency, the tail sound and the pronunciation coincidence degree are subjected to primary proofreading through the proofreading module, a feedback effect is performed, if the meeting requirements are not met, secondary output feedback is performed immediately, the output feedback at the moment is that the machine can repeat the voice sent by the closest interlocutor, and if the interlocutor receives the same words, the answer is correct or not, and the machine can execute the corresponding program according to the correctness or the mistake. After the correct feedback, the machine can convert the natural language, the conversion process can be carried out by combining dialect, semantics and the like, the higher success rate of the conversion is improved, then the conversion is carried out into the natural language, the execution of the judgment module is carried out, the voice synthesis conversion is carried out after the execution, the conversion is carried out into characters or voice, the specific form is determined by the sending form of the interlocutor, namely, if the interlocutor writes, the words are synthesized and converted, if the words are voice, the words are synthesized and converted into voice, and the operation process of the whole system is finished immediately.
Compared with the speech recognition rate and the semantic repetition rate of the conversation feedback system in the prior art, the speech recognition rate and the semantic repetition rate of the conversation feedback system in the application can reach 2% -5%, the prior art can only reach 10% -20%, and meanwhile, in 100 experiments, the applicant finds that the success rate of one-time success is as high as 95%, namely the feedback interaction effect is higher than that of the prior art, and the conversation feedback system has obvious technical improvement.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.
Claims (5)
1. The intelligent conversation feedback system based on the interaction scene comprises a conversation receiving module and is characterized by further comprising a language question explaining module, wherein a verification module used for voice detection is arranged at the downstream of the language question explaining module, and a feedback module used for voice data recognition is arranged at the downstream of the verification module.
2. The system according to claim 1, wherein a calibrator and a coincidence analysis module are disposed in the verification module, the calibrator calibrates the speech interpreted by the linguistic interpretation module according to the language type, the coincidence analysis module performs initial comparison analysis on the content of the speech in the database, and if the content of the speech cannot be coincided, the content of the speech is transmitted to the feedback module for feedback interactive questioning, so as to guide the interlocutor to perform the replay verification of the issued instruction.
3. The intelligent conversation feedback system based on the interaction scenario of claim 1, wherein a language generation module is disposed downstream of the feedback module, and the language generation module is configured to convert the language determined by the machine recognition into a natural language.
4. The intelligent conversation feedback system based on interactive scenes as claimed in claim 3, wherein a judgment module is arranged downstream of the language generation module, the judgment module is used for performing accuracy rate discrimination analysis on the generated language, if the generated language is matched with the judgment module, the next procedure is performed, and if the generated language is not matched with the judgment module, the generated language is re-entered into the verification module for verification processing.
5. The intelligent conversation feedback system based on the interactive scenario as claimed in claim 4, wherein a speech synthesis module is disposed downstream of the determination module, and the speech synthesis module is configured to synthesize and output speech.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111288634.8A CN114049889A (en) | 2021-11-02 | 2021-11-02 | Intelligent conversation feedback system based on interaction scene |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111288634.8A CN114049889A (en) | 2021-11-02 | 2021-11-02 | Intelligent conversation feedback system based on interaction scene |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114049889A true CN114049889A (en) | 2022-02-15 |
Family
ID=80206709
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111288634.8A Pending CN114049889A (en) | 2021-11-02 | 2021-11-02 | Intelligent conversation feedback system based on interaction scene |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114049889A (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106571140A (en) * | 2016-11-14 | 2017-04-19 | Tcl集团股份有限公司 | Electrical appliance intelligent control method based on voice meaning and electrical appliance intelligent control system thereof |
CN108073976A (en) * | 2016-11-18 | 2018-05-25 | 科沃斯商用机器人有限公司 | Man-machine interactive system and its man-machine interaction method |
CN111554281A (en) * | 2020-03-12 | 2020-08-18 | 厦门中云创电子科技有限公司 | Vehicle-mounted man-machine interaction method for automatically identifying languages, vehicle-mounted terminal and storage medium |
CN113571055A (en) * | 2020-04-29 | 2021-10-29 | 顾家家居股份有限公司 | Intelligent voice sofa control system |
-
2021
- 2021-11-02 CN CN202111288634.8A patent/CN114049889A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106571140A (en) * | 2016-11-14 | 2017-04-19 | Tcl集团股份有限公司 | Electrical appliance intelligent control method based on voice meaning and electrical appliance intelligent control system thereof |
CN108073976A (en) * | 2016-11-18 | 2018-05-25 | 科沃斯商用机器人有限公司 | Man-machine interactive system and its man-machine interaction method |
CN111554281A (en) * | 2020-03-12 | 2020-08-18 | 厦门中云创电子科技有限公司 | Vehicle-mounted man-machine interaction method for automatically identifying languages, vehicle-mounted terminal and storage medium |
CN113571055A (en) * | 2020-04-29 | 2021-10-29 | 顾家家居股份有限公司 | Intelligent voice sofa control system |
Non-Patent Citations (1)
Title |
---|
韩冰,等: "《数字音视频处理》", 31 October 2018 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111128126B (en) | Multi-language intelligent voice conversation method and system | |
US20200226327A1 (en) | System and method for direct speech translation system | |
US8498857B2 (en) | System and method for rapid prototyping of existing speech recognition solutions in different languages | |
US7412387B2 (en) | Automatic improvement of spoken language | |
CN110689877A (en) | Voice end point detection method and device | |
EP0767950B1 (en) | Method and device for adapting a speech recognition equipment for dialectal variations in a language | |
JP2011504624A (en) | Automatic simultaneous interpretation system | |
CN104882141A (en) | Serial port voice control projection system based on time delay neural network and hidden Markov model | |
CN113205811A (en) | Conversation processing method and device and electronic equipment | |
CN111968622A (en) | Attention mechanism-based voice recognition method, system and device | |
CN112133292A (en) | End-to-end automatic voice recognition method for civil aviation land-air communication field | |
CN112667787A (en) | Intelligent response method, system and storage medium based on phonetics label | |
CN112420053A (en) | Intelligent interactive man-machine conversation system | |
CN112863485A (en) | Accent voice recognition method, apparatus, device and storage medium | |
US20040143436A1 (en) | Apparatus and method of processing natural language speech data | |
CN114049889A (en) | Intelligent conversation feedback system based on interaction scene | |
KR101233655B1 (en) | Apparatus and method of interpreting an international conference based speech recognition | |
Huang et al. | Unit selection synthesis based data augmentation for fixed phrase speaker verification | |
CN113160821A (en) | Control method and device based on voice recognition | |
CN110534084B (en) | Intelligent voice control method and system based on FreeWITCH | |
JP3039399B2 (en) | Non-native speech recognition device | |
Kuzdeuov et al. | Speech command recognition: Text-to-speech and speech corpus scraping are all you need | |
CN110085212A (en) | A kind of audio recognition method for CNC program controller | |
CN113035247B (en) | Audio text alignment method and device, electronic equipment and storage medium | |
Hatazaki et al. | Speech dialogue system based on simultaneous understanding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |