CN113611305A - Voice control method, system, device and medium in autonomous learning home scene - Google Patents

Voice control method, system, device and medium in autonomous learning home scene Download PDF

Info

Publication number
CN113611305A
CN113611305A CN202111037587.XA CN202111037587A CN113611305A CN 113611305 A CN113611305 A CN 113611305A CN 202111037587 A CN202111037587 A CN 202111037587A CN 113611305 A CN113611305 A CN 113611305A
Authority
CN
China
Prior art keywords
user
voice
control logic
intention
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111037587.XA
Other languages
Chinese (zh)
Inventor
张泽宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Shanghai Intelligent Technology Co Ltd
Original Assignee
Unisound Shanghai Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Shanghai Intelligent Technology Co Ltd filed Critical Unisound Shanghai Intelligent Technology Co Ltd
Priority to CN202111037587.XA priority Critical patent/CN113611305A/en
Publication of CN113611305A publication Critical patent/CN113611305A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voice control method, a system, equipment and a medium under an autonomous learning home scene, wherein the method comprises the following steps: receiving a user voice command sent by pickup equipment and the information of the space where the pickup equipment is located; recognizing the voice command, inquiring intention control logic configured by a user in a corpus, and if the intention control logic is configured by the user, directly executing the intention control logic; and if not, inquiring an intention control logic which is consistent with the voice recognition result and the spatial information of the sound pickup equipment in a pre-stored intention library, if so, directly executing the intention control logic, sending an inquiry about whether to execute the correct to the user, acquiring feedback information about whether to execute the correct to the intention control logic by the user, and if so, recording the intention control logic into the user-defined configured intention control logic which is consistent with the voice recognition result of the user in the corpus. The invention solves the problem that a machine cannot accurately judge the intention of a user when the voice instruction of the user is fuzzy in an intelligent home scene.

Description

Voice control method, system, device and medium in autonomous learning home scene
Technical Field
The invention relates to the technical field of smart home, in particular to a voice control method, a system, equipment and a medium in an autonomous learning home scene.
Background
After the intelligent home voice control obtains the voice instruction of the user, firstly, voice is converted into characters through ASR, then, the characters are subjected to word segmentation processing, and the [ space ] information and the [ equipment information ] in the voice instruction of the user are obtained.
If the [ space ] information is not contained in the voice command of the user, the common practice in the industry is to directly control all devices of the type in the home of the user by default, or to inquire the user which device under the [ space ] is to be controlled by secondary clarification. When a user says the "light on" command in a different spatial domain, the real intent is not the same.
For example, a user says "turn on lights" (without explicit spatial information) when he just enters his home, and the real intention is to turn on lights in the entrance and living room; while the user, when entering the main bedroom or lying in bed and turning on the lights, has the real intention of turning on the bedroom lights only.
The current NLU and NLP technologies cannot accurately understand the difference of the user's intentions of the same fuzzy instruction in different scenes.
Disclosure of Invention
In view of the above problems, the present invention provides a voice control method, system, device and computer storage medium in an autonomous learning home scenario, which solves the problem that a machine is unable to accurately determine the user's intention when the user's voice instruction is fuzzy in an intelligent home scenario.
In order to realize the technical effects, the invention adopts the technical scheme that:
in one aspect, the invention provides a voice control method in an autonomous learning home scene, the method comprising:
the first step is as follows: receiving a user voice command sent by pickup equipment and space information of the pickup equipment;
the second step is that: performing voice recognition processing on the voice instruction, inquiring whether intention control logic which is configured by user definition and accords with the voice recognition result exists in a pre-stored corpus, if so, directly executing the intention control logic, wherein the intention control logic comprises the type, the space and the target action of target equipment which accords with the voice recognition result, and ending the process; if not, executing the third step;
the third step: inquiring whether a default intention control logic of the system, which is consistent with the voice recognition result and the spatial information of the sound pickup equipment, exists in the prestored intention library, if so, directly executing the intention control logic, sending an inquiry whether to execute the right to the user, and entering the fourth step; if not, executing the fifth step;
the fourth step: acquiring whether the user executes correct feedback information on the intention control logic, and if so, recording the intention control logic into the intention control logic which is configured by the user in a user-defined manner and conforms to the voice recognition result of the user in the corpus; if not, executing the fifth step;
the fifth step: and reminding the user of the problem that the voice instruction cannot be executed.
Preferably, the speech recognition processing includes at least one of the following modes: voiceprint recognition, ASR recognition + word segmentation processing, ASR recognition + NLU understanding.
Preferably, the result of the speech recognition process includes at least one of the following corpora: user voiceprint information, type or name of the target device, space in which the target device is located, and actions required to be performed by the target device.
Preferably, the second step further comprises: and according to the result of performing voice recognition processing on the voice command, judging whether the voice command meets the requirement of querying in the corpus in advance, if so, performing subsequent querying steps, and if not, directly executing the fifth step.
In another aspect, the present invention provides a voice control system in an autonomous learning home scenario, where the system includes:
the pickup device comprises a collecting module, a processing module and a processing module, wherein the collecting module is used for receiving a user voice instruction sent by the pickup device and the space information of the pickup device;
the recognition module is used for carrying out voice recognition processing on the voice command;
the first matching module is used for inquiring intention control logic which is matched with the voice recognition result and is configured by a user in a pre-stored corpus;
the first execution module is used for executing the intent control logic which is inquired by the first matching module and is configured by the user self-definition;
the second matching module is used for inquiring default intention control logic of the system, which is consistent with the voice recognition result and the spatial information of the sound pickup equipment, in the prestored intention library;
the second execution module is used for executing the default intention control logic of the system inquired by the second matching module;
the confirming module is used for confirming whether the execution is correct or not with a user after the second executing module finishes executing the default intention control logic of the system, and recording the correctly executed intention control logic into the intention control logic which is configured by the user in a user-defined way and accords with the voice recognition result of the user in the corpus;
and the reminding module is used for reminding the user of the problem that the voice instruction cannot be executed.
Preferably, the identification module comprises at least one of the following modules: the system comprises a voiceprint recognition module, an ASR recognition and word segmentation processing module and an ASR recognition and NLU understanding module.
Preferably, the result of the speech processing by the recognition module includes at least one of the following corpora: user voiceprint information, type or name of the target device, space in which the target device is located, and actions required to be performed by the target device.
As a preferred scheme, the system further includes a pre-judging module, configured to pre-judge whether the voice instruction meets the query requirement in the corpus for the result of voice recognition obtained by the recognition module, and output the result to the first matching module if the result meets the query requirement in the corpus, and output the result to the prompting module if the result does not meet the query requirement in the corpus.
In still another aspect, the present invention provides a voice control device in an autonomous learning home scenario, including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the voice control method as described above when executing the program.
In another aspect, the present invention also provides a computer storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the voice control method as described above.
Compared with the prior art, the invention has the beneficial effects that:
from the actual life scene and the habit of the conversation, the machine is helped to better understand the real intention of the user corresponding to the fuzzy language in different spaces through the feedback of the user. Meanwhile, in consideration of the language habit difference and the space difference of each person, a set of logic of NLU learning is stored aiming at the historical data of each user and machine conversation so as to achieve the effect of thousands of people.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a flowchart of a voice control method for smart home based on user habits according to an embodiment of the present invention.
Fig. 2 is a block diagram of a structure of a voice control system for smart homes based on user habits according to an embodiment of the present invention.
Detailed Description
In order to make the objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in detail below. Several embodiments of the invention are presented in the drawings. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete.
The terms in the examples of the present invention are explained as follows:
ASR: automatic Speech Recognition (Automatic Speech Recognition technology) is a technology for converting human Speech into text.
NLP: natural Language Processing is a technology for communicating with a computer using Natural Language. The research uses the electronic computer to simulate the human language communication process, so that the computer can understand and use the natural language of human society, such as Chinese and English, to realize the natural language communication between human and machine, to replace part of mental labor, including the processing of information inquiry, question answering, document extraction, compilation and all the information about natural language.
The invention provides a voice control method under an autonomous learning home scene, which is applied to a smart home ecosystem, wherein the smart home ecosystem comprises an intelligent home Application (APP), a voice acquisition device and a plurality of smart home devices, the voice acquisition device is communicated with the smart home application in a wireless or wired mode, the smart home devices are also communicated with the smart home application in a wireless or wired mode, when a user wants to control a certain smart home device through voice, the user sends voice information, then the voice acquisition device acquires the voice information sent by the user, and when the voice acquisition device acquires the voice information sent by the user, the voice information is sent to the smart home application.
The embodiment of the invention provides a voice control method and system in an autonomous learning home scene, and the system is used for realizing the voice control method. It can be understood that the system is the smart home Application (APP), and the voice collecting device in this embodiment is a sound pickup device (smart speaker, voice assistant, etc.).
Referring to fig. 1, an embodiment of the present invention provides a voice control method in an autonomous learning home scenario, including the following steps:
the first step is as follows: receiving a user voice command sent by pickup equipment and the information of the space where the pickup equipment is located;
it should be noted that the sound pickup apparatus may be an intelligent sound box, a voice assistant, and the like, and after the sound pickup apparatus logs in the system, information corresponding to the sound pickup apparatus, including ID information, location space information, and the like, is automatically stored on the system, so that when the system acquires a voice command of the sound pickup apparatus, the system can automatically recognize the location space information of the sound pickup apparatus, and in a broad sense, the system acquires the voice command and the location space information of the sound pickup apparatus at the same time. The user voice command, for example, the user says "light on", "air conditioner on", "television on", and among them "light", "air conditioner", "television" should be a smart home device with smart control capability.
The second step is that: performing voice recognition processing on the voice instruction, inquiring whether an intention control logic which is configured by a user in a self-definition mode and accords with the voice recognition result in a pre-stored corpus, and if so, directly executing the intention control logic which is configured by the user in the self-definition mode, wherein the intention control logic which is configured by the user in the self-definition mode comprises the type, the space and the target action of the target equipment which accords with the voice recognition result, and ending the process; if not, executing the third step;
the target equipment type is the type of target intelligent household equipment, such as an air conditioner, a lamp or a television, and the target action is the action executed by controlling the corresponding target intelligent household equipment through a user voice instruction, such as turning on or off (a lamp), increasing or decreasing (air conditioner temperature) and the like
Specifically, in this step, the speech recognition process may include at least one of the following ways: voiceprint recognition, ASR recognition + word segmentation processing, ASR recognition + NLU understanding.
In order to determine the identity of the user, the smart home device that the user wants to control, and the action performed on the smart home device to be controlled, an electroacoustic instrument may be used to perform voiceprint recognition on the user voice command, and the voiceprint information has the identity recognition function as a fingerprint. According to a preset voice recognition algorithm, the voiceprint information of the user in the voice instruction can be recognized.
The system identifies the voiceprint information of the user, and inquires whether the intention control logic of the user-defined configuration which is consistent with the voiceprint information exists in a pre-stored corpus according to the voiceprint information, if the user completely inputs the type, the space and the target action of the target equipment which is consistent with the voiceprint information in the corpus, at the moment, the information can be directly inquired according to the voiceprint information, and the intention control logic of the user-defined configuration can be completed by controlling the target equipment of the type in the space to execute the target action.
If the user lacks any one of the target device type, space and target action which is consistent with the voiceprint information in the corpus, the system can select to perform ASR recognition and simple word segmentation processing on the voice instruction, or perform ASR recognition and NLU understanding to obtain the target device type, space and target action in the voice instruction, and can also realize the purpose control logic for executing the user-defined configuration.
In some cases, the spatial information where the target device is located is lacking in the voice instruction of the user, for example, the user only says "turn on the light", but does not know whether "turn on the light in the living room" or "turn on the light in the bedroom", and at this time, the user does not know from the user habit, that is, the spatial information where the device configured by the user defined and corresponding to the voice instruction is located is lacking in the corpus. At this point the third step is entered.
Further, the method can further comprise the following steps: and according to the result of carrying out voice recognition processing on the voice command, judging whether the voice command meets the requirement of carrying out query in the corpus in advance, if so, carrying out the subsequent query step, and if not, directly executing the fifth step. In this case, an error is reported directly for a voice instruction that is not supported by the corpus, such as a voice instruction sent by a user without authority, so as to remind the user to send a correct voice instruction, and the error reporting can be performed by the sound pickup device.
The third step: inquiring whether a default intention control logic of the system, which is consistent with the voice recognition result and the spatial information of the sound pickup equipment, exists in a pre-stored intention library, if so, directly executing the intention control logic, sending an inquiry whether to execute the right to the user, and entering the fourth step; if not, executing the fifth step;
the contents of the intent library can be found in table 1 below:
corpus Room where pickup equipment is located Intention (device executing action)
Turning on lamp Parlor Hallway lamp and living room lamp
Turning on lamp Principal and subordinate bed Bedroom lamp
When the voice instruction of the user lacks the spatial information of the target device and only the type and the target action of the target device exist, for example, the user says "turn on the light", and the lights of the living room and the main-lying room are in the same intelligent home ecosystem of the user, it is difficult to judge whether the real intention of the user is "turn on the light of the living room" or "turn on the light of the main-lying room". At this time, the space where the user is located can be judged according to the space information where the sound pickup device is located, and the sound pickup device is generally set to be only effective in a certain space, such as a living room or a main sleeping space, so that when the voice command is sent by the sound pickup device in the living room, the user can be judged to be in the living room at this time, and combined with the type and the target action of the target device in the voice command of the user, the default intention control logic of the system is obtained, namely when the space where the sound pickup device is located is the living room, the turning on of the lamp in the living room (including the entrance lamp and the hall lamp) is executed, or when the space where the sound pickup device is located is the main sleeping, the turning on of the bedroom lamp is executed.
Therefore, the problem that the intention of a user cannot be accurately judged by a machine when the voice instruction of the user is fuzzy in an intelligent home scene can be solved.
Further, after the system default intent control logic is executed, the system may also query the user to: is it performed correctly? An inquiry is made by the sound pick-up device.
The fourth step: acquiring whether the user executes correct feedback information on the intention control logic, and if so, recording the intention control logic into the intention control logic which is configured by the user in a user-defined manner and conforms to the voice recognition result of the user in the corpus; if not, executing the fifth step;
in this step, if the user answers and executes correctly, the voice command and the default intention control logic of the system are correlated with each other and issued to the intention control logic configured by the user in the user-defined manner under the user account, and the intention control logic is stored in the corresponding corpus, and the intention control logic is directly executed for the same voice command next time, and the user is not asked whether to execute correctly or not twice. If the user answers that the execution is incorrect, the default intention control logic of the system is not executed next time;
the fifth step: reminding the user that the voice instruction can not be executed.
This step can be implemented by a sound pickup device, and reminds the user to set the type, space and target action of the target device which the voice command wants to control in the system customization (this step is not necessary and can be omitted).
Referring to fig. 2, an embodiment of the present invention provides a voice control system in an autonomous learning home scenario, where the system includes:
the pickup module 11 is configured to receive a user voice instruction sent by pickup equipment and spatial information where the pickup equipment is located;
the recognition module 12 is used for performing voice recognition processing on the voice instruction;
a first matching module 13, configured to query a pre-stored corpus for intent control logic configured by a user in a customized manner, the intent control logic corresponding to the result of the speech recognition;
a first executing module 14, configured to execute the intent control logic of the user-defined configuration queried by the first matching module 13;
the second matching module 15 is used for inquiring default intention control logic of the system, which is consistent with the voice recognition result and the spatial information of the sound pickup equipment, in the prestored intention library;
a second executing module 16, configured to execute the default intention control logic of the system queried by the second matching module;
the confirming module 17 is configured to confirm whether the execution is correct with the user after the second executing module finishes executing the default intention control logic of the system, and record the correctly executed intention control logic into the intention control logic of the user-defined configuration in the corpus that corresponds to the result of the voice recognition of the user;
and the reminding module 18 is used for reminding the user that the problem that the voice instruction cannot be executed exists.
The identification module 12 may specifically include at least one of the following modules: the system comprises a voiceprint recognition module, an ASR recognition and word segmentation processing module and an ASR recognition and NLU understanding module.
The result of the speech processing by the recognition module 12 may include at least one of the following corpora: user voiceprint information, type or name of the target device, space in which the target device is located, and actions required to be performed by the target device.
The system may further include a pre-determining module 19, configured to pre-determine whether the voice command satisfies the query requirement in the corpus, if so, output the voice command to the first matching module 13, and if not, output the voice command to the prompting module 18, so as to directly report an error.
In addition, an embodiment of the present invention further provides a voice control device in an autonomous learning home scenario, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the steps of the methods of the foregoing embodiments when executing the program.
Furthermore, embodiments of the present invention also provide a computer storage medium having a computer program stored thereon, which when executed by a processor, implements the steps of the methods of the embodiments described above.
The invention starts from the actual life scene and the habit of conversation, and helps the machine to better understand the real intention of the user corresponding to the fuzzy language command in different spaces through the feedback of the user. Meanwhile, in consideration of the language habit difference and the space difference of each person, a set of logic of NLU learning is stored aiming at the historical data of each user and machine conversation so as to achieve the effect of thousands of people.
The above-mentioned embodiments only express several embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention. Therefore, the protection scope of the present patent shall be subject to the appended claims. In addition, the parts not related to the invention are the same as or can be realized by the prior art.

Claims (10)

1. A voice control method under an autonomous learning home scene is characterized by comprising the following steps:
the first step is as follows: receiving a user voice command sent by pickup equipment and space information of the pickup equipment;
the second step is that: performing voice recognition processing on the voice instruction, inquiring whether intention control logic which is configured by user definition and accords with the voice recognition result exists in a pre-stored corpus, if so, directly executing the intention control logic, wherein the intention control logic comprises the type, the space and the target action of target equipment which accords with the voice recognition result, and ending the process; if not, executing the third step;
the third step: inquiring whether a default intention control logic of the system, which is consistent with the voice recognition result and the spatial information of the sound pickup equipment, exists in the prestored intention library, if so, directly executing the intention control logic, sending an inquiry whether to execute the right to the user, and entering the fourth step; if not, executing the fifth step;
the fourth step: acquiring whether the user executes correct feedback information on the intention control logic, and if so, recording the intention control logic into the intention control logic which is configured by the user in a user-defined manner and conforms to the voice recognition result of the user in the corpus; if not, executing the fifth step;
the fifth step: and reminding the user of the problem that the voice instruction cannot be executed.
2. The speech control method under the autonomous learning home scenario of claim 1, wherein the speech recognition processing comprises at least one of: voiceprint recognition, ASR recognition + word segmentation processing, ASR recognition + NLU understanding.
3. The speech control method under the autonomous learning home scenario of claim 1, wherein the result of the speech recognition processing comprises at least one of the following corpora: user voiceprint information, type or name of the target device, space in which the target device is located, and actions required to be performed by the target device.
4. The voice control method in the autonomous learning home scenario according to claim 1, further comprising in the second step: and according to the result of performing voice recognition processing on the voice command, judging whether the voice command meets the requirement of querying in the corpus in advance, if so, performing subsequent querying steps, and if not, directly executing the fifth step.
5. A speech control system under a home scene of autonomous learning, comprising:
the pickup device comprises a collecting module, a processing module and a processing module, wherein the collecting module is used for receiving a user voice instruction sent by the pickup device and the space information of the pickup device;
the recognition module is used for carrying out voice recognition processing on the voice command;
the first matching module is used for inquiring intention control logic which is matched with the voice recognition result and is configured by a user in a pre-stored corpus;
the first execution module is used for executing the intent control logic which is inquired by the first matching module and is configured by the user self-definition;
the second matching module is used for inquiring default intention control logic of the system, which is consistent with the voice recognition result and the spatial information of the sound pickup equipment, in the prestored intention library;
the second execution module is used for executing the default intention control logic of the system inquired by the second matching module;
the confirming module is used for confirming whether the execution is correct or not with a user after the second executing module finishes executing the default intention control logic of the system, and recording the correctly executed intention control logic into the intention control logic which is configured by the user in a user-defined way and accords with the voice recognition result of the user in the corpus;
and the reminding module is used for reminding the user of the problem that the voice instruction cannot be executed.
6. The voice control system under autonomous learning home scenario of claim 5, wherein the recognition module comprises at least one of: the system comprises a voiceprint recognition module, an ASR recognition and word segmentation processing module and an ASR recognition and NLU understanding module.
7. The speech control system under autonomous learning home scene of claim 5, wherein the result of the speech processing by the recognition module comprises at least one of the following corpora: user voiceprint information, type or name of the target device, space in which the target device is located, and actions required to be performed by the target device.
8. The voice control system under the autonomous learning home scene according to claim 5, further comprising a pre-determination module, configured to pre-determine whether the voice command satisfies a query requirement in the corpus for a voice recognition result obtained by the recognition module, and output the voice command to the first matching module if the voice command satisfies the query requirement, and output the voice command to the reminding module if the voice command does not satisfy the query requirement.
9. A voice control device under an autonomous learning home scene, comprising a memory, a processor and a computer program stored on the memory and operable on the processor, characterized in that the processor implements the voice control method according to any one of claims 1 to 4 when executing the program.
10. A computer storage medium on which a computer program is stored, which program, when executed by a processor, implements the speech control method according to any one of claims 1 to 4.
CN202111037587.XA 2021-09-06 2021-09-06 Voice control method, system, device and medium in autonomous learning home scene Pending CN113611305A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111037587.XA CN113611305A (en) 2021-09-06 2021-09-06 Voice control method, system, device and medium in autonomous learning home scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111037587.XA CN113611305A (en) 2021-09-06 2021-09-06 Voice control method, system, device and medium in autonomous learning home scene

Publications (1)

Publication Number Publication Date
CN113611305A true CN113611305A (en) 2021-11-05

Family

ID=78310118

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111037587.XA Pending CN113611305A (en) 2021-09-06 2021-09-06 Voice control method, system, device and medium in autonomous learning home scene

Country Status (1)

Country Link
CN (1) CN113611305A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220028373A1 (en) * 2020-07-24 2022-01-27 Comcast Cable Communications, Llc Systems and methods for training voice query models
CN115223556A (en) * 2022-06-15 2022-10-21 中国第一汽车股份有限公司 Self-feedback type vehicle voice control method and system
CN115346530A (en) * 2022-10-19 2022-11-15 亿咖通(北京)科技有限公司 Voice control method, device, equipment, medium, system and vehicle
CN117008493A (en) * 2023-09-26 2023-11-07 广州科宗智能科技有限公司 Gateway-free household control and regulation system based on intelligent sound control

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778946A (en) * 2014-01-10 2015-07-15 中国电信股份有限公司 Voice control method and system
CN104795065A (en) * 2015-04-30 2015-07-22 北京车音网科技有限公司 Method for increasing speech recognition rate and electronic device
CN107220292A (en) * 2017-04-25 2017-09-29 上海庆科信息技术有限公司 Intelligent dialogue device, reaction type intelligent sound control system and method
CN107705788A (en) * 2017-09-29 2018-02-16 上海与德通讯技术有限公司 The method of calibration and intelligent terminal of a kind of phonetic order
CN107919121A (en) * 2017-11-24 2018-04-17 江西科技师范大学 Control method, device, storage medium and the computer equipment of smart home device
CN110767225A (en) * 2019-10-24 2020-02-07 北京声智科技有限公司 Voice interaction method, device and system
CN111554286A (en) * 2020-04-26 2020-08-18 云知声智能科技股份有限公司 Method and equipment for controlling unmanned aerial vehicle based on voice
CN112053683A (en) * 2019-06-06 2020-12-08 阿里巴巴集团控股有限公司 Voice instruction processing method, device and control system
CN112201233A (en) * 2020-09-01 2021-01-08 沈澈 Voice control method, system and device of intelligent household equipment and computer storage medium
CN112201257A (en) * 2020-09-29 2021-01-08 北京百度网讯科技有限公司 Information recommendation method and device based on voiceprint recognition, electronic equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104778946A (en) * 2014-01-10 2015-07-15 中国电信股份有限公司 Voice control method and system
CN104795065A (en) * 2015-04-30 2015-07-22 北京车音网科技有限公司 Method for increasing speech recognition rate and electronic device
CN107220292A (en) * 2017-04-25 2017-09-29 上海庆科信息技术有限公司 Intelligent dialogue device, reaction type intelligent sound control system and method
CN107705788A (en) * 2017-09-29 2018-02-16 上海与德通讯技术有限公司 The method of calibration and intelligent terminal of a kind of phonetic order
CN107919121A (en) * 2017-11-24 2018-04-17 江西科技师范大学 Control method, device, storage medium and the computer equipment of smart home device
CN112053683A (en) * 2019-06-06 2020-12-08 阿里巴巴集团控股有限公司 Voice instruction processing method, device and control system
CN110767225A (en) * 2019-10-24 2020-02-07 北京声智科技有限公司 Voice interaction method, device and system
CN111554286A (en) * 2020-04-26 2020-08-18 云知声智能科技股份有限公司 Method and equipment for controlling unmanned aerial vehicle based on voice
CN112201233A (en) * 2020-09-01 2021-01-08 沈澈 Voice control method, system and device of intelligent household equipment and computer storage medium
CN112201257A (en) * 2020-09-29 2021-01-08 北京百度网讯科技有限公司 Information recommendation method and device based on voiceprint recognition, electronic equipment and storage medium

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20220028373A1 (en) * 2020-07-24 2022-01-27 Comcast Cable Communications, Llc Systems and methods for training voice query models
CN115223556A (en) * 2022-06-15 2022-10-21 中国第一汽车股份有限公司 Self-feedback type vehicle voice control method and system
CN115223556B (en) * 2022-06-15 2024-05-14 中国第一汽车股份有限公司 Self-feedback type vehicle voice control method and system
CN115346530A (en) * 2022-10-19 2022-11-15 亿咖通(北京)科技有限公司 Voice control method, device, equipment, medium, system and vehicle
CN117008493A (en) * 2023-09-26 2023-11-07 广州科宗智能科技有限公司 Gateway-free household control and regulation system based on intelligent sound control

Similar Documents

Publication Publication Date Title
CN113611305A (en) Voice control method, system, device and medium in autonomous learning home scene
CN107919121B (en) Control method and device of intelligent household equipment, storage medium and computer equipment
CN107908116B (en) Voice control method, intelligent home system, storage medium and computer equipment
CN108039988B (en) Equipment control processing method and device
CN107729433B (en) Audio processing method and device
CN109559742B (en) Voice control method, system, storage medium and computer equipment
CN108932947B (en) Voice control method and household appliance
CN113611306A (en) Intelligent household voice control method and system based on user habits and storage medium
CN111447124B (en) Intelligent household control method and intelligent control equipment based on biological feature recognition
CN112201233A (en) Voice control method, system and device of intelligent household equipment and computer storage medium
CN111508491A (en) Intelligent voice interaction equipment based on deep learning
CN110767225A (en) Voice interaction method, device and system
CN111933135A (en) Terminal control method and device, intelligent terminal and computer readable storage medium
CN111308904A (en) Intelligent home control method, main control device, sub-control device and storage medium
CN114020909A (en) Scene-based smart home control method, device, equipment and storage medium
CN116110112B (en) Self-adaptive adjustment method and device of intelligent switch based on face recognition
CN114639379A (en) Interaction method and device of intelligent electric appliance, computer equipment and medium
CN110719512A (en) Intelligent remote controller control method and device, intelligent remote controller and storage medium
CN116415590A (en) Intention recognition method and device based on multi-round query
CN110970019A (en) Control method and device of intelligent home system
CN109100942A (en) A kind of smart home control device and control method
CN114627859A (en) Method and system for recognizing electronic photo frame in offline semantic manner
CN116105307A (en) Air conditioner control method, device, electronic equipment and storage medium
CN113038256A (en) Audio output method of electronic equipment, smart television and readable storage medium
CN113962213A (en) Multi-turn dialog generation method, terminal and computer readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination