CN112562670A - Intelligent voice recognition method, intelligent voice recognition device and intelligent equipment - Google Patents
Intelligent voice recognition method, intelligent voice recognition device and intelligent equipment Download PDFInfo
- Publication number
- CN112562670A CN112562670A CN202011411651.1A CN202011411651A CN112562670A CN 112562670 A CN112562670 A CN 112562670A CN 202011411651 A CN202011411651 A CN 202011411651A CN 112562670 A CN112562670 A CN 112562670A
- Authority
- CN
- China
- Prior art keywords
- corpus
- recognition
- keyword
- intelligent
- voice information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 230000006870 function Effects 0.000 claims description 18
- 238000012545 processing Methods 0.000 claims description 17
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 6
- 230000008569 process Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 3
- 230000003993 interaction Effects 0.000 description 3
- 230000000694 effects Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Telephonic Communication Services (AREA)
Abstract
The embodiment of the application provides a voice intelligent recognition method, a voice intelligent recognition device and intelligent equipment, relates to the technical field of intelligent home furnishing, and not only can the user key operation be omitted, but also the intelligent equipment can be controlled to work under the condition that the user does not fully speak out the linguistic data mastered by the intelligent equipment. The intelligent voice recognition method comprises the following steps: receiving voice information; matching the voice information with a corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus; matching the target keywords with a business logic library, and determining skill sentences matched with the target keywords, wherein the business logic library comprises at least one skill sentence; and determining a control command corresponding to the voice information according to the skill sentence matched with the target keyword.
Description
Technical Field
The application relates to the technical field of smart homes, in particular to a voice intelligent recognition method, a voice intelligent recognition device and intelligent equipment.
Background
Along with the continuous development of intelligent equipment, the function of intelligent equipment is many and miscellaneous, and the user needs numerous and complicated manual button operation, just can awaken corresponding function or obtain useful information to reduce user experience.
Disclosure of Invention
The present application provides a voice intelligent recognition method, a voice intelligent recognition device and an intelligent device, for example, to solve the above problems.
In a first aspect, a method for intelligent speech recognition is provided, the method comprising: receiving voice information; matching the voice information with a corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus; matching the target keywords with a business logic library, and determining skill sentences matched with the target keywords, wherein the business logic library comprises at least one skill sentence; and determining a control command corresponding to the voice information according to the skill sentence matched with the target keyword.
In a second aspect, a speech intelligent recognition method is provided, which includes: the server receives voice information; the server matches the voice information with a corpus, determines a keyword corresponding to at least one recognition corpus in the corpus, and takes the keyword as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus; the server matches the target keywords with a business logic library to determine skill sentences matched with the business logic library, wherein the business logic library comprises at least one skill sentence; the server determines a control command corresponding to the voice information according to the skill sentence and sends the control command to the intelligent equipment; and the intelligent equipment receives the control command and controls the intelligent household equipment to execute the control command.
In a third aspect, an intelligent speech recognition apparatus is provided, which includes: the device comprises a receiving module and a processing module. The receiving module is used for receiving voice information sent by the intelligent equipment; the processing module is used for matching the voice information with the corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus; the processing module is also used for matching the target keywords with the business logic library and determining the skill sentences matched with the target keywords, and the business logic library comprises at least one skill sentence; and the processing module is also used for determining a control command corresponding to the voice information according to the skill sentence matched with the target keyword.
In a fourth aspect, a smart device is provided, comprising: one or more processors; a memory; and one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of the first aspect or the second aspect.
In a fifth aspect, a computer-readable storage medium is provided, in which a program code is stored, and the program code can be called by a processor to execute the method according to the first aspect or the second aspect.
In the intelligent voice recognition method, the intelligent voice recognition device and the intelligent device provided by the embodiment of the application, after receiving the voice information, the server can split the voice information into at least one keyword, match the at least one keyword with the recognition corpus in the corpus to determine a target keyword corresponding to the recognition corpus in the at least one keyword, match the target keyword with the business logic library to determine a skill sentence corresponding to the target keyword, and then determine a control command according to the skill sentence corresponding to the target keyword, so that the precise control command is determined according to the generalized voice information. Compared with the prior art, the method and the device can not only save key operation of a user, but also control the intelligent device to work under the condition that the user does not fully speak the linguistic data mastered by the intelligent device.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present application and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained from the drawings without inventive effort.
FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;
FIG. 2 is a flowchart of a speech intelligent recognition method according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an application environment provided by an embodiment of the present application;
FIG. 4 is a flowchart of a speech intelligent recognition method provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of an application environment provided by an embodiment of the present application;
FIG. 6 is a schematic diagram of an application environment provided by an embodiment of the present application;
FIG. 7 is an interaction timing diagram of a speech intelligent recognition method according to an embodiment of the present application;
fig. 8 is a block diagram of an intelligent speech recognition apparatus according to an embodiment of the present application;
fig. 9 is a block diagram of an intelligent speech recognition apparatus according to an embodiment of the present application;
fig. 10 is a block diagram of an intelligent device provided by an embodiment of the present application;
fig. 11 is a memory of an application program of the speech intelligent recognition method according to the embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. It should be noted that the features of the embodiments of the present application may be combined with each other without conflict.
Based on the problems provided by the background art, the voice recognition function is applied to the intelligent equipment in the related art, the distance between the user and the intelligent equipment is shortened, and the user can more directly and conveniently control the intelligent equipment.
At present, manufacturers often adopt a deep learning algorithm to train intelligent equipment, so that the intelligent equipment masters fixed corpora, and a user needs to speak a control command strictly according to the corpora mastered by the intelligent equipment. However, the corpus of the partially controlled intelligent device is too long and the spoken language is not enough, so that the user is difficult to hit precisely, and the intelligent device is further controlled, and the user experience is influenced.
The inventor provides the following scheme after research, so that the key operation of a user can be omitted, and the intelligent device can be controlled under the condition that the user does not fully speak the linguistic data mastered by the intelligent device.
The smart device may be a smart phone, a tablet computer, an electronic book, a smart control panel, etc. The intelligent household equipment can comprise an intelligent television, an intelligent curtain, an intelligent sound box, an intelligent refrigerator, an intelligent electric cooker and the like.
As shown in fig. 1, the server 20 may receive the voice information through the smart device 10, process the voice information, determine a control command corresponding to the voice information, and control the smart device 10 to perform a corresponding operation according to the control command. The server 20 may be a conventional server or a cloud server, among others.
In the embodiment of the present application, a corpus may be prestored in the smart device 10 or the server 20, where the corpus includes at least one recognition corpus. After receiving the voice message, the server 20 may directly call the pre-stored corpus to determine a keyword corresponding to the identified corpus in the voice message, where the keyword is used as a target keyword.
The corpus may be an intelligent home corpus, and the recognition corpus in the intelligent home corpus may include at least one of a name of the intelligent home device, a function of the intelligent home device, and a functional state of the intelligent home device.
Taking the smart home devices as smart televisions and the current requirements of users as watching televisions as examples, the names of the smart home devices may be "smart televisions", the functions of the smart home devices may be "on/off", and the functional states of the smart home devices may be "on". Certainly, the identification corpus in the smart home corpus may also include other functions and functional states of the smart television; the recognition corpus in the smart home corpus may also include names, functions, and functional states corresponding to other smart homes.
In the embodiment of the present application, a business logic library may be prestored in the smart device 10 or the server 20, and the business logic library includes at least one skill sentence. After the server 20 determines the target keyword corresponding to the voice information through the corpus, it may directly call a pre-stored service logic library to determine the skill sentence corresponding to the target keyword.
The service logic library may be an intelligent home service logic library, and the skill sentences in the intelligent home service logic library may include all sentences which can be formed by names of the intelligent home devices, functions of the intelligent home devices, and functional states of the intelligent home devices.
Taking the smart home device as a smart television, and taking an operation corresponding to a skill sentence as "turning on the smart television" as an example, the skill sentence of the smart home service logic library may include a skill sentence consisting of "smart television", "switch", "turn on", and "turn on", for example, the skill sentence may be "turn on the switch of the smart television", "please turn on the switch of the smart television", "turn on the switch of the smart television", and the like.
Embodiments of the present application will be described in detail below with reference to the accompanying drawings.
As shown in fig. 2, an embodiment of the present application provides an intelligent speech recognition method, which is applicable to a server 20, and this embodiment describes a server-side process flow, where the method may include:
and S110, receiving voice information.
As shown in fig. 3, when the smart device 10 is in the offline state and the awake state, the user's spoken words may be transmitted to the server 20 through the smart device 10.
For example, the smart device 10 may be a smart control panel, and before the server 20 receives the voice message, the smart control panel may be in an offline state, and the user may speak "small housekeeper" and send the voice message to the server 20 through the smart control panel.
Before the server 20 receives the voice message, the intelligent control panel may also be in an awake state, and the user may say "turn on the speaker", and send the voice message to the server 20 through the intelligent control panel.
S120, matching the voice information with a corpus to determine a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
After the voice information is received by the server 20, the voice information may be parsed to convert the voice information into a computer recognizable language. After the server 20 recognizes the analyzed voice information, the user can split a complete sentence into at least one keyword, and match the at least one keyword with the recognition corpus in the corpus, wherein the keyword corresponding to the recognition corpus in the at least one keyword can be used as a target keyword in the voice information.
For example, the voice message received by the server 20 is "open speaker", the "open speaker" may be split into "handle", "speaker" and "open", and the "handle", "speaker" and "open" are respectively matched with the recognized corpora in the corpus. If the recognition corpus in the corpus includes "open" and "smart speaker", the keyword "open" is matched with the recognition corpus "open", the keyword "speaker" is matched with the recognition corpus "smart speaker", and the "speaker" and "open" in the voice information can be used as the target keyword.
In some embodiments, in the case that a complete sentence spoken by a user is split into multiple keywords, the multiple keywords may all be matched with the corpus and serve as target keywords; alternatively, a part of the plurality of keywords may be matched with the corpus, and a part of the plurality of keywords matched with the corpus may be used as the target keyword.
In some embodiments, the words spoken by the user may include keywords and non-keywords, and the non-keywords are before and/or after the keywords, and when matching the speech information corresponding to the words spoken by the user with the corpus, the server 20 may determine which words in the speech information are the keywords and which words are the non-keywords according to the recognition corpus in the corpus, that is, the server 20 determines the target keywords according to the recognition corpus, and at the same time allows at least one non-keyword to be included in the speech information.
The non-keyword may be located before and/or after the keyword, taking the corpus as an example of the smart home corpus, and the non-keyword may be located before a name of the smart home device, and/or after the name of the smart home device, and/or before a function of the smart home device, and/or after a function state of the smart home device, and/or after the function state of the smart home device.
The non-keyword may be, for example, a subject, a verb, an assistant word, and the like, and this is not particularly limited in this embodiment of the present application. The subject may be, for example, "i", the verb may be, for example, "help", and the helpwords may be, for example, "bar, o, and" calash ".
For example, if the user's requirement is "turn on the smart tv", the user may say "please help me to turn on the switch bar of the smart tv", the target keyword may be "turn on", "smart tv", or "switch", and before "turn on", the voice message further includes a non-keyword "please help me"; between "smart tv" and "switch", the voice message also includes non-keyword "; after "switch", the voice message also includes a non-keyword "bar".
In some embodiments, the number of words of the non-keyword before or after the keyword may be less than the preset number of words, so as to avoid that the voice information is too long and the voice recognition effect is affected. The preset word number is not limited, for example, the preset word number can be 3-7, and optionally, the preset word number is 5. In some embodiments, the manner of parsing the voice information is not limited as long as the parsed voice information can be recognized by the server 20.
For example, the server may recognize the voice information by communicating the voice information with the computer using a Natural Language Understanding (NLU) technology; alternatively, a Dynamic Time Warping (DTW) algorithm may be used to identify text information corresponding to the speech information according to the acoustic feature vector of the speech information. Of course, other techniques may also be used to analyze the voice information, and this is not particularly limited in this embodiment of the application.
Wherein, if adopt technologies such as DTW technique to convert speech information into text message to analyze speech information, then match speech information and corpus, confirm the keyword that corresponds with at least one discernment corpus in the corpus, regard the keyword that corresponds with the discernment corpus as the target keyword in the speech information, the step that includes at least one discernment corpus in the corpus can include:
and S121, converting the voice information into text information.
The voice information may be converted into text information using DTW technology or the like.
And S122, matching the text information with a corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
S130, matching the target keywords with a business logic library, and determining the skill sentences matched with the business logic library, wherein the business logic library comprises at least one skill sentence.
After determining the target keyword according to step S120, the target keyword may be matched with the business logic library to determine a skill sentence corresponding to the target keyword.
For example, the target keywords are "speaker" and "open", and the skill statements in the business logic library include "open smart speaker", so that the skill statements matching "speaker" and "open" can be determined as "open smart speaker".
In some embodiments, in the case that the voice message has a plurality of target keywords, the sequence of the target keywords in the voice message may be the same as or different from the sequence of the target keywords in the skill sentence.
For example, the user says "open sound box", where "sound box" and "open" are the target keywords, and in the voice message, the order of "sound box" and "open" is: firstly, turning on the loudspeaker box; and the skill sentence matched with the target keyword is 'turn on smart speaker', and in the skill sentence, the sequence of 'smart speaker' and 'turn on' is: firstly, the intelligent sound box is turned on and then turned on.
Or, the user says "please open the sound box", wherein "sound box" and "open" are the target keywords, and in the voice message, the order of "sound box" and "open" is: firstly, turning on and then turning on the loudspeaker box; and the skill sentence matched with the target keyword is "turn on smart speaker", and in the skill sentence, the sequence of "smart speaker" and "turn on" is also: firstly, the intelligent sound box is turned on and then turned on. And S140, determining a control command corresponding to the voice information according to the skill sentence matched with the target keyword.
After determining the skill sentence according to step S130, the control command corresponding to the voice information may be determined according to the skill sentence corresponding to the target keyword.
In some embodiments, when determining the control command according to the skill sentence matched with the target keyword, the skill business logic may be triggered according to the skill sentence matched with the target keyword; and then, determining a control command corresponding to the voice information according to the skill business logic.
The embodiment of the application provides an intelligent voice recognition method, after receiving voice information, a server 20 can split the voice information into at least one keyword, match the at least one keyword with recognition corpora in a corpus to determine a target keyword corresponding to the recognition corpora in the at least one keyword, then match the target keyword with a business logic library to determine a skill sentence corresponding to the target keyword, and further determine a control command according to the skill sentence corresponding to the target keyword, so that an accurate control command is determined according to generalized voice information. Compared with the prior art, the method and the device can not only save key operation of a user, but also control the intelligent device 10 to work under the condition that the user does not fully speak the linguistic data mastered by the intelligent device 10.
As shown in fig. 4, an embodiment of the present application provides an intelligent speech recognition method, which is applicable to a server 20, and this embodiment describes a server-side process flow, where the method may include:
and S110, receiving voice information.
S120, matching the voice information with a corpus to determine a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
S130, matching the target keywords with a business logic library, and determining the skill sentences matched with the business logic library, wherein the business logic library comprises at least one skill sentence.
And S140, determining a control command corresponding to the voice information according to the skill sentence matched with the target keyword.
S150, the control command is sent to the intelligent equipment, so that the intelligent equipment controls the intelligent household equipment to execute the control command.
As shown in fig. 1, after determining the control command corresponding to the voice information, the server 20 may send the control command to the smart device 10 through communication manners such as Wireless-Fidelity (WiFi), bluetooth, Zigbee, and hotspot, so that the smart device controls the smart home device to execute the control command. Of course, the server 20 may also send the control command to the smart device 10 through other communication manners, which is not limited in this embodiment.
As shown in fig. 5, taking the control command as "turn on the smart speaker" as an example, if the server 20 sends the control command to the smart control panel, the smart control panel further controls the smart speaker to turn from the offline state to the awake state after receiving the control command. That is, the server 20 sends the control command to the smart device 10, and the smart device 10 further controls the smart home device to execute the control command.
In other embodiments, if the smart device 10 is an intelligent home device, the smart device 10 may directly execute the corresponding operation after receiving the control command.
Taking the control command as "turn on the smart speaker" as an example, as shown in fig. 6, if the server 20 sends the control command to the smart speaker, the smart speaker is turned from the offline state to the awake state.
The intelligent control panel and the intelligent sound box can interact with each other through WiFi Bluetooth, Zigbee, hotspots and other communication modes.
The embodiment of the application provides an intelligent voice recognition method, after receiving voice information, a server 20 can split the voice information into at least one keyword, match the at least one keyword with recognition corpora in a corpus to determine a target keyword corresponding to the recognition corpora in the at least one keyword, then match the target keyword with a business logic library to determine a skill sentence corresponding to the target keyword, further determine a control command according to the skill sentence corresponding to the target keyword, and send the control command to an intelligent device 10, so that the purpose that an accurate control command is determined according to generalized voice information is achieved, and the intelligent device 10 controls an intelligent home device to execute corresponding operations. Compared with the prior art, the method and the device can not only save key operation of a user, but also control the intelligent device 10 to work under the condition that the user does not fully speak the linguistic data mastered by the intelligent device 10.
As shown in fig. 7, another embodiment of the present application provides a device distribution network processing method, which is applicable to interaction between an intelligent device 10 and a server 12, where this embodiment describes an interaction flow between the intelligent device 10 and the server 12, and the method may include:
s210, the server receives the voice information.
As shown in fig. 3, when the smart device 10 is in the offline state and the awake state, the user's spoken words may be transmitted to the server 20 through the smart device 10.
S220, the server matches the voice information with the corpus to determine a keyword corresponding to at least one recognition corpus in the corpus, the keyword is used as a target keyword in the voice information, and the corpus comprises at least one recognition corpus.
After the voice information is received by the server 20, the voice information may be parsed to convert the voice information into a computer recognizable language. After the server 20 recognizes the analyzed voice information, the user can split a complete sentence into at least one keyword, and match the at least one keyword with the recognition corpus in the corpus, wherein the keyword corresponding to the recognition corpus in the at least one keyword can be used as a target keyword in the voice information.
S230, the server matches the target keywords with the business logic library to determine the skill sentences matched with the business logic library, and the business logic library comprises at least one skill sentence.
After determining the target keyword according to step S220, the target keyword may be matched with the business logic library to determine a skill sentence corresponding to the target keyword.
And S240, the server determines a control command corresponding to the voice information according to the skill sentence, and sends the control command to the intelligent equipment.
After determining the skill sentence according to step S230, the control command corresponding to the voice information may be determined according to the skill sentence corresponding to the target keyword, and then the control command may be sent to the intelligent device 10 through communication methods such as WiFi bluetooth, Zigbee, hotspot, and the like.
In addition, the other explanations of steps S210 to S240 are the same as the explanations of steps S110 to S150 in the foregoing embodiment, and are not repeated herein.
And S250, the intelligent equipment receives the control command and controls the intelligent household equipment to execute the control command.
After receiving the control command, the smart device 10 may control the smart home device to execute a corresponding operation according to the control command.
For example, the control command is "turn on the speaker", and after the intelligent control panel receiving the control command receives the control command "turn on the speaker", the intelligent speaker may be controlled to change from the offline state to the awake state.
The embodiment of the application provides an intelligent voice recognition method, after receiving voice information, a server 20 can split the voice information into at least one keyword, match the at least one keyword with recognition corpora in a corpus to determine a target keyword corresponding to the recognition corpora in the at least one keyword, then match the target keyword with a business logic library to determine a skill sentence corresponding to the target keyword, further determine a control command according to the skill sentence corresponding to the target keyword, and send the control command to an intelligent device 10, so that the purpose that an accurate control command is determined according to generalized voice information is achieved, and the intelligent device 10 controls an intelligent home device to execute corresponding operations. Compared with the prior art, the method and the device can not only save key operation of a user, but also control the intelligent device 10 to work under the condition that the user does not fully speak the linguistic data mastered by the intelligent device 10.
As shown in fig. 8, which shows a block diagram of a speech intelligent recognition apparatus 100 according to another embodiment of the present application, the speech intelligent recognition apparatus 100 may include a receiving module 101 and a processing module 102.
The receiving module 101 is configured to receive voice information sent by the smart device 10.
The processing module 102 is configured to match the speech information with a corpus, determine a keyword corresponding to at least one recognition corpus in the corpus, and use the keyword corresponding to the recognition corpus as a target keyword in the speech information, where the corpus includes the at least one recognition corpus.
The processing module 102 is further configured to match the target keyword with a business logic library, and determine a skill statement matched with the target keyword, where the business logic library includes at least one skill statement.
The processing module 102 is further configured to determine a control command corresponding to the voice message according to the skill sentence matched with the target keyword.
On the basis, the processing module 102 is further configured to parse the voice message to convert the voice message into a language recognizable to the computer; matching the analyzed voice information with a corpus, determining a target keyword corresponding to at least one recognition corpus of the corpus, and taking the keyword corresponding to the recognition corpus as the target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
The processing module 102 is further configured to convert the voice information into text information; matching the text information with a corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
The corpus can be an intelligent home device corpus, and the target keywords include at least one of names of the intelligent home devices, functions of the intelligent home devices, and functional states of the intelligent home devices.
On this basis, as shown in fig. 9, the intelligent speech recognition device 100 may further include a sending module 103. The sending module 103 is configured to send the control command to the intelligent device 10 after determining the control command corresponding to the voice information according to the skill sentence matched with the target keyword, so that the intelligent device 10 controls the intelligent home device to execute the control command.
The explanation and the advantageous effects of the speech intelligent recognition device 100 provided in the embodiment of the present application are the same as those of the foregoing embodiment, and are not repeated herein.
As shown in fig. 10, a block diagram of a smart device 10 according to another embodiment of the present application is shown, where the smart device 10 includes: one or more processors 11; a memory 12; and one or more applications 13, wherein the one or more applications 13 are stored in the memory and configured to be executed by the one or more processors 11, the one or more applications 13 configured to perform the methods of the foregoing embodiments.
The Memory 12 may include a Random Access Memory (RAM) or a Read-Only Memory (ROM). The memory 12 may be used to store instructions, programs, code sets or instruction sets. The memory 12 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The data storage area may also store data created by the smart device 10 during use (e.g., phone book, audio-video data, chat log data), etc.
Fig. 11 is a block diagram illustrating a computer-readable storage medium 200 according to another embodiment of the present application. The computer-readable storage medium 200 has stored therein a program code that can be called by a processor to execute the method described in the above-described method embodiments.
The computer-readable storage medium 200 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 200 includes a non-transitory computer-readable storage medium.
The computer readable storage medium 200 has storage space for the application 13 that performs any of the method steps of the method described above. The application programs 13 may be read from or written to one or more computer program products. The application 13 may be compressed, for example, in a suitable form.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.
Claims (10)
1. An intelligent speech recognition method, comprising:
receiving voice information;
matching the voice information with a corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus;
matching the target keywords with a business logic library, and determining skill sentences matched with the target keywords, wherein the business logic library comprises at least one skill sentence;
and determining a control command corresponding to the voice information according to the skill sentence matched with the target keyword.
2. The method according to claim 1, wherein the step of matching the speech information with a corpus, determining a target keyword corresponding to at least one recognition corpus in the corpus, and using the keyword corresponding to the recognition corpus as the target keyword in the speech information, wherein the corpus includes at least one recognition corpus specifically comprises:
analyzing the voice information to convert the voice information into a language which can be recognized by a computer;
matching the analyzed voice information with the corpus to determine a target keyword corresponding to at least one recognition corpus of the corpus, and taking the keyword corresponding to the recognition corpus as the target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
3. The method according to claim 1, wherein the step of matching the speech information with the corpus to determine a keyword corresponding to at least one recognition corpus in the corpus, and using the keyword corresponding to the recognition corpus as a target keyword in the speech information, wherein the corpus includes at least one recognition corpus specifically includes:
converting the voice information into text information;
matching the text information with the corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus.
4. The method according to any one of claims 1-3, wherein after determining the control command corresponding to the voice message according to the skill sentence matching the target keyword, the method further comprises:
and sending the control command to intelligent equipment so that the intelligent equipment controls the intelligent household equipment to execute the control command.
5. The method according to any one of claims 1-3, wherein the corpus is a corpus of smart home devices.
6. The method according to claim 5, wherein the target keyword comprises at least one of a name of the smart home device, a function of the smart home device, and a functional state of the smart home device.
7. An intelligent speech recognition method, comprising:
the server receives voice information;
the server matches the voice information with a corpus, determines a keyword corresponding to at least one recognition corpus in the corpus, and takes the keyword as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus;
the server matches the target keyword with a business logic library and determines a skill sentence matched with the business logic library, wherein the business logic library comprises at least one skill sentence;
the server determines a control command corresponding to the voice information according to the skill sentence and sends the control command to the intelligent equipment;
and the intelligent equipment receives the control command and controls the intelligent household equipment to execute the control command.
8. An intelligent speech recognition device, comprising:
the receiving module is used for receiving voice information sent by the intelligent equipment;
the processing module is used for matching the voice information with a corpus, determining a keyword corresponding to at least one recognition corpus in the corpus, and taking the keyword corresponding to the recognition corpus as a target keyword in the voice information, wherein the corpus comprises at least one recognition corpus;
the processing module is further configured to match the target keyword with a business logic library, and determine a skill statement matched with the target keyword, where the business logic library includes at least one skill statement;
the processing module is further configured to determine a control command corresponding to the voice message according to the skill sentence matched with the target keyword.
9. An intelligent device, comprising:
one or more processors;
a memory; and
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-6 or claim 7.
10. A computer-readable storage medium having program code stored therein, the program code being callable by a processor to perform the method according to any one of claims 1 to 6 or claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011411651.1A CN112562670A (en) | 2020-12-03 | 2020-12-03 | Intelligent voice recognition method, intelligent voice recognition device and intelligent equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011411651.1A CN112562670A (en) | 2020-12-03 | 2020-12-03 | Intelligent voice recognition method, intelligent voice recognition device and intelligent equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112562670A true CN112562670A (en) | 2021-03-26 |
Family
ID=75048768
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011411651.1A Pending CN112562670A (en) | 2020-12-03 | 2020-12-03 | Intelligent voice recognition method, intelligent voice recognition device and intelligent equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112562670A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113611306A (en) * | 2021-09-07 | 2021-11-05 | 云知声(上海)智能科技有限公司 | Intelligent household voice control method and system based on user habits and storage medium |
CN114049877A (en) * | 2021-11-04 | 2022-02-15 | 北京奇天大胜网络科技有限公司 | Voice digital human-television information interaction method and system based on Internet of things |
CN114911381A (en) * | 2022-04-15 | 2022-08-16 | 青岛海尔科技有限公司 | Interactive feedback method and device, storage medium and electronic device |
CN115421396A (en) * | 2022-09-29 | 2022-12-02 | 深圳康佳电子科技有限公司 | Intelligent household equipment control method and device and electronic equipment |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108305626A (en) * | 2018-01-31 | 2018-07-20 | 百度在线网络技术(北京)有限公司 | The sound control method and device of application program |
CN110286601A (en) * | 2019-07-01 | 2019-09-27 | 珠海格力电器股份有限公司 | Method and device for controlling intelligent household equipment, control equipment and storage medium |
CN110942773A (en) * | 2019-12-10 | 2020-03-31 | 上海雷盎云智能技术有限公司 | Method and device for controlling intelligent household equipment through voice |
WO2020135067A1 (en) * | 2018-12-24 | 2020-07-02 | 同方威视技术股份有限公司 | Voice interaction method and device, robot, and computer readable storage medium |
CN111599362A (en) * | 2020-05-20 | 2020-08-28 | 湖南华诺科技有限公司 | System and method for self-defining intelligent sound box skill and storage medium |
-
2020
- 2020-12-03 CN CN202011411651.1A patent/CN112562670A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108305626A (en) * | 2018-01-31 | 2018-07-20 | 百度在线网络技术(北京)有限公司 | The sound control method and device of application program |
WO2020135067A1 (en) * | 2018-12-24 | 2020-07-02 | 同方威视技术股份有限公司 | Voice interaction method and device, robot, and computer readable storage medium |
CN110286601A (en) * | 2019-07-01 | 2019-09-27 | 珠海格力电器股份有限公司 | Method and device for controlling intelligent household equipment, control equipment and storage medium |
CN110942773A (en) * | 2019-12-10 | 2020-03-31 | 上海雷盎云智能技术有限公司 | Method and device for controlling intelligent household equipment through voice |
CN111599362A (en) * | 2020-05-20 | 2020-08-28 | 湖南华诺科技有限公司 | System and method for self-defining intelligent sound box skill and storage medium |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113611306A (en) * | 2021-09-07 | 2021-11-05 | 云知声(上海)智能科技有限公司 | Intelligent household voice control method and system based on user habits and storage medium |
CN114049877A (en) * | 2021-11-04 | 2022-02-15 | 北京奇天大胜网络科技有限公司 | Voice digital human-television information interaction method and system based on Internet of things |
CN114911381A (en) * | 2022-04-15 | 2022-08-16 | 青岛海尔科技有限公司 | Interactive feedback method and device, storage medium and electronic device |
CN115421396A (en) * | 2022-09-29 | 2022-12-02 | 深圳康佳电子科技有限公司 | Intelligent household equipment control method and device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11302302B2 (en) | Method, apparatus, device and storage medium for switching voice role | |
US10803869B2 (en) | Voice enablement and disablement of speech processing functionality | |
CN112562670A (en) | Intelligent voice recognition method, intelligent voice recognition device and intelligent equipment | |
CN112201246B (en) | Intelligent control method and device based on voice, electronic equipment and storage medium | |
CN105592343B (en) | Display device and method for question and answer | |
US20160293168A1 (en) | Method of setting personal wake-up word by text for voice control | |
JP5119055B2 (en) | Multilingual voice recognition apparatus, system, voice switching method and program | |
KR102411619B1 (en) | Electronic apparatus and the controlling method thereof | |
JP2017107078A (en) | Voice interactive method, voice interactive device, and voice interactive program | |
JP2014191030A (en) | Voice recognition terminal and voice recognition method using computer terminal | |
CN108882101B (en) | Playing control method, device, equipment and storage medium of intelligent sound box | |
CN107844470B (en) | Voice data processing method and equipment thereof | |
TW201440482A (en) | Voice answering method and mobile terminal apparatus | |
US10540973B2 (en) | Electronic device for performing operation corresponding to voice input | |
CN112420044A (en) | Voice recognition method, voice recognition device and electronic equipment | |
CN113674742B (en) | Man-machine interaction method, device, equipment and storage medium | |
CN112767916A (en) | Voice interaction method, device, equipment, medium and product of intelligent voice equipment | |
CN113611316A (en) | Man-machine interaction method, device, equipment and storage medium | |
KR20200057501A (en) | ELECTRONIC APPARATUS AND WiFi CONNECTING METHOD THEREOF | |
CN113643684A (en) | Speech synthesis method, speech synthesis device, electronic equipment and storage medium | |
CN110473524B (en) | Method and device for constructing voice recognition system | |
CN112787899B (en) | Equipment voice interaction method, computer readable storage medium and refrigerator | |
CN112002325B (en) | Multi-language voice interaction method and device | |
CN114822598A (en) | Server and speech emotion recognition method | |
CN114299941A (en) | Voice interaction method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |