CN109347708B

CN109347708B - Voice recognition method and device, household appliance, cloud server and medium

Info

Publication number: CN109347708B
Application number: CN201811194962.XA
Authority: CN
Inventors: 李保水; 廖湖锋; 王慧君; 陶梦春; 毛跃辉; 郑文成
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2018-10-15
Filing date: 2018-10-15
Publication date: 2020-08-04
Anticipated expiration: 2038-10-15
Also published as: CN109347708A

Abstract

The invention discloses a voice recognition method, a voice recognition device, household electrical appliance equipment, a cloud server and a medium, which are used for solving the problem that in the prior art, the household electrical appliance equipment executes semantic analysis operation as long as voice information is collected. The method comprises the following steps: the method comprises the steps that a cloud server receives voice information collected by household appliances sent by the household appliances; judging whether the voice information is of a voice category containing the voice information of the user according to a pre-trained voice matching model; if yes, sending an instruction for processing according to the voice information to the household appliance, and enabling the household appliance to perform corresponding operation according to the voice information.

Description

Voice recognition method and device, household appliance, cloud server and medium

Technical Field

The invention mainly relates to the technical field of smart home, in particular to a voice recognition method, a voice recognition device, household electrical appliances, a cloud server and a medium.

Background

At present, more and more voice recognition products are available, and with the improvement of technology and the increase of popularization rate, users gradually accept and accept the interactive mode. With the continuous improvement of voice interaction technology and artificial intelligence, the application scene is speedily expanded from the aspects of voice assistants, intelligent sound boxes and the like. In the using process of the voice recognition product, semantic analysis is carried out by collecting sound of the surrounding environment, and the voice instruction operation of a user is executed. Noise presents challenges to speech recognition, and noise is prone to misrecognition or non-recognition for speech control. And sound information of other non-user voices possibly exists in the surrounding environment, such as noise of other electrical appliances, periodic white noise and the like, when the voice recognition product collects the sound information and carries out semantic analysis, the workload is increased, the meaning is meaningless, and the use effect and the service life of the voice recognition product are influenced.

Therefore, how to make the voice recognition product perform semantic parsing and execute the user voice instruction operation only when the user voice is collected is an urgent problem to be solved.

Disclosure of Invention

The embodiment of the invention provides a voice recognition method and device, household electrical appliance equipment, a cloud server and a medium, which are used for solving the problem that in the prior art, the household electrical appliance equipment executes semantic analysis operation as long as voice information is acquired.

The embodiment of the invention provides a voice recognition method, which is applied to a cloud server and comprises the following steps:

the method comprises the steps that a cloud server receives voice information collected by household appliances sent by the household appliances;

judging whether the voice information is of a voice category containing the voice information of the user according to a pre-trained voice matching model;

if yes, sending an instruction for processing according to the voice information to the household appliance, and enabling the household appliance to perform corresponding operation according to the voice information.

Further, before the sending the instruction for processing according to the voice information to the home appliance device, the method further includes:

analyzing the user voice information contained in the voice information;

judging whether the user voice information is a first control instruction for controlling the household appliance;

the sending the command processed according to the voice information to the household appliance comprises:

and if the user voice information is a first control instruction for controlling the household appliance, sending the first control instruction to the household appliance.

Further, if the voice information is not of a voice category containing user voice information, or if there is no first control instruction matching the user voice information, the method further comprises:

determining that the processing of the voice information is finished; or

And sending a third control instruction for prohibiting the voice information from being analyzed to the household appliance.

Further, the sending the instruction processed according to the voice information to the home appliance device includes:

and sending a second control instruction for analyzing the voice information to the household appliance equipment, so that the household appliance equipment analyzes the voice information.

Further, the training process of the voice matching model comprises the following steps:

acquiring sample voice information, wherein the sample voice information carries labeling information of voice categories to which the sample voice information belongs, wherein the labeling information corresponding to different voice categories is different, and the voice categories to which the sample voice information belongs comprise voice categories containing user voice information;

inputting each sample voice information into a voice matching model;

and training the voice matching model according to the labeling information of the voice category to which the voice information of each sample belongs and the output of the voice matching model.

The embodiment of the invention provides a voice recognition method, which is applied to household appliances and comprises the following steps:

the household appliance equipment sends the collected voice information to the cloud server;

receiving an instruction which is sent by a cloud server and processed according to the voice information;

and executing corresponding operation on the voice information according to the instruction.

Further, the performing the corresponding operation on the voice information according to the instruction includes:

and if the command is a first control command, executing a corresponding function according to the first control command, wherein the first control command is sent when the cloud server analyzes the user voice information contained in the voice information and judges that the user voice information is the first control command for controlling the household appliance.

Further, the performing corresponding operations on the voice information according to the instructions further includes:

if the instruction is a second control instruction, analyzing the user voice information contained in the voice information, wherein the second control instruction is an instruction for analyzing the voice information;

judging whether the user voice information is a target control instruction for controlling the user voice information;

and if so, executing the corresponding function according to the target control instruction.

Further, the method further comprises:

and receiving a third control instruction sent by the cloud server, and not analyzing the voice information, wherein the third control instruction is sent when the cloud server prohibits the household appliance from analyzing the voice information.

The embodiment of the invention provides a voice recognition device, which is applied to a cloud server and comprises the following components:

the first receiving module is used for receiving the voice information which is sent by the household appliance and collected by the household appliance;

the first judging module is used for judging whether the voice information is of a voice category containing the voice information of the user according to the pre-trained voice matching model, and if so, triggering the first sending module;

and the first sending module is used for sending an instruction for processing according to the voice information to the household appliance so that the household appliance performs corresponding operation according to the voice information.

Further, the apparatus further comprises:

the analysis module is used for analyzing the user voice information contained in the voice information; judging whether the user voice information is a first control instruction for controlling the household appliance;

the first sending module is specifically configured to send the first control instruction to the home appliance if the user voice information is the first control instruction for controlling the home appliance.

Further, the apparatus further comprises:

the determining module is used for determining that the voice information processing is finished; or sending a third control instruction for prohibiting the voice information from being analyzed to the household appliance.

Further, the first sending module is specifically further configured to send a second control instruction for analyzing the voice information to the home appliance device, so that the home appliance device analyzes the voice information.

Further, the apparatus further comprises:

the model training module is used for acquiring sample voice information, wherein the sample voice information carries labeling information of voice categories to which the sample voice information belongs, the labeling information corresponding to different voice categories is different, and the voice categories to which the sample voice information belongs comprise voice categories containing user voice information; inputting each sample voice information into a voice matching model; and training the voice matching model according to the labeling information of the voice category to which the voice information of each sample belongs and the output of the voice matching model.

The embodiment of the invention provides a voice recognition device, which is applied to household appliances and comprises:

the second sending module is used for sending the collected voice information to the cloud server;

the second receiving module is used for receiving an instruction which is sent by the cloud server and processed according to the voice information;

and the execution module is used for executing corresponding operation on the voice information according to the instruction.

Further, the execution module is specifically configured to execute a corresponding function according to the first control instruction if the instruction is the first control instruction, where the first control instruction is sent by the cloud server when the user voice information included in the voice information is analyzed and the user voice information is judged to be the first control instruction for controlling the home appliance device.

Further, the execution module is specifically configured to, if the instruction is a second control instruction, analyze user voice information included in the voice information, where the second control instruction is an instruction for analyzing the voice information; judging whether the user voice information is a target control instruction for controlling the user voice information; and if so, executing the corresponding function according to the target control instruction.

Further, the apparatus further comprises:

and the third receiving module is used for receiving a third control instruction sent by the cloud server and not analyzing the voice information, wherein the third control instruction is sent when the cloud server prohibits the household appliance from analyzing the voice information.

The embodiment of the invention provides a cloud server, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for completing mutual communication by the memory through the communication bus;

the memory has stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of any of the above methods applied to a cloud server.

An embodiment of the present invention provides a computer-readable storage medium storing a computer program executable by a cloud server, which when running on the cloud server, causes the cloud server to perform the steps of the method as claimed in any one of the above claims applied to the cloud server.

The embodiment of the invention provides household appliance equipment, which comprises a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for finishing mutual communication by the memory through the communication bus;

the memory has stored therein a computer program which, when executed by the processor, causes the processor to perform the steps of any of the above methods applied to a home appliance.

An embodiment of the present invention provides a computer-readable storage medium, which stores a computer program executable by a home appliance, and when the program runs on the home appliance, the program causes the home appliance to perform any one of the steps of the method applied to the home appliance.

The embodiment of the invention provides a voice recognition method, a voice recognition device, household electrical appliance equipment, a cloud server and a medium, wherein the method comprises the following steps: the method comprises the steps that a cloud server receives voice information collected by household appliances sent by the household appliances; judging whether the voice information is of a voice category containing the voice information of the user according to a pre-trained voice matching model; if yes, sending an instruction for processing according to the voice information to the household appliance, and enabling the household appliance to perform corresponding operation according to the voice information. In the method, whether the voice information acquired by the household appliance is the voice category containing the voice information of the user is judged through the cloud server, if so, an instruction is sent to the household appliance, so that the household appliance performs corresponding operation according to the voice information, and the problem that in the prior art, the household appliance executes semantic analysis operation as long as the voice information is acquired is solved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic process diagram of a speech recognition method according to embodiment 1 of the present invention;

FIG. 2 is a diagram illustrating an exemplary speech matching model according to an embodiment of the present invention;

fig. 3 is a schematic process diagram of a speech recognition method according to embodiment 6 of the present invention;

fig. 4 is a schematic process diagram of a speech recognition method according to embodiment 8 of the present invention;

FIG. 5 is a schematic diagram of a process for constructing a speech matching model according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a speech recognition apparatus according to embodiment 9 of the present invention;

fig. 7 is a schematic structural diagram of a speech recognition apparatus according to embodiment 10 of the present invention;

fig. 8 is a schematic structural diagram of a cloud server according to an embodiment of the present invention;

fig. 9 is a schematic structural diagram of a home appliance according to an embodiment of the present invention.

Detailed Description

The present invention will be described in further detail with reference to the attached drawings, and it should be understood that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example 1:

fig. 1 is a schematic process diagram of a speech recognition method according to an embodiment of the present invention, where the process includes the following steps:

s101: and the cloud server receives the voice information collected by the household appliance equipment sent by the household appliance equipment.

The voice recognition method provided by the embodiment of the invention is applied to the cloud server.

In the prior art, when the household appliance with the voice recognition function performs voice recognition processing, the semantic analysis operation is executed as long as voice information is collected, and other non-user voice information, such as other appliance noise, periodic white noise and the like, is likely to exist in the surrounding environment, and if the household appliance also performs processing aiming at such voice, the working intensity is increased undoubtedly, the service life is shortened, the situation of false recognition is likely to be caused, and the user experience effect is reduced.

In order to solve the problems in the prior art, in the scheme provided by the embodiment of the invention, the household appliance sends the collected voice information to the cloud server, judges whether the voice information collected by the household appliance contains the user voice information or not through the cloud server, and sends an instruction to the household appliance according to the judgment result so that the household appliance completes the operation according to the instruction.

Specifically, be provided with speech recognition module in the household electrical appliances, can discern speech information and carry out speech information collection, in addition, be provided with communication module in the household electrical appliances, for example WIFI wireless communication module etc. for this household electrical appliances can be connected with cloud ware, can send the speech information who gathers for cloud ware.

S102: and judging whether the voice information is of a voice category containing the voice information of the user according to a pre-trained voice matching model.

In order to effectively reduce the workload of the household appliance and improve the working efficiency, the cloud server stores a voice matching model, the voice matching model can identify the type of the voice information, specifically, whether the voice matching model is the voice type containing the voice information of the user, and the voice matching model can identify other voice types, such as electric appliance noise, periodic white noise and the like.

When the speech type is recognized, the speech matching model outputs identification information of the speech information belonging type, for example, identification information of the speech type including user speech information is 01, identification information of the speech type of electric appliance speech is 02, identification information of the speech type having periodic white noise is 03, and the like.

After receiving the voice information sent by the household appliance equipment, the cloud server searches whether the identification information contains the identification information of the voice category of the voice information of the user according to the identification information of the voice information attribution category output by the voice matching model. Specifically, the voice information is input into the voice matching model, and if the result output by the voice matching model includes the identification information 01 of the voice category of the user voice, it is considered that the voice information includes the identification information of the voice category of the user voice information, and subsequent processing needs to be performed.

S103: if yes, sending an instruction for processing according to the voice information to the household appliance, and enabling the household appliance to perform corresponding operation according to the voice information.

If the cloud server judges that the voice information is the voice category containing the voice information of the user, the fact that the household appliance needs to perform corresponding operation according to the voice information is indicated. At this time, an instruction for processing according to the voice information needs to be sent to the household appliance.

Specifically, when the instruction is sent, the instruction may carry identification information of the voice information, so that the home appliance device knows which voice information is to be processed, and the identification information may be time information acquired by the voice information, or information uniquely identifying the voice information, such as a number corresponding to the voice information.

Because the household appliance is provided with the communication module, all the household appliance can receive the instruction which is sent by the cloud server and processed according to the voice information, and corresponding processing is carried out according to the instruction.

In the embodiment of the invention, whether the voice information acquired by the household appliance is the voice category containing the voice information of the user is judged through the cloud server, if so, an instruction is sent to the household appliance, so that the household appliance performs corresponding operation according to the voice information, and the problem that in the prior art, the household appliance executes semantic analysis operation as long as the voice information is acquired is solved.

Example 2:

in order to further reduce the workload of the home appliance for processing the voice information, on the basis of the above embodiment, in an embodiment of the present invention, before the sending, to the home appliance, the instruction for processing according to the voice information, the method further includes:

analyzing the user voice information contained in the voice information;

The cloud server stores a large amount of voice data and has a strong semantic analysis function, and when the cloud server judges that the received voice information is of a voice category containing user voice information, the cloud server analyzes the voice information by utilizing the semantic analysis function of the cloud server to determine the user voice information contained in the voice information. Recognizing the semantics of the user voice information, judging whether the semantics of the user voice information is a first control instruction for controlling the household appliance, specifically, judging whether the semantics of the user voice information contains a target control instruction set for the household appliance, and if so, sending the first control instruction to the household appliance.

Specifically, the process of performing semantic parsing is the prior art, and is not described in detail in the embodiment of the present invention.

Or the cloud server stores a target voice instruction capable of performing voice control on the household appliance, matches the user voice information with the target voice instruction, and if the matching is successful, sends the corresponding target voice instruction which is successfully matched to the household appliance. In order to distinguish from other subsequent instructions, the target voice instruction is determined as a first control instruction, namely the first control instruction is sent to the household appliance.

In the embodiment of the invention, the user voice information is analyzed at the cloud server, whether the user voice information is the first control instruction for controlling the household appliance is judged, and if the user voice information is the first control instruction for controlling the household appliance, the first control instruction is sent to the household appliance, so that the workload of the household appliance is further reduced, the phenomenon that an error instruction is sent to the household appliance is avoided, and the user experience effect is improved.

Example 3:

in order to enable the home appliance device not to process other voice categories not containing the user voice information, on the basis of the foregoing embodiments, in an embodiment of the present invention, if the voice information is not a voice category containing the user voice information, or if there is no first control instruction matching the user voice information, the method further includes:

determining that the processing of the voice information is finished; or

If the cloud server judges that the received voice information does not contain the voice category of the user voice information, or the cloud server finds that no first control instruction matched with the user voice information exists after semantic analysis, the voice information is considered to be not necessary to be processed, at this moment, the processing process can be considered to be finished, and no message is sent to the household appliance. And because the household appliance does not receive any information, other subsequent operations are not performed on the voice information.

If the cloud server judges that the received voice information does not contain the voice category of the user voice information, or the cloud server finds that the first control instruction matched with the user voice information does not exist after semantic analysis, a third control instruction can be sent to the household appliance, wherein the third control instruction is an instruction for prohibiting the household appliance from performing semantic analysis on the acquired voice information, and the instruction carries identification information of the voice information, so that the household appliance does not analyze the voice information corresponding to the identification information. And the household appliance receives the third control instruction, and does not perform subsequent processing on the voice information according to the third control instruction.

In the embodiment of the invention, the effect that the household appliance does not execute the semantic analysis function on the non-user voice is achieved by sending a third control instruction for prohibiting the household appliance from executing the semantic analysis function or determining that the voice information processing is finished.

Example 4:

in order to ensure that the home appliance only analyzes and processes the user voice information, in the embodiments of the present invention, on the basis of the foregoing embodiments, the sending, to the home appliance, the instruction to process according to the voice information includes:

Because the voice information containing the voice information of the user is less, when the cloud server judges that the voice information is of a voice category containing the voice information of the user, in order to reduce the pressure of the server, a second control instruction for analyzing the voice information can be sent to the household appliance, and the second control instruction carries the identification information of the voice information, or can carry the voice information, so that the household appliance analyzes the voice information of the identification information.

The analyzing of the voice information by the household appliance equipment comprises the following steps: the household appliance analyzes the voice information by utilizing a semantic analysis function configured by the household appliance, determines the user voice information contained in the voice information, identifies the semantics of the user voice information, judges whether a target control instruction for controlling the household appliance is contained in the semantics of the user voice information, and if so, executes corresponding operation according to the target control instruction to complete the voice control which is required by the user.

Or the household appliance also stores a target voice command capable of controlling the household appliance through voice, the target voice command is matched with the voice information of the user, and if the matching is successful, corresponding operation is carried out according to the target voice command.

In the embodiment of the invention, the aim of enabling the household appliance to process the user voice information only is achieved by sending the instruction which allows the household appliance to execute semantic analysis processing on the collected user voice information to the household appliance.

Example 5:

in order to determine whether the collected voice information is a voice category including the voice information of the user, on the basis of the above embodiments, in an embodiment of the present invention, a training process of the voice matching model includes:

inputting each sample voice information into a voice matching model;

In order to facilitate training of the voice matching model, a large amount of sample voice information can be collected, the sample voice information is collected by the household appliance, for the sample voice information, voice categories of the sample voice information are artificially identified, the specific voice categories are voice categories including user voice information, and voice categories including non-user voice information can also be included, and the voice categories of the non-user voice information can include: the voice category of the electrical sound, the voice category of the periodic noise, and the like.

Specifically, each sample voice information contains identification information of a voice category to which it belongs, for example, identification information of a voice category of user voice information is 01, identification information of a voice category of electrical appliance type sound is 02, identification information of a voice category of periodic noise is 03, and the like.

And inputting the sample voice information carrying the identification information into the voice matching model, and training the voice matching model, wherein the model can be a convolutional neural network model. And inputting each sample voice information carrying the identification information into the convolutional neural network model, and training the voice matching model according to the identification information of the voice category to which each sample voice information belongs and the output result of the voice matching model.

The voice matching model shown in fig. 2 can recognize voice information of a type such as a voice type and identification information of a voice type of user voice information, a voice type and identification information of a voice type of electric appliance sound, a voice type and identification information of a voice type of periodic noise, and the like, specifically, the collected voice information is input into the voice matching model, the identification information of the voice type to which the voice information belongs is output through the voice matching model, and if the output result includes the identification information of the voice type of the user voice information, the voice information is the voice type including the user voice information.

In the embodiment of the invention, the voice matching model is obtained by training a large amount of sample voice information, and the collected voice information can be identified through the voice matching model.

Example 6:

fig. 3 is a schematic process diagram of a speech recognition method according to an embodiment of the present invention, where the process includes the following steps:

s301: and the household appliance equipment sends the collected voice information to the cloud server.

The voice recognition method provided by the embodiment of the invention is applied to household appliances with the voice recognition function.

At present, more and more household appliances with voice recognition functions are provided, the current household appliances execute the semantic analysis function as long as voice information is collected, part of the voice information does not contain user voice information, and it is meaningless to operate aiming at non-user voice information, so that the working efficiency of the household appliances is influenced, and the processing resources of the household appliances are wasted.

In order to solve the problems in the prior art, in the embodiment of the invention, after the voice information is collected by the household electrical appliance device, the voice information is sent to the cloud server, and the cloud server judges whether the voice information contains the user voice information or not and whether semantic analysis operation needs to be executed or not.

Specifically, dispose the speech recognition module among the household electrical appliances, this speech recognition module can discern speech information and gather, in addition for being connected with cloud ware, this household electrical appliances still is equipped with communication module, for example can be WIFI wireless communication module etc. and household electrical appliances send the speech information of gathering for cloud ware through this communication module.

S302: and receiving an instruction which is sent by the cloud server and processed according to the voice information.

The household appliance sends the collected voice information to the cloud server, the cloud server judges whether the voice information is of a voice category containing the voice information of the user, if the cloud server judges that the voice information is of the voice category containing the voice information of the user, an instruction for processing according to the voice information is sent to the household appliance, and the household appliance receives the instruction through a communication module configured on the household appliance and executes corresponding operation on the voice information according to the instruction.

The instruction carries identification information of the voice information, and the household appliance device can judge which voice information needs to be processed after receiving the instruction, for example, the identification information is time information for acquiring the voice information, and after receiving the instruction, the household appliance device recognizes the identification information contained in the instruction, that is, acquires the time information of the voice information, searches for the voice information corresponding to the time information, and processes the voice information.

S303: and executing corresponding operation on the voice information according to the instruction.

After the household appliance receives the instruction sent by the cloud server, because the instruction carries the identification information of the voice information, the household appliance can execute corresponding operation on the voice information of the identification information according to the instruction, and thus whether to trigger corresponding action is determined.

In the embodiment of the invention, the household appliance sends the collected voice information to the cloud server, the cloud server judges whether the voice information collected by the household appliance is of a voice type containing the voice information of the user, if so, the household appliance sends an instruction to the household appliance, and the household appliance receives the instruction and executes corresponding operation on the voice information according to the instruction, so that the problem that the household appliance executes analysis operation as long as the voice information is collected in the prior art is solved.

Example 7:

in order to ensure that the home appliance only processes the voice information of the user, on the basis of the foregoing embodiments, in an embodiment of the present invention, the performing, according to the instruction, the corresponding operation on the voice information includes:

In order to further reduce the workload of voice information processing of the household electrical appliance, and the cloud server stores a large amount of voice data and has a strong semantic analysis function, when the cloud server judges that the received voice information is of a voice type containing user voice information, the cloud server analyzes and processes the voice information by using the semantic analysis function of the cloud server, and determines the user voice information contained in the voice information. Recognizing the semantics of the user voice information, judging whether the semantics of the user voice information is a first control instruction for controlling the household appliance, specifically, judging whether the semantics of the user voice information contains a target control instruction set for the household appliance, and if so, sending the first control instruction to the household appliance.

If the household appliance receives a first control instruction sent by the cloud server, the first control instruction is sent by the cloud server when the user voice information is judged to be the first control instruction for controlling the household appliance, so that the household appliance can execute corresponding operation according to the first control instruction to complete the command of the user voice information.

If the household appliance receives the second control instruction sent by the cloud server, and the second control instruction is an instruction which allows the household appliance to execute the semantic analysis function and carries identification information of the voice information to be processed, the household appliance determines corresponding voice information according to the second control instruction and the identification information of the voice information carried in the second control instruction, and analyzes the voice information. For example, the identification information is time information of voice information acquisition, and after receiving the second control instruction, the home appliance identifies the identification information included in the second control instruction, that is, acquires the time information of the voice information, searches for the voice information corresponding to the time information, and processes the voice information.

However, the user voice information may be information that the user normally speaks, and does not include a control command for the home appliance. Therefore, after the semantic analysis is performed on the user voice information, the household appliance judges whether the semantic of the user voice information is a target control instruction for controlling the household appliance, and if so, executes corresponding operation according to the target control instruction. Or the household appliance stores a target voice command capable of carrying out voice control on the household appliance, the user voice information is matched with the target voice command, and if the matching is successful, corresponding operation is executed according to the target voice command. If the matching is not contained or is not successful, the voice information of the user is considered to be irrelevant, and no operation is executed.

In the embodiment of the invention, the corresponding operation is executed according to the instruction information sent by the cloud server, so that the problems of processing all voice information, increasing workload and reducing user experience effect are solved.

Example 8:

in order to ensure that the semantic parsing function is not executed on other non-user speech, on the basis of the foregoing embodiments, in an embodiment of the present invention, the method further includes:

and receiving a third control instruction sent by the cloud server, and not analyzing the voice information, wherein the third control instruction is an instruction for prohibiting analyzing the voice information.

Since the third control instruction is sent by the cloud server when the voice information is judged not to contain the user voice information or the cloud server judges that the first control instruction matched with the user voice information does not exist, and the third control instruction carries the identification information of the voice information, if the home appliance device receives the third control instruction sent by the cloud server, the operation of semantic analysis is not executed on the voice information corresponding to the identification information.

The following describes the process of the speech recognition method in a specific embodiment, and the detailed process is shown in fig. 4 and includes the following steps:

s401: and powering on the voice recognition module for awakening the household appliance.

S402: ambient sound and user speech are collected.

S403: and the household appliance equipment uploads the collected voice information to the cloud server.

S404: and the cloud server intelligently analyzes according to the voice matching model.

The process of constructing the speech matching model is described below in conjunction with FIG. 5.

The household appliance collects a large amount of sample voice information through a voice recognition module configured for the household appliance, the sample voice information can contain user voice, electric appliance voice, periodic noise and the like, the voice category of the sample voice information is artificially recognized, and different identification information is labeled for each voice category.

And inputting each sample voice information into a voice matching model, wherein the voice matching model can be a convolutional neural network model, and the voice matching model outputs identification information of a voice category to which each sample voice information belongs through self intelligent analysis and classification.

And determining the optimal model parameters aiming at the voice matching model according to the output result of each voice matching model, finishing the training of the voice matching model, and establishing an optimal voice matching model.

S405: and judging whether the voice information is of a voice category containing the voice information of the user.

If not, the voice information is processed or a third control instruction is sent to the household appliance, and the third control instruction is an instruction for prohibiting the household appliance from analyzing the voice information, so that the voice information is not processed when the household appliance does not receive any instruction or receives the third control instruction.

And if so, sending an instruction for processing according to the voice information to the household appliance, so that the household appliance performs corresponding operation according to the voice information. And if the cloud server analyzes the user voice information and the user voice information is a first control instruction for controlling the household appliance, sending the first control instruction to the household appliance. And the household appliance equipment receives the first control instruction and executes corresponding operation according to the first control instruction.

Certainly, in order to reduce the workload of the cloud server, the home appliance device may also complete the operation of analyzing the user voice information, and at this time, a second control instruction may be sent to the home appliance device, where the second control instruction is an instruction allowing the home appliance device to perform the analysis operation, and the instruction carries the identification information of the voice information. And the household appliance equipment receives the second control instruction, determines corresponding voice information according to the identification information of the voice information carried in the second control instruction, and performs semantic analysis on the voice information. And if the semantic meaning of the voice information is a target control instruction for controlling the household appliance, executing corresponding operation according to the target control instruction.

The specific implementation process of each step is described in detail in the above embodiments, and is not described herein again.

Example 9:

based on the same technical concept, the embodiment of the invention provides a voice recognition device which is applied to a cloud server. As shown in fig. 6, the apparatus provided in the embodiment of the present invention includes:

the first receiving module 601 is configured to receive voice information, which is sent by a home appliance and acquired by the home appliance;

a first judging module 602, configured to judge, according to a pre-trained voice matching model, whether the voice information is a voice category that includes user voice information, and if so, trigger a first sending module;

a first sending module 603, configured to send, if yes, an instruction for processing according to the voice information to the home appliance device, so that the home appliance device performs a corresponding operation according to the voice information.

Further, the apparatus further comprises: an analyzing module 604, configured to analyze user voice information included in the voice information; judging whether the user voice information is a first control instruction for controlling the household appliance; the first sending module 603 is specifically configured to send the first control instruction to the home appliance if the user voice information is the first control instruction for controlling the home appliance.

Further, the apparatus further comprises: a determining module 605, configured to determine that the processing of the voice message is finished; or sending a third control instruction for prohibiting the voice information from being analyzed to the household appliance.

Further, the first sending module 603 is specifically configured to send a second control instruction for analyzing the voice information to the home appliance device, so that the home appliance device analyzes the voice information.

Further, the apparatus further comprises: a model training module 606, configured to obtain sample voice information, where the sample voice information carries labeled information of a voice category to which the sample voice information belongs, where labeled information corresponding to different voice categories is different, and the voice category to which the sample voice information belongs includes a voice category including user voice information; inputting each sample voice information into a voice matching model; and training the voice matching model according to the labeling information of the voice category to which the voice information of each sample belongs and the output of the voice matching model.

Example 10:

based on the same technical concept, the embodiment of the invention provides a voice recognition device which is applied to household appliances. As shown in fig. 7, the apparatus provided in the embodiment of the present invention includes:

a second sending module 701, configured to send the collected voice information to a cloud server;

a second receiving module 702, configured to receive an instruction, sent by the cloud server, for processing according to the voice information;

the executing module 703 is configured to execute a corresponding operation on the voice message according to the instruction.

Further, the executing module 703 is specifically configured to execute a corresponding function according to the first control instruction if the instruction is the first control instruction, where the first control instruction is sent by the cloud server when the user voice information included in the voice information is analyzed and it is determined that the user voice information is the first control instruction for controlling the home appliance device.

Further, the executing module 703 is specifically configured to, if the instruction is a second control instruction, analyze the user voice information included in the voice information, where the second control instruction is an instruction for analyzing the voice information; judging whether the user voice information is a target control instruction for controlling the user voice information; and if so, executing the corresponding function according to the target control instruction.

Further, the apparatus further comprises: a third receiving module 704, configured to receive a third control instruction sent by the cloud server, and not parse the voice information, where the third control instruction is to prohibit parsing of the voice information.

Example 11:

on the basis of the foregoing embodiments, an embodiment of the present invention further provides a cloud server 800, as shown in fig. 8, including: the system comprises a processor 801, a communication interface 802, a memory 803 and a communication bus 804, wherein the processor 801, the communication interface 802 and the memory 803 complete mutual communication through the communication bus 804;

the memory 803 has stored therein a computer program which, when executed by the processor 801, causes the processor 801 to perform the steps of:

analyzing the user voice information contained in the voice information;

determining that the processing of the voice information is finished; or

inputting each sample voice information into a voice matching model;

The communication bus mentioned in the above server may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 802 is used for communication between the cloud server and other devices.

The Memory may include a Random Access Memory (RAM) or a Non-Volatile Memory (NVM), such as at least one disk Memory. Alternatively, the memory may be at least one memory device located remotely from the processor.

The processor may be a general-purpose processor, including a central processing unit, a Network Processor (NP), and the like; but may also be a Digital instruction processor (DSP), an application specific integrated circuit, a field programmable gate array or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.

Example 12:

on the basis of the foregoing embodiments, an embodiment of the present invention further provides a computer storage readable storage medium, in which a computer program executable by a cloud server is stored, and when the program runs on the cloud server, the cloud server is caused to execute the following steps:

analyzing the user voice information contained in the voice information;

determining that the processing of the voice information is finished; or

inputting each sample voice information into a voice matching model;

The computer readable storage medium may be any available medium or data storage device that can be accessed by a processor in a server, including but not limited to magnetic storage such as floppy disks, hard disks, magnetic tape, magneto-optical disks (MO), etc., optical storage such as CDs, DVDs, BDs, HVDs, etc., and semiconductor storage such as ROMs, EPROMs, EEPROMs, non-volatile memory (NANDF L ASH), Solid State Disks (SSDs), etc.

Example 13:

on the basis of the foregoing embodiments, an embodiment of the present invention further provides a home appliance 900, as shown in fig. 9, including: the system comprises a processor 901, a communication interface 902, a memory 903 and a communication bus 904, wherein the processor 901, the communication interface 902 and the memory 903 are communicated with each other through the communication bus 904;

the memory 903 has stored therein a computer program which, when executed by the processor 901, causes the processor 901 to perform the steps of:

Further, characterized in that the method further comprises:

and receiving a third control instruction sent by the cloud server, and not analyzing the voice information, wherein the third control instruction is to prohibit analyzing the voice information.

The communication bus mentioned in the above terminal may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.

The communication interface 902 is used for communication between the above-described terminal and other devices.

Example 14:

on the basis of the foregoing embodiments, an embodiment of the present invention further provides a computer storage readable storage medium, in which a computer program executable by a home appliance is stored, and when the program runs on the home appliance, the home appliance implements the following steps when executed:

Further, characterized in that the method further comprises:

The computer readable storage medium may be any available medium or data storage device that can be accessed by a processor in an appliance, including but not limited to magnetic memory such as floppy disks, hard disks, magnetic tape, magneto-optical disks (MO), etc., optical memory such as CDs, DVDs, BDs, HVDs, etc., and semiconductor memory such as ROMs, EPROMs, EEPROMs, non-volatile memory (NANDF L ASH), Solid State Disks (SSDs), etc.

In summary, the present invention provides a voice recognition method, an apparatus, a home appliance device, a cloud server, and a medium, so as to solve the problem in the prior art that the home appliance device performs semantic parsing operation on all collected voice information. The method comprises the following steps: the method comprises the steps that a cloud server receives voice information collected by household appliances sent by the household appliances; judging whether the voice information is of a voice category containing the voice information of the user according to a pre-trained voice matching model; if yes, sending an instruction for processing according to the voice information to the household appliance, and enabling the household appliance to perform corresponding operation according to the voice information.

For the system/apparatus embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.

It is to be noted that, in this document, relational terms such as first and second, and the like are used solely to distinguish one entity or operation from another entity or operation without necessarily requiring or implying any actual such relationship or order between such entities or operations.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely application embodiment, or an embodiment combining application and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A voice recognition method is applied to a cloud server, and comprises the following steps:

if yes, sending an instruction for processing according to the voice information to the household appliance, and enabling the household appliance to perform corresponding operation according to the voice information;

the training process of the voice matching model comprises the following steps:

inputting each sample voice information into a voice matching model;

training the voice matching model according to the labeling information of the voice category to which the voice information of each sample belongs and the output of the voice matching model;

if the voice information is not in a voice category containing the voice information of the user, the method further comprises:

and determining that the voice information processing is finished.

2. The method of claim 1, wherein prior to sending the instructions to the home device for processing in accordance with the voice information, the method further comprises:

analyzing the user voice information contained in the voice information;

3. The method of claim 2, wherein if there is no first control instruction matching the user voice information, the method further comprises:

determining that the processing of the voice information is finished; or

4. The method of claim 1, wherein sending the instructions to the home device for processing in accordance with the voice information comprises:

5. A speech recognition device applied to a cloud server, the device comprising:

the first sending module is used for sending an instruction for processing according to the voice information to the household appliance equipment so that the household appliance equipment performs corresponding operation according to the voice information;

the model training module is used for acquiring sample voice information, wherein the sample voice information carries labeling information of voice categories to which the sample voice information belongs, the labeling information corresponding to different voice categories is different, and the voice categories to which the sample voice information belongs comprise voice categories containing user voice information; inputting each sample voice information into a voice matching model; training the voice matching model according to the labeling information of the voice category to which the voice information of each sample belongs and the output of the voice matching model;

and the determining module is used for determining that the voice information processing is finished.

6. The apparatus of claim 5, wherein the apparatus further comprises:

7. The apparatus of claim 5 or 6, wherein the determining module is further configured to send a third control instruction to the home device to prohibit parsing the voice message.

8. The apparatus of claim 6, wherein the first sending module is further configured to send a second control instruction to the home device to parse the voice message, so that the home device parses the voice message.

9. The cloud server is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication through the communication bus by the memory;

the memory has stored therein a computer program which, when executed by the processor, causes the processor to carry out the steps of the method of any one of claims 1-4.

10. A computer-readable storage medium storing a computer program executable by a cloud server, the program, when executed on the cloud server, causing the cloud server to perform the steps of the method of any one of claims 1 to 4.