CN114172757A - Server, intelligent home system and multi-device voice awakening method - Google Patents

Server, intelligent home system and multi-device voice awakening method Download PDF

Info

Publication number
CN114172757A
CN114172757A CN202111521226.2A CN202111521226A CN114172757A CN 114172757 A CN114172757 A CN 114172757A CN 202111521226 A CN202111521226 A CN 202111521226A CN 114172757 A CN114172757 A CN 114172757A
Authority
CN
China
Prior art keywords
equipment
voice
intelligent
server
instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111521226.2A
Other languages
Chinese (zh)
Inventor
张路伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Visual Technology Co Ltd
Original Assignee
Hisense Visual Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Visual Technology Co Ltd filed Critical Hisense Visual Technology Co Ltd
Priority to CN202111521226.2A priority Critical patent/CN114172757A/en
Publication of CN114172757A publication Critical patent/CN114172757A/en
Priority to PCT/CN2022/100547 priority patent/WO2022268136A1/en
Priority to CN202280038248.XA priority patent/CN117882130A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
    • H04L12/2803Home automation networks
    • H04L12/2816Controlling appliance services of a home automation network by calling their functionalities
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Automation & Control Theory (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Selective Calling Equipment (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application provides a server, intelligent equipment and a multi-equipment voice awakening method, wherein the method can be used for analyzing service requirement information from a voice control instruction by the server after the voice control instruction is input by a user, screening target equipment of which the current equipment state can realize the service requirement according to the service requirement information, and sending a response instruction to the target equipment so that the intelligent equipment serving as the target equipment makes a voice response; meanwhile, the server also sends a silencing instruction to other equipment except the target equipment in the current intelligent home system according to the screening result of the target equipment, so that the intelligent equipment which is not used as the target equipment does not respond to the voice control function. The method can preprocess the voice control instruction at the server, so that all types of intelligent equipment can quickly and efficiently make correct awakening response within the set time, and the problem of abnormal response of the traditional voice awakening method is solved.

Description

Server, intelligent home system and multi-device voice awakening method
Technical Field
The application relates to the technical field of intelligent home, in particular to a server, an intelligent home system and a multi-device voice awakening method.
Background
The intelligent voice control is a novel interactive mode, can perform semantic recognition on voice information input by a user, and then controls equipment to operate according to a semantic recognition result. In order to realize the interaction process based on intelligent voice control, an intelligent voice system can be built in the intelligent equipment. The intelligent speech system may be composed of a hardware portion and a software portion. The hardware part mainly comprises a microphone, a loudspeaker and a controller, and is used for receiving, feeding back and processing voice information; the software part mainly comprises a voice conversion module, a natural language processing module and a control module, and is used for converting an input sound signal into a text signal and forming a specific control instruction for control.
When a user uses an intelligent voice system, if the number of devices contained in the corresponding intelligent home system is large, the problem of simultaneous awakening or mistaken awakening of multiple devices can occur, so that voice playing and an interaction process under the scene are disordered, and the experience of the user is seriously influenced. In order to improve the multi-device wake-up problem, a user can define and switch different wake-up strategies through an application program in the terminal device according to the use habits.
However, this wake-up method not only requires the user to perform manual switching on the terminal device, but also the switched wake-up policy determines which device is to be woken up by mutual communication between the devices to be woken up, and if the number of the devices to be woken up is large, information interaction between all devices cannot be completed in a short time, so that the execution rate of the voice interaction instruction is reduced, and abnormal response of the devices is easily caused.
Disclosure of Invention
The application provides a server, an intelligent home system and a multi-device voice awakening method, and aims to solve the problem that a traditional voice awakening method is abnormal in response.
In a first aspect, the present application provides a server comprising: the device comprises a storage module, a communication module and a control module. The storage module is configured to store the device state reported by the intelligent device; the communication module is configured to establish a communication connection with a smart device to obtain a device state of the smart device; the control module is configured to perform the following program steps:
acquiring a voice control instruction input by a user through the intelligent equipment;
responding to the voice control instruction, and analyzing service demand information in the voice control instruction;
screening target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
and sending a response instruction to the target equipment, and sending a silencing instruction to other intelligent equipment except the target equipment in the current intelligent home system.
In a second aspect, the present application further provides a smart device, including: audio input device, audio output device, communicator and controller. Wherein the audio input device is configured to detect voice audio data input by a user; the audio output device is configured to play a voice response; the communicator is configured to establish a communication connection with a server to transmit a device status to the server; the controller is configured to perform the following program steps:
acquiring voice audio data which is input by a user and used for executing voice control;
generating a voice control instruction according to the voice audio data;
sending the voice control instruction to the server so that the server analyzes the service demand information in the voice control instruction, and screening target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
receiving a response instruction or a silent instruction sent by the server;
and executing the response instruction or the quiesce instruction.
In a third aspect, the present application further provides a multi-device voice wake-up method, which is applied to an intelligent home system, where the intelligent home system includes a server and a plurality of intelligent devices, and the intelligent devices establish communication connection with the server; the multi-device voice wake-up method comprises the following steps:
the intelligent equipment acquires voice audio data input by a user, generates a voice control instruction according to the voice audio data, and sends the voice control instruction and the equipment state to the server;
the server analyzes the service demand information in the voice control instruction and screens target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
the server sends a response instruction to the intelligent equipment serving as the target equipment and sends a silencing instruction to other intelligent equipment except the target equipment in the current intelligent home system;
the intelligent device serving as the target device runs the response instruction to respond to the voice control function;
and other intelligent equipment except the target equipment in the current intelligent home system runs the silence instruction and does not respond to the voice control function.
According to the technical scheme, after a voice control instruction is input by a user, the server, the intelligent device and the multi-device voice awakening method can analyze the service requirement information from the voice control instruction, screen the target device of which the current device state can realize the service requirement according to the service requirement information, and send a response instruction to the target device so that the intelligent device serving as the target device makes a voice response; meanwhile, the server also sends a silencing instruction to other equipment except the target equipment in the current intelligent home system according to the screening result of the target equipment, so that the intelligent equipment which is not used as the target equipment does not respond to the voice control function. The method can preprocess the voice control instruction at the server, so that all types of intelligent equipment can quickly and efficiently make correct awakening response within the set time, and the problem of abnormal response of the traditional voice awakening method is solved.
Drawings
In order to more clearly explain the technical solution of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious to those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a usage scenario of an intelligent home system in an embodiment of the present application;
fig. 2 is a hardware configuration diagram of an intelligent device in an embodiment of the present application;
FIG. 3 is a schematic diagram illustrating a voice interaction process in an embodiment of the present application;
FIG. 4 is a schematic diagram illustrating response voice interaction effects of a plurality of smart devices according to an embodiment of the present application;
fig. 5 is a schematic flowchart of a multi-device voice wake-up method in an embodiment of the present application;
FIG. 6 is a schematic flow chart of a target screening device in an embodiment of the present application;
fig. 7 is a schematic flowchart of determining a target device according to the number of devices in the embodiment of the present application;
FIG. 8 is a schematic flow chart of a marking master device in an embodiment of the present application;
FIG. 9 is a flow chart illustrating updating the device status in an embodiment of the present application;
FIG. 10 is a flow chart of a server-side timing sequence of a multi-device voice wake-up method in an embodiment of the present application;
fig. 11 is a flowchart of a timing procedure of an intelligent device side of a multi-device voice wake-up method in an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following examples do not represent all embodiments consistent with the present application. But merely as exemplifications of systems and methods consistent with certain aspects of the application, as recited in the claims.
In the embodiment of the present application, the smart home system is a network system established based on a unified control service based on a specific area network, and the smart home system may include a plurality of smart devices 200 that establish a communication connection relationship with each other. The plurality of intelligent devices 200 may access the same local area network to implement a communication connection relationship between the devices. The plurality of intelligent devices 200 can also directly form a point-to-point network through a unified communication protocol to realize communication connection. For example, a plurality of smart devices 200 may communicate with each other by connecting to the same wireless lan. For example, one smart device 200 may also establish a communication connection with another plurality of smart devices 200 through bluetooth, infrared, cellular network, power carrier communication, and the like.
The smart device 200 is a device having a communication function, and capable of receiving, transmitting, and executing a control command and implementing a specific function. The smart device 200 includes, but is not limited to, a smart display device, a smart terminal, a smart home appliance, a smart gateway, a smart lighting device, a smart audio device, a game device, and the like. The plurality of smart devices 200 constituting the smart home system may be the same type of device or different types of devices. For example, as shown in fig. 1, in the same smart home system, a smart television, a smart sound box, a smart refrigerator, a plurality of smart light fixtures, and the like may be included. The smart devices 200 may be distributed in different locations to meet the usage requirements at the corresponding locations.
It should be noted that, the smart home system described in the present application does not limit the application range of the scheme to be protected in the present application. In practical application, the server, the intelligent device and the multi-device voice wake-up method provided by the application are not limited to be applied to the field of intelligent home, and are also applicable to other systems supporting intelligent voice control, such as an intelligent office system, an intelligent service system, an intelligent management system, an industrial production system, and the like.
The smart device 200 has a specific hardware configuration according to the actual function of the smart device 200. As shown in fig. 2, taking a display apparatus as an example, the smart device 200 having a display function may include at least one of a tuner demodulator 210, a communicator 220, a detector 230, an external device interface 240, a controller 250, a display 260, an audio output interface 270, a memory, a power supply, and a user interface.
In some embodiments, the controller 250 includes a central processor, a video processor, an audio processor, a graphic processor, a RAM, a ROM, first to nth interfaces for input/output.
In some embodiments, the display 260 includes a display screen component for displaying pictures, and a driving component for driving image display, a component for receiving image signals from the controller output, displaying video content, image content, and menu manipulation interface, and a user manipulation UI interface, etc.
In some embodiments, the display 260 may be at least one of a liquid crystal display, an OLED display, and a projection display, and may also be a projection device and a projection screen.
In some embodiments, the tuner demodulator 210 receives broadcast television signals via wired or wireless reception, and demodulates audio/video signals, such as EPG data signals, from a plurality of wireless or wired broadcast television signals.
In some embodiments, the external device interface 240 may include, but is not limited to, the following: high Definition Multimedia Interface (HDMI), analog or data high definition component input interface (component), composite video input interface (CVBS), USB input interface (USB), RGB port, and the like. The interface may be a composite input/output interface formed by the plurality of interfaces.
In some embodiments, the controller 250 controls the operation of the smart device and responds to user actions through various software control programs stored in memory. The controller 250 controls the overall operation of the smart device 200. For example: in response to receiving a user command for selecting a UI object to be displayed on the display 260, the controller 250 may perform an operation related to the object selected by the user command.
In some embodiments, a user may enter user commands on a Graphical User Interface (GUI) displayed on display 260, and the user input interface receives the user input commands through the Graphical User Interface (GUI). Alternatively, the user may input the user command by inputting a specific sound or gesture, and the user input interface receives the user input command by recognizing the sound or gesture through the sensor.
In some embodiments, the smart device 200 is also in data communication with the server 400. The smart device 200 may be allowed to communicatively connect through a Local Area Network (LAN), a Wireless Local Area Network (WLAN), and other networks. The server 400 may provide various content and interactions to the smart device 200. The server 400 may be a cluster, or may be a plurality of clusters, and may include one or more types of server groups.
In some embodiments, the smart device 200 may have a smart voice system built in to support smart voice control of the user. The smart voice control refers to an interactive process in which a user operates the smart device 200 by inputting voice audio data. To implement intelligent voice control, the smart device 200 may include an audio input device and an audio output device. The audio input device is used to collect voice and audio data input by a user, and may be a microphone device built in or externally connected to the smart device 200. The audio output device is used for making sound to play voice response. For example, as shown in FIG. 3, when the user enters "hi!via the audio input device! When a wake-up word such as "small x" is used, the smart device 200 may play an "i am" voice response through the audio output device to guide the user to complete subsequent voice input.
In some embodiments, the smart voice system built into the smart device 200 also supports a one-talk mode, i.e., supports a "one-shot" mode. In this mode, the user can directly implement the control function through a small number of voice inputs. For example, in the legacy mode, if the user wants to control the smart device 200 to play a movie resource, the user needs to input the voice "hi, xiax" first, and after the smart device 200 feeds back "i am", and then inputs "i want to watch a movie", the smart device 200 feeds back "find the following movie for you". While in the "one-shot" mode, the user can directly input "hi! Small x, i.e. i want to watch a movie ", the smart device 200 directly feeds back" find the following movie for you "after receiving the voice instruction, reduces the number of voice interactions, and improves the voice interaction efficiency.
For a plurality of intelligent devices 200 in the same intelligent home system, a user can control linkage of the plurality of devices through intelligent voice. For example, a user may input a voice command "turn on a bedroom lamp" through the smart speaker, and the smart speaker may generate a control command for turning on light in response to the voice command, and then send the control command to a lamp named "bedroom" in the smart home system to control turning on the bedroom lamp. Meanwhile, the smart sound box also responds to the voice input of the user, namely, the feedback voice content of 'turning on a bedroom lamp for you' and the like is played.
When linkage control is performed among a plurality of intelligent devices 200, a control instruction can be directly transmitted to a controlled device through the intelligent device 200 which receives user voice audio data, or can be transmitted to a specific relay device such as a router through the intelligent device 200, and then transmitted to the controlled device through the relay device. In some embodiments, the control instruction may also be transmitted to the controlled device through the server 400. For example, when a user controls a certain smart device 200 in a smart home system through the smart terminal 300 outside a local area network where a smart home is located, the smart terminal 300 may first send a control instruction to the server 400, and the server 400 then transmits the control instruction to the smart device 200 for control.
In order to control the smart devices 200 in the smart home system, the server 400 may individually issue a control instruction and related data to any smart device 200. For example, for a display device, a user may control the display device to request online play of the media asset through interactive operation, and the server 400 may feed back the media asset data to the display device according to the play request. For the linkage control of a plurality of smart devices 200, the server 400 may issue control commands and related data to the smart home system in a unified manner. For example, when the user intelligent sound box controls to turn on the bedroom lamp, the intelligent sound box can send the control instruction input by the user to the server 400, and the server 400 sends feedback data to the intelligent home system, so that the intelligent home system sends the turn-on instruction to the bedroom lamp and feeds back a control response to the intelligent sound box.
A part of the smart devices 200 in the smart home system may have a complete smart voice system built therein, and such smart devices 200 may be used as a main control device, and may independently receive, process, and respond, and may send a control instruction corresponding to a voice audio to other smart devices 200. For example, a complete smart voice system may be built in the smart device 200, such as a display device, a smart speaker, a smart refrigerator, etc., to receive voice audio input by a user. Some of the smart devices 200 in the smart home system may not have a complete smart voice system built therein, and only serve as controlled devices to receive the control instruction sent by the main control device. For example, intelligent devices such as lamps and small household appliances may receive a control instruction transmitted by a display device serving as a main control device, and start, stop or change operation parameters.
As the number of smart devices supporting a complete smart voice system increases, a plurality of smart devices supporting the smart voice system may be included in the same smart home system. For example, a smart television, a smart speaker, and a smart refrigerator are installed in the same room, and the smart devices 200 are all installed with a complete smart voice system, and can respond to a voice command input by a user. However, the manner in which the voice commands are actually responded to, and the types of voice commands that are supported for response, is different for different smart devices 200 that support a fully intelligent voice system. For example, as shown in fig. 4, for a voice instruction "i want to watch a movie" input by the user, the smart tv can respond by displaying a list of movies and feeding back voice content "have found the following movies for you". And the intelligent sound box and the intelligent refrigerator cannot respond, so that the voice content of 'what you can not say' can be fed back.
It can be seen that, because the number of the intelligent devices 200 capable of supporting voice control included in the current intelligent home system is multiple, for the same voice instruction, the situation that multiple intelligent devices 200 are awakened simultaneously or mistakenly can occur, which causes a scene confusion and seriously affects the user experience.
To alleviate the problem of scene confusion, in some implementations, the user can define the responding device through the application in the smart terminal 300 according to the usage habit and freely switch different wake-up policies. For example, the user may manually set the smart speaker as the main response device, and the voice instruction input by the user may be responded by the smart speaker, and the smart speaker sends a control command to the other smart devices 200, so as to implement smart voice control on the smart devices in the entire smart home system.
However, the control method of the wake-up policy by the user-defined method requires the user to perform manual switching operations for many times, which is not intelligent enough. Moreover, no matter which wake policy is switched to, the current execution process of multi-device wake-up is to determine which device is currently woken up by communicating devices to be woken up. On one hand, when the number of devices to be wakened is large, information interaction between every two intelligent devices 200 is required in a wakening process, so that it cannot be guaranteed that information interaction between all the intelligent devices 200 is completed within a specified time, and response of the intelligent devices 200 is abnormal. On the other hand, because the wake-up delays of different types of smart devices 200 are different, that is, the time from wake-up to response is different, it cannot be guaranteed that the different types of smart devices 200 can be simultaneously in the device information interaction time period during the wake-up decision, and the smart device 200, which is extended during partial wake-up, may not receive the wake-up word during the device information interaction, so that the time of device information interaction is missed, and the smart device 200 cannot respond to the voice, thereby causing a problem of abnormal voice control.
In order to alleviate the problem of abnormal voice control, some embodiments of the present application provide a multi-device voice wake-up method, which may be applied to an intelligent home system. The smart home system includes a server 400 and a plurality of smart devices 200. The server 400 at least comprises a storage module 410, a communication module 420 and a control module 430. The storage module 410 is configured to store the device status reported by the smart device. The communication module 420 is configured to establish communication connections with the plurality of smart devices 200 to obtain device statuses reported by the smart devices 200 and to issue control commands and related data to the plurality of smart devices 200. The control module 430 is configured to execute the program steps on the server 400 side of the multi-device voice wake-up method to issue response commands or silence commands to different smart devices 200.
Similarly, in order to satisfy the implementation of the multi-device voice wake-up method, the smart device 200 in the smart home system at least includes an audio input device, an audio output device, a communicator 220 and a controller 250. Wherein the audio input device is configured to detect voice audio data input by a user. The audio output device is configured to play a voice response. The communicator 220 is configured to establish a communication connection with the server 400, so as to report the device status to the server 400 and receive a response instruction or a silent instruction issued by the server 400. The controller 250 is configured as a program step executed by the smart device 200 side in the multi-device voice wake-up method to complete the response of the smart voice control process.
As shown in fig. 5 and fig. 6, the multi-device voice wake-up method includes the following steps:
the smart device 200 acquires voice audio data input by a user. When a user is in an intelligent home system environment, voice input can be performed in real time, and a voice sound signal input by the user can be converted into an electric signal by an audio input device built in the intelligent device 200, and voice audio data can be obtained through a series of signal processing methods such as noise reduction, amplification, coding and conversion. When performing voice interaction, a user may input voice audio data in a variety of ways. That is, in some embodiments, the user may input voice audio data through an audio input device built into the smart device 200. For example, a user may input speech "hi!through a microphone device built into smart device 200! Small x, i want to watch a movie ", the microphone may convert the voice sound signal into an electrical signal and pass it to the controller 250 for subsequent processing.
To trigger the smart device 200 for smart voice control, in some embodiments, the user may also carry a specific wake-up word in the input voice audio data. The wake-up word is a piece of speech containing specific content, such as "hi! Small x, small x, black ground, black, and the like! Xxx ", and the like. For the process of inputting voice and audio data by the user, especially for the process of inputting voice and audio data by the far-field microphone built in the smart device 200, the smart device 200 may determine whether the voice input by the user contains a wake-up word, and perform subsequent processing after detecting the wake-up word, so as to alleviate the false triggering of the smart voice control process.
According to the transmission characteristics of the sound signal, generally, the volume attenuation of the detected user voice of the smart home system close to the user is small, and the propagation distance is close, so that after the user makes a sound, the smart device 200 close to the user can detect the voice audio data of the user first. However, since the specific content of the voice input by the user is different in different situations, the smart device 200 responding to the voice is uncertain, that is, the smart device 200 responding to the voice may be a device closer to the user or a device farther away from the user. For example, when the user enters "hi!in the bedroom! When you want to watch the voice of a movie, the smart sound box in the bedroom will detect the voice audio data first, but the smart sound box does not have the video playing function, and the smart television in the living room has the video playing function.
Therefore, in order to respond to the current user voice, after acquiring the voice audio data input by the user, the smart device 200 may generate a voice control command according to the voice audio data. The voice control command is a control command, and has a specific command format, including contents such as a control action function and a control object code. After the intelligent device 200 receives the voice audio data, the intelligent device 200 may perform text conversion on the voice audio data through a voice processing module in the intelligent voice system, that is, convert waveform data in the voice audio data into text data through acoustic feature extraction.
After converting to text data, the smart device 200 may use a segmentation tool to convert unstructured text data to structured text data. That is, the intelligent device 200 may remove the text contents without practical meanings, such as the mood words and the auxiliary words, in the text data by means of word bank matching, etc., retain the keywords in the text data, and separate the keywords according to the word senses to obtain the structured text.
After obtaining the structured text data, the smart device 200 may also enter the structured text into a word processing model. The word processing model is an artificial intelligence model based on machine learning. The word processing model may computationally determine a classification probability that text information is attributed to a particular semantic after text data is input. Therefore, by using various standard control commands as classification labels, the word processing model can output the classification probability of the text data to each standard control command, wherein the standard control command with the highest classification probability is the control command corresponding to the voice audio data.
The word processing model can be obtained by repeatedly training the initial model by using the sample data and the set input and output rules. Wherein, the sample data is text information with a label. In the process of model training, sample data can be used as input, classification probability is used as output, and calculation is performed on the sample data. And comparing the output result with the label in the sample data to obtain a training error, and then reversely propagating the training error, namely adjusting the model parameters according to the training error, so that the character processing model capable of accurately outputting the recognition result can be obtained through repeatedly inputting a large amount of sample data.
After model calculation, the smart device 200 may convert voice audio data input by the user into a voice control command. After the conversion of the intelligent device 200, the controlled device or the server 400 may directly process the voice control instruction after receiving the voice control instruction, for example, execute a control action according to the voice control instruction, extract service requirement information from the voice control instruction, and the like.
Obviously, in some embodiments, the smart device 200 may directly send the voice audio data as the voice control instruction, that is, for the smart device 200 with low data processing capability or without a complete smart voice system built in, the smart device 200 may directly forward the audio data, and the server 400 or other smart devices 200 perform language processing, so as to alleviate the computational load of the current smart device 200.
After generating the voice control command, the smart device 200 may send the voice control command to the server 400 to trigger the server 400 to perform control of the wake-up process of the plurality of smart devices 200. It should be noted that, because the smart home system may include a plurality of smart devices 200 with a built-in smart voice system, when a user inputs voice, the plurality of smart devices 200 in the smart home system may all be capable of detecting voice audio data, and at this time, to avoid repeated data transmission, the server 400 may suspend the generation process of the voice control instruction and the reporting process of the voice control instruction in other smart devices 200 after receiving one voice control instruction.
For example, after the smart television sends the voice control instruction to the server 400, the server 400 may send a control instruction for generating a pause instruction and sending the pause instruction to the smart speaker and the smart refrigerator in the smart home system where the smart television is located, and after receiving the control instruction, both the smart speaker and the smart refrigerator stop generating and sending the voice control instruction. Since the smart device 200 with higher data processing capability can generally complete the voice audio data calculation in a shorter time, the generation of the voice control command is completed before the other devices. Therefore, after receiving the first transmitted voice control instruction, the server 400 stops the voice control instruction generation and reporting processes of other intelligent devices 200, and can also shorten the generation time of the voice control instruction and improve the voice response speed.
After receiving the voice control command, the server 400 may parse the service requirement information in the voice control command. For different voice control commands input by users, the control contents contained in the different voice control commands are different, and the different voice control commands have different service requirements. For example, when the user inputs the voice "hi! When the user wants to listen to music, the user generates a voice control instruction after processing by the smart device 200, and the voice control instruction includes a service requirement of "playing music" (music _ play). When the user inputs the voice "hi! And when the small x is used for turning on the bedroom lamp, generating a voice control instruction containing a lamp turning on (light _ power on) service requirement.
Obviously, when the voice control command includes the service requirement information, the server 400 may extract the service requirement information directly from the voice control command aggregate. When the voice control instruction is the voice audio data uploaded by the intelligent device 200, the server 400 may further perform recognition processing on the voice audio data uploaded by the intelligent device 200, that is, the same processing manner as that performed by the intelligent device 200 on the voice audio data in the above embodiment is performed, and the server 400 may also recognize the voice audio data through a built-in voice-to-text tool, a text structured processing tool, a word processing model, and the like, so as to recognize the service requirement information therefrom.
In order to facilitate the server 400 to analyze the service requirement information from the voice control command, in some embodiments, a service requirement recognition model may be set, or the output classification of the word processing model may be set as a service requirement, so as to calculate the classification probability of the user voice audio data for each service requirement through the model.
It should be noted that, since the voice content input by the user may include a plurality of user intentions, a plurality of service requirements may also be parsed from the corresponding voice control command. For example, the user enters the speech "hi! And when the small x is used for turning on the hall lamp and playing the movie, two service requirements of turning on the lamp and playing the movie can be analyzed in the voice control instruction. In addition, the intelligent home system can also realize richer voice interaction functions by presetting richer instruction sets, and can correspondingly determine the service requirements contained in the instruction sets according to the set instruction sets. For example, the user enters the speech "hi! And c, small x, starting a cinema mode ", the smart home system may determine, according to the instruction set of the cinema mode, that the control content for the smart home system includes playing a movie and turning off the light at the same time, so as to simulate the atmosphere of a cinema. Therefore, the server 400 can resolve the two service requirements of "turning off the lamp" and "playing the movie" in the voice control instruction.
Different service requirements correspond to different control operations executed by the intelligent device 200, and the intelligent device 200 that needs to respond to the voice control instruction needs to be in different device states. For example, for a luminaire, it can only support on/off, brightness adjustment, etc. control while in standby state; when the user turns off the power supply of the lamp through the wall switch to enable the lamp to be in an off-line state, the on/off, brightness adjustment and other controls cannot be supported.
Therefore, the smart device 200 may report the device status to the server 400 through a predetermined information reporting policy. In some embodiments, the smart device 200 may report the current device status to the server 400 once every certain time according to the data update frequency, and the server 400 may update the stored device status according to the reported status of the smart device 200.
For example, the server 400 may send a heartbeat instruction to the smart device 200, and the smart device 200 may feed back the current device status to the server 400 after receiving the heartbeat instruction, so that the server 400 may update the stored device status. When the server 400 sends the heartbeat instruction to the smart device 200 within a preset period, and the smart device 200 does not feed back the heartbeat instruction to the server 400, the server 400 may update the corresponding device status to an offline status.
In order to make the device status in the voice interaction process more effective, in some embodiments, the device status of the smart device 200 may also be reported by triggering through a voice control instruction. That is, the server 400 may obtain the voice audio data corresponding to the voice control instruction and recognize the wakeup word from the voice audio data. If the voice audio data includes the wake-up word, the smart home system in which the smart device 200 is located, and a state acquisition request is sent to the smart home system. All the intelligent devices 200 in the intelligent home system can report the device states after receiving the state acquisition instruction.
For example, when the smart device 200 reports voice audio data, the server 400 may recognize the wake word "hi!from the voice audio data! Small x ", the wake-up word" hi | is recognized in the voice audio data! After the size is small, the server 400 may determine, according to the identification information of the smart device 200, an intelligent home system currently used by the user, that is, an "xx home system", where a smart television, a sound box a, and a sound box B are located in a living room of the intelligent home system; a lamp and a sound box C are arranged in the bedroom; an intelligent refrigerator is arranged in the kitchen. And sending a state acquisition request to the intelligent home system so that a television, a sound box A, a sound box B, a lamp, a sound box C and an intelligent refrigerator in the intelligent home system report the current equipment state.
After obtaining the service requirement information and the device status reported by the intelligent device 200, the server 400 may screen the target device according to the service requirement information and the device status information. The target equipment is intelligent equipment of which the equipment state can realize service demand information.
Since whether the intelligent device 200 can meet the service requirement requires a specific precondition, such as a device type and a device state, the server 400 may perform multi-level screening on the intelligent device 200 in the current smart home system according to different preconditions in the process of screening the target device. For example, the user enters the speech "hi! And c, turning on the lamp, if the corresponding service requirement is "turn on the lamp", and the precondition required for realizing the service requirement is that the equipment type is the lamp and the equipment state is the standby state, the server 400 may screen out all the intelligent equipment 200 of which the equipment state is the lamp in the current intelligent home system according to the equipment type, and then screen out the lamp of which the equipment state is the standby state according to the equipment state, and use the lamp as the target equipment.
After the target device is screened out, the server 400 sends a response instruction to the smart device 200 as the target device, and the smart device 200 as the target device may respond to the voice control function by operating the response instruction. Meanwhile, the server 400 further sends a silencing instruction to other intelligent devices 200 except the target device in the current intelligent home system, so that the other intelligent devices except the target device in the current intelligent home system can not respond to the voice control function by running the silencing instruction.
For example, the user enters the speech "hi! And c, turning on the light, the intelligent device 200 supporting voice interaction in the home environment reports the received voice command and the device state to the server 400, that is, reports the voice control command 'turn on the lamp' and the device state (standby) to the server 400. After receiving the voice control command, the server 400 may determine that the current device status of a desk lamp meets the object category corresponding to the service requirement in the voice control command of the current user. Therefore, the server 400 may issue a response instruction for waking up the lamp, and issue a mute instruction to other devices at the same time, so that the device executes a corresponding instruction, so that the lamp meeting the service requirement and the device status is turned on, and the other intelligent devices 200 that do not meet the service requirement and the device status keep a mute state.
According to the technical scheme, the multi-device voice wake-up method provided in the above embodiment can use the service requirement information included in the voice control instruction and the device state reported by the intelligent device 200, and then screen out the target device capable of responding to the voice control instruction from the current intelligent home system. And sending a response instruction to the target device, and simultaneously sending a silencing instruction to other devices, so that after the intelligent voice system receives a voice instruction input by a user, each intelligent device 200 can exchange information through communication with the server 400 respectively, and the target device is automatically judged through the server 400, thereby reducing data interaction among a plurality of intelligent devices 200, and relieving the problem of low execution rate caused by frequent communication among multiple devices.
When a user inputs voice, the user may explicitly execute the device in the voice control instruction, for example, if the voice content input by the user is "turn on the television", the execution device is explicitly determined to be the television, at this time, because of having the explicit execution device, the server 400 may directly transmit the voice control instruction to the television device, and the execution device may be determined without analyzing the service requirement information. Therefore, in some embodiments, after the server 400 receives the voice control command reported by the smart device 200, the execution device in the voice control command may also be detected. If the voice control instruction does not explicitly execute the device, the target device is screened by analyzing the service requirement information and matching the service requirement information with the device state according to the method provided in the embodiment.
If the speech control instruction specifies the execution device, i.e. includes the identification information of the execution device, the control command and the feedback speech information may be generated from the speech control instruction. The control command is a command corresponding to the voice control instruction and facing the execution equipment. For example, for a "turn on TV" voice input by the user, the corresponding generated control command is "TV _ power on". The feedback voice information is a voice audio sent aiming at the voice content and used for prompting the execution result of the instruction of the user. For example, when the user inputs the voice "turn on the television", the intelligent voice system plays the feedback voice message "turn on the television for you" after turning on the television.
The control command and the feedback voice message may be sent to the specific smart device 200 to implement a service corresponding to the control command by executing the control command, and prompt a user of a service execution result by playing the feedback voice message, respectively. The control command and the feedback voice information can both act on the execution device, for example, when the user inputs the voice of turning on the television, the television is powered on in response to the voice, and the voice feedback of 'asking you to turn on the television' is played through the intelligent voice system and the loudspeaker of the television.
However, since the execution device may be located far away from the user, if the feedback voice information is played through the execution device, the user cannot hear the feedback voice content due to the long distance, so that the user cannot know the control result of the voice interaction process. Moreover, when a plurality of intelligent voice control devices exist at home, the user does not care which device is awakened to feed back the execution result in many times. To this end, in some embodiments, the server 400 may transmit the control command and the feedback voice information to different smart devices 200, respectively. That is, the server 400 may transmit the control command to the execution device according to the identification information of the execution device and transmit the feedback voice information to the smart device that inputs the voice control instruction.
For example, when a user sends a voice "turn on a television" in a bedroom, an intelligent air conditioner with an intelligent voice system in the bedroom detects voice audio data first and generates a voice control instruction to be sent to the server 400, the server 400 can determine that the television is an execution device according to the voice control instruction, and generates a control command of "TV _ power on" and feedback voice information of "turn on a television for you" according to the voice control instruction. And then sending the control command to a television in the living room to turn on the television, and sending feedback voice information to the intelligent air conditioner to play voice feedback of turning on the television for you through the intelligent air conditioner in the bedroom.
It can be seen that, in the above embodiment, when there is an explicit execution device in the voice control instruction, the execution device and the intelligent device 200 that inputs the voice control instruction may respectively respond to the voice control instruction, so that the service requirement is met and a better feedback effect is given to the user.
Since the smart home system may include a plurality of smart devices 200, and different smart devices 200 may support the same service requirement and be in the same device state at the same time, when the devices are screened in the above-described manner, a plurality of target devices may be screened. At this time, if the server 400 directly sends a response instruction to the smart device 200 as the target device, a plurality of smart devices 200 will simultaneously respond to one voice control instruction, and the problem of scene confusion still exists.
In this regard, the server 400 may further perform a detailed filtering process by increasing the filtering condition to reduce the number of the smart devices 200 as target devices. That is, in some embodiments, the service requirement information may further include a service type and a service status. The server 400 may extract the service type and the service state from the service requirement information when screening the target device according to the service requirement information, and match the candidate device satisfying the service type in the device state, where the candidate device has a device type meeting the service type requirement, and then screen the target device whose device state meets the service state by traversing the device states of the candidate device.
For example, the user enters the speech "hi! And c, closing the music ", the smart device 200 in the home environment reports the received voice control instruction, the current device type and the device state to the cloud server 400, that is, the device type (music) and the device state (playing). After receiving the content reported by the smart device 200, the server 400 may screen the device type and the device state corresponding to the current smart home system according to the service type and the service state required in the voice control instruction, and determine that the current device type and the device state of a sound box in music playing conform to the object category of the voice control instruction of the current user. Therefore, the server 400 may issue a response instruction to the corresponding speaker, and issue a mute instruction to other devices, so that the speaker device in the current smart home system executes the corresponding response instruction to execute the music-off operation.
In some embodiments, the service requirement information further includes a service execution location, and the server 400 may further filter the smart device 200 according to the service execution location to determine the target device. That is, the server 400 may extract the service execution position from the service demand information when screening the target device according to the service demand information, and obtain the device position of each candidate device in the current smart home system; if the device location of the candidate device coincides with the service execution location, that is, the candidate device satisfies the service execution location, the step of traversing the device state of the candidate device may be performed to screen out a target device whose device state conforms to the service state. If the device location of the candidate device does not coincide with the service execution location, the candidate device is marked as not being the target device, i.e. the device may be deleted from the candidate device list.
For example, the user enters the speech "hi! And c, small x, allowing the sound box of the bedroom to play music ", then the intelligent device 200 in the home environment reports the received user instruction to the cloud server according to the current device type and device state, namely the device type (no) and the device state (standby). After receiving the information reported by the smart device 200, the server 400 may analyze the service execution position "bedroom" from the voice control instruction, and screen the smart device 200 in the current smart home system according to the service execution position, to determine the smart device 200 whose device position is within the range of the bedroom. Therefore, the server 400 may issue a response instruction to the sound equipment in the bedroom when it is determined that there is one sound equipment in the bedroom corresponding to the current equipment type and the equipment state conforming to the object category controlled by the current user instruction. Meanwhile, the server 400 also issues a silencing instruction to other devices in the current smart home system, including devices in the bedroom and devices outside the bedroom.
According to the technical scheme, the multi-device voice wake-up method provided in the above embodiment can perform multiple rounds of screening on the intelligent devices 200 in the intelligent home system based on the service requirement information such as the service type, the service state, the service execution position, and the like, so as to determine a small number of target devices, reduce the communication frequency between the intelligent devices 200, and improve the execution efficiency of the intelligent voice control process.
Through the screening process provided in the above embodiment, the server 400 may screen out target devices that can respond to the control command from among the plurality of smart devices 200. Through the screening process, although the number of the smart devices 200 as the target devices can be greatly reduced, in a part of the screening process, there are still a plurality of smart devices 200 capable of meeting the service requirements, and for the voice control process of the user, only a specific target device or devices are generally required to perform the response.
Thus, as shown in fig. 7, to determine the target device for the final execution response, in some implementations, the server 400 may further determine the final execution device from the screened plurality of smart devices 200 that can meet the business requirements. That is, when the target device is screened according to the service demand information, the server 400 may obtain the number of intelligent devices whose device states can implement the service demand information. If the number of the intelligent devices is equal to 1, that is, only one intelligent device 200 capable of meeting the current service requirement exists in the current intelligent home system, the server 400 may directly mark the intelligent device capable of realizing the service requirement information as the target device.
And if the number of the intelligent devices is greater than or equal to 2, searching the master device. The main device is one of a plurality of intelligent devices capable of realizing service demand information. The master device may perform further interactions with the user to determine the target device that ultimately responds to the voice control instruction.
That is, in some embodiments, the server 400 may send a query instruction to the master device after searching for the master device, so as to cause the master device to play the query voice, where the query instruction is a multi-round wake-free voice interaction instruction. And receiving a voice confirmation instruction input by a user through the main equipment, and extracting target equipment identification information from the voice confirmation instruction so as to screen the target equipment from a plurality of intelligent equipment capable of realizing service demand information according to the target equipment identification information.
For example, when the user is in an environment including two smart devices 200, a speaker a and a speaker B, playing music, then when the user inputs speech: "hi! When the music is turned off, the speaker a and the speaker B respectively report the current device type (music) and the current device status (playing) to the cloud server 400 after receiving the voice command input by the user. After receiving the above content, the server 400 may screen out the intelligent device 200 meeting the service requirement according to the service requirement in the voice control instruction. That is, after it is determined that the current device type and device state of two speakers are consistent with the required service type and service state, the speaker a is designated as the main device, and a multi-turn wake-up-free query instruction is issued to the speaker a, that is, "do you have two devices, i.e., ask which one needs to be turned off? And then receiving a confirmation voice instruction fed back by the user, namely when the user replies voice: "turn off music of speaker a", then determine that speaker a is the target device that will ultimately perform the voice control response. At this time, the server 400 may send a response instruction to speaker a and a mute instruction to the other smart devices 200 including speaker B.
In order to find the master device among the plurality of intelligent devices 200 capable of implementing the service requirement information, as shown in fig. 8, in some embodiments, the master device may be the intelligent device 200 closest to the position of the sound source corresponding to the voice control instruction. When searching for the master device, the server 400 may obtain voice audio data detected by the plurality of intelligent devices capable of implementing the service requirement information according to the voice control instruction, extract the acoustic energy value from the voice audio data, and compare the acoustic energy values to obtain the intelligent device 200 with the highest acoustic energy value, thereby marking the intelligent device 200 with the highest acoustic energy value as the master device.
Since the reverberation time parameter T60 within a specific scene is determined, that is, the time required for the energy attenuation at any position to be 60db is the same, and T60 can be estimated based on the direct sound to reverberant sound energy ratio of the corresponding position, it is possible to find the direct sound to reverberant sound energy ratio for the sound source of all the smart devices 200 in the environment based on the beamformed spectrogram and the sound source arrival time difference, and thus find the direct energy. And then arranging the direct sound energy of the sound source received by each device, so as to judge the intelligent device 200 closest to the sound source position as the main device.
The master device may be determined based on other means besides the above-described means of determining the master device based on the magnitude of the acoustic energy. That is, in some embodiments, the process of detecting the distance between the sound source position and the smart device 200 may also be completed by each smart device 200, that is, the smart device 200 may acquire images of the current environment through the multi-camera, construct a three-dimensional space model according to the images at multiple angles, and extract a portrait from the three-dimensional space model according to an image recognition method, so as to locate the position of the user in the three-dimensional space model, that is, the sound source position. After the sound source position is located, the smart device 200 determines the distance between the sound source position and each smart device 200 according to the placement state of the current smart home model, and finally sends the calculated distance to the server 400, so that the server 400 can determine the smart device 200 closest to the sound source position as a main device.
It can be seen that, in the above embodiment, when the number of the intelligent devices 200 capable of implementing the service requirement includes a plurality of intelligent devices, the server 400 may further select, in a manner that the master device further interacts with the user, a target device capable of finally executing the voice control response from the plurality of intelligent devices 200, so that before the voice control process, frequent communication is not performed among the plurality of devices, and the response speed of the voice interaction process is increased.
Based on the multi-device voice wake-up method provided by the above embodiment, the server 400 may determine the target device, and by issuing the response instruction, the target device may perform an interactive response with respect to the voice input by the user. Since the interactive response process may control the target device to perform specific interactive actions, which may change the device status of the smart device 200, after the response instruction is sent to the target device, the server 400 may further obtain the device status of the target device after executing the response instruction, so as to update the stored device status in real time.
That is, as shown in fig. 9, the server 400 may receive the execution result data reported by the target device after sending the response instruction to the target device. And the execution result data comprises a new state of the equipment after the response instruction is operated. And extracting the new state of the equipment from the execution result, and updating the equipment state stored in the storage module by using the new state of the equipment.
Through the device state updating method provided in the above embodiment, the device state stored in the server 400 can be kept consistent with the actual device state of the smart device 200 in the smart home system in time, so that the server 400 can screen the smart device 200 based on the updated device state in the subsequent smart voice interaction process, and more accurately determine the target device.
Based on the above multi-device voice wake-up method, as shown in fig. 10, in some embodiments of the present application, there is further provided a server 400, including: a storage module 410, a communication module 420, and a control module 430. Wherein the control module 430 is configured to perform the following program steps:
acquiring a voice control instruction input by a user through intelligent equipment;
responding to the voice control instruction, and analyzing the service requirement information in the voice control instruction;
screening target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
and sending a response instruction to the target equipment, and sending a silencing instruction to other intelligent equipment except the target equipment in the current intelligent home system.
In cooperation with the server 400, as shown in fig. 11, in some embodiments of the present application, there is further provided a smart device 200, including: an audio input device, an audio output device, a communicator 220, and a controller 250. Wherein the controller 250 is configured to perform the following program steps:
acquiring voice audio data which is input by a user and used for executing voice control;
generating a voice control instruction according to the voice audio data;
sending a voice control instruction to a server so that the server can analyze service demand information in the voice control instruction and screen target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
receiving a response instruction or a silent instruction sent by a server;
a response instruction or a quiesce instruction is executed.
According to the technical solution, the server 400 and the smart device 200 provided in the above embodiment may form a smart home system, and are used to implement the above multi-device voice wake-up method. After the user inputs the voice control instruction, the server 400 may analyze the service requirement information from the voice control instruction, and screen the target device whose current device state can realize the service requirement according to the service requirement information, so as to send a response instruction to the target device, so that the intelligent device serving as the target device makes a voice response; meanwhile, the server 400 further sends a silent instruction to other devices except the target device in the current smart home system according to the screening result of the target device, so that the smart device 200 that is not the target device does not respond to the voice control function. The server 400 may pre-process the voice control command, so that all types of smart devices 200 can quickly and efficiently make a correct wake-up response within a specified time, and the problem of response abnormality in the conventional voice wake-up method is solved.
The embodiments provided in the present application are only a few examples of the general concept of the present application, and do not limit the scope of the present application. Any other embodiments extended according to the scheme of the present application without inventive efforts will be within the scope of protection of the present application for a person skilled in the art.

Claims (10)

1. A server, comprising:
the storage module is configured to store the equipment state reported by the intelligent equipment;
the communication module is configured to establish communication connection with the intelligent device so as to obtain the device state of the intelligent device;
a control module configured to:
acquiring a voice control instruction input by a user through the intelligent equipment;
responding to the voice control instruction, and analyzing service demand information in the voice control instruction;
screening target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
and sending a response instruction to the target equipment, and sending a silencing instruction to other intelligent equipment except the target equipment in the current intelligent home system.
2. The server of claim 1, wherein the control module is further configured to:
acquiring voice audio data corresponding to the voice control instruction, and identifying a wakeup word from the voice audio data;
if the voice audio data comprise the awakening words, positioning an intelligent home system where the intelligent equipment is located;
and sending a state acquisition request to the intelligent home system so that all intelligent devices in the intelligent home system report the device states after receiving the state acquisition instruction.
3. The server according to claim 1, wherein the service requirement information comprises a service type and a service status, and the control module is further configured to:
in the step of screening the target equipment according to the service demand information, extracting the service type and the service state from the service demand information;
matching candidate devices meeting the service type in the device state, wherein the candidate devices have device types meeting the requirements of the service type;
and traversing the equipment states of the candidate equipment to screen out the target equipment of which the equipment state accords with the service state.
4. The server according to claim 3, wherein the service requirement information further comprises a service execution location, and the control module is further configured to:
in the step of screening the target device according to the service demand information, extracting the service execution position from the service demand information;
acquiring the equipment positions of all candidate equipment in the current intelligent home system;
if the equipment position of the candidate equipment is coincident with the service execution position, executing a step of traversing the equipment state of the candidate equipment;
if the device location of the candidate device does not coincide with the service execution location, marking that the candidate device is not the target device.
5. The server of claim 1, wherein the control module is further configured to:
in the step of screening target equipment according to the service demand information, acquiring the number of intelligent equipment of which the equipment state can realize the service demand information;
if the number of the intelligent devices is greater than or equal to 2, searching a main device to use the main device to interact with a user to determine the target device, wherein the main device is one of a plurality of intelligent devices capable of realizing the service requirement information;
and if the number of the intelligent devices is equal to 1, marking the intelligent device capable of realizing the service requirement information as the target device.
6. The server of claim 5, wherein the control module is further configured to:
after the step of searching the main equipment, sending an inquiry instruction to the main equipment to enable the main equipment to play inquiry voice, wherein the inquiry instruction is a multi-turn wake-up-free voice interaction instruction;
receiving a confirmation voice instruction input by a user through the main equipment;
extracting target device identification information from the confirmation voice instruction;
and screening the target equipment from a plurality of intelligent equipment capable of realizing the service demand information according to the target equipment identification information.
7. The server of claim 1, wherein the control module is further configured to:
after the step of obtaining the voice control instruction input by the user through the intelligent equipment, analyzing the identification information of the execution equipment from the voice control instruction;
if the voice control instruction comprises the identification information of the execution equipment, generating a control command and feeding back voice information according to the voice control instruction;
and sending the control command to the execution equipment according to the identification information of the execution equipment, and sending the feedback voice information to the intelligent equipment inputting the voice control instruction.
8. The server of claim 1, wherein the control module is further configured to:
after the step of sending a response instruction to the target device, receiving execution result data reported by the target device, wherein the execution result data comprises a new device state after the response instruction is operated;
extracting the new state of the equipment from the execution result;
and updating the equipment state stored in the storage module by using the new equipment state.
9. A smart device, comprising:
an audio input device configured to detect voice audio data input by a user;
an audio output device configured to play a voice response;
a communicator configured to establish a communication connection with a server to transmit a device status to the server;
a controller configured to:
acquiring voice audio data which is input by a user and used for executing voice control;
generating a voice control instruction according to the voice audio data;
sending the voice control instruction to the server so that the server analyzes the service demand information in the voice control instruction, and screening target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
receiving a response instruction or a silent instruction sent by the server;
and executing the response instruction or the quiesce instruction.
10. A multi-device voice wake-up method is characterized by being applied to an intelligent home system, wherein the intelligent home system comprises a server and a plurality of intelligent devices, and the intelligent devices are in communication connection with the server; the multi-device voice wake-up method comprises the following steps:
the intelligent equipment acquires voice audio data input by a user, generates a voice control instruction according to the voice audio data, and sends the voice control instruction and the equipment state to the server;
the server analyzes the service demand information in the voice control instruction and screens target equipment according to the service demand information, wherein the target equipment is intelligent equipment of which the equipment state can realize the service demand information;
the server sends a response instruction to the intelligent equipment serving as the target equipment and sends a silencing instruction to other intelligent equipment except the target equipment in the current intelligent home system;
the intelligent device serving as the target device runs the response instruction to respond to the voice control function;
and other intelligent equipment except the target equipment in the current intelligent home system runs the silence instruction and does not respond to the voice control function.
CN202111521226.2A 2021-06-22 2021-12-13 Server, intelligent home system and multi-device voice awakening method Pending CN114172757A (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202111521226.2A CN114172757A (en) 2021-12-13 2021-12-13 Server, intelligent home system and multi-device voice awakening method
PCT/CN2022/100547 WO2022268136A1 (en) 2021-06-22 2022-06-22 Terminal device and server for voice control
CN202280038248.XA CN117882130A (en) 2021-06-22 2022-06-22 Terminal equipment and server for voice control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111521226.2A CN114172757A (en) 2021-12-13 2021-12-13 Server, intelligent home system and multi-device voice awakening method

Publications (1)

Publication Number Publication Date
CN114172757A true CN114172757A (en) 2022-03-11

Family

ID=80486145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111521226.2A Pending CN114172757A (en) 2021-06-22 2021-12-13 Server, intelligent home system and multi-device voice awakening method

Country Status (1)

Country Link
CN (1) CN114172757A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114697151A (en) * 2022-03-15 2022-07-01 杭州控客信息技术有限公司 Intelligent home system with non-voice awakening function and non-voice awakening method thereof
CN115273850A (en) * 2022-09-28 2022-11-01 科大讯飞股份有限公司 Autonomous mobile equipment voice control method and system
CN115665894A (en) * 2022-10-20 2023-01-31 四川启睿克科技有限公司 Whole-house distributed voice gateway system and voice control method
CN116052666A (en) * 2023-02-21 2023-05-02 之江实验室 Voice message processing method, device, system, electronic device and storage medium
CN116580711A (en) * 2023-07-11 2023-08-11 北京探境科技有限公司 Audio control method and device, storage medium and electronic equipment
CN116582382A (en) * 2023-07-11 2023-08-11 北京探境科技有限公司 Intelligent device control method and device, storage medium and electronic device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109450747A (en) * 2018-10-23 2019-03-08 珠海格力电器股份有限公司 A kind of method, apparatus and computer storage medium waking up smart home device
CN111880645A (en) * 2019-05-02 2020-11-03 三星电子株式会社 Server for determining and controlling target device based on voice input of user and operating method thereof
CN113470634A (en) * 2020-04-28 2021-10-01 海信集团有限公司 Control method of voice interaction equipment, server and voice interaction equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109450747A (en) * 2018-10-23 2019-03-08 珠海格力电器股份有限公司 A kind of method, apparatus and computer storage medium waking up smart home device
CN111880645A (en) * 2019-05-02 2020-11-03 三星电子株式会社 Server for determining and controlling target device based on voice input of user and operating method thereof
CN113470634A (en) * 2020-04-28 2021-10-01 海信集团有限公司 Control method of voice interaction equipment, server and voice interaction equipment

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114697151A (en) * 2022-03-15 2022-07-01 杭州控客信息技术有限公司 Intelligent home system with non-voice awakening function and non-voice awakening method thereof
CN114697151B (en) * 2022-03-15 2024-06-07 杭州控客信息技术有限公司 Intelligent home system with non-voice awakening function and voice equipment awakening method
CN115273850A (en) * 2022-09-28 2022-11-01 科大讯飞股份有限公司 Autonomous mobile equipment voice control method and system
CN115665894A (en) * 2022-10-20 2023-01-31 四川启睿克科技有限公司 Whole-house distributed voice gateway system and voice control method
CN116052666A (en) * 2023-02-21 2023-05-02 之江实验室 Voice message processing method, device, system, electronic device and storage medium
CN116580711A (en) * 2023-07-11 2023-08-11 北京探境科技有限公司 Audio control method and device, storage medium and electronic equipment
CN116582382A (en) * 2023-07-11 2023-08-11 北京探境科技有限公司 Intelligent device control method and device, storage medium and electronic device
CN116580711B (en) * 2023-07-11 2023-09-29 北京探境科技有限公司 Audio control method and device, storage medium and electronic equipment
CN116582382B (en) * 2023-07-11 2023-09-29 北京探境科技有限公司 Intelligent device control method and device, storage medium and electronic device

Similar Documents

Publication Publication Date Title
CN114172757A (en) Server, intelligent home system and multi-device voice awakening method
CN111989741B (en) Speech-based user interface with dynamically switchable endpoints
US20220286317A1 (en) Apparatus, system and method for directing voice input in a controlling device
CN106297781B (en) Control method and controller
KR102025566B1 (en) Home appliance and voice recognition server system using artificial intelligence and method for controlling thereof
CN106154860B (en) A kind of intelligent switch and the smart home system using the intelligent switch
CN114067798A (en) Server, intelligent equipment and intelligent voice control method
CN105847099B (en) Internet of things implementation system and method based on artificial intelligence
US20140100854A1 (en) Smart switch with voice operated function and smart control system using the same
CN109473095A (en) A kind of intelligent home control system and control method
CN109616111B (en) Scene interaction control method based on voice recognition
CN109377992A (en) Total space interactive voice Internet of Things network control system and method based on wireless communication
US20200213653A1 (en) Automatic input selection
WO2017141530A1 (en) Information processing device, information processing method and program
CN112838967B (en) Main control equipment, intelligent home and control device, control system and control method thereof
CN112837526A (en) Universal integrated remote control method, control device and universal integrated remote control device
CN112331195B (en) Voice interaction method, device and system
CN113674738A (en) Whole-house distributed voice system and method
JP7374099B2 (en) Apparatus, system and method for instructing voice input in a control device
WO2022268136A1 (en) Terminal device and server for voice control
CN111833585A (en) Method, device and equipment for intelligent equipment to learn remote control function and storage medium
CN109658924B (en) Session message processing method and device and intelligent equipment
CN115834271A (en) Server, intelligent device and intelligent device control method
CN114999496A (en) Audio transmission method, control equipment and terminal equipment
CN110879695B (en) Audio playing control method, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20220311