CN114979543B - Smart home control method and device - Google Patents

Smart home control method and device Download PDF

Info

Publication number
CN114979543B
CN114979543B CN202110204624.5A CN202110204624A CN114979543B CN 114979543 B CN114979543 B CN 114979543B CN 202110204624 A CN202110204624 A CN 202110204624A CN 114979543 B CN114979543 B CN 114979543B
Authority
CN
China
Prior art keywords
user
control
terminal
intelligent home
call request
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110204624.5A
Other languages
Chinese (zh)
Other versions
CN114979543A (en
Inventor
黑昱冬
周靖
田天
李雪丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN202110204624.5A priority Critical patent/CN114979543B/en
Publication of CN114979543A publication Critical patent/CN114979543A/en
Application granted granted Critical
Publication of CN114979543B publication Critical patent/CN114979543B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/14Use of phonemic categorisation or speech recognition prior to speaker recognition or verification
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/02Total factory control, e.g. smart factories, flexible manufacturing systems [FMS] or integrated manufacturing systems [IMS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application provides an intelligent home control method and device. The method comprises the following steps: the control end answers a call request from the terminal, wherein the call request is a request sent by a user of the terminal by dialing a preset number, and the call request is used for requesting remote control of the intelligent home; the control end responds to the call request and calls a pre-configured virtual digital person to carry out video call with the user of the terminal; based on the video call, a control instruction of the user to the intelligent home is obtained. The call request may be carried on resources of the LTE communication system or the 5G communication system. Because of the dialing flow based on the self-contained standard of the terminal, such as the LTE standard or the 5G standard, the control of the user on the intelligent home can be completed without downloading special application, and the occupation of the memory of the terminal can be reduced; and the conversation is carried on LTE or 5G resources, so that the transmission of information such as identity authentication, instructions and the like is safer and more reliable.

Description

Smart home control method and device
Technical Field
The application relates to the technical field of mobile communication, in particular to an intelligent home control method and device.
Background
The intelligent home is connected with various devices (such as lighting equipment, curtains, air conditioners, security systems, digital cinema systems, video and audio servers, network home appliances and the like) in the home through the internet of things technology, and various control functions and means are provided. Currently, the main control means are remote network control, local control, timing control, and one-touch scene control.
However, the existing remote network control needs to download and install special software provided by the smart home manufacturer, such as Application (APP), to perform the related operations. With the increase of smart home types, corresponding special software is increased, which may occupy a larger storage space of the user terminal.
Disclosure of Invention
The application provides a method and a device for controlling intelligent home, which can finish remote control of intelligent home without downloading APP, thereby reducing occupation of memory of a user terminal.
In a first aspect, the present application provides an intelligent home control method, which includes: receiving a call request from a terminal, wherein the call request is a request sent by a user of the terminal by dialing a preset number, and the call request is used for requesting remote control of an intelligent home; responding to the call request, and calling a pre-configured virtual digital person to perform video call with a user of the terminal; and based on the video call, acquiring a control instruction of the user to the intelligent home.
Based on the scheme, the user can acquire the control instruction of the user to the intelligent home by responding to the call request based on the terminal self-contained standard (such as LTE standard or 5G standard) sent by the user dialing the preset number and calling the pre-configured virtual digital person to carry out video call with the user, so that the user can control the intelligent home without downloading special application on the terminal, and the occupation of the memory of the user terminal is reduced. And because the APP complex operation interface is not needed, the user operation is more convenient.
Optionally, before the control instruction of the user on the smart home is acquired based on the video call, the method further includes: prompting the user to input an identity through the virtual digital person; determining whether the user is a registered user based on the identity; under the condition that the user is a registered user, carrying out identity authentication on the user based on voiceprint and human face of the user; or prompting the user to perform voiceprint registration and face registration under the condition that the user is an unregistered user.
Optionally, the identity is a mobile phone number of the user.
Optionally, the authenticating the user includes: based on the video call, collecting voiceprints and faces of the user; and carrying out identity verification on the user according to the recognition of the voiceprint, the recognition of the human face and the prestored voiceprint and human face of the user.
Optionally, the method further comprises: inquiring a home address of the user and a product model of the intelligent home through the virtual digital person; based on the video call, collecting voiceprints and faces of the user; and storing the home address of the user, the product model of the intelligent home, the voiceprint of the user and the face of the user into a local database.
Optionally, the obtaining, based on the video call, a control instruction of the user to the smart home includes: acquiring a voice instruction of the user based on the video call; and performing voice recognition on the voice command to determine the intelligent home which the user wants to control and the control operation on the intelligent home.
Optionally, the method further comprises: executing the control instruction; and after the control instruction is finished, feeding back the finishing condition of the control instruction to the user through the virtual digital person.
Optionally, the call request and the video call are carried on resources of a long term evolution (long term evolution, LTE) communication system or resources of a fifth generation (5 th generation, 5G) communication system.
In a second aspect, the present application provides an intelligent home control device, including a unit for implementing the intelligent home control method described in the first aspect.
In a third aspect, the present application provides an intelligent home control device, including a processor, where the processor is configured to execute the intelligent home control method described in the first aspect.
In a fourth aspect, there is provided a computer readable storage medium comprising instructions which, when run on a computer, cause the computer to implement the method of the first aspect and any one of the first aspects.
In a fifth aspect, there is provided a computer program product comprising: a computer program (which may also be referred to as code, or instructions) which, when executed, causes a computer to perform the method of the first aspect and any one of the possible implementations of the first aspect.
It should be understood that the second to fifth aspects of the present application correspond to the technical solutions of the first aspect of the present application, and the advantages obtained by each aspect and the corresponding possible embodiments are similar, and are not repeated.
Drawings
Fig. 1 is a network architecture suitable for an intelligent home control method provided by the application;
fig. 2 is a schematic flowchart of an intelligent home control method according to an embodiment of the present application;
fig. 3 is a schematic block diagram of an intelligent home control system according to an embodiment of the present application;
Fig. 4 is a schematic block diagram of an intelligent home control device according to an embodiment of the present application;
Fig. 5 is another schematic block diagram of an intelligent home control device according to an embodiment of the present application.
Detailed Description
The technical scheme of the application will be described below with reference to the accompanying drawings.
The intelligent home uses the home as a platform, integrates facilities related to home life by utilizing a comprehensive wiring technology, a network communication technology, a security protection technology, an automatic control technology and an audio and video technology, builds an efficient management system for home facilities and home schedule errors, improves the safety, convenience, comfort and artistry of the home, and realizes an environment-friendly and energy-saving living environment. Compared with the common home, the intelligent home not only has the traditional living function, but also has the functions of building, network communication, information home appliances and equipment automation, provides an omnibearing information interaction function, and even saves funds for various energy costs.
The technical scheme of the embodiment of the application can be used for controlling various intelligent household devices. The intelligent home may include, for example, but not limited to, intelligent light fixtures, intelligent washing machines, intelligent refrigerators, intelligent electric cookers, intelligent door locks, intelligent curtains, central dust collection systems, intelligent background music systems, intelligent video systems, digital cinema systems, intelligent security alarms, intelligent monitoring systems, and the like.
Control of smart home devices may include, for example, but is not limited to, control of the various smart home devices listed above. For brevity, this is not a list.
For the convenience of understanding the embodiments of the present application, a network architecture suitable for an intelligent home control method provided by the present application is first described in detail with reference to fig. 1. As shown in fig. 1, the network architecture 100 may include a terminal 110, a control end 120, and smart home devices 131, 132, and 133. The terminal 110 and the control terminal 120 may communicate through a global system for mobile communications (global system for mobile communications, GSM), and the control terminal 120 may be configured to control the smart home devices 131 to 133.
It should be understood that, in the embodiment of the present application, the terminal 110 may be, for example, a mobile phone or a wearable device; the control terminal 120 may be, for example, an intelligent home control device having an audio/video call function based on a terminal-carried standard (such as an LTE standard or a 5G standard), or a control device integrated in an intelligent home having an audio/video call function based on a terminal-carried standard (such as an LTE standard or a 5G standard). The embodiment of the present application is not limited thereto. And the control terminal 120 may be connected to a plurality of terminals 110, which is not limited in number according to the embodiment of the present application.
The following describes in detail a smart home control method provided by an embodiment of the present application with reference to fig. 2. As shown in fig. 2, the method 200 may include steps 201 through 214.
In step 201, the control end receives a call request from the terminal.
Optionally, the call request is a request sent by a user of the terminal by dialing a preset number, and the call request is used for requesting remote control of the smart home. The preset number may be set on the control terminal, and used for connecting the control terminal. The control terminal may be, for example, an intelligent home control device having an audio/video call function based on a terminal-carried standard (such as an LTE standard or a 5G standard), or a control device integrated in an intelligent home having an audio/video call function based on a terminal-carried standard (such as an LTE standard or a 5G standard), or the like. The embodiment of the present application is not limited thereto.
Alternatively, the user of the terminal may be a family member or other user desiring to control the smart home.
It should be understood that in the embodiment of the present application, the terminal, the user of the terminal and the user are used alternately, and the meanings expressed by the three are the same.
Alternatively, the call request may be a call request transmitted using a fourth generation (4 th generation, 4G) terminal or a 5G terminal to make a call. The 4G terminal is a terminal supporting 4G, or a terminal supporting LTE. The 5G terminal is also a terminal supporting 5G. The call request may be carried on resources of the LTE communication system or on resources of the 5G communication system. For example, the call request may be sent via voice over LTE (VoLTE) protocol.
It should be noted that, the VoLTE protocol is mainly a high-speed wireless communication standard for mobile phones and data terminals, and compared with a traditional circuit-switched voice network, voLTE carries voice services in an LTE data bearer network for transmission. It should be understood that the VoLTE protocol is only an example and the application does not exclude the possibility of defining other possible protocols in future protocols for carrying voice traffic in future communication systems, such as 5G etc.
It should be understood that the foregoing is merely illustrative of a possible case where the terminal sends a call request, and the network and the handset used for the call request are not limited in the embodiment of the present application.
In step 202, in response to the call request, the control end invokes a pre-configured virtual digital person to make a video call with the user of the terminal.
Wherein a virtual digital person can be understood to be custom generated for user needs based on artificial intelligence (ARTIFICIAL INTELLIGENCE, AI) technology. The virtual digital person can convert static image-text content into dynamic video taught by 'real person' based on AI technology. The virtual digital person in the embodiment of the application can carry out semantic understanding based on the received voice, generate corresponding replies based on the semantic understanding, and output the replies in the form of audio and video so as to carry out dialogue with a user. In other words, the virtual digital person may engage in a video call with the user. It should be understood that the virtual digital person is only one possible naming and should not be construed as limiting the application in any way. For example, a virtual digital person may also be referred to as a virtual manager, an analog digital person, etc., and the present application is not limited thereto.
In the embodiment of the application, the virtual digital person can be pre-configured at the control end, and can carry out video call with the user of the terminal in response to the call of the virtual digital person.
In step 203, based on the video call, the control end obtains the control instruction of the user to the smart home.
It should be understood that the control instructions are for controlling the smart home.
It should also be understood that, by receiving a call request from the terminal, in response to the call request, the control end invokes a pre-configured virtual digital person to perform a video call with a user of the terminal, and based on the video call, the control end obtains a control instruction of the user on the smart home. The intelligent home control method and the intelligent home control system have the advantages that the control of the user on the intelligent home can be completed without downloading the APP at the terminal, so that the occupation of the memory of the user terminal is reduced, and the user operation is facilitated.
Optionally, steps 204 to 212 are included before step 203.
In step 204, the user is prompted by the virtual digital person to enter an identification.
Alternatively, the identification may be, for example, a cell phone number. The mobile phone number may be, for example, the user's own number or another mobile phone number entered by the user. The application is not limited in this regard.
Optionally, the local database of the control end pre-stores the identity. The pre-stored identity may be used to determine if the user is present, i.e. to verify if the user is a registered user.
Optionally, the control end local database pre-stores the user voiceprint and the face corresponding to the identity. The pre-stored user voiceprints and faces can be used to authenticate the user.
In step 205, it is determined whether the user is a registered user based on the identity.
The identity input by the user can be found in a local database at the control end to determine whether the user is a registered user. If so, steps 206-207 may continue for the registered user; if not, steps 208 to 211 may continue to be performed for the unregistered user.
In step 206, based on the video call, voiceprints and faces of the user are collected.
Illustratively, the process of collecting user voiceprints is as follows:
Pulse code modulated (pulse code modulation, PCM) audio data of a user is first acquired based on a video call. For example, the PCM audio data takes 25 milliseconds as one frame, and the sample data of each frame is 400.
The PCM audio data is then subjected to front-end processing (active speech detection, speech enhancement, etc.), in which various environmental noises and silence, etc., which may be present in the PCM audio data, are filtered by the energy value and zero-crossing rate of each frame, for example, to obtain 320 frames of valid PCM audio data.
It should be appreciated that since the PCM audio data is one frame of 25 milliseconds, 320 frames of valid PCM audio data is 8 seconds of PCM audio data. The 8 second PCM audio data will be described hereinafter as an example.
The effective PCM audio data is then extracted into the zero order, first order and second order statistics of the forward-backward (baum-welch) algorithm.
For example, the effective PCM audio data may be represented as a dataset x= { X 1,x2,……,xs }, for the S-th segment of PCM audio data X s in the dataset, the extracted acoustic feature sequence may be represented as o s, the t-th frame feature in the acoustic feature sequence is represented as o s,t, and then the zero order statistic N c,s, the first order statistic F c,s, and the second order statistic S c,s corresponding to the segment of speech on the c-th gaussian mixture component may be calculated according to the generic background model (universal background model, UBM) as follows:
Wherein c represents the c-th Gaussian mixture component in the general background model, diag {.cndot } represents matrix diagonal operation, mu c is the mean value of the corresponding c-th Gaussian mixture component, and gamma c,s,t is the posterior probability of the t-th frame characteristic of the s-th speech on the c-th Gaussian mixture component.
Finally, the obtained 8 seconds of PCM audio data is stored in a local database as voiceprint passwords.
Alternatively, the resulting 8 seconds of PCM audio data may be stored in a voiceprint library, which may be included in the local database.
Illustratively, the process of collecting the face of the user is as follows:
first, video data of a user is acquired based on a video call.
Then, using H.264 decoder, separating frame data in the video data, taking one frame data and sending to face recognition engine.
And finally, if the face recognition engine returns an effective result and a face identification vector, a mapping table of the incoming call user and the face identification vector is established and stored in a local database.
Alternatively, the mapping table may be stored in a face database, and the local database may include the face database.
However, if the face recognition engine returns an invalid result, it delays for one second and a frame of data is retrieved. If n 1 times of identification are unsuccessful, returning registration failure. Where n 1 is a predefined value, for example 0 < n 1 < 10 may be satisfied.
In step 207, the user is authenticated based on the collected voiceprints and faces, and pre-stored user voiceprints and faces.
Before the identity verification is carried out, a welcome word is played, the user is informed of the next identity verification, the user is recommended to adjust the mobile phone, the head portrait is arranged at the appointed position of the screen, and words of the screen prompt are read aloud.
Optionally, the authentication includes voiceprint authentication and face authentication.
The following exemplarily illustrates a specific implementation procedure of voiceprint verification:
First, PCM audio data of a user is acquired based on a video call.
Then, the front-end processing of the PCM audio data (including active speech detection and speech enhancement) yields 4 seconds of valid PCM audio data. The specific method is as follows: and calculating the energy value and zero crossing rate of each frame, and filtering various environmental noises, silence and the like possibly existing in the PCM audio data to obtain 160 frames of effective data.
And accessing the voiceprint library based on the identity of the user to extract the voiceprint password of the user.
And then performing one-to-one voiceprint comparison by using Probability Linear Discriminant Analysis (PLDA) to obtain the final product. The speaker information and the channel information contained in the i-vector factor exist in two linear subspaces, and assuming that the speaker S has R S sections of training voices, corresponding all i-vector factor vector sets extracted through i-vector factor analysis can be expressed as { w s,r:r=1,2,...,Rs }, and for the R section i-vector of the speaker S, modeling can be performed by using PLDA to obtain:
ws,r=μ+Ψβs+Γαs,r+ε。
Wherein μ represents the mean value of the entire training data; ψ is the identity space, which contains the substrates that can be used to represent various identities; β s is the identity of a person (or the position of a person in the identity space); Γ is the error space, comprising a base that can be used to represent different changes in the same identity; α s,r represents the position in this space; the last residual noise term epsilon is used to represent what has not yet been explained, and is zero-averaged gaussian. The first two terms to the right of the formula equal sign are related to the speaker only and not to a particular speech of the speaker, called the signal portion, which describes the difference between the speakers; the second two terms on the right of the equal sign describe the difference between different voices of the same speaker, called the noise portion. Such two imaginary variables are used to describe the data structure of a piece of speech.
After modeling the i-vector factor through the subspace PLDA and estimating to obtain model parameters, the speaker can be classified based on the model, and the test task is scored by adopting likelihood ratio scores under the subspace PLDA modeling method.
And finally, comparing the scores, and if the scores exceed a specified threshold, considering that the voiceprint verification is passed.
The following exemplarily shows a specific implementation procedure of face verification:
Firstly, accessing a face library to obtain a face identification vector under the condition of the identity identification of a user.
Then, based on the video call, the video data is read.
And then using an H.264 decoder to separate frame data in the video data, taking one frame of data, and sending the one frame of data and the face identification vector to a face identification engine to request 1-to-1 head portrait identification.
And finally, if the engine returns that the identification is successful, stopping the identification, and returning that the video identity verification passes. Or if the return is unsuccessful, re-reading the frame data, repeating the identification, up to n 2 times. Where n 2 is a predefined value, for example 0 < n 2 < 10 may be satisfied.
In step 208, the user is prompted to perform voiceprint registration and face registration.
Optionally, prompting the user at the terminal, wherein the currently input mobile phone number is an unregistered mobile phone number, and prompting the user to register the input mobile phone number.
In the case where the user determines to perform voiceprint registration and face registration, step 210 is performed.
In step 209, the user's home address and smart home product model are queried by the virtual digital person.
In step 210, based on the video call, voiceprints and faces of the user are collected.
It should be understood that, based on the video call, the process of collecting the voiceprints and faces of the user is shown in step 206, and will not be described again.
In step 211, the home address of the user, the product model of the smart home, the collected voiceprint and the face are stored in a local database.
It should be understood that the voiceprints and the faces stored in the local database are the prestored voiceprints and faces, and can be used for performing identity verification on the user.
Optionally, the identity of the user is saved. The stored identity is the pre-stored identity, and can be used for determining whether the user is a registered user.
In case the authentication passes or the voiceprint registration and face registration succeed, the above step 203 is performed.
Optionally, after step 203, steps 212 to 214 may be performed.
In step 212, the control command is subjected to voice recognition to determine the smart home that the user wishes to control and the control operation for the smart home.
The above-mentioned process of speech recognition of control instructions is specifically as follows:
Firstly, audio data of a user is read from a message queue, and is packaged according to a second edition (media resource control protocol version, MRCPv 2) of a media resource control protocol and forwarded to a voice recognition engine.
And then, after a transcription result of the voice recognition engine is obtained, through pinyin error correction and hotword error correction, the intention recognition of the user is completed, and the voice instruction with the closest similarity is selected.
It should be appreciated that the accuracy of the control instructions may be improved through the voice recognition process, thereby improving the accuracy and efficiency of voice control to some extent.
In step 213, the control instruction described above is executed.
In step 214, the completion of the control instruction is fed back to the user by the virtual digital person.
Based on the scheme, the control instruction of the user on the intelligent home is acquired by responding to the call request of the user dialing the preset number based on the terminal self-contained standard (such as LTE standard or 5G standard) and calling the pre-configured virtual digital person to perform video call with the user, so that the user can control the intelligent home without downloading the APP at the terminal, and the occupation of the memory of the user terminal is reduced. And because the APP complex operation interface is not needed, the user operation is more convenient. In addition, because the LTE communication resource or the 5G communication resource is used for bearing the video call, identity information, instructions and the like of the user can be borne on the LTE communication resource or the 5G communication resource, compared with the public Internet, privacy leakage can be avoided, and the video call is safer and more reliable.
Fig. 3 is a schematic block diagram of an intelligent home control system according to an embodiment of the present application. As shown in fig. 3, the smart home control system may include a terminal, a control end, a local database, a user voiceprint acquisition module, a user voiceprint comparison module, a user head portrait acquisition module, a user head portrait comparison module, a voice instruction recognition module, and a smart home interface gateway module.
Alternatively, the local database above may include, for example, a voiceprint library and an avatar library.
It should be appreciated that the method 200 described in fig. 2 may be applied in the system shown in fig. 3, and that the modules in fig. 3 may be used to accomplish the method described in fig. 2.
Fig. 4 is a schematic block diagram of an intelligent home control device according to an embodiment of the present application. As shown in fig. 4, the apparatus 400 may include: a processing module 410 and a communication module 420.
The processing module 410 may be configured to invoke a pre-configured virtual digital person to conduct a video call with a user of the terminal in response to the call request.
The communication module 420 may be configured to answer a call request from a terminal, where the call request is a request sent by a user of the terminal by dialing a preset number, and the call request is used to request remote control of a smart home; and the control instruction of the user to the intelligent home can be acquired based on the video call.
Optionally, the communication module 420 may be further configured to prompt the user for an identification through the virtual digital person.
Optionally, the processing module 410 is further configured to determine, based on the identity, whether the user is a registered user; and in the case that the user is a registered user, carrying out identity authentication on the user based on the voiceprint and the face of the user.
Optionally, the communication module 420 may be further configured to prompt the user to perform voiceprint registration and face registration when the user is an unregistered user.
Optionally, the identity is a mobile phone number of the user.
Optionally, the communication module 420 may be further configured to collect voiceprints and faces of the user based on the video call.
Optionally, the processing module 410 may be further configured to perform authentication on the user according to the recognition of the voiceprint and the recognition of the face, and the prestored voiceprint and face of the user.
Optionally, the communication module 420 may be further configured to query, through the virtual digital person, a home address of the user and a product model of the smart home; and collecting voiceprints and faces of the user based on the video call.
Optionally, the processing module 410 may be further configured to store the home address of the user, the product model of the smart home, the voiceprint of the user, and the face in a local database.
Optionally, the communication module 420 may be further configured to obtain a voice instruction of the user based on the video call.
Optionally, the processing module 410 may be further configured to perform voice recognition on the voice command to determine the smart home that the user wishes to control and the control operation of the smart home.
Optionally, the processing module 410 may be further configured to execute the control instructions.
Optionally, the communication module 420 may be further configured to feed back, to the user, the completion of the control instruction through the virtual digital person after the control instruction is completed.
Optionally, the call request and the video call are carried on resources of an LTE communication system or resources of a 5G communication system.
Fig. 5 is another schematic block diagram of an intelligent home control device according to an embodiment of the present application. The device can be used for realizing the functions of the processing module and the communication module in the method. Wherein the device may be a system-on-chip. In the embodiment of the application, the chip system can be formed by a chip, and can also comprise the chip and other discrete devices.
As shown in fig. 5, the apparatus 500 may include at least one processor 510. Illustratively, the processor 510 is operable to invoke a pre-configured virtual digital person to conduct a video call with a user of the terminal in response to the call request. Reference is made specifically to the detailed description in the method examples, and details are not described here.
The apparatus 500 may also include at least one memory 520 for storing program instructions and/or data. Memory 520 is coupled to processor 510. The coupling in the embodiments of the present application is an indirect coupling or communication connection between devices, units, or modules, which may be in electrical, mechanical, or other forms for information interaction between the devices, units, or modules. Processor 510 may operate in conjunction with memory 520. Processor 510 may execute program instructions stored in memory 520. At least one of the at least one memory may be included in the processor.
The apparatus 500 may also include a communication interface 530 for communicating with other devices over a transmission medium, such that an apparatus for use in the apparatus 500 may communicate with other devices. The communication interface 530 may be, for example, a transceiver, an interface, a bus, a circuit, or a device capable of implementing a transceiver function. Processor 510 may utilize communication interface 530 to transmit and receive data and/or information and may be used to implement a smart home control method as described in the corresponding embodiment of fig. 2.
The specific connection medium between the processor 510, the memory 520, and the communication interface 530 is not limited to the above embodiments of the present application. The embodiment of the present application is illustrated in fig. 5 as being coupled between processor 510, memory 520, and communication interface 530 via bus 540. The connection of the bus 540 to other components is shown by a bold line in fig. 5, and is merely illustrative and not limiting. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 5, but not only one bus or one type of bus.
The present application also provides a computer program product comprising: a computer program (which may also be referred to as code, or instructions) which, when executed, causes an electronic device to perform the method of the embodiment shown in fig. 2.
The present application also provides a computer-readable storage medium storing a computer program (which may also be referred to as code, or instructions). The computer program, when executed, causes the electronic device to perform the method in the embodiment shown in fig. 2.
It should be appreciated that the processor in embodiments of the present application may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The processor may be a general purpose processor, a digital signal processor (DIGITAL SIGNAL processor, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), a field programmable gate array (field programmable GATE ARRAY, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
It should also be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an erasable programmable ROM (erasable PROM), an electrically erasable programmable EPROM (EEPROM), or a flash memory. The volatile memory may be random access memory (random access memory, RAM) which acts as external cache memory. By way of example, and not limitation, many forms of RAM are available, such as static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate synchronous dynamic random access memory (double DATA RATE SDRAM, DDR SDRAM), enhanced synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCHLINK DRAM, SLDRAM), and direct memory bus random access memory (direct rambus RAM, DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
The terms "unit," "module," and the like as used in this specification may be used to refer to a computer-related entity, either hardware, firmware, a combination of hardware and software, or software in execution.
Those of ordinary skill in the art will appreciate that the various illustrative logical blocks (illustrative logical block) and steps (steps) described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. In the several embodiments provided by the present application, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
In the above-described embodiments, the functions of the respective functional units may be implemented in whole or in part by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions (programs). When the computer program instructions (program) are loaded and executed on a computer, the processes or functions according to the embodiments of the present application are fully or partially produced. The computer may be a general purpose computer, a special purpose computer, a computer network, or other programmable apparatus. The computer instructions may be stored in or transmitted from one computer-readable storage medium to another, for example, by wired (e.g., coaxial cable, optical fiber, digital Subscriber Line (DSL)), or wireless (e.g., infrared, wireless, microwave, etc.). The computer readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server, data center, etc. that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Drive (SSD)), etc.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (9)

1. The intelligent home control method is characterized by being applied to a control end and comprising the following steps:
Receiving a call request from a terminal, wherein the call request is a request sent by a user of the terminal by dialing a preset number, and the call request is used for requesting remote control of an intelligent home;
responding to the call request, and calling a pre-configured virtual digital person to perform video call with a user of the terminal;
Based on the video call, acquiring a control instruction of the user to the intelligent home;
Before the control instruction of the user to the smart home is acquired based on the video call, the method further comprises: prompting the user to input an identity through the virtual digital person;
determining whether the user is a registered user based on the identity;
under the condition that the user is a registered user, carrying out identity authentication on the user based on voiceprint and human face of the user; or (b)
Prompting the user to perform voiceprint registration and face registration under the condition that the user is an unregistered user;
the identity is the mobile phone number of the user.
2. The method of claim 1, wherein said authenticating the user comprises:
Based on the video call, collecting voiceprints and faces of the user;
and carrying out identity verification on the user according to the recognition of the voiceprint, the recognition of the human face and the prestored voiceprint and human face of the user.
3. The method of claim 1, wherein the method further comprises:
inquiring a home address of the user and a product model of the intelligent home through the virtual digital person;
Based on the video call, collecting voiceprints and faces of the user;
and storing the home address of the user, the product model of the intelligent home, the voiceprint of the user and the face of the user into a local database.
4. The method of claim 1, wherein the method further comprises:
Executing the control instruction;
and after the control instruction is finished, feeding back the finishing condition of the control instruction to the user through the virtual digital person.
5. The method of any of claims 1 to 4, wherein the call request and the video call are carried on resources of a long term evolution, LTE, communication system or resources of a fifth generation, 5G, communication system.
6. Smart home control device, characterized by comprising means for performing the method according to any of claims 1 to 5.
7. A smart home control device comprising a processor for performing the method of any one of claims 1 to 5.
8. A computer readable storage medium storing instructions which, when run on a computer, cause the computer to perform the method of any one of claims 1 to 5.
9. A computer program product comprising a computer program which, when run, causes the computer to perform the method of any one of claims 1 to 5.
CN202110204624.5A 2021-02-24 2021-02-24 Smart home control method and device Active CN114979543B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110204624.5A CN114979543B (en) 2021-02-24 2021-02-24 Smart home control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110204624.5A CN114979543B (en) 2021-02-24 2021-02-24 Smart home control method and device

Publications (2)

Publication Number Publication Date
CN114979543A CN114979543A (en) 2022-08-30
CN114979543B true CN114979543B (en) 2024-07-02

Family

ID=82972710

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110204624.5A Active CN114979543B (en) 2021-02-24 2021-02-24 Smart home control method and device

Country Status (1)

Country Link
CN (1) CN114979543B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105978775A (en) * 2016-07-29 2016-09-28 镇江惠通电子有限公司 Speech control system and speech control method
CN109412910A (en) * 2018-11-20 2019-03-01 三星电子(中国)研发中心 The method and apparatus for controlling smart home device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106097495A (en) * 2016-06-03 2016-11-09 赵树龙 A kind of intelligent voice control vocal print face authentication door access control system and method
CN106792238A (en) * 2016-11-17 2017-05-31 Tcl集团股份有限公司 A kind of interactive approach and interaction systems based on live telecast shopping platform
CN106790054A (en) * 2016-12-20 2017-05-31 四川长虹电器股份有限公司 Interactive authentication system and method based on recognition of face and Application on Voiceprint Recognition
CN106782559A (en) * 2016-12-31 2017-05-31 广东博意建筑设计院有限公司 Smart home house keeper central control system and its control method with telecommunication control
CN107092196A (en) * 2017-06-26 2017-08-25 广东美的制冷设备有限公司 The control method and relevant device of intelligent home device
CN110248018A (en) * 2018-03-09 2019-09-17 深圳市云昱科技有限公司 The call method and Related product of intelligent secretary
CN110647636B (en) * 2019-09-05 2021-03-19 深圳追一科技有限公司 Interaction method, interaction device, terminal equipment and storage medium
CN112291331A (en) * 2020-10-25 2021-01-29 湖南云脸智联科技有限公司 Cloud access control implementation method based on AI intelligent sound box application
CN112291497B (en) * 2020-10-28 2023-04-07 上海赛连信息科技有限公司 Intelligent video customer service access method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105978775A (en) * 2016-07-29 2016-09-28 镇江惠通电子有限公司 Speech control system and speech control method
CN109412910A (en) * 2018-11-20 2019-03-01 三星电子(中国)研发中心 The method and apparatus for controlling smart home device

Also Published As

Publication number Publication date
CN114979543A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
WO2019051214A1 (en) Administration of privileges by speech for voice assistant system
US11557301B2 (en) Hotword-based speaker recognition
US12010108B2 (en) Techniques to provide sensitive information over a voice connection
JP6738867B2 (en) Speaker authentication method and voice recognition system
US20080195395A1 (en) System and method for telephonic voice and speech authentication
US11763817B2 (en) Methods, systems, and media for connecting an IoT device to a call
CN108597526A (en) A kind of permission confirmation method, device, storage medium and intelligent sound box
CN110473555B (en) Interaction method and device based on distributed voice equipment
US11102021B2 (en) Responsive communication system
US20150056952A1 (en) Method and apparatus for determining intent of an end-user in a communication session
US20230017401A1 (en) Speech Activity Detection Using Dual Sensory Based Learning
CN110858841B (en) Electronic device and method for registering new user through authentication of registered user
CN111343022A (en) Method and system for realizing network configuration processing of intelligent equipment by directly interacting with user
CN114979543B (en) Smart home control method and device
CN107172620A (en) A kind of wireless local area network (WLAN) verification method and apparatus
EP1164576B1 (en) Speaker authentication method and system from speech models
CN114745213B (en) Conference record generation method and device, electronic equipment and storage medium
CN112417923A (en) System, method and apparatus for controlling smart devices
CN112492599B (en) Terminal control method, system, electronic device and storage medium
US20210375267A1 (en) Method and system for smart interaction in a multi voice capable device environment
CN111988426A (en) Communication method and device based on voiceprint recognition, intelligent terminal and storage medium
CN112449059A (en) Voice interaction device, method and system for realizing call based on voice interaction device
KR102495028B1 (en) Sound Device with Function of Whistle Sound Recognition
KR20190053633A (en) Stationary Sound Device for Circumstantial Judgement
Blue et al. Lux: Enabling ephemeral authorization for display-limited IoT devices

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant