CN113488054B - Voice forwarding method, server and intelligent voice equipment - Google Patents

Voice forwarding method, server and intelligent voice equipment Download PDF

Info

Publication number
CN113488054B
CN113488054B CN202010350327.7A CN202010350327A CN113488054B CN 113488054 B CN113488054 B CN 113488054B CN 202010350327 A CN202010350327 A CN 202010350327A CN 113488054 B CN113488054 B CN 113488054B
Authority
CN
China
Prior art keywords
voice
message
information
intelligent
address
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010350327.7A
Other languages
Chinese (zh)
Other versions
CN113488054A (en
Inventor
陈维强
王彦芳
高雪松
王月岭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Co Ltd
Original Assignee
Hisense Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Co Ltd filed Critical Hisense Co Ltd
Priority to CN202010350327.7A priority Critical patent/CN113488054B/en
Publication of CN113488054A publication Critical patent/CN113488054A/en
Application granted granted Critical
Publication of CN113488054B publication Critical patent/CN113488054B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/04Details of speech synthesis systems, e.g. synthesiser structure or memory management
    • G10L13/047Architecture of speech synthesisers
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voice forwarding method, a server and intelligent voice equipment, wherein the method comprises the steps of receiving voice notification information acquired by first intelligent voice equipment, determining the address of the first intelligent voice equipment, determining the identity information of a notifier, first notification information and the identity information of a notified person according to the voice notification information, determining first forwarding content, positioning personnel according to the identity information of the notified person, determining the address of second intelligent voice equipment after the position of the notified person is positioned, determining first voice information, and uploading the first voice information to a message bus for forwarding the first voice information. The person to be notified is positioned, the position of the person to be notified is determined, and the address of the second intelligent voice device is obtained according to the position of the person to be notified, so that the voice message is forwarded through the message bus, the second intelligent voice device monitors the voice message and then plays the voice message to the person to be notified, the person to be notified can listen to the message quickly, and the message forwarding efficiency is improved.

Description

Voice forwarding method, server and intelligent voice equipment
Technical Field
The invention relates to the technical field of smart home, in particular to a voice forwarding method, a server and intelligent voice equipment.
Background
The intelligent sound box is used as a household small electric appliance which is indispensable for modern intelligent life, and brings convenience for the life of people. However, most of current intelligent sound box applications focus on the function that single sound boxes such as listening to songs, listening to news or intelligent home control can be completed, or intelligent sound boxes acquire mobile phone address book authorization, so that the function of communication between mobile phones and sound boxes is realized, and the collaborative interaction research between sound boxes in a family is less.
Disclosure of Invention
The embodiment of the invention provides a voice forwarding method, a server and intelligent voice equipment, which are used for realizing quick message transmission among family members and improving message forwarding efficiency.
In a first aspect, an embodiment of the present invention provides a voice forwarding method, including:
receiving a voice notification message acquired by first intelligent voice equipment; the first intelligent voice device is an intelligent voice device for notifying a person to wake up;
determining the address of the first intelligent voice device, and determining the identity information of the notifier, the first notification information and the identity information of the notified person according to the voice notification message;
Determining first forwarding content according to the identity information of the notifier and the first notification information; performing personnel positioning according to the identity information of the notified person, positioning to the position of the notified person, and determining the address of the second intelligent voice device according to the position of the notified person;
and determining a first voice message according to the address of the first intelligent voice device, the first forwarding content and the address of the second intelligent voice device, and uploading the first voice message to a message bus for forwarding the first voice message.
According to the technical scheme, the position of the notified person is rapidly determined by personnel positioning the identity information of the notified person, and then the address of the second intelligent voice device can be obtained according to the position of the notified person, so that the first voice message is forwarded through the message bus, the second intelligent voice device monitors the first voice message and plays the first voice message to the notified person, the notified person can rapidly listen to the message, and the message forwarding efficiency is improved.
In some embodiments, the determining the identity information of the notifier, the first notification information, and the identity information of the notified person according to the voice notification message includes:
Voiceprint recognition is carried out on the voice notification message, and identity information of the notifier is determined;
and carrying out semantic analysis on the voice notification message, and extracting identity information and first notification information of the notified person in the voice notification message.
According to the technical scheme, the identity information of the notifier, the notification information and the identity information of the notified person can be obtained quickly by identifying the voice notification message, so that the processing efficiency of the voice notification message is improved.
In some embodiments, the determining the address of the second intelligent voice device according to the location of the notified person includes:
determining the address of the intelligent voice equipment nearest to the notified person according to the position of the notified person;
and determining the address of the intelligent voice device nearest to the notified person as the address of the second intelligent voice device.
According to the technical scheme, the address of the intelligent voice device closest to the position of the notified person can be rapidly determined as the address of the second intelligent voice device according to the position of the notified person, so that the destination address of message forwarding can be obtained.
In some embodiments, after the uploading to the message bus for forwarding the first voice message, the method further comprises:
Acquiring voice response information sent by the second intelligent voice equipment within preset time;
identifying the voice response information to obtain second notification information;
determining second forwarding content according to the identity information of the notified person and the second notification information;
and determining a second voice message according to the address of the first intelligent voice device, the second forwarding content and the address of the second intelligent voice device, and uploading the second voice message to a message bus for forwarding the second voice message.
According to the technical scheme, the second intelligent voice equipment is used for generating the second voice message after the voice response information sent by the second intelligent voice equipment in the preset time is identified, and then the second voice message is forwarded to the first intelligent voice equipment where the notifier is located, so that the response message can be quickly replied, and the efficiency of information interaction is improved.
In certain embodiments, the method further comprises:
if the position of the notified person cannot be located;
determining the first forwarding content as message information; and storing the identity information of the notifier, the identity information of the notified person, the message information, the current time and the message read flag bit in a database of a memory.
According to the technical scheme, when the position of the notified person cannot be located, point-to-point messages can be realized by storing the message information and the address of the intelligent voice equipment for leaving the messages, and information missing transmission is prevented.
In certain embodiments, the method further comprises:
when the distance between the position of the notified person and the intelligent voice equipment for leaving messages is smaller than a preset distance, determining the information of the messages of the notified person and the identity information of the notified person within a set time range from a database of the memory; waking up the intelligent voice device;
according to the message information of the notified person and the identity information of the notified person, performing voice synthesis to obtain message play information;
and sending the message playing information to the intelligent message voice equipment for voice playing, and setting the read flag bit of the message as read.
According to the technical scheme, when the notified person approaches the intelligent voice equipment for leaving the message, the intelligent voice equipment for leaving the message is quickly awakened, and the voice for leaving the message is played for the notified person, so that the message can be automatically played when the notified person approaches the intelligent voice equipment for leaving the message.
In certain embodiments, the method further comprises:
Acquiring a message inquiry request of a user sent by third intelligent voice equipment;
identifying identity information of the user;
inquiring a message record of the user from a database according to the identity information of the user;
according to the identity information and the message information of the notifier in the message record, performing voice synthesis to obtain message playing voice;
and sending the message playing voice to the third intelligent voice equipment for voice playing.
In a second aspect, an embodiment of the present invention provides a server configured to perform the above-described voice forwarding method.
In a third aspect, an embodiment of the present invention provides an intelligent voice device, including:
the microphone array is used for collecting voice information of a user;
a speaker for playing voice information;
an RFID (Radio Frequency Identification ) tag for determining an address of the intelligent voice device;
a processor configured to:
when the destination address in the first voice message uploaded by the server is confirmed to be the address of the intelligent voice device, acquiring first forwarding content in the first voice message;
and performing voice synthesis on the first forwarding content to obtain first forwarding voice playing information, and playing the first forwarding voice playing information through the loudspeaker.
In certain embodiments, the processor is further configured to:
collecting voice response information of the notified person within preset time;
and sending the voice response information to a server.
In a fourth aspect, an embodiment of the present invention provides an intelligent voice device, including:
the microphone array is used for collecting voice information of a user;
a speaker for playing voice information;
an RFID tag for determining an address of the intelligent voice device;
a processor configured to:
when the destination address in the second voice message uploaded by the server is confirmed to be the address of the intelligent voice device, acquiring second forwarding content in the second voice message;
and performing voice synthesis on the second forwarding content to obtain second forwarding voice playing information, and playing the second forwarding voice playing information through the loudspeaker.
In a fifth aspect, an embodiment of the present invention provides an intelligent voice device, including:
the microphone array is used for collecting voice information of a user;
a speaker for playing voice information;
an RFID tag for determining an address of the intelligent voice device;
a processor configured to:
obtaining message playing information;
And playing the message playing information through the loudspeaker.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic diagram of a system architecture according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a server according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of an intelligent voice device according to an embodiment of the present invention;
fig. 4 is a flow chart of a method for forwarding voice and leaving a message according to an embodiment of the present invention;
fig. 5 is a schematic flow chart of uploading a voice message according to an embodiment of the present invention;
fig. 6 is a schematic flow chart of message monitoring according to an embodiment of the present invention;
fig. 7 is a schematic flow chart of a voice message according to an embodiment of the present invention;
fig. 8 is a schematic flow chart of a message playing process according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature, and in the description of embodiments of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
Fig. 1 illustrates the structure of a voice forwarding and messaging system, which may include a server 100, a plurality of intelligent voice devices 200, and a plurality of locating devices 300, to which embodiments of the present invention are applicable.
As shown in fig. 1, the server 100 may be connected to a plurality of intelligent voice devices 200 through a network, and to a plurality of positioning devices 300 through a network. The server 100 may also be communicatively connected to the user's mobile terminal 400 via a mobile communication network. In some application scenarios, multiple intelligent speech devices 200 and multiple positioning devices 300 may be connected to the server 100 through a gateway.
Wherein a plurality of positioning devices 300 may be provided at a plurality of preset locations (respective rooms, walkways, respective doorways) within a preset space (home house, office). The location device 300 may be used to determine the address of each smart voice device 200 and the location of the user (family member) within a preset space. In some examples, the positioning device 300 may be an RFID reader and an RFID antenna, and RFID tags may be provided on the smart phone device 200 and on the user, and the positioning device 300 may implement positioning of the smart phone device 200 and the user by recognizing the RFID tags and transmit positioning information to the server 100 through a network. In some examples, the manner of positioning in the preset space may be implemented by indoor positioning technology in the prior art, which is not described in detail.
The server 100 is connected to the intelligent voice device 200 through a network (e.g., local area network, internet of things), wherein the manufacturing voice device may include a smart speaker, a voice control panel, a home smart sensor, etc.
The type of the intelligent voice apparatus 200 may be one type or may be plural (two or more types), and different intelligent voice apparatuses 200 may be disposed in different areas. Fig. 1 illustrates only an example in which an intelligent voice device a located in a room 1 is an intelligent sound box, an intelligent voice device b located in a room 2 is an intelligent sound box, an intelligent voice device c located in a kitchen is a voice assistant of a refrigerator, an intelligent voice device d located in a room 3 is an intelligent voice panel, and an intelligent voice device e located in a vehicle is an on-vehicle voice center. The location of the smart voice device 200 may be fixed (e.g., a voice assistant of a refrigerator in a kitchen) or may be changed (e.g., an on-board voice center in a vehicle may change location along with movement of the vehicle), if the location of the smart voice device 200 may be changed, after the location of the smart voice device 200 is changed, the location-changed address of the smart voice device 200 may be recognized by the positioning device 300 and then sent to the server 100 to maintain the location information of the smart voice device 200.
The intelligent voice device 200 can be controlled by the server 100 to play voice. In some examples, the smart voice device 200 may play the forwarded voice play information and the message play information.
Further, the intelligent speech device 200 is also capable of enabling intelligent speech interactions with a user. The intelligent voice device 200 in the embodiment of the present application may adopt a distributed architecture, that is, a plurality of intelligent voice devices 200 may be connected with the server 100, and send a voice request input by a user to the server 100 for voice processing and responding to the request of the user. Some intelligent speech devices 200 also have a display screen that can graphically display information.
The server 100 has functions of processing and forwarding voice information, after receiving various voice information sent by the intelligent voice device 200, the server 100 can respectively perform voiceprint recognition on voiceprints of the voice information and semantic recognition on content of the voice information, wherein the voiceprint recognition mainly is to identify identity information of a user, and the semantic recognition generally converts the voice information into text information, performs semantic analysis on the text information, and then extracts corresponding text information according to a preset format.
The server 100 may be a server 100 deployed independently, may be a distributed server 100, or may be a cluster of servers 100.
Based on the above architecture, in an actual application scenario, the intelligent voice devices 200 at different positions in a household (residence) can be accessed to the server 100, when a certain intelligent voice device 200 is awakened, the server 100 receives voice information sent by the awakened intelligent voice device 200, then identifies the voice information to obtain forwarding content, judges an address of the intelligent voice device 200 closest to the member user according to the acquired position information of the member user of the household, and then realizes quick forwarding of a message based on the source address, the forwarding content and the destination address, and after the intelligent voice device 200 closest to the member user listens to the message, the message is played timely so that the member user can hear the message timely.
Taking the system architecture shown in fig. 1 as an example, in the embodiment of the present invention, the system may be first built, which specifically includes the following configuration operations:
(1) The intelligent voice appliance 200 is registered.
One or more smart voice devices 200 within geographic range are connected to the gateway and the smart voice devices 200 are registered in the server 100 to form a list of smart voice devices associated with the home. Wherein a geographic area may be a residence, a suite of residences, a production facility, a corporate office, etc.
The list of intelligent voice devices may include information about the intelligent voice device 200, for example, including: the ID of the intelligent voice device 200, the address (e.g., IP address, MAC address) of the intelligent voice device 200, the location area where the intelligent voice device 200 is located, the type of the intelligent voice device 200 (e.g., intelligent speaker, car voice center), etc.
(2) The location device 300 is registered.
The positioning device 300 within the above geographical range is registered in the server 100. Of course, locating devices 300 within other geographic areas may also be registered with the server 100 to form a list of locating devices associated with the home.
Fig. 2 schematically illustrates a structure of a server 100 according to an embodiment of the present invention.
As shown in fig. 2, the server 100 may include a communication module 101, a memory 102, and a processor 103. Further, various management modules (not shown in the figures) may be further included in the server 100, and the management modules may include one or various combinations of the following:
the device access management module is used for performing device access management, such as registering and correlating the positioning device and the intelligent voice device;
the device position management module is used for updating, maintaining and managing the address and the position of the intelligent voice device;
And the user position management module is used for updating, maintaining and managing the user position.
The communication module 101 is configured to form a network with a plurality of intelligent voice devices, and receive a voice notification message collected by a first intelligent voice device. The first intelligent voice device is an intelligent voice device for notifying a person to wake up.
And a memory 102 for storing message information.
A processor 103, connected to the communication module 101 and the memory 102, configured to:
and determining the address of the first intelligent voice equipment, and determining the identity information of the notifier, the first notification information and the identity information of the notified person according to the voice notification message. And determining the first forwarding content according to the identity information of the notifier and the first notification information. And positioning the person according to the identity information of the notified person, positioning the person to the position of the notified person, and determining the address of the second intelligent voice equipment according to the position of the notified person. And determining a first voice message according to the address of the first intelligent voice device, the first forwarding content and the address of the second intelligent voice device, and uploading the first voice message to a message bus for forwarding the first voice message.
Specifically, the server 100 can implement a voice forwarding function when the location of the notified person can be located according to the notified person's identity information.
First, when a notifier wakes up a first intelligent voice device by a wake-up word, the server 100 may quickly obtain an address of the first intelligent voice device. The first intelligent voice device may collect voice notification information of the notifier and then transmit to the server 100. After receiving the voice notification information of the first intelligent voice device, the communication module 101 in the server 100 may be configured to: voiceprint recognition is carried out on the voice notification message, identity information of a notifier is determined, semantic analysis is carried out on the voice notification message, and identity information and first notification information of the notified person in the voice notification message are extracted. For example, the identity information identifying the notifier by invoking the voiceprint recognition service is "mom". The voice notification sent by the notifier to the awakened intelligent voice device is "tell little and bright, stay up, sleep early". By calling the voice recognition service to recognize as text information and then carrying out semantic analysis and text classification processing, the identity information of the notified person can be extracted: xiaoming; notification information: 'Butt late night, early sleep' ". After the identity information and the first notification information of the notifier are obtained, a first forwarding content may be generated. The first forwarding content may be generated in a preset format, for example: the forwarding content may be "mom tells you: night, and early sleep. The voiceprint recognition technology and the semantic analysis technology may use general technologies, and are not described herein.
When locating to the location of the notified person in accordance with the notified person's identity information, the processor 103 may be configured to: and determining the address of the intelligent voice device nearest to the notified person according to the position of the notified person. The address of the smart voice device closest to the notified person is determined as the address of the second smart voice device. The address of the second intelligent voice device is the destination address of message forwarding, such as the IP address of the second intelligent voice device.
After the address (IP address) of the first intelligent voice device is obtained, the address of the first intelligent voice device is used as a source address for forwarding the message, so that the source address, the first forwarding content and the destination address are packaged, and the first voice message is obtained. The message encapsulation format herein may be a generic format, such as encapsulation as json strings. And finally, uploading the first voice message to a message bus for forwarding the first voice message, so that the first voice message is played after voice synthesis when the second intelligent voice device monitors that the target address is the address of the second intelligent voice device. Thereby realizing the rapid forwarding of the message and improving the forwarding efficiency.
Fig. 3 exemplarily illustrates a structure of a smart voice device 200, and as shown in fig. 3, the smart voice device 200 may include: microphone array 201, speaker 202, RFID tag 203, communication module 204, and processor 205.
Wherein, the microphone array 201 is used for collecting the voice information of the user;
a speaker 202 for playing voice information;
an RFID tag 203 for determining a location of the smart voice device 200;
a communication module 204 for transceiving messages;
a processor 205 is connected to the microphone array 201, the speaker 202 and the communication module 204.
When the intelligent speech device 200 shown in fig. 3 is the second intelligent speech device 200 described above, the processor 205 of the second intelligent speech device 200 is configured to:
and when the destination address in the first voice message uploaded by the server is confirmed to be the address of the second intelligent voice device, acquiring first forwarding content in the first voice message. And then, performing voice synthesis on the first forwarding content to obtain first forwarding voice playing information, and playing the first forwarding voice playing information through a loudspeaker.
When the destination address in the message bus is confirmed to be consistent with the address of the intelligent voice equipment, pulling the message to obtain the forwarding content, the destination address and the source address. And then, calling a local voice synthesis service to synthesize the text message of the forwarding content into a voice message, namely forwarding voice playing information for playing.
After the playing is completed, the second intelligent voice device may wait for pickup, and when the person is notified of the answer within a preset waiting time, the processor 205 is configured to: and collecting voice response information of the notified person within a preset time, and sending the voice response information to the server.
The preset time may be empirically set, for example, 10s,20s, etc.
In some embodiments, after the second intelligent voice device plays the first voice message, voice response information of the notified person may also be collected within a preset time and sent to the server, where the processor 103 of the server is configured to:
the voice response information sent by the second intelligent voice device in the preset time is obtained, the voice response information is identified, second notification information is obtained, second forwarding content is determined according to the identity information of the notified person and the second notification information, second voice information is determined according to the address of the first intelligent voice device, the second forwarding content and the address of the second intelligent voice device, and the second voice information is uploaded to a message bus to be forwarded.
For example, the voice response message is "i know, i sleeps immediately", then the voice response message is identified as text message "second forwarding content", the address of the second intelligent voice device is used as the source address, the address of the first intelligent voice device is used as the destination address, a second voice message is generated, and the second voice message is uploaded to the message bus for forwarding the second voice message.
After the first intelligent voice device receives the second voice message, based on the structure shown in fig. 3, the processor 205 of the first intelligent voice device is configured to: and when the destination address in the second voice message uploaded by the server is confirmed to be the address of the first intelligent voice device, acquiring second forwarding content in the second voice message. And performing voice synthesis on the second forwarding content to obtain second forwarding voice playing information, and playing the second forwarding voice playing information through the loudspeaker 202.
Alternatively, when the indoor location cannot acquire the location of the notified person (the RFID tag with small brightness cannot be found in the controllable range), the server may implement a voice message function, and based on the structure shown in fig. 2, the processor 103 of the server is configured to: if the position of the notified person cannot be located; the first forwarding content is determined as message information. And stores the identity information of the notifier, the identity information of the notified person, the message information, the current time, and the message read flag bit in the database of the memory 102.
At this time, the identity information of the notifier, the identity information of the notified person, the message information, the address of the intelligent voice device for the message, the current time and the read flag bit of the message can be stored in the message table of the relational database, and the read flag bit of the message is used for identifying whether the message information is played or not.
In some embodiments, the processor 103 is further configured to:
when the distance between the position of the notified person and the intelligent voice equipment for message leaving is smaller than the preset distance, message leaving information of the notified person and identity information of the notified person in a set time range are determined from a database of the memory 102. And waking up the intelligent voice device for message leaving, and performing voice synthesis according to the message leaving information of the notified person and the identity information of the notified person to obtain message playing information. And finally, the message playing information is sent to the intelligent voice equipment for voice playing, and the read flag bit of the message is set as read. The preset distance and the set time range may be empirically set, for example, the preset distance may be set to 1m, 2m; the set time range may be any range value set by the user within 1h, within 2h, or the like, and the set time range may also be set to be none, which indicates that all unread message information of the user is played. The intelligent voice equipment for leaving messages can be set according to experience, and the address of the intelligent voice equipment for leaving messages can be obtained after the intelligent voice equipment for leaving messages is set. For example, a smart voice device that can be set as a doorway for a person being notified or a smart voice device in a bedroom. After the read flag bit of the message is set to be read, each message can be ensured to be automatically played only once, but the user can read for a plurality of times in a query mode. For example, a user may query all messages about himself over a certain time frame through the intelligent voice device.
When the smart speech device 200 shown in fig. 3 is a message-leaving smart speech device, the processor 205 of the message-leaving smart speech device may be configured to: obtaining message playing information sent by a server; and playing the message playing information through a loudspeaker.
For example, the first forwarding content mentioned above is "mom tells you: when the user sleeps at night and early hours, the user is the left message, synthesizes the left message into voice, and plays the voice through the loudspeaker by the intelligent voice equipment.
Optionally, the user may also query all the message records about himself at any intelligent voice device, at which time the processor 103 of the server may be further configured to: and acquiring a message inquiry request of the user sent by the third intelligent voice equipment, and then identifying the identity information of the user. Inquiring the message record of the user from the database according to the identity information of the user, and carrying out voice synthesis according to the identity information and the message information of the notifier in the message record to obtain the message playing voice. And sending the message playing voice to third intelligent voice equipment for voice playing. The specific flow is similar to the process that the intelligent voice equipment actively plays the message to the user, and is not described in detail.
The following describes the flow of voice forwarding and message leaving in the embodiment of the present invention in detail with reference to the accompanying drawings.
Fig. 4 is a schematic diagram schematically illustrating a voice forwarding and message leaving process according to an embodiment of the present invention, where, as shown in fig. 4, the process may include the following steps:
s401, receiving voice notification information acquired by the first intelligent voice equipment.
The first intelligent voice device is an intelligent voice device for notifying a person to wake up. The notifier may wake up the first smart voice device, e.g., the XXX speaker, by a wake word.
S402, determining the address of the first intelligent voice device, and determining the identity information of the notifier, the first notification information and the identity information of the notified person according to the voice notification message.
When the notifier wakes up the first intelligent voice device, the server can determine the address of the waken up first intelligent voice device, and the address of the first intelligent voice device is the source address. When the first intelligent voice equipment collects voice notification information of the notifier, after receiving the voice notification information of the first intelligent voice equipment, the server can conduct voiceprint recognition on the voice notification information to determine identity information of the notifier, and conduct semantic analysis on the voice notification information to extract the identity information of the notifier and the first notification information in the voice notification information.
For example, the identity information identifying the notifier by invoking the voiceprint recognition service is "mom". The voice notification sent by the notifier to the awakened intelligent voice device is "tell little and bright, stay up, sleep early". By calling the voice recognition service to recognize as text information and then carrying out semantic analysis and text classification processing, the identity information of the notified person can be extracted: xiaoming; notification information: 'Butt late night, early sleep' ".
S403, determining a first forwarding content according to the identity information of the notifier and the first notification information; and positioning the personnel according to the identity information of the notified person.
After the identity information and the first notification information of the notifier are obtained, a first forwarding content may be generated. The first forwarding content may be generated according to a preset format or a preset template, for example: the forwarding content may be "mom tells you: night, and early sleep. Meanwhile, personnel positioning can be performed according to the identity information of the notified person through an indoor positioning technology.
S404, judging whether the position of the notified person is located, if so, turning to S405, otherwise turning to S406.
S405, determining the address of the second intelligent voice equipment according to the position of the notified person.
When the location of the notified person can be located, indicating that the notified person is within a controllable range, the server may determine the address of the smart voice device nearest to the notified person according to the location of the notified person, and then determine the address of the smart voice device nearest to the notified person as the address of the second smart voice device. The address of the second intelligent voice device is the destination address of message forwarding, such as the IP address of the second intelligent voice device.
S406, determining the first forwarding content as message information; and storing the identity information of the notifier, the identity information of the notified person, the message information, the address of the intelligent voice device for leaving the message, the current time and the read flag bit of the message in a database.
The server can store the identity information of the notifier, the identity information of the notified person, the message information, the address of the intelligent voice device for the message, the current time and the read flag bit of the message in a message table of a relational database, and the read flag bit of the message is used for identifying whether the message information is played or not.
Further, when the server determines that the distance between the position of the notified person and the intelligent voice device for leaving messages is smaller than the preset distance, the server determines the information of leaving messages of the notified person and the identity information of the notified person within the set time range from a database of a memory. After waking up the intelligent voice device, carrying out voice synthesis according to the message information of the notified person and the identity information of the notified person to obtain message play information; and sending the message playing information to the intelligent voice equipment for voice playing, and setting the read flag bit of the message as read. The way of waking up the message-leaving intelligent voice device can be by software or by a wake-up instruction. After the read flag bit of the message is set to be read, each message can be ensured to be automatically played only once, but the user can read for a plurality of times in a query mode. For example, a user may query all messages about himself over a certain time frame through the intelligent voice device.
The intelligent voice equipment for leaving messages receives the message playing information sent by the server; and playing the message playing information through a loudspeaker. For example, message information "mom tells you: and when the user sleeps at night and early hours, the intelligent voice equipment for leaving the message plays the voice after synthesizing the voice from the message information through the loudspeaker.
S407, determining a first voice message according to the address of the first intelligent voice device, the first forwarding content and the address of the second intelligent voice device, and uploading the first voice message to a message bus for forwarding the first voice message.
The address of the first intelligent voice device is a source address, the address of the second intelligent voice device is a destination address, and the source address, the first forwarding content and the destination address can be packaged to obtain a first voice message. For example, may be packaged as json strings. And finally, uploading the first voice message to a message bus for forwarding the first voice message, so that the first voice message is played after voice synthesis when the second intelligent voice device monitors that the target address is the address of the second intelligent voice device. Thereby realizing the rapid forwarding of the message and improving the forwarding efficiency.
And when the second intelligent voice equipment confirms that the destination address in the first voice message uploaded by the server is the address of the intelligent voice equipment, acquiring first forwarding content in the first voice message. And then, performing voice synthesis on the first forwarding content to obtain first forwarding voice playing information, and playing the first forwarding voice playing information.
That is, when the second intelligent voice device receives that the destination address in the message bus is consistent with the address of the intelligent voice device, the second intelligent voice device pulls the message to obtain the forwarding content, the destination address and the source address. And then, calling a local voice synthesis service to synthesize the text message of the forwarding content into a voice message, namely forwarding voice playing information for playing.
After the playing is completed, the second intelligent voice equipment can wait for pickup, and when the notified person has a response in a preset waiting time, the voice response information of the notified person in the preset time is collected, and the voice response information is sent to the server.
After receiving the voice response information of the notified person collected by the second intelligent voice device, the server can identify the voice response information to obtain second notification information, determine second forwarding content according to the identity information of the notified person and the second notification information, determine second voice information according to the address of the first intelligent voice device, the second forwarding content and the address of the second intelligent voice device, and upload the second voice information to the message bus for forwarding.
For example, the voice response message is "i know, i sleeps immediately", then the voice response message is identified as text message "second forwarding content", the address of the second intelligent voice device is used as the source address, the address of the first intelligent voice device is used as the destination address, a second voice message is generated, and the second voice message is uploaded to the message bus for forwarding the second voice message.
And after the first intelligent voice equipment receives the second voice message, acquiring second forwarding content in the second voice message. And performing voice synthesis on the second forwarding content to obtain second forwarding voice playing information, and playing the second forwarding voice playing information through a loudspeaker.
In order to better explain the embodiments of the present invention, the process of voice forwarding and message leaving will be described in a specific scenario.
Taking an intelligent sound box as an example, the voice forwarding process can include two processes of voice message uploading and message monitoring:
the flow of voice message upload as shown in fig. 5:
s501, waking up the intelligent sound box.
When a notifier needs to forward a message to a notified person, the notifier needs to wake up the smart speaker by voice, usually by speaking a fixed wake-up word.
S502, the IP address of the awakened intelligent sound box is acquired.
When the intelligent sound box is awakened, the server can acquire the IP address of the awakened intelligent sound box at the same time.
S503, identifying the identity information of the notifier.
A voiceprint recognition service can be invoked to recognize the identity information of the notifier.
S504, a voice forwarding instruction is issued.
After the intelligent sound box is awakened, a person is notified to issue a voice forwarding instruction, such as 'tell little, stay up, sleep early'.
S505, voice recognition.
The server may invoke a speech recognition service to recognize speech as text information.
S506, semantic analysis.
Semantic analysis is carried out on the text information to obtain extracted semantic information: "notified person: small, notification information: 'Butt late night, early sleep' ".
S507, personnel positioning.
The person to be notified in S506 is "xiaoming", the person locates to find the current position information of xiaoming and locates to determine the nearest intelligent sound box to the sound box position to obtain the corresponding IP address, which is the destination address of forwarding.
S508, generating forwarding content.
The notifier identity information obtained through S502, for example, the notifier identity is "mom", and the forwarding content is generated by combining the notifier information as "mom tells you: night, and early sleep.
S509, uploading a message bus.
The forwarding content, source address and destination address are packaged into json strings for uploading to the message bus.
Fig. 6 shows a flow of message interception:
as shown in fig. 6, the method specifically comprises the following steps:
s601, monitoring the destination address of the message.
The intelligent sound box can monitor the destination address of each message in the message bus through the thread.
S602, judging whether the destination address of the message is equal to the address of the intelligent sound box, if so, switching to S603, otherwise, switching to S601.
And judging whether the destination address of the message is consistent with the address of the message.
S603, pulling the message.
And pulling the message to obtain forwarding content, a destination address and a source address in the message.
S604, speech synthesis.
Invoking a speech synthesis service to synthesize the forwarding content into forwarding speech.
S605, broadcasting.
The intelligent sound box plays and forwards the voice.
S606, starting wakeup.
After the playing is completed, the intelligent sound box is in an awakening state in a software mode.
S607, pick-up.
Waiting for pick-up.
S608, judging whether a user response exists in 5S, if yes, turning to S609, otherwise turning to S611.
S609, voice recognition.
And carrying out voice recognition on the response information to obtain text information serving as forwarding content.
And S610, uploading the message to a message bus.
And packaging the original destination address serving as a source address and the original source address serving as a destination address into json strings to upload the json strings to a message bus in combination with forwarding contents.
S611, the pick-up is stopped.
In this scenario, the message that has been uploaded to the message bus is { "destination address": "xxxx", "source address": "yyyy", "forwarding information": "mom tells you: other stay up, go to sleep in the morning, "}, monitored by the intelligent sound box with the IP address of the room where the Ming is located being" xxxx ", the forwarding content" mom "tells you: the method comprises the steps of (1) when the user sleeps at early time, synthesizing and playing voice, waiting for pickup after playing, converting response information into text information to be forwarded through voice recognition if the user answers a small message within 5 seconds, setting a 'destination address' as an IP address of an original intelligent sound box of a mom room, namely an IP address of an intelligent sound box of the current small message room, packaging a 'source address, a destination address and forwarding content' into json strings, and uploading the json strings to a bus. Similarly, the smart speaker in the mother's room is also monitoring the message bus in real time and playing the forwarding content with the destination address matching itself.
Further, the voice message process may include a message process and a play message process:
Fig. 7 shows a message process:
s701, waking up the intelligent sound box.
When a notifier needs to forward a message to a notified person, the notifier needs to wake up the smart speaker by voice, usually by speaking a fixed wake-up word.
S702, identifying the identity information of the notification person.
A voiceprint recognition service can be invoked to recognize the identity information of the notifier.
S703, issuing a voice forwarding instruction.
After the intelligent sound box is awakened, a person is notified to issue a voice forwarding instruction, such as 'tell little, stay up, sleep early'.
S704, voice recognition.
The server may invoke a speech recognition service to recognize speech as text information.
S705, semantic analysis.
Semantic analysis is carried out on the text information to obtain extracted semantic information: "notified person: small, notification information: 'Butt late night, early sleep' ".
S706, personnel positioning.
The notified person identity "xiaoming" in S506 inquires about the current location information by person location.
S707, the location of the notified person is not queried.
The location of the small tag is not found (the RFID tag of the small tag is not found within a controllable range).
S708, determining the destination address as the IP of the message speaker.
And setting the destination address as the IP address of the message speaker.
S709, storing the message database.
The method comprises the steps of storing the information of the identity of a person to be notified, the information of the identity of the person to be notified, the message information, the destination address IP, the current time and the message read flag bit into a message table of a relational database, wherein the message read flag bit is used for identifying whether a message is played or not.
Fig. 8 shows a message play process:
s801, determining that the user is close to the message sound box.
And when the relative distance between the user and the message sound box is smaller than the set value, determining that the user is close to the message sound box.
S802, inquiring whether a message exists in the database for the user, if so, turning to S803, otherwise turning to S801.
Inquiring a message record corresponding to the user in a set time range in a database, wherein the set value and the set time range of the distance can be set by the user on the intelligent terminal through the intelligent sound box App terminal.
S803, extracting the message information.
When the related message record of the user is found in the database, the message information and the message person are extracted to form 'message play information': "XXX message: XXXXX.
S804, waking up the message sound box.
The message sound box is awakened through software, the voice synthesis service is invoked, and message playing information is synthesized into message voice.
S805, playing the message.
And playing the message voice and simultaneously setting the read mark of the message recorded in the database as read, so as to ensure that each message is only played once automatically, and a user can read for many times in a query mode.
Based on the same technical concept, the embodiment of the invention further provides a computing device, including:
a memory for storing program instructions;
and the processor is used for calling the program instructions stored in the memory and executing the voice forwarding and message leaving methods according to the obtained program.
Based on the same technical concept, the embodiment of the invention also provides a computer readable storage medium, which stores computer executable instructions for causing a computer to execute the above-mentioned voice forwarding and message leaving method.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.

Claims (9)

1. A method of voice forwarding, the method comprising:
receiving a voice notification message acquired by first intelligent voice equipment; the first intelligent voice device is an intelligent voice device for notifying a person to wake up;
determining the address of the first intelligent voice device, and determining the identity information of the notifier, the first notification information and the identity information of the notified person according to the voice notification message;
Determining first forwarding content according to the identity information of the notifier and the first notification information; performing personnel positioning according to the identity information of the notified person, positioning to the position of the notified person, and determining the address of the second intelligent voice device according to the position of the notified person;
and determining a first voice message according to the address of the first intelligent voice device, the first forwarding content and the address of the second intelligent voice device, and uploading the first voice message to a message bus for forwarding the first voice message.
2. The method of claim 1, wherein the determining the identity information of the notifier, the first notification information, and the identity information of the notified person according to the voice notification message comprises:
voiceprint recognition is carried out on the voice notification message, and identity information of the notifier is determined;
and carrying out semantic analysis on the voice notification message, and extracting identity information and first notification information of the notified person in the voice notification message.
3. The method of claim 1, wherein determining the address of the second intelligent voice device based on the location of the notified person comprises:
Determining the address of the intelligent voice equipment nearest to the notified person according to the position of the notified person;
and determining the address of the intelligent voice device nearest to the notified person as the address of the second intelligent voice device.
4. The method of claim 3, further comprising, after said uploading to a message bus for first voice message forwarding:
acquiring voice response information sent by the second intelligent voice equipment within preset time;
identifying the voice response information to obtain second notification information;
determining second forwarding content according to the identity information of the notified person and the second notification information;
and determining a second voice message according to the address of the first intelligent voice device, the second forwarding content and the address of the second intelligent voice device, and uploading the second voice message to a message bus for forwarding the second voice message.
5. The method of any one of claims 1 to 4, further comprising:
if the position of the notified person cannot be located;
determining the first forwarding content as message information; and storing the identity information of the notifier, the identity information of the notified person, the message information, the current time and the message read flag bit in a database of a memory.
6. The method of claim 5, wherein the method further comprises:
when the distance between the position of the notified person and the intelligent voice equipment for leaving messages is smaller than the preset distance, determining the information of the messages of the notified person and the identity information of the notified person within a set time range from a database of the memory; waking up the intelligent voice device;
according to the message information of the notified person and the identity information of the notified person, performing voice synthesis to obtain message play information;
and sending the message playing information to the intelligent message voice equipment for voice playing, and setting the read flag bit of the message as read.
7. The method of claim 6, wherein the method further comprises:
acquiring a message inquiry request of a user sent by third intelligent voice equipment;
identifying identity information of the user;
inquiring a message record of the user from a database according to the identity information of the user;
according to the identity information and the message information of the notifier in the message record, performing voice synthesis to obtain message playing voice;
and sending the message playing voice to the third intelligent voice equipment for voice playing.
8. A server, characterized in that it is configured to perform the method of any of claims 1-7.
9. An intelligent speech device, comprising:
the microphone array is used for collecting voice information of a user;
a speaker for playing voice information;
an RFID tag for determining a location of the intelligent voice device;
a processor configured to:
determining a source address for message forwarding and a destination address for message forwarding; the source address of the message forwarding is the address of the intelligent voice equipment, and the destination address of the message forwarding is the address of the intelligent voice equipment nearest to the notified person;
when a destination address of message forwarding in a first voice message uploaded by a server is confirmed to be a source address of message forwarding, acquiring first forwarding content in the first voice message; wherein the first voice message is determined by a source address to which the message is forwarded, the first forwarding content, and a destination address to which the message is forwarded;
and performing voice synthesis on the first forwarding content to obtain first forwarding voice playing information, and playing the first forwarding voice playing information through the loudspeaker.
CN202010350327.7A 2020-04-28 2020-04-28 Voice forwarding method, server and intelligent voice equipment Active CN113488054B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010350327.7A CN113488054B (en) 2020-04-28 2020-04-28 Voice forwarding method, server and intelligent voice equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010350327.7A CN113488054B (en) 2020-04-28 2020-04-28 Voice forwarding method, server and intelligent voice equipment

Publications (2)

Publication Number Publication Date
CN113488054A CN113488054A (en) 2021-10-08
CN113488054B true CN113488054B (en) 2024-03-08

Family

ID=77932523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010350327.7A Active CN113488054B (en) 2020-04-28 2020-04-28 Voice forwarding method, server and intelligent voice equipment

Country Status (1)

Country Link
CN (1) CN113488054B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002049390A (en) * 2000-08-04 2002-02-15 Asahi Kasei Corp Voice recognition method, server and voice recognition system
CN101924843A (en) * 2009-06-12 2010-12-22 阿瓦雅公司 Utilize the caller identification of sound message system
CN104506703A (en) * 2014-12-24 2015-04-08 小米科技有限责任公司 Voice message leaving method, voice message leaving device, voice message playing method and voice message playing device
WO2018127008A1 (en) * 2017-01-03 2018-07-12 中兴通讯股份有限公司 Method and apparatus for acquiring voice message

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10390160B2 (en) * 2017-06-12 2019-08-20 Tyco Fire & Security Gmbh System and method for testing emergency address systems using voice recognition

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002049390A (en) * 2000-08-04 2002-02-15 Asahi Kasei Corp Voice recognition method, server and voice recognition system
CN101924843A (en) * 2009-06-12 2010-12-22 阿瓦雅公司 Utilize the caller identification of sound message system
CN104506703A (en) * 2014-12-24 2015-04-08 小米科技有限责任公司 Voice message leaving method, voice message leaving device, voice message playing method and voice message playing device
WO2018127008A1 (en) * 2017-01-03 2018-07-12 中兴通讯股份有限公司 Method and apparatus for acquiring voice message

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于语音控制的智能门铃***;赵人熳;崔巍;王奕璇;;延安大学学报(自然科学版);20191220(第04期);全文 *

Also Published As

Publication number Publication date
CN113488054A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
US10462003B2 (en) Intelligent agent features for wearable personal communication nodes
WO2018188587A1 (en) Voice response method and device, and smart device
CN106782540B (en) Voice equipment and voice interaction system comprising same
CN103024188B (en) A kind of based reminding method of unprocessed information and alarm set
CN103190139B (en) For providing the system and method for conferencing information
EP2182707A1 (en) Ambient sound detection and recognition method
CN109450747B (en) Method and device for awakening smart home equipment and computer storage medium
CN108320745A (en) Control the method and device of display
CN106960667B (en) Position reminding method, device and system
CN113506568B (en) Central control and intelligent equipment control method
JP2017192091A (en) IOT system with voice control function and information processing method thereof
CN111432087A (en) Alarm method and related equipment
CN113470634A (en) Control method of voice interaction equipment, server and voice interaction equipment
WO2016198132A1 (en) Communication system, audio server, and method for operating a communication system
CN109412821A (en) Message treatment method and device and electronic equipment
CN113488054B (en) Voice forwarding method, server and intelligent voice equipment
CN112634922A (en) Voice signal processing method, apparatus and computer readable storage medium
WO2018023514A1 (en) Home background music control system
CN102917345A (en) Method and device for reporting state information of user
CN113470635B (en) Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium
CN112820273B (en) Wake-up judging method and device, storage medium and electronic equipment
CN113746871A (en) Property repair reporting system, method, device and equipment
CN114495921A (en) Voice processing method and device and computer storage medium
CN202887734U (en) Field recorder for trade negotiations
CN114373459A (en) Smart home awakening method and smart home platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant