CN109346083A

CN109346083A - A kind of intelligent sound exchange method and device, relevant device and storage medium

Info

Publication number: CN109346083A
Application number: CN201811449019.9A
Authority: CN
Inventors: 王海燕; 陈君宇; 杨鹏
Original assignee: Beijing Orion Star Technology Co Ltd
Current assignee: Beijing Orion Star Technology Co Ltd
Priority date: 2018-11-28
Filing date: 2018-11-28
Publication date: 2019-02-15

Abstract

A kind of intelligent sound exchange method and device, relevant device and storage medium provided by the present application, wherein the method is applied to smart machine server-side, including receive at least two people set in the switching request that sets of target person；Based on the switching request, the unique identification information that the target person is set is determined；It is determined based on the unique identification information and sets corresponding speech synthesis target sound with the target person；Send the switching command that target person is set；Different people can be switched according to demand to set, and the exclusive sound etc. that output people sets can be set according to the people of switching, meet the individual demand of user.

Description

A kind of intelligent sound exchange method and device, relevant device and storage medium

Technical field

This application involves smart machine technical field of voice interaction, in particular to a kind of intelligent sound exchange method and dress It sets, relevant device and storage medium, wherein relevant device includes server, smart machine and terminal.

Background technique

Currently, when intelligent sound is interacted with user, after user selects a product, only a kind of sound and a people If more dull, in fact most users have demand to the change of sound style, it is highly desirable to can switch different sound Sound.

Summary of the invention

In view of this, this specification embodiment provides a kind of intelligent sound exchange method and device, relevant device and deposits Storage media, to solve technological deficiency existing in the prior art.

In a first aspect, this specification embodiment discloses a kind of intelligent sound exchange method, it is applied to smart machine service End, comprising:

Receive at least two people set in the switching request that sets of target person；

Based on the switching request, the unique identification information that the target person is set is determined；

It is determined based on the unique identification information and sets corresponding speech synthesis target sound with the target person；

Send the switching command that target person is set.

Optionally, receive at least two people set in the switching request that sets of target person, comprising:

Receive smart machine end at least two people set in the switching request that sets of target person；Or

Receive application program end at least two people set in the switching request that sets of target person.

Receive at least two people set in the switching request based on touch control operation that sets of target person, it is described to be grasped based on touch-control The switching request of work carries the unique identification information that the target person is set；Or

Receive at least two people set in the voice-based switching request that sets of target person.

Optionally, receive at least two people set in the voice-based switching request that sets of target person include:

Receive at least two people set in target person set based on voice switching instruction switching request；Or

Receive at least two people set in the switching request based on vocal print that sets of target person.

Optionally, receive at least two people set in the switching request based on vocal print that sets of target person before, further includes:

It establishes target person and sets corresponding relationship with target vocal print.

Optionally, receive at least two people set in target person set based on voice switching instruction switching request it Afterwards, it is based on the switching request, determines the unique identification information that the target person is set, comprising:

Voice switching instruction is converted into text switching instruction；

Switched based on the text and indicated, determines the unique identification information that the target person is set.

Optionally, receive at least two people set in the switching request based on vocal print that sets of target person after, be based on institute Switching request is stated, determines the unique identification information that the target person is set, comprising:

It identifies the vocal print, determines target vocal print；

The corresponding relationship with target vocal print is set based on the target person, determines the unique identification information that the target person is set.

Optionally, before the switching command that transmission target person is set, further includes:

It is determined based on the unique identification information and sets corresponding exclusive corpus with the target person.

Optionally, the method, further includes:

Receive the voice messaging of smart machine end acquisition；

If the voice messaging is to set associated voice messaging with target person, based on the voice messaging to described exclusive Corpus is retrieved, and is formed feedback speech concurrent based on search result and sent the feedback voice to smart machine end.

Optionally, the method, further includes:

Receive the voice messaging of smart machine end acquisition；

If the voice messaging is universal phonetic information, preset general corpus is carried out based on the voice messaging Retrieval, and feedback speech concurrent is formed based on search result and gives the feedback voice to smart machine end.

Optionally, the exclusive corpus is retrieved based on the voice messaging, and is formed instead based on search result Feedback speech concurrent send the feedback voice to smart machine end, comprising:

The voice messaging is converted into corresponding text information；

The text information is parsed to obtain being intended to text information；

Feedback text information is obtained based on exclusive corpus described in the intention document information retrieval；

The feedback text information is converted to the backchannel that corresponding speech synthesis target sound is set with the target person Sound, and the feedback voice is sent to smart machine end.

Optionally, preset general corpus is retrieved based on the voice messaging, and is formed based on search result Feedback speech concurrent send the feedback voice to smart machine end, comprising:

The voice messaging is converted into corresponding text information；

The text information is parsed to obtain being intended to text information；

Feedback text information is obtained based on general corpus described in the intention document information retrieval；

Optionally, the switching request carries the unique identification information and/or the target person sets corresponding switching language Sound.

Second aspect, this specification embodiment disclose a kind of intelligent sound exchange method, are applied to smart machine end, packet It includes:

Receive the switching command that the target person that smart machine server-side is sent is set；

The switching set based on the switching command to the target person is notified.

Optionally, before the switching command that the target person that reception smart machine server-side is sent is set, further includes:

The switching request that target person is set is sent to smart machine server-side.

Optionally, the switching request that target person is set is sent to smart machine server-side, comprising:

The switching request based on touch control operation that target person is set is sent to smart machine server-side, or

The voice-based switching request that target person is set is sent to smart machine server-side.

Optionally, the voice-based switching request that target person is set is sent to smart machine server-side, comprising:

The switching request based on voice switching instruction that target person is set is sent to smart machine server-side；Or

The switching request based on vocal print that target person is set is sent to smart machine server-side.

Optionally, before the switching request based on vocal print set to smart machine server-side transmission target person, further includes:

The bind request that sets of target person is sent to smart machine server-side, and request establishes target person and sets pair with target vocal print It should be related to.

Optionally, the switching set based on the switching command to the target person is notified, comprising:

Appointment display effect is obtained after receiving the switching command to be shown.

Optionally, the switching command carries the unique identification information that the target person is set；

The switching set based on the switching command to the target person is notified, comprising:

Based on the unique identification information that the target person that the switching command carries is set, determine that the target person sets correspondence Default bandwagon effect be shown；And/or

Based on the unique identification information that the target person that the switching command carries is set, determine that the target person sets correspondence Default switching voice broadcasted.

Optionally, the switching command carries the target person and sets corresponding switching voice；

Corresponding switching voice is set based on the target person that the switching command carries to be broadcasted.

Optionally, the method, further includes:

The corresponding wake-up voice for waking up word is set comprising the target person if collecting, it is determined that the target person sets correspondence Wake-up respond voice broadcasted.

Optionally, the method, further includes:

Voice messaging is acquired after smart machine end wakes up, and is sent to smart machine server-side；

It receives and the feedback voice based on the voice messaging that exports smart machine server-side, wherein the feedback voice Corresponding feedback voice is set for the target person.

The third aspect, this specification embodiment disclose a kind of intelligent sound exchange method, are applied to application program end, packet It includes:

The display interface at the application program end is determined based on the switching command.

The display interface at the application program end is determined based on the switching command, comprising:

Based on the unique identification information that the target person that the switching command carries is set, determine that the target person sets correspondence The application program display interface.

Fourth aspect, this specification embodiment disclose a kind of intelligent sound interactive device, are applied to smart machine service End, comprising:

Server-side switching request receiving module, for receive at least two people set in the switching request that sets of target person；

Unique identification information determining module determines the unique identification that the target person is set for being based on the switching request Information；

Target sound determining module sets corresponding voice with the target person for determining based on the unique identification information Synthesize target sound；

Switching command sending module, the switching command set for sending target person.

Optionally, the server-side switching request receiving module, specifically for including:

Optionally, the server-side switching request receiving module, is specifically used for:

Optionally, described device, further includes:

Corresponding relation building module sets corresponding relationship with target vocal print for establishing target person.

Optionally, the unique identification information determining module, is specifically used for:

Voice switching instruction is converted into text switching instruction；

It identifies the vocal print, determines target vocal print；

Optionally, further includes:

Exclusive corpus determining module, it is corresponding special for being set based on unique identification information determination with the target person Belong to corpus.

Optionally, described device, further includes:

Exclusive corpus voice messaging receiving module, for receiving the voice messaging of smart machine end acquisition；

Exclusive corpus retrieval module, if being to set associated voice messaging, base with target person for the voice messaging The exclusive corpus is retrieved in the voice messaging, and based on search result formed feedback speech concurrent send it is described anti- Voice is presented to smart machine end.

Optionally, described device, further includes:

General corpus voice messaging receiving module, for receiving the voice messaging of smart machine end acquisition；

General corpus library searching module is believed if being universal phonetic information for the voice messaging based on the voice Breath retrieves preset general corpus, and forms feedback speech concurrent based on search result and send the feedback voice to intelligence It can equipment end.

Optionally, the exclusive corpus retrieval module, is specifically used for:

The voice messaging is converted into corresponding text information；

The text information is parsed to obtain being intended to text information；

Optionally, the general corpus library searching module, is specifically used for:

The voice messaging is converted into corresponding text information；

The text information is parsed to obtain being intended to text information；

5th aspect, this specification embodiment disclose a kind of intelligent sound interactive device, are applied to smart machine end, packet It includes:

Equipment end Switching command receiver module, the switching that the target person for receiving the transmission of smart machine server-side is set refer to It enables；

Equipment end notification module, the switching for being set based on the switching command to the target person are notified.

Optionally, described device, further includes:

Equipment end sending module, for sending the switching request that target person is set to smart machine server-side.

Optionally, the equipment end sending module, is specifically used for:

Optionally, described device, further includes:

Corresponding relationship binding module, for sending the bind request that target person is set to smart machine server-side, request is established Target person sets the corresponding relationship with target vocal print.

Optionally, the equipment end notification module, is specifically used for:

The equipment end notification module, is specifically used for:

Optionally, described device, further includes:

Broadcasting module, if setting the corresponding wake-up voice for waking up word comprising the target person for collecting, it is determined that institute It states target person and sets corresponding wake-up and respond voice and broadcasted.

Optionally, described device, further includes:

Smart machine acquisition module for acquiring voice messaging after smart machine end wakes up, and is sent to smart machine Server-side；

Voice output module is fed back, for receiving and exporting backchannel of the smart machine server-side based on the voice messaging Sound, wherein the feedback voice is that the target person sets corresponding feedback voice.

6th aspect, this specification embodiment disclose a kind of intelligent sound interactive device, are applied to application program end, packet It includes:

Application program end Switching command receiver module, the switching that the target person for receiving the transmission of smart machine server-side is set Instruction；

Display interface determining module, for determining the display interface at the application program end based on the switching command.

Optionally, described device, further includes:

Application program end sending module, for sending the switching request that target person is set to smart machine server-side.

Optionally, application program end sending module, is specifically used for:

The display interface determining module, is specifically used for:

7th aspect, this specification embodiment disclose a kind of server, including memory, processor and are stored in storage On device and the computer instruction that can run on a processor, the processor realize the instruction by processor when executing described instruction The step of intelligent sound exchange method as described above is realized when execution.

Eighth aspect, this specification embodiment disclose a kind of smart machine, including memory, processor and are stored in On reservoir and the computer instruction that can run on a processor, the processor realize that the instruction is processed when executing described instruction The step of device realizes intelligent sound exchange method as described above when executing.

9th aspect, this specification embodiment disclose a kind of terminal, including memory, processor and are stored in memory Computer instruction that is upper and can running on a processor, the processor realize that the instruction is held by processor when executing described instruction The step of intelligent sound exchange method as described above is realized when row.

Tenth aspect, this specification embodiment disclose a kind of computer readable storage medium, are stored with computer and refer to The step of order, which realizes intelligent sound exchange method as described above when being executed by processor.

A kind of intelligent sound exchange method and device, relevant device and storage medium provided by the present application, can be according to need The people for asking switching different sets, and the exclusive sound etc. that output people sets can be set according to the people of switching, meets the personalization of user Demand.

Detailed description of the invention

Fig. 1 is a kind of flow chart for intelligent sound exchange method that this specification one or more embodiment provides；

Fig. 2 is a kind of flow chart for intelligent sound exchange method that this specification one or more embodiment provides；

Fig. 3 is a kind of flow chart for intelligent sound exchange method that this specification one or more embodiment provides；

Fig. 4 is a kind of flow chart for intelligent sound exchange method that this specification one or more embodiment provides；

Fig. 5 is a kind of flow chart for intelligent sound exchange method that this specification one or more embodiment provides；

Fig. 6 is a kind of structural schematic diagram for intelligent sound interactive device that this specification one or more embodiment provides；

Fig. 7 is a kind of structural schematic diagram for intelligent sound interactive device that this specification one or more embodiment provides；

Fig. 8 is a kind of structural schematic diagram for intelligent sound interactive device that this specification one or more embodiment provides；

Fig. 9 is a kind of schematic diagram for smart machine that this specification one or more embodiment provides.

Specific embodiment

Many details are explained in the following description in order to fully understand the application.But the application can be with Much it is different from other way described herein to implement, those skilled in the art can be without prejudice to the application intension the case where Under do similar popularization, therefore the application is not limited by following public specific implementation.

The term used in this specification one or more embodiment be only merely for for the purpose of describing particular embodiments, It is not intended to be limiting this specification one or more embodiment.In this specification one or more embodiment and appended claims The "an" of singular used in book, " described " and "the" are also intended to including most forms, unless context is clearly Indicate other meanings.It is also understood that term "and/or" used in this specification one or more embodiment refers to and includes One or more associated any or all of project listed may combine.

It will be appreciated that though may be retouched using term first, second etc. in this specification one or more embodiment Various information are stated, but these information should not necessarily be limited by these terms.These terms are only used to for same type of information being distinguished from each other It opens.For example, first can also be referred to as second, class in the case where not departing from this specification one or more scope of embodiments As, second can also be referred to as first.Depending on context, word as used in this " if " can be construed to " ... when " or " when ... " or " in response to determination ".

Firstly, the vocabulary of terms being related to one or more embodiments of the invention explains.

People sets: personage's setting, includes appearance i.e. display circle at end application end for personage's setting in this specification The display interface at face and smart machine end, switching notice etc., name are the customized wake-up word of personage's setting, sound i.e. people It is exclusive technical ability etc. of personage's setting that the synthesized phonetic sound of object setting, thought, which are the corpus of text of personage's setting, personality,. I.e. different people, which is set, can correspond to different display interfaces, different switching notices, different wake-up word, different speech synthesis sound Sound, different corpus of text, different technical ability.Different synthesized phonetic sounds corresponds to the sound characteristics such as different tone colors.

In this specification one or more embodiment, the intelligent sound exchange method and device can be applied to intelligent sound The intelligent terminals such as case, intelligent vehicle mounted terminal or intelligent robot, the application are not intended to be limited in any this.

In this application, a kind of intelligent sound exchange method and device, relevant device and storage medium are provided, below Embodiment in be described in detail one by one.

The intelligent sound exchange method of one embodiment of this specification is disclosed referring to Fig. 1, Fig. 1, the method is applied to intelligence Can device service end, including step 102 is to step 108.

Step 102: receive at least two people set in the switching request that sets of target person.

In this specification one or more embodiment, the people of the smart machine server-side set at least two or two with On, such as children people sets, star people sets and ordinary people sets.Smart machine can include but is not limited to as intelligent sound box.

Be set as children people with the target person and be set as example, receive at least two people set in the switching that sets of target person ask Ask, be receive at least two people set in the switching request that sets of children people, set with switching to children people.

In a kind of embodiment of this specification, receive at least two people set in the switching request that sets of target person, comprising:

Receive smart machine end or application program end at least two people set in target person set based on touch control operation Switching request, the switching request based on touch control operation carry the unique identification information that the target person is set；Or

Receive smart machine end or application program end at least two people set in the voice-based switching that sets of target person Request.

Wherein, the touch control operation can include but is not limited to the touch operation at smart machine end or application program end with And clicking operation etc..Such as it is clicked on the screen at intelligent sound box end or application program of mobile phone end and switches to children people and set mode Deng.

The switching command that the voice-based target person is set may include being sent out based on smart machine end or application program end Send at least two people set in the switching request that sets of target person.Such as receive intelligent sound box or application program of mobile phone end Voice handover request " please switch to children people to set ".

In a kind of embodiment of this specification, receive at least two people set in the voice-based switching that sets of target person ask It asks and includes:

Receive smart machine end or application program end at least two people set in being referred to based on voice switching of setting of target person The switching request shown；Or

Receive smart machine end at least two people set in the switching request based on vocal print that sets of target person.

Wherein, the voice switching instruction can include but is not limited to the voice of " please switch to someone to set ", so that intelligence Energy device service end, which determines, needs the people switched to set.

In addition, receive at least two people set in the switching request based on vocal print that sets of target person before, further includes:

I.e. smart machine server-side receive smart machine end at least two people set in target person set based on vocal print Switching request can set the corresponding relationship with target vocal print according to target person, determine that the switching request at smart machine end is corresponding Target person is set.

Step 104: being based on the switching request, determine the unique identification information that the target person is set.

In a kind of embodiment of this specification, receive at least two people set in target person set based on voice switch indicate Switching request after, be based on the switching request, determine the unique identification information that the target person is set, comprising:

Voice switching instruction is converted into text switching instruction；

Wherein, the unique identification information includes but is not limited to the ID etc. that target person is set.

Voice switching instruction is converted into text switching instruction, i.e., voice switching instruction is resolved into text and cut Instruction is changed, such as voice is switched and indicates that " please be switched to children people to set " resolves to text switching instruction and " please be switched to children people If ", to determine that the target person sets the unique identification information of " children people sets " based on text switching instruction.

In a kind of embodiment of this specification, receive at least two people set in the switching based on vocal print that sets of target person ask After asking, it is based on the switching request, determines the unique identification information that the target person is set, comprising:

It identifies the vocal print, determines target vocal print；

The smart machine server-side, can vocal print to voice sender and people when user sets function switch for the first time If being bound, each voice sender can bind a people by vocal print and set, and identify this in the smart machine server-side After the vocal print of voice sender, it is determined that the switching request needs to switch to the mesh with the binding of the vocal print of the voice sender Mark people sets, and then determines the unique identification information that the target person is set.

In a kind of embodiment of this specification, receive at least two people set in the cutting based on touch control operation that sets of target person Request is changed, the switching request based on touch control operation carries the unique identification information that the target person is set, therefore, can be direct Based on the switching request, the unique identification information that the target person is set is obtained from switching request.

Step 106: being determined based on the unique identification information and set corresponding speech synthesis target sound with the target person.

It is true based on the unique identification information if the unique identification information is the ID that children people sets in actual use It is fixed to set corresponding speech synthesis target sound with the target person, it sets for the ID determination that is set based on children people with children people corresponding The sound of children.

In a kind of embodiment of this specification, before the switching command that transmission target person is set, further includes:

Wherein, the exclusive corpus of text that corresponding target person is set is stored in the exclusive corpus, such as target person is set as Tong Renshe, the then ID set based on the children people determine corresponding exclusive corpus.

Smart machine server-side is set after the switching for completing speech synthesis target sound, exclusive corpus intelligently Standby server-side completes the switching that target person is set.

Step 108: sending the switching command that target person is set.

The switching command is after the smart machine server-side completes the switching that sets of target person, to smart machine end and The target person that application program end is sent sets the response that switching is completed.

In a kind of embodiment of this specification, the method, further includes:

Receive the voice messaging of smart machine end acquisition；

Speech recognition is carried out to the voice messaging and obtains text information；

Semantic parsing is carried out to the text information to obtain being intended to text information；

Based on the intention text information, feedback text information is determined according to exclusive corpus or general corpus；

Speech synthesis is carried out based on the feedback text information, forms the feedback voice of speech synthesis target sound；

The feedback voice is sent to smart machine end.

Specifically, being based on the voice messaging pair if the voice messaging is to set associated voice messaging with target person The exclusive corpus is retrieved, and is formed feedback speech concurrent based on search result and sent the feedback voice to smart machine End.Wherein, it is set if the target person is set as children people, the children people, which sets in corresponding exclusive corpus, may include but not It is limited to children's ancient poetry, nursery rhymes, children's riddle, Chinese idiom, early teaches English, early education story, mathematical operation either various children's question and answer Deng.

The voice messaging includes but is not limited to problem inquiry, chat message etc..

In actual use, children people is set as with target person and is set as example, if the voice messaging is " please to say an early education event to me Thing " can then determine that the voice messaging is to set " children people sets " associated voice messaging with target person.

Referring to fig. 2, the exclusive corpus is retrieved based on the voice messaging, feedback is formed based on search result Speech concurrent send the feedback voice to smart machine end, including step 202 is to step 208.

Step 202: the voice messaging is converted into corresponding text information.

Step 204: parsing the text information and obtain being intended to text information.

Step 206: feedback text information is obtained based on exclusive corpus described in the intention document information retrieval.

Step 208: the feedback text information being converted to and sets corresponding speech synthesis target sound with the target person Feedback voice, and send the feedback voice to smart machine end.

Further, if the voice messaging be not associated voice messaging is set with target person, but be common language message Breath then retrieves preset general corpus based on the voice messaging, and forms feedback voice simultaneously based on search result The feedback voice is sent to smart machine end.

Wherein, common corpus of text is stored in the general corpus, the plain text corpus includes but is not limited to For weather, alarm clock or answer of FM (Frequency Modulation, frequency modulation) etc..

In addition, the corresponding exclusive corpus of each unique identification information and general corpus.

In actual use, if the voice messaging is " today, how is weather ", the voice messaging can be determined not It is that associated voice messaging is set with target person.

Preset general corpus is retrieved based on the voice messaging, feedback voice is formed simultaneously based on search result The feedback voice is sent to smart machine end, comprising:

The voice messaging is converted into corresponding text information.

The text information is parsed to obtain being intended to text information.

Feedback text information is obtained based on general corpus described in the intention document information retrieval.

In a kind of embodiment of this specification, the switching request carries the unique identification information and/or the target person If corresponding switching voice.

Carry that target person sets corresponding unique identification information and/or the target person sets correspondence in the i.e. described switching request Switching voice, to determine that the target person that switches of needs is set.

In this specification one or more embodiment, receive at least two people set in the switching request that sets of target person； Based on the switching request, the unique identification information that the target person is set is determined；Based on the unique identification information determination and institute It states target person and sets corresponding speech synthesis target sound；Send the switching command that target person is set.Difference can be switched according to demand People set, and can be set according to the target person of switching and export the exclusive sound etc. that sets of the target person, meet the personalization of user Demand.

Referring to Fig. 3, one embodiment of this specification discloses a kind of intelligent sound exchange method, and the method is applied to intelligence Equipment end, including step 302 is to step 304.

Step 302: receiving the switching command that the target person that smart machine server-side is sent is set.

Wherein, the switching command is that the target person that the smart machine server-side is sent sets the switching sound that switching is completed It answers.

In a kind of embodiment of this specification, before receiving the switching command that the target person that smart machine server-side is sent is set, Further include:

Wherein, in a kind of situation, the switching request that target person is set is sent to smart machine server-side, comprising:

In another case, sending the voice-based switching request that target person is set to smart machine server-side, comprising:

In another case, before sending the switching request based on vocal print that target person is set to smart machine server-side, also Include:

Switching request, voice-based switching request based on touch control operation, the switching request based on voice switching instruction And the switching request based on vocal print can participate in above-described embodiment, details are not described herein.

Step 304: the switching set based on the switching command to the target person is notified.

In a kind of embodiment of this specification, the switching set based on the switching command to the target person is led to Know, comprising:

I.e. after receiving the switching command, the smart machine end is according to bandwagon effects such as specified audio, lamp effects Be shown, for example, it is specified " dichloro-diphenyl-dichlorothane " audio and yellow light flashing lamp effect three times.

In a kind of embodiment of this specification, the switching command carries the unique identification information that the target person is set；

Such as corresponding bandwagon effect and switching voice, example are configured for the unique identification information that the target person is set in advance The ID configuration green that for example children people sets flashes lamp effect three times and the switching voice of " I has been switched to Xiaobao ".

If the unique identification information that the target person that the switching command carries is set sets ID as children people, based on described The unique identification information that sets of the target person that switching command carries, determine the target person set corresponding default bandwagon effect into Row shows and determines that the target person sets corresponding default switching voice and broadcasted, it can shows for smart machine end green The lamp of colored lights optical flare three times is imitated and the switching voice of " I has been switched to Xiaobao ".

In a kind of embodiment of this specification, the switching command carries the target person and sets corresponding switching voice；

If the switching command carries the target person and sets corresponding switching voice as " target person set switching complete ", It sets corresponding switching voice based on the target person that the switching command carries to be broadcasted, i.e., the described smart machine end casting Switching voice " target person sets switching and completes ".

In actual use, switching voice can be set according to the actual demand that the target person is set, and the application is to this It is not limited in any way.

In a kind of embodiment of this specification, the method also includes:

Different wake-up words can be set in advance for different people, such as set when target person is set as children people, Ke Yiwei Children people, which is arranged, wakes up voice " Xiaobao Xiaobao " etc..

It may include reply voice corresponding with wake-up voice that voice is responded in the wake-up, if such as target person be set as children People sets, and the wake-up voice of children people's setting is " Xiaobao Xiaobao ", then the wake-up reply voice then can be " small Treasured exists ".

In a kind of embodiment of this specification, voice messaging is acquired after smart machine end wakes up, and be sent to smart machine Server-side；

After the smart machine end is waken up, voice messaging can be acquired, and is sent to smart machine server-side, In, to the explanation of the voice messaging, for details, reference can be made to above-described embodiments, and details are not described herein.

After smart machine server-side is received based on the feedback voice of the voice messaging, the feedback voice is carried out defeated Out.

Such as the voice messaging of acquisition is " this can be some ", is based on the voice receiving the smart machine server-side It is anti-as " 13:00 in afternoon " of female's star's woods * * sound that the feedback voice of information is that the target person sets the sound of " female star woods * * " After presenting voice, " 13:00 in afternoon " feedback voice is exported by smart machine end.

In this specification one or more embodiment, the method includes receiving the target person of smart machine server-side transmission If switching command；The switching set based on the switching command to the target person is notified.Switching can be set in target person When set according to the target person and show different transition effect and switching voice, reception can also be set according to the target person of switching And the feedback voice etc. for exporting exclusive tone color, meet the individual demand of user.

Referring to fig. 4, one embodiment of this specification provides a kind of intelligent sound exchange method, and the method is answered With program end, including step 402 is to step 404.

Step 402: receiving the switching command that the target person that smart machine server-side is sent is set.

In one embodiment of this specification, before receiving the switching command that the target person that smart machine server-side is sent is set, also Include:

Wherein, the switching request that target person is set is sent to smart machine server-side, comprising:

The description of the switching request and voice-based switching request based on touch control operation may refer to above-mentioned reality Example is applied, details are not described herein.

Step 404: the display interface at the application program end is determined based on the switching command.

In this specification one embodiment, the switching command carries the unique identification information that the target person is set；

Wherein, the unique identification information that the target person carried based on the switching command is set, determines the target person If the display interface of the corresponding application program, i.e., the unique identification information set based on the target person determines described using journey The head portrait etc. that the corresponding interface display element of sequence, interface function layout and target person are set.

Such as the target person is set as children people and sets, the unique mark set based on the target person that the switching command carries Know information, determine that the target person sets the display interface of the corresponding application program, then can be to be set based on the children people ID the application program end show cartoon interface element, more interesting interface layout and the head portrait of baby Deng.

In this specification one or more embodiment, the method includes receiving the target person of smart machine server-side transmission If switching command；The display interface at the application program end is determined based on the switching command.The switching set for target person It can directly be executed at application program end, and application program end can convert corresponding display circle according to the switching that target person is set Face, people set the ceremony sense enhancing of switching, greatly improve the individualized experience of user.

Referring to Fig. 5, call end and intelligent sound box server-side to the intelligence with application program end, intelligent sound box end, interface Energy voice interactive method is explained in detail, referring to step 502 to step 518.

Step 502: application program end or intelligent sound box end are sent by touch control operation or voice to intelligent sound box server-side The switching request that target person is set.

Step 504: intelligent sound box server-side is based on the switching request, determines the unique identification information that the target person is set Later, to determine that corresponding exclusive corpus and the target person are set according to the unique identification information that the target person is set corresponding Speech synthesis target sound.

Step 506: the switching command that intelligent sound box server-side transmission target person is set to intelligent sound box end and application program end.

It should be noted that intelligent sound box server-side is cut to what the target person that intelligent sound box end and application program end are sent was set Instruction is changed, can be the identical switching command of data format, or the different switching command of data format.

Step 508: intelligent sound box end receives the switching command that the target person that intelligent sound box server-side is sent is set, and is based on institute It states the switching that switching command sets the target person to notify, shows that the target person sets corresponding audio, lamp effect, broadcast the mesh Mark people sets corresponding switching voice.

Step 510: application program end receives the switching command that the target person that intelligent sound box server-side is sent is set, and is based on institute State the display interface that switching command determines the application program end.

Wherein, step 508 and step 510 can execute parallel, and the application is not limited in any way this.

Step 512: intelligent sound box end, which is collected, sets the corresponding wake-up voice for waking up word comprising the target person, and determines The target person sets corresponding wake-up response voice and is broadcasted.

Step 514: intelligent sound box end acquires voice messaging after waking up, and is sent to intelligent sound box server-side.

Step 516: intelligent sound box server-side receives the voice messaging of intelligent sound box end acquisition, is formed instead based on voice messaging Present voice.

In one embodiment of this specification, the voice messaging is passed through into ASR (Automatic Speech Recognition, speech recognition) speech recognition conversion be corresponding text information；

The text information is parsed using NLP (Natural Language Processing, natural language processing) to obtain It is intended to text information；

Based on the intention text information, feedback text information is obtained according to exclusive corpus or general corpus；

The feedback text information is converted to the backchannel that corresponding speech synthesis target sound is set with the target person Sound, and the feedback voice is sent to intelligent sound box end.

Step 518: intelligent sound box end receives and the feedback voice based on the voice messaging that exports intelligent sound box server-side, Wherein, the feedback voice is that the target person sets corresponding feedback voice.

In this specification one or more embodiment, each of the method, which is set, is owned by oneself independent personality, It supports personalized corpus, with customized wake-up word and language can be replied, everyone has the sound characteristics of oneself, Yong Huke Liked according to oneself sound style selection, and can be set according to target person carry out personalized speech synthesis and everyone set Also there is the exclusive technical ability of oneself, can satisfy the demand of user individual.

In this specification one or more embodiment, when new user uses intelligent sound box for the first time, the intelligence is configured for the first time When multiple people of speaker set function, the intelligent sound box is subjected to networking setting first so that the intelligent sound box and cell phone application end into Row connection, then enters people at cell phone application end and sets the selection page, and the people sets selection page presentation everyone image for setting, people sets Personality explanation, attribute function etc., and can the people set the selection page can customized setting everyone be located at intelligent sound The corresponding bandwagon effect in case end is with switching voice and in corresponding display interface in cell phone application end etc., wherein can be with gas of chatting Bubble form guide user select people set, and corresponding people set selection the page play guidance voice, guide page complete and then Secondary to click to enter cell phone application, default people sets and can actively interact, and cell phone application end can show that default people sets corresponding display circle Face, intelligent sound box end can show that default people sets corresponding bandwagon effect and casting default people sets corresponding switching voice, such as " HI, Zhu Renhao, I is * * *, have later I accompany with you, I understands * * * ".

It is used for the first time when multiple people set function, the registered old user in the cell phone application end connecting with intelligent sound box can Directly to carry out the study of people's setting.After the function is online, user is initially opened the cell phone application end, will pop up people and sets function Online explanation, guidance user use.User first can increase people in cell phone application end display interface " I " menu bar and set switching People is set handoff functionality and is associated with the product ID (SN code) of intelligent sound box by function, so that the people sets handoff functionality to being closed The intelligent sound box of connection comes into force, and then enters people at cell phone application end and sets the selection page, and the people sets selection page presentation, and everyone sets Image, people the personality explanation, the attribute function that set etc., and can set the selection page in the people can everyone customized be set and set At intelligent sound box end, corresponding bandwagon effect is with switching voice and in corresponding display interface in cell phone application end etc., wherein can be with To chat, bubble form guides user that people is selected to set, and sets the selection page in corresponding people and play guidance voice, and guide page is complete At and then it is secondary click to enter cell phone application, default people sets and can actively interact, and cell phone application end can show that default people sets correspondence Display interface, intelligent sound box end can show default people set corresponding bandwagon effect and casting default people set corresponding switching language Sound, such as " HI, Zhu Renhao, I is * * *, have later I accompany with you, I understands * * * ".

Referring to Fig. 6, this specification embodiment discloses a kind of intelligent sound interactive device, is applied to smart machine service End, comprising:

Server-side switching request receiving module 602, for receive at least two people set in the switching that sets of target person ask It asks；

Unique identification information determining module 604 determines unique mark that the target person is set for being based on the switching request Know information；

Target sound determining module 606, it is corresponding for being set based on unique identification information determination with the target person Speech synthesis target sound；

Switching command sending module 608, the switching command set for sending target person.

Optionally, the server-side switching request receiving module 602, specifically for including:

Optionally, the server-side switching request receiving module 602, is specifically used for:

Optionally, described device, further includes:

Optionally, the unique identification information determining module 604, is specifically used for:

Voice switching instruction is converted into text switching instruction；

It identifies the vocal print, determines target vocal print；

Optionally, further includes:

Optionally, described device, further includes:

Optionally, the exclusive corpus retrieval module, is specifically used for:

The voice messaging is converted into corresponding text information；

The text information is parsed to obtain being intended to text information；

The voice messaging is converted into corresponding text information；

The text information is parsed to obtain being intended to text information；

The specific implementation of intelligent sound interactive device shown in fig. 6 can be found in the intelligent sound at aforementioned intelligent device service end Exchange method embodiment, details are not described herein.

Referring to Fig. 7, this specification embodiment discloses a kind of intelligent sound interactive device, is applied to smart machine end, packet It includes:

Equipment end Switching command receiver module 702, the switching that the target person for receiving the transmission of smart machine server-side is set Instruction；

Equipment end notification module 704, the switching for being set based on the switching command to the target person are notified.

Optionally, described device, further includes:

Optionally, the equipment end sending module, is specifically used for:

Optionally, described device, further includes:

Optionally, the equipment end notification module 704, is specifically used for:

The equipment end notification module 704, is specifically used for:

Optionally, described device, further includes:

The specific implementation of intelligent sound interactive device shown in Fig. 7 can be found in the intelligent sound interaction of aforementioned intelligent equipment end Embodiment of the method, details are not described herein.

Referring to Fig. 8, this specification embodiment discloses a kind of intelligent sound interactive device, is applied to application program end, packet It includes:

Application program end Switching command receiver module 802, what the target person for receiving the transmission of smart machine server-side was set Switching command；

Display interface determining module 804, for determining the display interface at the application program end based on the switching command.

Optionally, described device, further includes:

Optionally, application program end sending module, is specifically used for:

The display interface determining module 804, is specifically used for:

The specific implementation of intelligent sound interactive device shown in Fig. 8 can be found in the intelligent sound interaction of aforementioned applications program end Embodiment of the method, details are not described herein.

This specification embodiment discloses a kind of server, including memory, processor and storage are on a memory and can The computer instruction run on a processor, the processor are realized real when the instruction is executed by processor when executing described instruction The step of intelligent sound exchange method at existing aforementioned intelligent device service end.

This specification embodiment discloses a kind of smart machine, including memory, processor and storage are on a memory simultaneously The computer instruction that can be run on a processor, when the processor realizes that the instruction is executed by processor when executing described instruction The step of realizing the intelligent sound exchange method of aforementioned intelligent equipment end.

The smart machine can be, but not limited to as intelligent sound box.

It is to show the structural block diagram of the smart machine 900 according to one embodiment of this specification referring to Fig. 9, Fig. 9.The intelligence The component of equipment 900 includes but is not limited to memory 910 and processor 920.Processor 920 and memory 910 pass through bus 930 It is connected.

Smart machine 900 further includes access device 940, access device 940 enable smart machine 900 via one or Multiple network communications.The example of these networks include public switched telephone network (PSTN), local area network (LAN), wide area network (WAN), The combination of the communication network of personal area network (PAN) or such as internet.Access device 940 may include wired or wireless any class One or more of network interface (for example, network interface card (NIC)) of type, such as IEEE802.11 WLAN (WLAN) wireless interface, worldwide interoperability for microwave accesses (Wi-MAX) interface, Ethernet interface, universal serial bus (USB) interface, Cellular network interface, blue tooth interface, near-field communication (NFC) interface, etc..

In one embodiment of this specification, unshowned other component in above-mentioned and Fig. 9 of smart machine 900 It can be connected to each other, such as pass through bus.It should be appreciated that smart machine structural block diagram shown in Fig. 9 is merely for the sake of example Purpose, rather than the limitation to this specification range.Those skilled in the art can according to need, and increase or replace other portions Part.

This specification embodiment discloses a kind of terminal, including memory, processor and storage are on a memory and can be The computer instruction run on processor, the processor realize realization when the instruction is executed by processor when executing described instruction The step of intelligent sound exchange method at application program end.

The terminal can include but is not limited to that the Portable movables such as the mobile phone of application program, tablet computer can be installed Terminal.

The exemplary scheme of the above-mentioned relevant device for the present embodiment.Wherein, the relevant device includes server, intelligence Equipment and terminal.It should be noted that the technology of the technical solution of the relevant device and above-mentioned intelligent sound exchange method Scheme belongs to same design, and the detail content that the technical solution of relevant device is not described in detail may refer to above-mentioned intelligent language The description of the technical solution of sound exchange method.

One embodiment of the application also provides a kind of computer readable storage medium, is stored with computer instruction, the instruction The step of intelligent sound exchange method as previously described is realized when being executed by processor.

A kind of exemplary scheme of above-mentioned computer readable storage medium for the present embodiment.It should be noted that this is deposited The technical solution of storage media and the technical solution of above-mentioned intelligent sound exchange method belong to same design, the technology of storage medium The detail content that scheme is not described in detail may refer to the description of the technical solution of above-mentioned intelligent sound exchange method.

It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can With or may be advantageous.

The computer instruction includes computer program code, the computer program code can for source code form, Object identification code form, executable file or certain intermediate forms etc..The computer-readable medium may include: that can carry institute State any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic disk, CD, the computer storage of computer program code Device, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), Electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that the computer-readable medium include it is interior Increase and decrease appropriate can be carried out according to the requirement made laws in jurisdiction with patent practice by holding, such as in certain jurisdictions of courts Area does not include electric carrier signal and telecommunication signal according to legislation and patent practice, computer-readable medium.

It should be noted that for the various method embodiments described above, describing for simplicity, therefore, it is stated as a series of Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because According to the application, certain steps can use other sequences or carry out simultaneously.Secondly, those skilled in the art should also know It knows, the embodiments described in the specification are all preferred embodiments, and related actions and modules might not all be this Shen It please be necessary.

In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment Point, it may refer to the associated description of other embodiments.

The application preferred embodiment disclosed above is only intended to help to illustrate the application.There is no detailed for alternative embodiment All details are described, are not limited the invention to the specific embodiments described.Obviously, according to the content of this specification, It can make many modifications and variations.These embodiments are chosen and specifically described to this specification, is in order to preferably explain the application Principle and practical application, so that skilled artisan be enable to better understand and utilize the application.The application is only It is limited by claims and its full scope and equivalent.

Claims

1. a kind of intelligent sound exchange method, which is characterized in that be applied to smart machine server-side, comprising:

Send the switching command that target person is set.

2. the method according to claim 1, wherein receive at least two people set in the switching that sets of target person Request, comprising:

3. method according to claim 1 or 2, which is characterized in that receive at least two people set in target person set Switching request, comprising:

Receive at least two people set in the target person switching request based on touch control operation that sets, it is described based on touch control operation Switching request carries the unique identification information that the target person is set；Or

4. according to the method described in claim 3, it is characterized in that, receive at least two people set in target person set based on The switching request of voice includes:

5. according to the method described in claim 4, it is characterized in that, receive at least two people set in target person set based on Before the switching request of vocal print, further includes:

6. according to the method described in claim 4, it is characterized in that, receive at least two people set in target person set based on After the switching request of voice switching instruction, it is based on the switching request, determines the unique identification information that the target person is set, is wrapped It includes:

Voice switching instruction is converted into text switching instruction；

7. according to the method described in claim 5, it is characterized in that, receive at least two people set in target person set based on After the switching request of vocal print, it is based on the switching request, determines the unique identification information that the target person is set, comprising:

It identifies the vocal print, determines target vocal print；

8. the method according to claim 1, wherein before sending the switching command that target person is set, further includes:

9. according to the method described in claim 8, it is characterized in that, the method, further includes:

Receive the voice messaging of smart machine end acquisition；

If the voice messaging is to set associated voice messaging with target person, based on the voice messaging to the exclusive corpus Library is retrieved, and is formed feedback speech concurrent based on search result and sent the feedback voice to smart machine end.

10. according to the method described in claim 8, it is characterized in that, the method, further includes:

Receive the voice messaging of smart machine end acquisition；

If the voice messaging is universal phonetic information, preset general corpus is examined based on the voice messaging Rope, and feedback speech concurrent is formed based on search result and gives the feedback voice to smart machine end.