CN110473541A

CN110473541A - A kind of sound equipment Alexa sound control method and system based on artificial intelligence

Info

Publication number: CN110473541A
Application number: CN201910831030.XA
Authority: CN
Inventors: 庄少宏; 曾庆法; 李叶永; 史作超; 葛丰达; 张勃然; 姜秉周; 何伟新; 莫砺; 高正彬; 劳凯邦; 李蕤秀; 潘伟兴; 吴亚新; 刘湘华; 刘学满; 张泽远
Original assignee: GUANGZHOU PANYU JUDA CAR AUDIO EQUIPMENT CO Ltd
Current assignee: GUANGZHOU PANYU JUDA CAR AUDIO EQUIPMENT CO Ltd
Priority date: 2019-09-02
Filing date: 2019-09-02
Publication date: 2019-11-19

Abstract

The invention discloses a kind of sound equipment Alexa sound control method and system based on artificial intelligence, which comprises receive the control voice messaging that user sends over；Judge to send whether the user of the control voice messaging is permission user for voice control stereo set based on Application on Voiceprint Recognition；If so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained；The stereo set responds the control instruction, executes act corresponding with the control instruction.In embodiments of the present invention, artificial intelligence is introduced in sound equipment control, improves user in the experience sense controlled sound equipment.

Description

A kind of sound equipment Alexa sound control method and system based on artificial intelligence

Technical field

The present invention relates to sound equipment control technology field more particularly to a kind of sound equipment Alexa voice controls based on artificial intelligence Method and system processed.

Background technique

Artificial intelligence (Artificial Intelligence), english abbreviation AI, artificial intelligence is since the birth, reason By increasingly mature with technology, application field also constantly expands, it is contemplated that the following artificial intelligence bring sci-tech product, it will It is the wisdom of humanity " container "；Artificial intelligence can consciousness to people, thinking information process simulation；Artificial intelligence is not people Intelligence, but can think deeply as people, may also be more than people intelligence.

With the development of technology, the control mode of current stereo set is also from original by key control to infrared control, It is even developed to network-control etc. now, but current present young man's is had been unable to meet to the control method of stereo set Diversified demand.

Summary of the invention

It is an object of the invention to overcome the deficiencies in the prior art, and the present invention provides sound equipment of the kind based on artificial intelligence Alexa sound control method and system meet user to the demand for control of stereo set, improve the usage experience sense of user.

In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of sound equipment Alexa language based on artificial intelligence Sound controlling method, which comprises

Receive the control voice messaging that user sends over；

Judge to send whether the user of the control voice messaging is for voice control stereo set based on Application on Voiceprint Recognition Permission user；

If so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained；

The stereo set responds the control instruction, executes act corresponding with the control instruction.

Optionally, before the stereo set response control instruction, further includes:

Judge that the control instruction plays control instruction or playing function switching command for inquiry.

Optionally, when the control instruction is judged as inquiry broadcasting control instruction, comprising:

Control instruction is played according to the inquiry to be searched in the stereo set audio database, is obtained to be played Audio data；

The stereo set is decoded and plays to the audio data to be played；

Wherein, control instruction is played according to the inquiry to be searched in the stereo set audio database, comprising:

It is searched in local audio data library, obtains local audio data to be played；

Judge whether the stereo set connects internet；

If connection, is searched in network audio data library, network audio data to be played is obtained；

Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.

Optionally, when the audio data to be played is greater than one, voice feedback information is produced, Xiang Suoshu user speech is anti- Present the audio data to be played inquired；

It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice messaging, obtain Take family selection result；

The stereo set is decoded and plays according to user's selection result, to the audio data to be played.

Optionally, when the control instruction is judged as playing function switching command, comprising:

The stereo set responds the playing function switching command, by the state of the stereo set switch to it is described The corresponding functional status of playing function switching command.

In addition, the embodiment of the invention also provides a kind of sound equipment Alexa speech control system based on artificial intelligence, described System includes:

Speech reception module: the control voice messaging sended over for receiving user；

Voiceprint identification module: for based on Application on Voiceprint Recognition judge send it is described control voice messaging user whether be for The permission user of voice control stereo set；

Semanteme solves module: for parsing the control voice messaging based on semantics recognition, obtaining the control instruction of user；

Execution module: the control instruction is responded for the stereo set, executes move corresponding with the control instruction Make.

Optionally, before the execution module, further includes:

Instruction judgment module: for judging that the control instruction plays control instruction for inquiry or playing function switching refers to It enables.

Optionally, the execution module, is also configured to:

When the control instruction, which is judged as inquiry, plays control instruction, comprising:

The stereo set is decoded and plays to the audio data to be played；

Judge whether the stereo set connects internet；

Optionally, the execution module, is also configured to:

When the audio data to be played is greater than one, voice feedback information is produced, Xiang Suoshu user speech feedback is looked into The audio data to be played ask；

Optionally, the execution module, is also configured to:

When the control instruction is judged as playing function switching command, comprising:

In specific implementation process of the present invention, Application on Voiceprint Recognition is carried out by voice messaging to user, ensures that user can be with The control stereo set of safety, so that stereo set is not controlled arbitrarily by external voice messaging；Artificial intelligence is added in stereo set Can, meet user to the intelligentized control method of stereo set, improves user experience.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it is clear that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Other attached drawings are obtained according to these attached drawings.

Fig. 1 is the flow diagram of the sound equipment Alexa sound control method based on artificial intelligence in the embodiment of the present invention；

Fig. 2 is that the structure composition of the sound equipment Alexa speech control system based on artificial intelligence in the embodiment of the present invention is shown It is intended to.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts all other Embodiment shall fall within the protection scope of the present invention.

Fig. 1 is the flow diagram of the sound equipment Alexa sound control method based on artificial intelligence in the embodiment of the present invention.

As shown in Figure 1, a kind of sound equipment Alexa sound control method based on artificial intelligence, which comprises

S11: the control voice messaging that user sends over is received；

In specific implementation process of the present invention, the control voice letter that user sends over is received by voice capture device Breath.

S12: judge to send whether the user of the control voice messaging is for voice control sound equipment based on Application on Voiceprint Recognition The permission user of equipment；

In specific implementation process of the present invention, Application on Voiceprint Recognition (Voiceprint Recognition, VPR) is also referred to as said It talks about people and identifies (Speaker Recognition), there is two classes, is i.e. speaker recognizes (Speaker Identification) and says It talks about people and confirms (Speaker Verification).The former is described in which of several people to judge certain section of voice, It is " multiselect one " problem；And the latter is to confirm whether certain section of voice is described in specified someone, is " one-to-one differentiation " Problem.

Specifically, the feature extraction of Application on Voiceprint Recognition is to extract and select have separability strong, stable to the vocal print of speaker The acoustics or language feature of the characteristics such as property height；It include: (1) acoustic feature related with the anatomical structure of pronunciation mechanism of the mankind (such as frequency spectrum, cepstrum, formant, fundamental tone, reflection coefficient etc.), nasal sound, band deep breathing sound, hoarse sound, laugh；(2) by Semanteme, rhetoric, pronunciation, speech habit of the influences such as socioeconomic status, education level, birthplace etc.；(3) personal touch or The features such as the rhythm, rhythm, speed, intonation, the volume that are influenced by parent.From the angle that can be modeled using mathematical method, The feature that vocal print automatic identification model can be used at present includes: (1) acoustic feature (cepstrum)；(2) lexical characteristics (speaker Relevant word n-gram, phoneme n-gram)；(3) prosodic features (fundamental tone and energy " posture " that are described using n-gram)；(4) Languages, dialect and accent information；(5) channel information (which kind of channel used)；Etc..

Specific recognition methods has: (1) template matching method: using dynamic time bending (DTW) to be directed at training and survey Characteristic sequence is tried, the application (usually text inter-related task) of fixed phrases is mainly used for；(2) it arest neighbors method: is protected when training Stay all characteristic vectors, when identification finds K nearest in trained vector to each vector, is identified accordingly, usual mould The amount of type storage and similar calculating is all very big；(3) neural network method: there are many kinds of forms, such as Multilayer Perception, radial basis function (RBF) etc., it can explicitly train to distinguish speaker and its background speaker, training burden is very big, and the replicability of model It is bad；(4) hidden Markov model (HMM) method: usually using the HMM or gauss hybrid models (GMM) of list state, it is Popular method, effect are relatively good；(5) VQ clustering method (such as LBG): effect is relatively good, and algorithm complexity is not also high, and HMM method, which cooperates, can more receive better effect；(6) multinomial classifier methods: there is higher precision, but model is deposited Storage and calculation amount are all bigger etc..

Based on above-mentioned method for recognizing sound-groove, whether the user that identification sends the control voice messaging is for voice control The permission user of stereo set processed；If so, entering in next step, if it is not, then returning to S11.

S13: if so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained；

In specific implementation process of the present invention, when judging that the control voice messaging uses for corresponding permission in Application on Voiceprint Recognition After family issues, the control voice messaging is identified using based on semantics recognition algorithm, is obtaining the control voice messaging Semanteme after, by the semanteme produce user control instruction, then obtain user control instruction.

S14: the stereo set responds the control instruction, executes act corresponding with the control instruction.

In specific implementation process of the present invention, the stereo set is responded before the control instruction, further includes: judges institute It states control instruction and plays control instruction or playing function switching command for inquiry.

Further, when the control instruction is judged as inquiry broadcasting control instruction, comprising: played according to the inquiry Control instruction is searched in the stereo set audio database, obtains audio data to be played；The stereo set pair The audio data to be played is decoded and plays；Wherein, control instruction is played in the stereo set according to the inquiry It is searched in audio database, comprising: searched in local audio data library, obtain local audio data to be played； Judge whether the stereo set connects internet；If connection, is searched in network audio data library, obtains network and wait broadcasting Put audio data；Local audio data to be played and network audio data to be played are merged, audio number to be played is obtained According to.

Further, when the audio data to be played is greater than one, voice feedback information, Xiang Suoshu user speech are produced The audio data to be played that feedback query arrives；It receives user and is based on voice feedback delivering selection voice messaging, parsing The selection voice messaging obtains user's selection result；The stereo set is according to user's selection result, to described wait broadcast Audio data is put to be decoded and play.

Further, when the control instruction is judged as playing function switching command, comprising: the stereo set response The state of the stereo set is switched to function corresponding with the playing function switching command by the playing function switching command It can state.

Specifically, first determining whether that the control instruction plays control for inquiry after stereo set receives the control instruction System instruction or playing function switching command；When playing control instruction if inquiry, then control instruction is played according to the inquiry It is searched in the stereo set audio database, obtains audio data to be played；The stereo set is to described wait broadcast Audio data is put to be decoded and play；Wherein, control instruction is played in the stereo set audio data according to the inquiry It is searched in library, comprising: searched in local audio data library, obtain local audio data to be played；Described in judgement Whether stereo set connects internet；If connection, is searched in network audio data library, network audio number to be played is obtained According to；Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.Wherein, When the audio data to be played be greater than one when, produce voice feedback information, Xiang Suoshu user speech feedback query arrive to Playing audio-fequency data；It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice letter Breath obtains user's selection result；The stereo set carries out the audio data to be played according to user's selection result It decodes and plays.

If judge playing function switching command, the stereo set responds the playing function switching command, by institute The state for stating stereo set switches to functional status corresponding with the playing function switching command.

As shown in Fig. 2, a kind of sound equipment Alexa speech control system based on artificial intelligence, the system comprises:

Speech reception module 11: the control voice messaging sended over for receiving user；

Voiceprint identification module 12: send whether the user of the control voice messaging is use for judging based on Application on Voiceprint Recognition In the permission user of voice control stereo set；

Semanteme solves module 13: for parsing the control voice messaging based on semantics recognition, the control for obtaining user refers to It enables；

Execution module 14: the control instruction is responded for the stereo set, is executed corresponding with the control instruction Movement.

In specific implementation process of the present invention, before the execution module 14, further includes: instruction judgment module: for sentencing The control instruction of breaking is that inquiry plays control instruction or playing function switching command.

Further, the execution module, is also configured to: referring to when the control instruction is judged as inquiry broadcasting control When enabling, comprising: play control instruction according to the inquiry and searched in the stereo set audio database, obtained wait broadcast Put audio data；The stereo set is decoded and plays to the audio data to be played；Wherein, it is broadcast according to the inquiry Control instruction is put to be searched in the stereo set audio database, comprising: it is searched in local audio data library, Obtain local audio data to be played；Judge whether the stereo set connects internet；If connection, in network audio data library In searched, obtain network audio data to be played；Will local audio data to be played and network audio data to be played into Row merges, and obtains audio data to be played.

Further, the execution module, is also configured to: when the audio data to be played is greater than one, production Voice feedback information, the audio data to be played that Xiang Suoshu user speech feedback query arrives；It is anti-based on the voice to receive user Feedforward information issues selection voice messaging, parses the selection voice messaging, obtains user's selection result；The stereo set according to User's selection result is decoded and plays to the audio data to be played.

Further, the execution module, is also configured to: referring to when the control instruction is judged as playing function switching When enabling, comprising: the stereo set responds the playing function switching command, and the state of the stereo set is switched to and institute State the corresponding functional status of playing function switching command.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include: read-only memory (ROM, ReadOnly Memory), random access memory (RAM, Random Access Memory), disk or CD etc..

In addition, being provided for the embodiments of the invention a kind of voice control side sound equipment Alexa based on artificial intelligence above Method and system are described in detail, and should use specific case herein and be explained the principle of the present invention and embodiment It states, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas；Meanwhile for this field Those skilled in the art, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, to sum up institute It states, the contents of this specification are not to be construed as limiting the invention.

Claims

1. a kind of sound equipment Alexa sound control method based on artificial intelligence, which is characterized in that the described method includes:

Receive the control voice messaging that user sends over；

Judge to send whether the user of the control voice messaging is power for voice control stereo set based on Application on Voiceprint Recognition Limit the use of family；

2. sound equipment Alexa sound control method according to claim 1, which is characterized in that the stereo set responds institute Before stating control instruction, further includes:

3. sound equipment Alexa sound control method according to claim 2, which is characterized in that the control instruction is judged When playing control instruction for inquiry, comprising:

Control instruction is played according to the inquiry to be searched in the stereo set audio database, and audio to be played is obtained Data；

The stereo set is decoded and plays to the audio data to be played；

Judge whether the stereo set connects internet；

4. sound equipment Alexa sound control method according to claim 3, which is characterized in that the audio data to be played When greater than one, voice feedback information, the audio data to be played that Xiang Suoshu user speech feedback query arrives are produced；

It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice messaging, obtain and use Family selection result；

5. sound equipment Alexa sound control method according to claim 2, which is characterized in that the control instruction is judged When for playing function switching command, comprising:

The stereo set responds the playing function switching command, and the state of the stereo set is switched to and the broadcasting Function switch instructs corresponding functional status.

6. a kind of sound equipment Alexa speech control system based on artificial intelligence, which is characterized in that the system comprises:

Voiceprint identification module: send whether the user of the control voice messaging is for voice for judging based on Application on Voiceprint Recognition Control the permission user of stereo set；

Execution module: the control instruction is responded for the stereo set, executes act corresponding with the control instruction.

7. sound equipment Alexa speech control system according to claim 6, which is characterized in that before the execution module, also Include:

Instruction judgment module: for judging that the control instruction plays control instruction or playing function switching command for inquiry.

8. sound equipment Alexa speech control system according to claim 7, which is characterized in that the execution module also configures For:

The stereo set is decoded and plays to the audio data to be played；

Judge whether the stereo set connects internet；

9. sound equipment Alexa speech control system according to claim 8, which is characterized in that the execution module also configures For:

When the audio data to be played is greater than one, voice feedback information is produced, Xiang Suoshu user speech feedback query arrives Audio data to be played；

10. sound equipment Alexa speech control system according to claim 7, which is characterized in that the execution module is also matched It sets and is used for: