CN110473541A - A kind of sound equipment Alexa sound control method and system based on artificial intelligence - Google Patents

A kind of sound equipment Alexa sound control method and system based on artificial intelligence Download PDF

Info

Publication number
CN110473541A
CN110473541A CN201910831030.XA CN201910831030A CN110473541A CN 110473541 A CN110473541 A CN 110473541A CN 201910831030 A CN201910831030 A CN 201910831030A CN 110473541 A CN110473541 A CN 110473541A
Authority
CN
China
Prior art keywords
played
audio data
stereo set
control
control instruction
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910831030.XA
Other languages
Chinese (zh)
Inventor
庄少宏
曾庆法
李叶永
史作超
葛丰达
张勃然
姜秉周
何伟新
莫砺
高正彬
劳凯邦
李蕤秀
潘伟兴
吴亚新
刘湘华
刘学满
张泽远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
GUANGZHOU PANYU JUDA CAR AUDIO EQUIPMENT CO Ltd
Original Assignee
GUANGZHOU PANYU JUDA CAR AUDIO EQUIPMENT CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by GUANGZHOU PANYU JUDA CAR AUDIO EQUIPMENT CO Ltd filed Critical GUANGZHOU PANYU JUDA CAR AUDIO EQUIPMENT CO Ltd
Priority to CN201910831030.XA priority Critical patent/CN110473541A/en
Publication of CN110473541A publication Critical patent/CN110473541A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of sound equipment Alexa sound control method and system based on artificial intelligence, which comprises receive the control voice messaging that user sends over;Judge to send whether the user of the control voice messaging is permission user for voice control stereo set based on Application on Voiceprint Recognition;If so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained;The stereo set responds the control instruction, executes act corresponding with the control instruction.In embodiments of the present invention, artificial intelligence is introduced in sound equipment control, improves user in the experience sense controlled sound equipment.

Description

A kind of sound equipment Alexa sound control method and system based on artificial intelligence
Technical field
The present invention relates to sound equipment control technology field more particularly to a kind of sound equipment Alexa voice controls based on artificial intelligence Method and system processed.
Background technique
Artificial intelligence (Artificial Intelligence), english abbreviation AI, artificial intelligence is since the birth, reason By increasingly mature with technology, application field also constantly expands, it is contemplated that the following artificial intelligence bring sci-tech product, it will It is the wisdom of humanity " container ";Artificial intelligence can consciousness to people, thinking information process simulation;Artificial intelligence is not people Intelligence, but can think deeply as people, may also be more than people intelligence.
With the development of technology, the control mode of current stereo set is also from original by key control to infrared control, It is even developed to network-control etc. now, but current present young man's is had been unable to meet to the control method of stereo set Diversified demand.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, and the present invention provides sound equipment of the kind based on artificial intelligence Alexa sound control method and system meet user to the demand for control of stereo set, improve the usage experience sense of user.
In order to solve the above-mentioned technical problem, the embodiment of the invention provides a kind of sound equipment Alexa language based on artificial intelligence Sound controlling method, which comprises
Receive the control voice messaging that user sends over;
Judge to send whether the user of the control voice messaging is for voice control stereo set based on Application on Voiceprint Recognition Permission user;
If so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained;
The stereo set responds the control instruction, executes act corresponding with the control instruction.
Optionally, before the stereo set response control instruction, further includes:
Judge that the control instruction plays control instruction or playing function switching command for inquiry.
Optionally, when the control instruction is judged as inquiry broadcasting control instruction, comprising:
Control instruction is played according to the inquiry to be searched in the stereo set audio database, is obtained to be played Audio data;
The stereo set is decoded and plays to the audio data to be played;
Wherein, control instruction is played according to the inquiry to be searched in the stereo set audio database, comprising:
It is searched in local audio data library, obtains local audio data to be played;
Judge whether the stereo set connects internet;
If connection, is searched in network audio data library, network audio data to be played is obtained;
Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.
Optionally, when the audio data to be played is greater than one, voice feedback information is produced, Xiang Suoshu user speech is anti- Present the audio data to be played inquired;
It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice messaging, obtain Take family selection result;
The stereo set is decoded and plays according to user's selection result, to the audio data to be played.
Optionally, when the control instruction is judged as playing function switching command, comprising:
The stereo set responds the playing function switching command, by the state of the stereo set switch to it is described The corresponding functional status of playing function switching command.
In addition, the embodiment of the invention also provides a kind of sound equipment Alexa speech control system based on artificial intelligence, described System includes:
Speech reception module: the control voice messaging sended over for receiving user;
Voiceprint identification module: for based on Application on Voiceprint Recognition judge send it is described control voice messaging user whether be for The permission user of voice control stereo set;
Semanteme solves module: for parsing the control voice messaging based on semantics recognition, obtaining the control instruction of user;
Execution module: the control instruction is responded for the stereo set, executes move corresponding with the control instruction Make.
Optionally, before the execution module, further includes:
Instruction judgment module: for judging that the control instruction plays control instruction for inquiry or playing function switching refers to It enables.
Optionally, the execution module, is also configured to:
When the control instruction, which is judged as inquiry, plays control instruction, comprising:
Control instruction is played according to the inquiry to be searched in the stereo set audio database, is obtained to be played Audio data;
The stereo set is decoded and plays to the audio data to be played;
Wherein, control instruction is played according to the inquiry to be searched in the stereo set audio database, comprising:
It is searched in local audio data library, obtains local audio data to be played;
Judge whether the stereo set connects internet;
If connection, is searched in network audio data library, network audio data to be played is obtained;
Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.
Optionally, the execution module, is also configured to:
When the audio data to be played is greater than one, voice feedback information is produced, Xiang Suoshu user speech feedback is looked into The audio data to be played ask;
It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice messaging, obtain Take family selection result;
The stereo set is decoded and plays according to user's selection result, to the audio data to be played.
Optionally, the execution module, is also configured to:
When the control instruction is judged as playing function switching command, comprising:
The stereo set responds the playing function switching command, by the state of the stereo set switch to it is described The corresponding functional status of playing function switching command.
In specific implementation process of the present invention, Application on Voiceprint Recognition is carried out by voice messaging to user, ensures that user can be with The control stereo set of safety, so that stereo set is not controlled arbitrarily by external voice messaging;Artificial intelligence is added in stereo set Can, meet user to the intelligentized control method of stereo set, improves user experience.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it is clear that, the accompanying drawings in the following description is only this Some embodiments of invention for those of ordinary skill in the art without creative efforts, can be with Other attached drawings are obtained according to these attached drawings.
Fig. 1 is the flow diagram of the sound equipment Alexa sound control method based on artificial intelligence in the embodiment of the present invention;
Fig. 2 is that the structure composition of the sound equipment Alexa speech control system based on artificial intelligence in the embodiment of the present invention is shown It is intended to.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts all other Embodiment shall fall within the protection scope of the present invention.
Fig. 1 is the flow diagram of the sound equipment Alexa sound control method based on artificial intelligence in the embodiment of the present invention.
As shown in Figure 1, a kind of sound equipment Alexa sound control method based on artificial intelligence, which comprises
S11: the control voice messaging that user sends over is received;
In specific implementation process of the present invention, the control voice letter that user sends over is received by voice capture device Breath.
S12: judge to send whether the user of the control voice messaging is for voice control sound equipment based on Application on Voiceprint Recognition The permission user of equipment;
In specific implementation process of the present invention, Application on Voiceprint Recognition (Voiceprint Recognition, VPR) is also referred to as said It talks about people and identifies (Speaker Recognition), there is two classes, is i.e. speaker recognizes (Speaker Identification) and says It talks about people and confirms (Speaker Verification).The former is described in which of several people to judge certain section of voice, It is " multiselect one " problem;And the latter is to confirm whether certain section of voice is described in specified someone, is " one-to-one differentiation " Problem.
Specifically, the feature extraction of Application on Voiceprint Recognition is to extract and select have separability strong, stable to the vocal print of speaker The acoustics or language feature of the characteristics such as property height;It include: (1) acoustic feature related with the anatomical structure of pronunciation mechanism of the mankind (such as frequency spectrum, cepstrum, formant, fundamental tone, reflection coefficient etc.), nasal sound, band deep breathing sound, hoarse sound, laugh;(2) by Semanteme, rhetoric, pronunciation, speech habit of the influences such as socioeconomic status, education level, birthplace etc.;(3) personal touch or The features such as the rhythm, rhythm, speed, intonation, the volume that are influenced by parent.From the angle that can be modeled using mathematical method, The feature that vocal print automatic identification model can be used at present includes: (1) acoustic feature (cepstrum);(2) lexical characteristics (speaker Relevant word n-gram, phoneme n-gram);(3) prosodic features (fundamental tone and energy " posture " that are described using n-gram);(4) Languages, dialect and accent information;(5) channel information (which kind of channel used);Etc..
Specific recognition methods has: (1) template matching method: using dynamic time bending (DTW) to be directed at training and survey Characteristic sequence is tried, the application (usually text inter-related task) of fixed phrases is mainly used for;(2) it arest neighbors method: is protected when training Stay all characteristic vectors, when identification finds K nearest in trained vector to each vector, is identified accordingly, usual mould The amount of type storage and similar calculating is all very big;(3) neural network method: there are many kinds of forms, such as Multilayer Perception, radial basis function (RBF) etc., it can explicitly train to distinguish speaker and its background speaker, training burden is very big, and the replicability of model It is bad;(4) hidden Markov model (HMM) method: usually using the HMM or gauss hybrid models (GMM) of list state, it is Popular method, effect are relatively good;(5) VQ clustering method (such as LBG): effect is relatively good, and algorithm complexity is not also high, and HMM method, which cooperates, can more receive better effect;(6) multinomial classifier methods: there is higher precision, but model is deposited Storage and calculation amount are all bigger etc..
Based on above-mentioned method for recognizing sound-groove, whether the user that identification sends the control voice messaging is for voice control The permission user of stereo set processed;If so, entering in next step, if it is not, then returning to S11.
S13: if so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained;
In specific implementation process of the present invention, when judging that the control voice messaging uses for corresponding permission in Application on Voiceprint Recognition After family issues, the control voice messaging is identified using based on semantics recognition algorithm, is obtaining the control voice messaging Semanteme after, by the semanteme produce user control instruction, then obtain user control instruction.
S14: the stereo set responds the control instruction, executes act corresponding with the control instruction.
In specific implementation process of the present invention, the stereo set is responded before the control instruction, further includes: judges institute It states control instruction and plays control instruction or playing function switching command for inquiry.
Further, when the control instruction is judged as inquiry broadcasting control instruction, comprising: played according to the inquiry Control instruction is searched in the stereo set audio database, obtains audio data to be played;The stereo set pair The audio data to be played is decoded and plays;Wherein, control instruction is played in the stereo set according to the inquiry It is searched in audio database, comprising: searched in local audio data library, obtain local audio data to be played; Judge whether the stereo set connects internet;If connection, is searched in network audio data library, obtains network and wait broadcasting Put audio data;Local audio data to be played and network audio data to be played are merged, audio number to be played is obtained According to.
Further, when the audio data to be played is greater than one, voice feedback information, Xiang Suoshu user speech are produced The audio data to be played that feedback query arrives;It receives user and is based on voice feedback delivering selection voice messaging, parsing The selection voice messaging obtains user's selection result;The stereo set is according to user's selection result, to described wait broadcast Audio data is put to be decoded and play.
Further, when the control instruction is judged as playing function switching command, comprising: the stereo set response The state of the stereo set is switched to function corresponding with the playing function switching command by the playing function switching command It can state.
Specifically, first determining whether that the control instruction plays control for inquiry after stereo set receives the control instruction System instruction or playing function switching command;When playing control instruction if inquiry, then control instruction is played according to the inquiry It is searched in the stereo set audio database, obtains audio data to be played;The stereo set is to described wait broadcast Audio data is put to be decoded and play;Wherein, control instruction is played in the stereo set audio data according to the inquiry It is searched in library, comprising: searched in local audio data library, obtain local audio data to be played;Described in judgement Whether stereo set connects internet;If connection, is searched in network audio data library, network audio number to be played is obtained According to;Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.Wherein, When the audio data to be played be greater than one when, produce voice feedback information, Xiang Suoshu user speech feedback query arrive to Playing audio-fequency data;It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice letter Breath obtains user's selection result;The stereo set carries out the audio data to be played according to user's selection result It decodes and plays.
If judge playing function switching command, the stereo set responds the playing function switching command, by institute The state for stating stereo set switches to functional status corresponding with the playing function switching command.
In specific implementation process of the present invention, Application on Voiceprint Recognition is carried out by voice messaging to user, ensures that user can be with The control stereo set of safety, so that stereo set is not controlled arbitrarily by external voice messaging;Artificial intelligence is added in stereo set Can, meet user to the intelligentized control method of stereo set, improves user experience.
Fig. 2 is that the structure composition of the sound equipment Alexa speech control system based on artificial intelligence in the embodiment of the present invention is shown It is intended to.
As shown in Fig. 2, a kind of sound equipment Alexa speech control system based on artificial intelligence, the system comprises:
Speech reception module 11: the control voice messaging sended over for receiving user;
In specific implementation process of the present invention, the control voice letter that user sends over is received by voice capture device Breath.
Voiceprint identification module 12: send whether the user of the control voice messaging is use for judging based on Application on Voiceprint Recognition In the permission user of voice control stereo set;
In specific implementation process of the present invention, Application on Voiceprint Recognition (Voiceprint Recognition, VPR) is also referred to as said It talks about people and identifies (Speaker Recognition), there is two classes, is i.e. speaker recognizes (Speaker Identification) and says It talks about people and confirms (Speaker Verification).The former is described in which of several people to judge certain section of voice, It is " multiselect one " problem;And the latter is to confirm whether certain section of voice is described in specified someone, is " one-to-one differentiation " Problem.
Specifically, the feature extraction of Application on Voiceprint Recognition is to extract and select have separability strong, stable to the vocal print of speaker The acoustics or language feature of the characteristics such as property height;It include: (1) acoustic feature related with the anatomical structure of pronunciation mechanism of the mankind (such as frequency spectrum, cepstrum, formant, fundamental tone, reflection coefficient etc.), nasal sound, band deep breathing sound, hoarse sound, laugh;(2) by Semanteme, rhetoric, pronunciation, speech habit of the influences such as socioeconomic status, education level, birthplace etc.;(3) personal touch or The features such as the rhythm, rhythm, speed, intonation, the volume that are influenced by parent.From the angle that can be modeled using mathematical method, The feature that vocal print automatic identification model can be used at present includes: (1) acoustic feature (cepstrum);(2) lexical characteristics (speaker Relevant word n-gram, phoneme n-gram);(3) prosodic features (fundamental tone and energy " posture " that are described using n-gram);(4) Languages, dialect and accent information;(5) channel information (which kind of channel used);Etc..
Specific recognition methods has: (1) template matching method: using dynamic time bending (DTW) to be directed at training and survey Characteristic sequence is tried, the application (usually text inter-related task) of fixed phrases is mainly used for;(2) it arest neighbors method: is protected when training Stay all characteristic vectors, when identification finds K nearest in trained vector to each vector, is identified accordingly, usual mould The amount of type storage and similar calculating is all very big;(3) neural network method: there are many kinds of forms, such as Multilayer Perception, radial basis function (RBF) etc., it can explicitly train to distinguish speaker and its background speaker, training burden is very big, and the replicability of model It is bad;(4) hidden Markov model (HMM) method: usually using the HMM or gauss hybrid models (GMM) of list state, it is Popular method, effect are relatively good;(5) VQ clustering method (such as LBG): effect is relatively good, and algorithm complexity is not also high, and HMM method, which cooperates, can more receive better effect;(6) multinomial classifier methods: there is higher precision, but model is deposited Storage and calculation amount are all bigger etc..
Based on above-mentioned method for recognizing sound-groove, whether the user that identification sends the control voice messaging is for voice control The permission user of stereo set processed;If so, entering in next step, if it is not, then returning to S11.
Semanteme solves module 13: for parsing the control voice messaging based on semantics recognition, the control for obtaining user refers to It enables;
In specific implementation process of the present invention, when judging that the control voice messaging uses for corresponding permission in Application on Voiceprint Recognition After family issues, the control voice messaging is identified using based on semantics recognition algorithm, is obtaining the control voice messaging Semanteme after, by the semanteme produce user control instruction, then obtain user control instruction.
Execution module 14: the control instruction is responded for the stereo set, is executed corresponding with the control instruction Movement.
In specific implementation process of the present invention, before the execution module 14, further includes: instruction judgment module: for sentencing The control instruction of breaking is that inquiry plays control instruction or playing function switching command.
Further, the execution module, is also configured to: referring to when the control instruction is judged as inquiry broadcasting control When enabling, comprising: play control instruction according to the inquiry and searched in the stereo set audio database, obtained wait broadcast Put audio data;The stereo set is decoded and plays to the audio data to be played;Wherein, it is broadcast according to the inquiry Control instruction is put to be searched in the stereo set audio database, comprising: it is searched in local audio data library, Obtain local audio data to be played;Judge whether the stereo set connects internet;If connection, in network audio data library In searched, obtain network audio data to be played;Will local audio data to be played and network audio data to be played into Row merges, and obtains audio data to be played.
Further, the execution module, is also configured to: when the audio data to be played is greater than one, production Voice feedback information, the audio data to be played that Xiang Suoshu user speech feedback query arrives;It is anti-based on the voice to receive user Feedforward information issues selection voice messaging, parses the selection voice messaging, obtains user's selection result;The stereo set according to User's selection result is decoded and plays to the audio data to be played.
Further, the execution module, is also configured to: referring to when the control instruction is judged as playing function switching When enabling, comprising: the stereo set responds the playing function switching command, and the state of the stereo set is switched to and institute State the corresponding functional status of playing function switching command.
Specifically, first determining whether that the control instruction plays control for inquiry after stereo set receives the control instruction System instruction or playing function switching command;When playing control instruction if inquiry, then control instruction is played according to the inquiry It is searched in the stereo set audio database, obtains audio data to be played;The stereo set is to described wait broadcast Audio data is put to be decoded and play;Wherein, control instruction is played in the stereo set audio data according to the inquiry It is searched in library, comprising: searched in local audio data library, obtain local audio data to be played;Described in judgement Whether stereo set connects internet;If connection, is searched in network audio data library, network audio number to be played is obtained According to;Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.Wherein, When the audio data to be played be greater than one when, produce voice feedback information, Xiang Suoshu user speech feedback query arrive to Playing audio-fequency data;It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice letter Breath obtains user's selection result;The stereo set carries out the audio data to be played according to user's selection result It decodes and plays.
If judge playing function switching command, the stereo set responds the playing function switching command, by institute The state for stating stereo set switches to functional status corresponding with the playing function switching command.
In specific implementation process of the present invention, Application on Voiceprint Recognition is carried out by voice messaging to user, ensures that user can be with The control stereo set of safety, so that stereo set is not controlled arbitrarily by external voice messaging;Artificial intelligence is added in stereo set Can, meet user to the intelligentized control method of stereo set, improves user experience.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium may include: read-only memory (ROM, ReadOnly Memory), random access memory (RAM, Random Access Memory), disk or CD etc..
In addition, being provided for the embodiments of the invention a kind of voice control side sound equipment Alexa based on artificial intelligence above Method and system are described in detail, and should use specific case herein and be explained the principle of the present invention and embodiment It states, the above description of the embodiment is only used to help understand the method for the present invention and its core ideas;Meanwhile for this field Those skilled in the art, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, to sum up institute It states, the contents of this specification are not to be construed as limiting the invention.

Claims (10)

1. a kind of sound equipment Alexa sound control method based on artificial intelligence, which is characterized in that the described method includes:
Receive the control voice messaging that user sends over;
Judge to send whether the user of the control voice messaging is power for voice control stereo set based on Application on Voiceprint Recognition Limit the use of family;
If so, parsing the control voice messaging based on semantics recognition, the control instruction of user is obtained;
The stereo set responds the control instruction, executes act corresponding with the control instruction.
2. sound equipment Alexa sound control method according to claim 1, which is characterized in that the stereo set responds institute Before stating control instruction, further includes:
Judge that the control instruction plays control instruction or playing function switching command for inquiry.
3. sound equipment Alexa sound control method according to claim 2, which is characterized in that the control instruction is judged When playing control instruction for inquiry, comprising:
Control instruction is played according to the inquiry to be searched in the stereo set audio database, and audio to be played is obtained Data;
The stereo set is decoded and plays to the audio data to be played;
Wherein, control instruction is played according to the inquiry to be searched in the stereo set audio database, comprising:
It is searched in local audio data library, obtains local audio data to be played;
Judge whether the stereo set connects internet;
If connection, is searched in network audio data library, network audio data to be played is obtained;
Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.
4. sound equipment Alexa sound control method according to claim 3, which is characterized in that the audio data to be played When greater than one, voice feedback information, the audio data to be played that Xiang Suoshu user speech feedback query arrives are produced;
It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice messaging, obtain and use Family selection result;
The stereo set is decoded and plays according to user's selection result, to the audio data to be played.
5. sound equipment Alexa sound control method according to claim 2, which is characterized in that the control instruction is judged When for playing function switching command, comprising:
The stereo set responds the playing function switching command, and the state of the stereo set is switched to and the broadcasting Function switch instructs corresponding functional status.
6. a kind of sound equipment Alexa speech control system based on artificial intelligence, which is characterized in that the system comprises:
Speech reception module: the control voice messaging sended over for receiving user;
Voiceprint identification module: send whether the user of the control voice messaging is for voice for judging based on Application on Voiceprint Recognition Control the permission user of stereo set;
Semanteme solves module: for parsing the control voice messaging based on semantics recognition, obtaining the control instruction of user;
Execution module: the control instruction is responded for the stereo set, executes act corresponding with the control instruction.
7. sound equipment Alexa speech control system according to claim 6, which is characterized in that before the execution module, also Include:
Instruction judgment module: for judging that the control instruction plays control instruction or playing function switching command for inquiry.
8. sound equipment Alexa speech control system according to claim 7, which is characterized in that the execution module also configures For:
When the control instruction, which is judged as inquiry, plays control instruction, comprising:
Control instruction is played according to the inquiry to be searched in the stereo set audio database, and audio to be played is obtained Data;
The stereo set is decoded and plays to the audio data to be played;
Wherein, control instruction is played according to the inquiry to be searched in the stereo set audio database, comprising:
It is searched in local audio data library, obtains local audio data to be played;
Judge whether the stereo set connects internet;
If connection, is searched in network audio data library, network audio data to be played is obtained;
Local audio data to be played and network audio data to be played are merged, audio data to be played is obtained.
9. sound equipment Alexa speech control system according to claim 8, which is characterized in that the execution module also configures For:
When the audio data to be played is greater than one, voice feedback information is produced, Xiang Suoshu user speech feedback query arrives Audio data to be played;
It receives user and is based on voice feedback delivering selection voice messaging, parse the selection voice messaging, obtain and use Family selection result;
The stereo set is decoded and plays according to user's selection result, to the audio data to be played.
10. sound equipment Alexa speech control system according to claim 7, which is characterized in that the execution module is also matched It sets and is used for:
When the control instruction is judged as playing function switching command, comprising:
The stereo set responds the playing function switching command, and the state of the stereo set is switched to and the broadcasting Function switch instructs corresponding functional status.
CN201910831030.XA 2019-09-02 2019-09-02 A kind of sound equipment Alexa sound control method and system based on artificial intelligence Pending CN110473541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910831030.XA CN110473541A (en) 2019-09-02 2019-09-02 A kind of sound equipment Alexa sound control method and system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910831030.XA CN110473541A (en) 2019-09-02 2019-09-02 A kind of sound equipment Alexa sound control method and system based on artificial intelligence

Publications (1)

Publication Number Publication Date
CN110473541A true CN110473541A (en) 2019-11-19

Family

ID=68514860

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910831030.XA Pending CN110473541A (en) 2019-09-02 2019-09-02 A kind of sound equipment Alexa sound control method and system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN110473541A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111600782A (en) * 2020-04-28 2020-08-28 百度在线网络技术(北京)有限公司 Control method and device of intelligent voice equipment, electronic equipment and storage medium

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886161A (en) * 2015-12-16 2017-06-23 美的集团股份有限公司 The control method of audio amplifier, system and audio amplifier
CN107247768A (en) * 2017-06-05 2017-10-13 北京智能管家科技有限公司 Method for ordering song by voice, device, terminal and storage medium
CN108735205A (en) * 2018-04-17 2018-11-02 上海康斐信息技术有限公司 A kind of control method and intelligent sound box of intelligent sound box
CN108769790A (en) * 2018-06-29 2018-11-06 百度在线网络技术(北京)有限公司 Method and apparatus for control terminal
CN108831489A (en) * 2018-06-21 2018-11-16 四川斐讯信息技术有限公司 A kind of speaker control method and system
CN108877790A (en) * 2018-05-21 2018-11-23 江西午诺科技有限公司 Speaker control method, device, readable storage medium storing program for executing and mobile terminal
CN109410951A (en) * 2018-11-21 2019-03-01 广州番禺巨大汽车音响设备有限公司 Audio controlling method, system and stereo set based on Alexa voice control
CN109412910A (en) * 2018-11-20 2019-03-01 三星电子(中国)研发中心 The method and apparatus for controlling smart home device
CN109737521A (en) * 2018-03-07 2019-05-10 北京三五二环保科技有限公司 Air purifier with voice control function

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886161A (en) * 2015-12-16 2017-06-23 美的集团股份有限公司 The control method of audio amplifier, system and audio amplifier
CN107247768A (en) * 2017-06-05 2017-10-13 北京智能管家科技有限公司 Method for ordering song by voice, device, terminal and storage medium
CN109737521A (en) * 2018-03-07 2019-05-10 北京三五二环保科技有限公司 Air purifier with voice control function
CN108735205A (en) * 2018-04-17 2018-11-02 上海康斐信息技术有限公司 A kind of control method and intelligent sound box of intelligent sound box
CN108877790A (en) * 2018-05-21 2018-11-23 江西午诺科技有限公司 Speaker control method, device, readable storage medium storing program for executing and mobile terminal
CN108831489A (en) * 2018-06-21 2018-11-16 四川斐讯信息技术有限公司 A kind of speaker control method and system
CN108769790A (en) * 2018-06-29 2018-11-06 百度在线网络技术(北京)有限公司 Method and apparatus for control terminal
CN109412910A (en) * 2018-11-20 2019-03-01 三星电子(中国)研发中心 The method and apparatus for controlling smart home device
CN109410951A (en) * 2018-11-21 2019-03-01 广州番禺巨大汽车音响设备有限公司 Audio controlling method, system and stereo set based on Alexa voice control

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111600782A (en) * 2020-04-28 2020-08-28 百度在线网络技术(北京)有限公司 Control method and device of intelligent voice equipment, electronic equipment and storage medium
CN111600782B (en) * 2020-04-28 2021-05-18 百度在线网络技术(北京)有限公司 Control method and device of intelligent voice equipment, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US11908468B2 (en) Dialog management for multiple users
Yamagishi et al. Robust speaker-adaptive HMM-based text-to-speech synthesis
US9484030B1 (en) Audio triggered commands
CN107767861B (en) Voice awakening method and system and intelligent terminal
US11538478B2 (en) Multiple virtual assistants
US11830485B2 (en) Multiple speech processing system with synthesized speech styles
WO2017071182A1 (en) Voice wakeup method, apparatus and system
Shriberg Higher-level features in speaker recognition
Prahallad et al. Sub-phonetic modeling for capturing pronunciation variations for conversational speech synthesis
US11887580B2 (en) Dynamic system response configuration
US11302329B1 (en) Acoustic event detection
US11715472B2 (en) Speech-processing system
CN106653002A (en) Literal live broadcasting method and platform
Savchenko et al. Towards the creation of reliable voice control system based on a fuzzy approach
US20240071385A1 (en) Speech-processing system
US11783824B1 (en) Cross-assistant command processing
Ons et al. Fast vocabulary acquisition in an NMF-based self-learning vocal user interface
BenZeghiba et al. User-customized password speaker verification using multiple reference and background models
CN110473541A (en) A kind of sound equipment Alexa sound control method and system based on artificial intelligence
WO2023154427A1 (en) Voice adaptation using synthetic speech processing
WO2023107244A1 (en) Multiple wakeword detection
US11887583B1 (en) Updating models with trained model update objects
US11763809B1 (en) Access to multiple virtual assistants
US11735178B1 (en) Speech-processing system
US8024191B2 (en) System and method of word lattice augmentation using a pre/post vocalic consonant distinction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191119