WO2019242312A1 - 家电设备的唤醒词训练方法、装置及家电设备 - Google Patents

家电设备的唤醒词训练方法、装置及家电设备 Download PDF

Info

Publication number
WO2019242312A1
WO2019242312A1 PCT/CN2019/074317 CN2019074317W WO2019242312A1 WO 2019242312 A1 WO2019242312 A1 WO 2019242312A1 CN 2019074317 W CN2019074317 W CN 2019074317W WO 2019242312 A1 WO2019242312 A1 WO 2019242312A1
Authority
WO
WIPO (PCT)
Prior art keywords
word
wake
training
feature information
home appliance
Prior art date
Application number
PCT/CN2019/074317
Other languages
English (en)
French (fr)
Inventor
孙裕文
谭博钊
Original Assignee
广东美的厨房电器制造有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN201810628693.7A external-priority patent/CN109036393A/zh
Priority claimed from CN201810885079.9A external-priority patent/CN109166571B/zh
Application filed by 广东美的厨房电器制造有限公司 filed Critical 广东美的厨房电器制造有限公司
Publication of WO2019242312A1 publication Critical patent/WO2019242312A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/28Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]

Definitions

  • the present disclosure relates to the technical field of household appliances, and in particular, to a wake word training method and device for household appliances and household appliances.
  • voice recognition technology is mainly divided into two categories.
  • One is cloud-based semantic recognition.
  • Voice signals are transmitted to the server through the network for semantic analysis and understanding, and the results are transmitted through the network.
  • Typical representatives Apple's Siri (voice assistant), Amazon's echo speaker, Microsoft Xiaobing, etc.
  • Apple's Siri voice assistant
  • Amazon's echo speaker Microsoft Xiaobing
  • This method must have a network to be used, and the usage scenarios are limited.
  • the other is local entry recognition, which does not require the use of a network. It can process voice control command words in real time through the embedded high-performance processor. However, it can only recognize pre-set voice control command terms, and it needs to recognize the complete voice control command terms before responding. It cannot realize free semantic understanding and the user experience is not high.
  • the present disclosure provides a wake word training method and device for a home appliance, and a home appliance to implement a user-defined wake word to meet a user's personalized needs.
  • An embodiment of the first aspect of the present disclosure provides a wakeup word training method for a home appliance, including:
  • the speech data sample of the wake-up word is saved in a custom wake-up word library.
  • the method further includes:
  • the method further includes:
  • the voice data samples After collecting the voice data samples of the wake word, the voice data samples are denoised.
  • the method further includes:
  • control the home appliance Before collecting voice data samples of the wake word, control the home appliance to enter a custom wake word mode.
  • the method further includes:
  • detecting and determining that the voice information is a custom wake-up word based on the custom wake-up thesaurus includes:
  • the method further includes:
  • the method for training wake-up words of a home appliance includes detecting voice data samples of wake-up words, extracting feature information of the voice data samples, and normalizing the feature information to detect and determine the normalization.
  • the normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.
  • An embodiment of the second aspect of the present disclosure provides another wake word training method for a home appliance, including:
  • N is a positive integer.
  • the method further includes:
  • the N-th training awakening word After the N-th training awakening word succeeds, it is determined that the awakening word is effective, and the effective awakening word is saved locally.
  • the method further includes:
  • training the awake word, and detecting and determining that the training awake word is successful includes:
  • the next wake-word acquisition and training is performed until the N-th training wake-word is successful, including:
  • the wake word training method for a home appliance includes controlling the home appliance to enter a custom wake word mode, collecting the entered wake word, and training the wake word to detect and determine that the training wake word is successful. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.
  • An embodiment of the third aspect of the present disclosure provides a wake word training apparatus for a home appliance, including:
  • a first acquisition module configured to collect speech data samples of wake words
  • An extraction module configured to extract feature information of the voice data samples
  • the first saving module is configured to save the speech data samples of the wake-up word to a custom wake-up word bank.
  • the first acquisition module is further configured to:
  • the apparatus further includes:
  • the preprocessing module is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
  • the apparatus further includes:
  • the first control module is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
  • the apparatus further includes:
  • a first receiving module configured to receive input voice information
  • a recognition module configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus;
  • the first wake-up module is configured to generate a wake-up instruction and wake up the home appliance according to the wake-up instruction.
  • the identification module is configured to:
  • the apparatus further includes:
  • a prompting module is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
  • the wake word training device for a home appliance in the embodiment of the present disclosure detects voice data samples of wake words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalization.
  • the normalized feature information satisfies a preset condition, and the voice data samples of the wake-up word are stored in a custom wake-up word bank, thereby realizing a user-defined wake-up word and satisfying the personalized needs of the user.
  • An embodiment of the fourth aspect of the present disclosure provides another wake word training device for a home appliance, including:
  • a second control module for controlling a home appliance to enter a custom wake-up word mode
  • a second acquisition module configured to collect an input wake-up word
  • the apparatus further includes:
  • the second saving module is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
  • the apparatus further includes:
  • a second receiving module configured to receive an input effective wake-up word after determining that the wake-up word is valid
  • the second wake-up module is configured to wake up the home appliance according to the valid wake-up word.
  • the training module is configured to:
  • the training module is further configured to:
  • the wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained. One training, until the Nth training of the wake word is successful, so that the user can customize the wake word to meet the user's personalized needs, and the trained wake word is highly accurate.
  • An embodiment of the fifth aspect of the present disclosure provides a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as described in the embodiment of the first aspect.
  • An embodiment of the sixth aspect of the present disclosure provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor, where the processor is configured to execute the implementation as in the first aspect.
  • the wake-up word training method of a home appliance according to the example, or the wake-up word training method of a home appliance according to the embodiment of the second aspect is performed.
  • FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure
  • FIG. 2 is a schematic flowchart of training the same wake word multiple times according to an embodiment of the present disclosure
  • FIG. 3 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 2 of the present disclosure
  • FIG. 4 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 3 of the present disclosure
  • FIG. 5 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 4 of the present disclosure
  • FIG. 6 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 5 of the present disclosure
  • FIG. 7 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 6 of the present disclosure.
  • FIG. 8 is a flowchart of a wake-up word training method for a home appliance according to Embodiment 7 of the present disclosure.
  • FIG. 9 is a schematic flowchart of wake word training according to a specific example of the present disclosure.
  • FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure.
  • FIG. 11 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 9 of the present disclosure.
  • FIG. 12 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 10 of the present disclosure.
  • FIG. 13 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 11 of the present disclosure
  • FIG. 14 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 12 of the present disclosure.
  • FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure.
  • FIG. 16 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 14 of the present disclosure.
  • FIG. 17 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 15 of the present disclosure.
  • speech recognition technology mainly includes cloud semantic recognition and local entry recognition.
  • Cloud semantic recognition must rely on the network to be used, and the use scenarios are limited.
  • Local entry recognition can only recognize pre-set voice control command entries, and cannot achieve free semantic understanding.
  • the present disclosure proposes a wake-up word training method for home appliances, which can customize local wake-up words to meet personalized needs, does not need to rely on the network, has fast response speed, and is not limited by scenarios.
  • FIG. 1 is a flowchart of a wake-up word training method for a home appliance according to a first embodiment of the present disclosure.
  • a wake-up word training method for a home appliance includes:
  • a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.
  • the user may first control the home appliance to enter the user-defined wake-up word mode.
  • the way of entering may be to trigger a physical button or issue a voice command.
  • the user can be reminded of the wake-up word that they want to set.
  • the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone. For example, the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.
  • feature information can be extracted using MFCC (Mel Frequency Cepstral Coefficient) or other feature extraction algorithms.
  • MFCC Mel Frequency Cepstral Coefficient
  • speech can be divided into low frequency, intermediate frequency and high frequency. Therefore, when extracting features, feature information can be extracted separately in the low frequency range, the intermediate frequency range, and the high frequency range.
  • the weight information corresponding to the feature information of the low frequency range, the feature information of the intermediate frequency range and the feature information of the high frequency range are different.
  • the feature information may include features such as sound intensity, male voice, or female voice.
  • step S103 Normalize the feature information, and determine whether the normalized feature information meets a preset condition. If yes, perform step S104, and if not, perform step S105.
  • normalization is to limit the feature information to a certain range after processing (by a certain algorithm), so that the normalized feature information can be compared and judged with preset conditions. For example: whether the length feature is too short or too long compared to the preset length range; or whether the strength feature is too large or too small compared to the preset strength range, and so on.
  • the training of the wake-up word is successful, that is, the voice data sample of the wake-up word is saved in the custom wake-up word library.
  • the wake-up word training method for home appliances may further include:
  • the home appliance may remind the user to re-enter the wake-up word, and thereby re-collect voice data samples of the wake-up word.
  • the same arousal word training can be performed multiple times.
  • the same wake-up words spoken by the user are collected three times, and feature information is extracted, normalized, and then the normalized feature information is filtered to detect feature information that meets the conditions (training Wake word for success).
  • the trained wake-up words are stored in a local custom wake-up dictionary.
  • the method for training wake-up words of a home appliance collects voice data samples of wake-up words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.
  • the wake-up word training method for a home appliance may further include:
  • the collected voice data samples need to be denoised first to avoid noise effects and improve accuracy.
  • the wake-up word training method for a home appliance may further include:
  • S401 Receive input voice information.
  • voice information input by a user may be received.
  • step S402. Identify whether the voice information is a custom wake-up word based on the custom wake-up lexicon. If yes, perform step S403; if no, perform step S404.
  • whether the voice information is a custom wake-up word can be identified based on the custom wake-up thesaurus.
  • the feature information of the voice information can be extracted, the feature information of the voice information can be normalized, and then the feature information of the voice information and the feature information of all the wake-up words in the custom wake-up vocabulary are adopted by using a dynamic time planning algorithm. Compare. For example, the similarity between the feature information B of the voice information and the feature information of the wake-up word A1, the feature information of the wake-up word A2, and the feature information of the wake-up word A3 in the custom wake-up thesaurus are calculated respectively.
  • the comparison result with the highest similarity is obtained. If the comparison result with the highest similarity satisfies the set value, it is determined that the voice information is a custom wakeup word; if the comparison result with the highest similarity does not satisfy the set value, it is determined that the voice information is not a custom wakeup word.
  • S403 Generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
  • a wake-up instruction is generated, and the home appliance is woken up according to the wake-up instruction.
  • the voice information is not a custom wake-up word, prompting to re-enter the voice information, thereby improving the success rate of the home appliance being woken up.
  • the local customized wake-up word dictionary is used to identify whether the voice information input by the user is a customized wake-up word. Compared with traditional network recognition, the response speed is faster, and it is not limited by the network, and the usage scenarios are more abundant.
  • FIG. 5 is a flowchart of a wake-up word training method for home appliances provided in Embodiment 4 of the present disclosure.
  • a wake-up word training method for a home appliance includes:
  • a custom wake-up word mode is set for the home appliance device, so that the user can train a custom wake-up word that meets his own needs.
  • the user before training the user-defined wake-up word, the user may first control the home appliance to enter the user-defined wake-up word mode.
  • the way of entering may be to trigger a physical button or issue a voice command.
  • S502 Collect the input wake-up word.
  • the user can be reminded of the wake-up word that they want to set.
  • the user speaks the wake word.
  • the home appliance can collect voice data samples of the wake word in a preset audio format through a voice input device such as a microphone.
  • the sound signal is collected in a format with a sampling frequency of 16Khz and a transmission rate of 16Bit. If the user does not say the wake word within 5 seconds, the user may be reminded to re-enter it.
  • step S503 Train the wake-up words and determine whether the training of wake-up words is successful. If yes, go to step S504; if no, go to step S502.
  • the feature information of the arousal word may be extracted first, and then the feature information is compared with the preset standard to determine whether the feature information meets the preset standard.
  • a suitable maximum time length can be set for the wake-up word.
  • the quality and consistency of the training corpus need to be strictly ensured. Therefore, in the entire training process, it is necessary to judge from the size of the voice, the length of the voice, the similarity of the voice, the complexity of the voice, and the environmental noise. Whether the wake word meets the preset criteria.
  • the collected wake-up word is a time-domain signal
  • the time-domain signal can be converted into a frequency-domain signal (characteristic information is extracted), and then compared and analyzed.
  • Judgment of voice sound level First set 4 predefined thresholds according to the experimental results, which respectively represent the maximum volume vh, the minimum volume vl, the maximum value above the maximum volume vhm, and the maximum value below the minimum volume vlm. Then, the number of training corpus above the maximum volume vhr and the number below the minimum volume vhr are counted. If vhr> vhm, it means that the sound is too loud; if vlr> vlm, it means that the sound is too low. If vhr ⁇ vhm and vlr ⁇ vlm, it means that the voice sound level meets the standard.
  • Judgment of speech length It can be divided into two parts, super long judgment and too short judgment. Both the overlength determination and the overlength determination are based on the characteristics of the fixed length of the training corpus, combined with the signal-to-noise ratio of the front-end speech and the back-end speech. If the power of the back-end voice does not decrease relative to the power of the front-end voice, it means that the voice is too long; if the power of the back-end voice decreases relative to the power of the previous-stage voice, it means that the voice is too short.
  • the threshold of similarity is predefined. Then the cosine distance is used to judge the similarity between different voices. If the similarity is greater than the threshold, it indicates similarity; otherwise, it indicates dissimilarity.
  • Speech complexity judgment Use the peak characteristics of the training corpus. If the number of peaks is greater than a predefined threshold, it means that the training corpus is qualified, otherwise it means unqualified.
  • Environmental noise judgment Use environmental characteristics to set the noise threshold. Analyze the training corpus. If the noise of the training corpus is lower than the threshold, it indicates that the environment is suitable, otherwise, it indicates that the noise is too large.
  • N is a positive integer.
  • the second training awakening word can be performed. If the first training wake word is unsuccessful, the first training wake word is re-performed. In addition, when training the wake word, if the number of consecutive unsuccessful trainings reaches 3 times, a prompt message can be generated.
  • the information content can be "Wake word training failed, please enter other wake words for training", etc., so as to remind the user to change Easier to train successful wakeup words.
  • S602 Detect and determine that the feature information of the awake words inputted at the Mth time meets a preset standard, and perform similarity calculation on the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times.
  • S603 Detect and determine that the similarity between the feature information of the awake words inputted at the Mth time and the feature information of the awake words inputted at the first M-1 times is greater than a preset similarity, and it is determined that the training of the awake words is successful.
  • the process of training the wake word each time may specifically adopt a method of recording sound multiple times.
  • the sound signals input by the user three times can be collected, the feature information of the three arousal words can be extracted, and their average value can be used to train as the feature information of the awake words of the first training. Success rate of training wake words.
  • the wake word training method for a home appliance in the embodiment of the present disclosure is to control the home appliance to enter a custom wake word mode, collect the entered wake word, and train the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.
  • the wake-up word training method for a home appliance may further include:
  • the effective wake-up words are saved in the local custom wake-up thesaurus.
  • the wake-up word training method for a home appliance may further include:
  • the feature information of the effective wake-up words input can be extracted, and then compared with the feature information stored in the custom wake-up thesaurus. If the similarity between the two is higher than a preset value, a wake-up instruction may be generated, and the home appliance may be woken up according to the wake-up instruction. Otherwise, wake-up appliances are unsuccessful.
  • the speech recognition device is installed in the cooking equipment so that the cooking equipment has a speech recognition function.
  • the factory setting of the cooking device is: the command word for starting a custom training wake-up word is "change a name”.
  • the voice recognition device voice module is activated.
  • the user says “change a name”, and the cooking device can enter a mode of custom training wake word.
  • the cooking device can play "Please say a new wake-up word after a beep”.
  • the user speaks a new wake-up word according to the prompt voice.
  • the cooking device receives the new wake-up word and determines whether the new wake-up word is successfully trained. If the training is successful, the cooking device may give a voice prompt "Training is successful, please say the wake word again”; if the training is not successful, the cooking device may give a voice prompt "Sound **, please say the wake word again”.
  • ** can be "too small”, “too big”, “too long”, “too short”, “too simple”, “inconsistent with the last training result” and so on.
  • the above training steps are repeated, and when the third training is successful, the cooking device may perform a voice prompt "Training is completed, and the new wake-up word has taken effect", thereby ending the training.
  • the training wake-up word process can be shown in FIG. 9.
  • the present disclosure also proposes a wake word training apparatus for a home appliance.
  • FIG. 10 is a structural block diagram of a wake-up word training apparatus for a home appliance provided in Embodiment 8 of the present disclosure.
  • the wake-up word training device for a home appliance may include: a first acquisition module 110, an extraction module 120, a determination module 130, and a first storage module 140.
  • the first collection module 110 is configured to collect voice data samples of wake words.
  • the first collection module 110 is further configured to detect and determine that the normalized feature information does not satisfy the preset condition, and re-collect voice data samples of the wake-up word.
  • the extraction module 120 is configured to extract feature information of a voice data sample.
  • the judging module 130 is configured to normalize the feature information, and detect and determine that the normalized feature information meets a preset condition.
  • the first saving module 140 is configured to save a voice data sample of the wake-up word into a custom wake-up word bank.
  • the wake-up word training apparatus for a home appliance may further include a pre-processing module 150.
  • the preprocessing module 150 is configured to perform denoising processing on the voice data samples after collecting the voice data samples of the wake word.
  • the wake-up word training apparatus for a home appliance may further include a first control module 160.
  • the control module 160 is configured to control the home appliance to enter a custom wake-up word mode before collecting voice data samples of the wake-up word.
  • the wake-up word training device for a home appliance may further include a first receiving module 210, a recognition module 220, and a first wake-up module 230.
  • the first receiving module 210 is configured to receive input voice information.
  • the recognition module 220 is configured to detect and determine that the voice information is a custom wake-up word based on the custom wake-up thesaurus.
  • the recognition module 220 is configured to: extract feature information of the voice information; normalize the feature information of the voice information, and use a dynamic time planning algorithm to The feature information is compared with the feature information of the awakened words in the custom wake-up vocabulary; the comparison result with the highest similarity is obtained; the comparison result with the highest similarity is detected and determined to satisfy the set value, and the voice information is determined For custom wake up words.
  • the first wake-up module 230 is configured to detect and determine that the voice information is a custom wake-up word, generate a wake-up instruction, and wake up the home appliance according to the wake-up instruction.
  • the wake-up word training apparatus for a home appliance may further include a prompting module 240.
  • the prompting module 240 is configured to detect and determine that the voice information is not a custom wake-up word, and prompt to re-enter the voice information.
  • the apparatus for awakening word training of a home appliance in the embodiment of the present disclosure collects voice data samples of the awakening words, extracts feature information of the voice data samples, and normalizes the feature information to detect and determine the normalized feature information. Satisfy the preset conditions, and save the voice data samples of the wake-up words into the custom wake-up thesaurus, so as to achieve user-defined wake-up words and meet the personalized needs of users.
  • the present disclosure also proposes a wake word training apparatus for a home appliance.
  • FIG. 15 is a structural block diagram of a wake-up word training apparatus for a home appliance according to Embodiment 13 of the present disclosure.
  • the wake word training device for a home appliance may include a second control module 310, a second acquisition module 320, and a training module 330.
  • the second control module 310 is configured to control a home appliance to enter a custom wake-up word mode.
  • the second collection module 320 is configured to collect an input wake-up word.
  • the training module 330 is configured to train the wake-up words, detect and determine that the training wake-up words are successful, the collection module 320 performs the next collection, and the training module 330 performs the next training until the N-th training wake-up word succeeds.
  • the training module 330 is specifically configured to: extract feature information of the awake word; detect and determine that the feature information of the awake word meets a preset standard, and determine that training of the awake word is successful.
  • the training module 330 is further configured to: extract feature information of the wake-up word inputted at the Mth time; detect and determine that the feature information of the wake-up word inputted at the Mth time conforms to a preset standard, and convert the Mth time
  • the feature information of the awake words input is calculated similarly to the feature information of the awake words inputted before M-1 times; the feature information of the awake word inputted the Mth times and the wakeup word of the first M-1 times are detected and determined.
  • the similarity of the feature information is greater than the preset similarity, and it is determined that the training awakening word is successful.
  • the wake-up word training apparatus for a home appliance may further include a second saving module 340.
  • the second saving module 340 is configured to determine that the wakeup word is effective after the Nth training wakeup word is successful, and save the valid wakeup word locally.
  • the wake-up word training device for a home appliance may further include a second receiving module 350 and a second wake-up module 360.
  • the second receiving module 350 is configured to receive an inputted effective wake-up word after determining that the wake-up word is valid.
  • the second wake-up module 360 is configured to wake up the home appliance according to the effective wake-up word.
  • the wake word training device for a home appliance in the embodiment of the present disclosure controls a home appliance to enter a custom wake word mode, collects the entered wake word, and trains the wake word to detect and determine that the wake word is successfully trained for the next training. , Until the N-th training awakening word succeeds, so as to realize user-defined awakening words, meet the personalized needs of users, and the training awakening words have high accuracy.
  • the present disclosure also proposes a non-transitory computer-readable storage medium having stored thereon a computer program that, when executed by a processor, implements a wake-up word for a home appliance as proposed by the foregoing embodiment of the present disclosure. Training methods.
  • the present disclosure also provides a home appliance including a processor, a memory, and a computer program stored on the memory and executable on the processor.
  • the processor is configured to execute the home appliance as proposed in the foregoing embodiment of the present disclosure. Wake-up word training method.
  • first and second are used for descriptive purposes only and cannot be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Therefore, the features defined as “first” and “second” may explicitly or implicitly include at least one of the features. In the description of the present disclosure, the meaning of "plurality” is at least two, for example, two, three, etc., unless it is specifically and specifically defined otherwise.
  • any process or method description in a flowchart or otherwise described herein can be understood as representing a module, fragment, or portion of code that includes one or more executable instructions for implementing steps of a custom logic function or process
  • the scope of the preferred embodiments of the present disclosure includes additional implementations in which the functions may be performed out of the order shown or discussed, including performing functions in a substantially simultaneous manner or in the reverse order according to the functions involved, which should It is understood by those skilled in the art to which the embodiments of the present disclosure belong.
  • a sequenced list of executable instructions that can be considered to implement a logical function can be embodied in any computer-readable medium,
  • the instruction execution system, device, or device such as a computer-based system, a system including a processor, or other system that can fetch and execute instructions from the instruction execution system, device, or device), or in combination with these instruction execution systems, devices Or equipment.
  • a "computer-readable medium” may be any device that can contain, store, communicate, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device.
  • computer readable media include the following: electrical connections (electronic devices) with one or more wirings, portable computer disk cartridges (magnetic devices), random access memory (RAM), Read-only memory (ROM), erasable and editable read-only memory (EPROM or flash memory), fiber optic devices, and portable optical disk read-only memory (CDROM).
  • the computer-readable medium may even be paper or other suitable medium on which the program can be printed, because, for example, by optically scanning the paper or other medium, followed by editing, interpretation, or other suitable Processing to obtain the program electronically and then store it in computer memory.
  • portions of the present disclosure may be implemented in hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented by software or firmware stored in a memory and executed by a suitable instruction execution system.
  • Discrete logic circuits with logic gates for implementing logic functions on data signals Logic circuits, ASICs with suitable combinational logic gate circuits, programmable gate arrays (PGA), field programmable gate arrays (FPGAs), etc.
  • a person of ordinary skill in the art can understand that all or part of the steps carried by the methods in the foregoing embodiments may be implemented by a program instructing related hardware.
  • the program may be stored in a computer-readable storage medium.
  • the program is When executed, one or a combination of the steps of the method embodiment is included.
  • each functional unit in each embodiment of the present disclosure may be integrated into one processing module, or each unit may exist separately physically, or two or more units may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or software functional modules. If the integrated module is implemented in the form of a software functional module and sold or used as an independent product, it may also be stored in a computer-readable storage medium.
  • the aforementioned storage medium may be a read-only memory, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

一种家电设备的唤醒词训练方法、装置及家电设备,方法包括:采集唤醒词的语音数据样本(S101);提取语音数据样本的特征信息(S102);对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件;将唤醒词的语音数据样本保存至自定义唤醒词库中(S104)。通过采集唤醒词的语音数据样本,并提取语音数据样本的特征信息,以及对特征信息进行归一化,并检测并确定归一化后的特征信息满足预设条件,将唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。

Description

家电设备的唤醒词训练方法、装置及家电设备
相关申请的交叉引用
本公开要求广东美的厨房电器制造有限公司于2018年06月19日提交的、申请名称为“家电设备的唤醒词训练方法、装置及家电设备”的、中国专利申请号“201810628693.7”的优先权,以及于2018年08月06日提交的、申请名称为“家电设备的唤醒词训练方法、装置及家电设备”的、中国专利申请号“201810885079.9”的优先权。
技术领域
本公开涉及家用电器技术领域,尤其涉及一种家电设备的唤醒词训练方法、装置及家电设备。
背景技术
随着科技的不断进步,语音识别技术开发出的产品应用领域越来越广泛,涉及车载***、机器人、家庭服务、银行服务、医疗服务、工业控制等等。目前,语音识别技术主要分为两类,一类是云端语义识别,通过网络将语音信号传输到服务器进行语义分析和理解,再通过网络将结果传输。典型代表:苹果的Siri(语音助手)、亚马逊的echo音箱、微软小冰等等。但是该方法必须有网络才能使用,使用场景受限制。另一类是本地词条识别,无需使用网络,通过本机内嵌高性能处理器,能够实时处理语音控制命令词。但其只能识别预先设定好的语音控制命令词条,需识别到完整的语音控制命令词条以后才会响应,不能实现自由语义理解,用户体验感不高。
发明内容
本公开提出一种家电设备的唤醒词训练方法、装置及家电设备,以实现用户自定义唤醒词,满足用户的个性化需求。
本公开第一方面实施例提出了一种家电设备的唤醒词训练方法,包括:
采集唤醒词的语音数据样本;
提取所述语音数据样本的特征信息;
对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;
将所述唤醒词的语音数据样本保存至自定义唤醒词库中。
作为本公开第一方面实施例的第一种可能的实现方式,方法还包括:
检测并确定所述归一化的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。
作为本公开第一方面实施例的第二种可能的实现方式,方法还包括:
在采集唤醒词的语音数据样本之后,对所述语音数据样本进行去噪处理。
作为本公开第一方面实施例的第三种可能的实现方式,方法还包括:
在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。
作为本公开第一方面实施例的第四种可能的实现方式,方法还包括:
接收输入的语音信息;
基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;
生成唤醒指令,并根据所述唤醒指令唤醒家电设备。
作为本公开第一方面实施例的第五种可能的实现方式,基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词,包括:
提取所述语音信息的特征信息;
对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;
获取相似度最高的比对结果;
检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。
作为本公开第一方面实施例的第六种可能的实现方式,方法还包括:
检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。
本公开实施例的家电设备的唤醒词训练方法,通过采集唤醒词的语音数据样本,并提取所述语音数据样本的特征信息,以及对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件,将所述唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。
本公开第二方面实施例提出了另一种家电设备的唤醒词训练方法,包括:
控制家电设备进入自定义唤醒词模式;
采集输入的唤醒词;
对所述唤醒词进行训练,检测并确定训练唤醒词成功;
进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,N为正整数。
作为本公开第二方面实施例的第一种可能的实现方式,方法还包括:
在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。
作为本公开第二方面实施例的第二种可能的实现方式,方法还包括:
在确定所述唤醒词生效之后,接收输入的生效的唤醒词;
根据所述生效的唤醒词唤醒家电设备。
作为本公开第二方面实施例的第三种可能的实现方式,对所述唤醒词进行训练,检测并确定训练唤醒词成功,包括:
提取所述唤醒词的特征信息;
检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。
作为本公开第二方面实施例的第四种可能的实现方式,进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,包括:
提取第M次输入的唤醒词的特征信息;
检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;
检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。
本公开实施例的家电设备的唤醒词训练方法,通过控制家电设备进入自定义唤醒词模式,并采集输入的唤醒词,以及对所述唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。
本公开第三方面实施例提出了一种家电设备的唤醒词训练装置,包括:
第一采集模块,用于采集唤醒词的语音数据样本;
提取模块,用于提取所述语音数据样本的特征信息;
判断模块,用于对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;
第一保存模块,用于将所述唤醒词的语音数据样本保存至自定义唤醒词库中。
作为本公开第三方面实施例的第一种可能的实现方式,所述第一采集模块,还用于:
检测并确定所述归一化后的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。
作为本公开第三方面实施例的第二种可能的实现方式,所述装置还包括:
预处理模块,用于在采集唤醒词的语音数据样本之后,对所述语音数据样本进行去噪处理。
作为本公开第三方面实施例的第三种可能的实现方式,所述装置还包括:
第一控制模块,用于在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。
作为本公开第三方面实施例的第四种可能的实现方式,所述装置还包括:
第一接收模块,用于接收输入的语音信息;
识别模块,用于基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;
第一唤醒模块,用于生成唤醒指令,并根据所述唤醒指令唤醒家电设备。
作为本公开第三方面实施例的第五种可能的实现方式,所述识别模块,用于:
提取所述语音信息的特征信息;
对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;
获取相似度最高的比对结果;
检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。
作为本公开第三方面实施例的第六种可能的实现方式,所述装置还包括:
提示模块,用于检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。
本公开实施例的家电设备的唤醒词训练装置,通过采集唤醒词的语音数据样本,并提取所述语音数据样本的特征信息,以及对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件,将所述唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。
本公开第四方面实施例提出了另一种家电设备的唤醒词训练装置,包括:
第二控制模块,用于控制家电设备进入自定义唤醒词模式;
第二采集模块,用于采集输入的唤醒词;
训练模块,用于对所述唤醒词进行训练,检测并确定训练唤醒词成功,采集模块进行下一次采集,所述训练模块进行下一次训练,直至第N次训练唤醒词成功,N为正整数。
作为本公开第四方面实施例的第一种可能的实现方式,所述装置还包括:
第二保存模块,用于在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。
作为本公开第四方面实施例的第二种可能的实现方式,所述装置还包括:
第二接收模块,用于在确定所述唤醒词生效之后,接收输入的生效的唤醒词;
第二唤醒模块,用于根据所述生效的唤醒词唤醒家电设备。
作为本公开第四方面实施例的第三种可能的实现方式,所述训练模块,用于:
提取所述唤醒词的特征信息;
检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。
作为本公开第四方面实施例的第四种可能的实现方式,所述训练模块,还用于:
提取第M次输入的唤醒词的特征信息;
检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;
检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。
本公开实施例的家电设备的唤醒词训练装置,通过控制家电设备进入自定义唤醒词模式,并采集输入的唤醒词,以及对所述唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。
本公开第五方面实施例提出了一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面实施例所述的家电设备的唤醒词训练方法,或者,实现如第二方面实施例所述的家电设备的唤醒词训练方法。
本公开第六方面实施例提出了一种家电设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器用于执行如第一方面实施例所述的家电设备的唤醒词训练方法,或者,执行如第二方面实施例所述的家电设备的唤醒词训练方法。
本公开附加的方面和优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本公开的实践了解到。
附图说明
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例一所提出的家电设备的唤醒词训练方法的流程图;
图2为本公开一实施例提出的多次训练相同的唤醒词的流程示意图;
图3为本公开实施例二所提出的家电设备的唤醒词训练方法的流程图;
图4为本公开实施例三所提出的家电设备的唤醒词训练方法的流程图;
图5为本公开实施例四所提出的家电设备的唤醒词训练方法的流程图;
图6为本公开实施例五所提出的家电设备的唤醒词训练方法的流程图;
图7为本公开实施例六所提出的家电设备的唤醒词训练方法的流程图;
图8为本公开实施例七所提出的家电设备的唤醒词训练方法的流程图;
图9为本公开一具体示例的唤醒词训练的流程示意图;
图10为本公开实施例八所提出的家电设备的唤醒词训练装置的结构框图;
图11为本公开实施例九所提出的家电设备的唤醒词训练装置的结构框图;
图12为本公开实施例十所提出的家电设备的唤醒词训练装置的结构框图;
图13为本公开实施例十一所提出的家电设备的唤醒词训练装置的结构框图;
图14为本公开实施例十二所提出的家电设备的唤醒词训练装置的结构框图;
图15为本公开实施例十三所提出的家电设备的唤醒词训练装置的结构框图;
图16为本公开实施例十四所提出的家电设备的唤醒词训练装置的结构框图;
图17为本公开实施例十五所提出的家电设备的唤醒词训练装置的结构框图。
具体实施方式
下面详细描述本公开的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,旨在用于解释本公开,而不能理解为对本公开的限制。
目前,语音识别技术主要是云端语义识别和本地词条识别两种方式。云端语义识别必须依靠网络才能使用,使用场景受限制。本地词条识别只能识别预先设定好的语音控制命令词条,不能实现自由语义理解。为此本公开提出一种家电设备的唤醒词训练方法,能够自定义本地唤醒词,满足个性化需求,且无需依靠网络,响应速度快,不受场景限制。
下面参考附图描述本公开实施例的家电设备的唤醒词训练方法、装置及家电设备。
图1为本公开实施例一所提出的家电设备的唤醒词训练方法的流程图。
如图1所示,家电设备的唤醒词训练方法,包括:
S101,采集唤醒词的语音数据样本。
在智能语音交互领域中,用户可通过唤醒词将处于休眠状态的设备唤醒。而该唤醒词通常为厂家预先定义的,无法更改,不能满足用户个性化的需求。因此,本实施例为家电设备设置一个自定义唤醒词模式,可以让用户训练出一个符合自身需求的自定义唤醒词。
在本公开的一个实施例中,用户在训练自定义唤醒词之前,可先控制家电设备进入自定义唤醒词模式。其中,进入的方式可以采用触发实体按键或者发出语音指令等方式。在家电设备进入自定义唤醒词模式之后,可提醒用户想要设置的唤醒词。在预定时间段内如5秒内,用户说出唤醒词。此时,家电设备可通过麦克风等语音输入装置以预设音频格式采集唤醒词的语音数据样本。例如,按照采样频率16Khz、传输速率16Bit的格式来采集声音信号。如果5秒内,用户没有说出唤醒词,则可提醒用户重新输入。
S102,提取语音数据样本的特征信息。
其中,特征信息的提取可以采用MFCC(Mel频率倒谱系数)或其他特征提取算法进行提取。而语音按照频率的高低,可分为低频、中频和高频。因此,在提取特征时,可在低频范围、中频范围和高频范围内,分别提取特征信息。且低频范围的特征信息、中频范围的特征信息和高频范围的特征信息所对应的权重值不同。其中,特征信息可包括声音强度大小、男声或女声等特征。
S103,对特征信息进行归一化,并判断归一化后的特征信息是否满足预设条件,若是,执行步骤S104,若否,执行步骤S105。
其中,归一化是将特征信息经过处理后(通过一定的算法)限制在一定范围内,从而使得归一化后的特征信息能够与预设条件进行比对判断。例如:长度特征与预设长度范围相比,是否过短或过长;或者强度特征与预设强度范围相比,是否过大或过小等等。
S104,将唤醒词的语音数据样本保存至自定义唤醒词库中。
在本公开的一个实施例中,如果归一化后的特征信息满足预设条件,则说明该唤醒词训练成功,即将唤醒词的语音数据样本保存至自定义唤醒词库中。
此外,家电设备的唤醒词训练方法还可包括:
S105,重新采集唤醒词的语音数据样本。
在本公开的一个实施例中,如果归一化后的特征信息不满足预设条件,则家电设备可提醒用户重新输入唤醒词,从而重新采集唤醒词的语音数据样本。
当然,为了提高准确率,可进行多次的相同的唤醒词的训练。如图2所示,三次采集用户说出的相同的唤醒词,分别提取特征信息,经过归一化处理,然后对归一化后的特征信息进行过滤检测,筛选出满足条件的特征信息(训练成功的唤醒词)。最后将训练好的唤醒词存储在本地的自定义唤醒词库中。
本公开实施例的家电设备的唤醒词训练方法,通过采集唤醒词的语音数据样本,并提取语音数据样本的特征信息,以及对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件,将唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。
在本公开的另一个实施例中,如图3所示,家电设备的唤醒词训练方法还可包括:
S106,在采集唤醒词的语音数据样本之后,对语音数据样本进行去噪处理。
由于环境噪声、其他干扰声音的影响,在采集用户说出的唤醒词时,需要先对采集到的语音数据样本进行去噪处理,从而避免噪声影响,提高精准度。
在本公开的又一个实施例中,如图4所示,家电设备的唤醒词训练方法还可包括:
S401,接收输入的语音信息。
在自定义唤醒词成功时候,便可以利用自定义唤醒词唤醒家电设备。
在本公开的一个实施例中,可接收用户输入的语音信息。
S402,基于自定义唤醒词库识别语音信息是否为自定义唤醒词,若是,执行步骤S403,若否,执行步骤S404。
在此之后,可基于自定义唤醒词库来识别语音信息是否为自定义唤醒词。
具体地,可提取语音信息的特征信息,对语音信息的特征信息进行归一化,然后采用动态时间规划算法,将语音信息的特征信息与自定义唤醒词库中的所有的唤醒词的特征信息进行比对。例如,分别计算语音信息的特征信息B与自定义唤醒词库中的唤醒词A1的特征信息、唤醒词A2的特征信息、唤醒词A3的特征信息的相似度。
之后,获取相似度最高的比对结果。如果相似度最高的比对结果满足设定值,则确定语音信息为自定义唤醒词;如果相似度最高的比对结果不满足设定值,则确定语音信息不为自定义唤醒词。
S403,生成唤醒指令,并根据唤醒指令唤醒家电设备。
在本公开的一个实施例中,如果语音信息为自定义唤醒词,则生成唤醒指令,并根据唤醒指令唤醒家电设备。
S404,提示重新输入语音信息。
在本公开的一个实施例中,如果语音信息不为自定义唤醒词,则提示重新输入语音信息,从而提升家电设备被唤醒的成功率。
本实施例通过本地的自定义唤醒词库来识别用户输入的语音信息是否为自定义唤醒词,相比于传统的联网识别,响应速度更快,且不受网络限制,使用场景更丰富。
本公开还提出了另一种家电设备的唤醒词训练方法,图5为本公开实施例四所提出的家电设备的唤醒词训练方法的流程图。
如图5所示,家电设备的唤醒词训练方法,包括:
S501,控制家电设备进入自定义唤醒词模式。
在智能语音交互领域中,用户可通过唤醒词将处于休眠状态的设备唤醒。而该唤醒词通常为厂家预先定义的,无法更改,不能满足用户个性化的需求。因此,本实施例为家电设备设置一个自定义唤醒词模式,可以让用户训练出一个符合自身需求的自定义唤醒词。
在本公开的一个实施例中,用户在训练自定义唤醒词之前,可先控制家电设备进入自定义唤醒词模式。其中,进入的方式可以采用触发实体按键或者发出语音指令等方式。
S502,采集输入的唤醒词。
在家电设备进入自定义唤醒词模式之后,可提醒用户想要设置的唤醒词。在预定时间段内如5秒内,用户说出唤醒词。此时,家电设备可通过麦克风等语音输入装置以预设音频格式采集唤醒词的语音数据样本。例如,按照采样频率16Khz、传输速率16Bit的格式来采集声音信号。如果5秒内,用户没有说出唤醒词,则可提醒用户重新输入。
S503,对唤醒词进行训练,并判断训练唤醒词是否成功,若是,执行步骤S504,若否,执行步骤S502。
在本公开的一个实施例中,可先提取唤醒词的特征信息,然后将特征信息与预设标准进行比对判断,判断特征信息是否符合预设标准。
举例来说,结合用户使用习惯,可为唤醒词设定一个合适的最大时间长度。
在训练过程中,需要严格保证训练语料(唤醒词)的质量和一致性,因此,在整个训练过程中,需要从语音声响大小、语音长度、语音相似度、语音复杂度、环境噪声等方面判断唤醒词是否符合预设标准。
其中,采集的唤醒词为时域信号,可将该时域信号转换为频域信号(提取特征信息),再进行比对分析。
语音声响大小的判断:先按照实验结果设置4个预定义的阈值,分别表示最大音量vh,最小音量vl,高于最大音量的最大数值vhm,低于最小音量的最大数值vlm。之后,统计 出训练语料的高于最大音量的个数vhr和低于最小音量的个数vlr。如果vhr>vhm,则表示声音太大;如果vlr>vlm,则表示声音太小。如果vhr<vhm,且vlr<vlm,则表示语音声响大小符合标准。
语音长度的判断:可分为两部分,超长判定和过短判定。超长判定和过短判定均是利用训练语料固定长度的特性,结合前端语音和后端语音的信噪比进行判断。如果后端语音的功率相对于前端语音的功率没有减弱,则表示语音超长;如果后端语音的功率相对于前段语音的功率提前减弱,则表示语音过短。
语音相似度的判断:根据实验结果,预定义相似度的阈值。再利用余弦距离判断不同语音之间的相似度。如果相似度大于阈值,则表示相似,否则表示不相似。
语音复杂度判断:利用训练语料的波峰特性,若波峰数大于预定义的阈值,则表示训练语料合格,否则表示不合格。
环境噪声判断:利用环境特性,设置噪音阈值。对训练语料进行分析,若训练语料的噪音低于阈值,则表示环境合适,否则,表示噪声太大。
S504,进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功。
其中,N为正整数。
也就是说,如果第一次训练唤醒词成功,那么可以进行第二次训练唤醒词。如果第一次训练唤醒词不成功,则重新进行第一次训练唤醒词。此外,在训练唤醒词时,如果连续训练不成功的次数达到3次,则可生成提示信息,信息内容可以是“唤醒词训练失败,请输入其他唤醒词进行训练”等,从而提醒用户更换一个更容易训练成功的唤醒词。
在本公开的一个实施例中,在进行第M次训练时,如图6所示,具体可包括如下步骤:
S601,提取第M次输入的唤醒词的特征信息。
S602,检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算。
S603,检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。
假设M=5,则进行第五次训练时,需要将第五次输入的唤醒词的特征信息分别与第一次、第二次、第三次、第四次输入的唤醒词的特征信息进行相似度计算。4个相似度需要均大于预设相似度如85%,才能确定第五次的唤醒词训练成功。
应当理解的是,每一次训练唤醒词的过程,具体还可以采用多次录制声音的方式。例如,第一次训练时,可采集三次用户输入的声音信号,提取这三次的唤醒词的特征信息,求取它们的平均值来作为第一次训练的唤醒词的特征信息进行训练,从而提升训练唤醒词的成功率。
本公开实施例的家电设备的唤醒词训练方法,通过控制家电设备进入自定义唤醒词模式,并采集输入的唤醒词,以及对唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。
在本公开的另一个实施例中,如图7所示,家电设备的唤醒词训练方法还可包括:
S505,在第N次训练唤醒词成功之后,确定唤醒词生效,并将生效的唤醒词保存在本地。
其中,生效的唤醒词保存在本地的自定义唤醒词库中。
在本公开的又一个实施例中,如图8所示,家电设备的唤醒词训练方法还可包括:
S506,在确定唤醒词生效之后,接收输入的生效的唤醒词。
S507,根据生效的唤醒词唤醒家电设备。
在自定义唤醒词成功后,便可以利用生效的唤醒词唤醒家电设备。
具体地,可提取输入的生效的唤醒词的特征信息,然后与保存在自定义唤醒词库中的特征信息进行比对。如果两者相似度高于预设值,则可生成唤醒指令,并根据唤醒指令唤醒家电设备。否则,唤醒家电设备不成功。
下面以一个具体示例进行说明:
将语音识别装置安装在烹饪设备中,使烹饪设备具备语音识别功能。其中,烹饪设备的出厂设置为:启动自定义训练唤醒词的命令词为“换一个名字”。
在烹饪设备通电后,语音识别装置语音模组启动。用户说出“换一个名字”,则烹饪设备可进入自定义训练唤醒词的模式。此时,烹饪设备可播放“请在滴一声后说出新的唤醒词”。用户根据该提示语音,说出新的唤醒词。烹饪设备接收新的唤醒词,并判断新的唤醒词是否训练成功。如果训练成功,则烹饪设备可进行语音提示“训练成功,请再次说出唤醒词”;如果训练不成功,则烹饪设备可进行语音提示“声音**,请重新说出唤醒词”。其中,**可以是“太小”、“太大”、“太长”、“太短”、“太简单”、“与上一次训练结果不一致”等。重复上述训练步骤,并当第三次训练成功时,烹饪设备可进行语音提示“训练已完成,新唤醒词已生效”,从而结束训练。上述训练唤醒词过程,可如图9所示。通过该方法,大大提高了训练出的唤醒词的精度,进而提升了唤醒词的识别率,降低了误识别率。
为实现上述实施例,本公开还提出一种家电设备的唤醒词训练装置。
图10为本公开实施例八所提出的家电设备的唤醒词训练装置的结构框图。
如图10所示,家电设备的唤醒词训练装置可包括:第一采集模块110、提取模块 120、判断模块130和第一保存模块140。
其中,第一采集模块110,用于采集唤醒词的语音数据样本。
作为一种可能的实现方式,第一采集模块110,还用于:检测并确定归一化后的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。
提取模块120,用于提取语音数据样本的特征信息。
判断模块130,用于对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件。
第一保存模块140,用于将唤醒词的语音数据样本保存至自定义唤醒词库中。
在本公开的另一个实施例中,如图11所示,家电设备的唤醒词训练装置还可包括预处理模块150。
预处理模块150,用于在采集唤醒词的语音数据样本之后,对语音数据样本进行去噪处理。
在本公开的又一个实施例中,如图12所示,家电设备的唤醒词训练装置还可包括:第一控制模块160。
控制模块160,用于在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。
在本公开的再一个实施例中,如图13所示,家电设备的唤醒词训练装置还可包括第一接收模块210、识别模块220和第一唤醒模块230。
第一接收模块210,用于接收输入的语音信息。
识别模块220,用于基于自定义唤醒词库检测并确定语音信息为自定义唤醒词。
作为一种可能的实现方式,识别模块220,用于:提取所述语音信息的特征信息;对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;获取相似度最高的比对结果;检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。
第一唤醒模块230,用于检测并确定语音信息为自定义唤醒词,生成唤醒指令,并根据唤醒指令唤醒家电设备。
在本公开的一个具体实施例中,如图14所示,家电设备的唤醒词训练装置还可包括提示模块240。
提示模块240,用于检测并确定语音信息不为自定义唤醒词,提示重新输入语音信息。
需要说明的是,前述对家电设备的唤醒词训练方法的解释说明,也适用于本公开实施 例的家电设备的唤醒词训练装置,本公开实施例中未公布的细节,在此不再赘述。
本公开实施例的家电设备的唤醒词训练装置,通过采集唤醒词的语音数据样本,并提取语音数据样本的特征信息,以及对特征信息进行归一化,检测并确定归一化后的特征信息满足预设条件,将唤醒词的语音数据样本保存至自定义唤醒词库中,从而实现用户自定义唤醒词,满足用户的个性化需求。
为实现上述实施例,本公开还提出一种家电设备的唤醒词训练装置。
图15为本公开实施例十三所提出的家电设备的唤醒词训练装置的结构框图。
如图15所示,家电设备的唤醒词训练装置可包括:第二控制模块310、第二采集模块320和训练模块330。
其中,第二控制模块310,用于控制家电设备进入自定义唤醒词模式。
第二采集模块320,用于采集输入的唤醒词。
训练模块330,用于对唤醒词进行训练,检测并确定训练唤醒词成功,采集模块320进行下一次采集,训练模块330进行下一次训练,直至第N次训练唤醒词成功。
作为一种可能的实现方式,训练模块330,具体用于:提取所述唤醒词的特征信息;检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。
作为一种可能的实现方式,训练模块330,还用于:提取第M次输入的唤醒词的特征信息;检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。
在本公开的另一个实施例中,如图16所示,家电设备的唤醒词训练装置还可包括:第二保存模块340。
第二保存模块340,用于在第N次训练唤醒词成功之后,确定唤醒词生效,并将生效的唤醒词保存在本地。
在本公开的又一个实施例中,如图17所示,家电设备的唤醒词训练装置还可包括:第二接收模块350和第二唤醒模块360。
第二接收模块350,用于在确定唤醒词生效之后,接收输入的生效的唤醒词。
第二唤醒模块360,用于根据生效的唤醒词唤醒家电设备。
需要说明的是,前述对家电设备的唤醒词训练方法的解释说明,也适用于本公开实施例的家电设备的唤醒词训练装置,本公开实施例中未公布的细节,在此不再赘述。
本公开实施例的家电设备的唤醒词训练装置,通过控制家电设备进入自定义唤醒词模 式,并采集输入的唤醒词,以及对唤醒词进行训练,检测并确定训练唤醒词成功,进行下一次训练,直至第N次训练唤醒词成功,从而实现用户自定义唤醒词,满足用户的个性化需求,且训练出的唤醒词精确度高。
为实现上述实施例,本公开还提出一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如本公开前述实施例提出的家电设备的唤醒词训练方法。
为实现上述实施例,本公开还提出一种家电设备,包括处理器、存储器及存储在存储器上并可在处理器上运行的计算机程序,处理器用于执行如本公开前述实施例提出的家电设备的唤醒词训练方法。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征、结构、材料或者特点包含于本公开的至少一个实施例或示例中。在本说明书中,对上述术语的示意性表述不必须针对的是相同的实施例或示例。而且,描述的具体特征、结构、材料或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
此外,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括至少一个该特征。在本公开的描述中,“多个”的含义是至少两个,例如两个,三个等,除非另有明确具体的限定。
流程图中或在此以其他方式描述的任何过程或方法描述可以被理解为,表示包括一个或更多个用于实现定制逻辑功能或过程的步骤的可执行指令的代码的模块、片段或部分,并且本公开的优选实施方式的范围包括另外的实现,其中可以不按所示出或讨论的顺序,包括根据所涉及的功能按基本同时的方式或按相反的顺序,来执行功能,这应被本公开的实施例所属技术领域的技术人员所理解。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读介质中,以供指令执行***、装置或设备(如基于计算机的***、包括处理器的***或其他可以从指令执行***、装置或设备取指令并执行指令的***)使用,或结合这些指令执行***、装置或设 备而使用。就本说明书而言,"计算机可读介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行***、装置或设备或结合这些指令执行***、装置或设备而使用的装置。计算机可读介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子装置),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读介质甚至可以是可在其上打印所述程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得所述程序,然后将其存储在计算机存储器中。
应当理解,本公开的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行***执行的软件或固件来实现。如,如果用硬件来实现和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。
此外,在本公开各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。所述集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读取存储介质中。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本公开的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本公开的限制,本领域的普通技术人员在本公开的范围内可以对上述实施例进行变化、修改、替换和变型。

Claims (26)

  1. 一种家电设备的唤醒词训练方法,其特征在于,包括:
    采集唤醒词的语音数据样本;
    提取所述语音数据样本的特征信息;
    对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;
    将所述唤醒词的语音数据样本保存至自定义唤醒词库中。
  2. 如权利要求1所述的方法,其特征在于,还包括:
    检测并确定所述归一化的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。
  3. 如权利要求1或2所述的方法,其特征在于,还包括:
    在采集唤醒词的语音数据样本之后,对所述语音数据样本进行去噪处理。
  4. 如权利要求1-3任一项所述的方法,其特征在于,在采集唤醒词的语音数据样本之前,还包括:
    控制家电设备进入自定义唤醒词模式。
  5. 如权利要求1-4任一项所述的方法,其特征在于,还包括:
    接收输入的语音信息;
    基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;
    生成唤醒指令,并根据所述唤醒指令唤醒家电设备。
  6. 如权利要求5所述的方法,其特征在于,基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词,包括:
    提取所述语音信息的特征信息;
    对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;
    获取相似度最高的比对结果;
    检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。
  7. 如权利要求5或6所述的方法,其特征在于,还包括:
    检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。
  8. 一种家电设备的唤醒词训练方法,其特征在于,包括:
    控制家电设备进入自定义唤醒词模式;
    采集输入的唤醒词;
    对所述唤醒词进行训练,检测并确定训练唤醒词成功;
    进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,N为正整数。
  9. 如权利要求8所述的方法,其特征在于,还包括:
    在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。
  10. 如权利要求9所述的方法,其特征在于,还包括:
    在确定所述唤醒词生效之后,接收输入的生效的唤醒词;
    根据所述生效的唤醒词唤醒家电设备。
  11. 如权利要求8-10任一项所述的方法,其特征在于,对所述唤醒词进行训练,检测并确定训练唤醒词成功,包括:
    提取所述唤醒词的特征信息;
    检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。
  12. 如权利要求8-11任一项所述的方法,其特征在于,所述进行下一次的唤醒词采集和训练,直至第N次训练唤醒词成功,包括:
    提取第M次输入的唤醒词的特征信息;
    检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;
    检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。
  13. 一种家电设备的唤醒词训练装置,其特征在于,包括:
    第一采集模块,用于采集唤醒词的语音数据样本;
    提取模块,用于提取所述语音数据样本的特征信息;
    判断模块,用于对所述特征信息进行归一化,检测并确定所述归一化后的特征信息满足预设条件;
    第一保存模块,用于将所述唤醒词的语音数据样本保存至自定义唤醒词库中。
  14. 如权利要求13所述的装置,其特征在于,所述第一采集模块,还用于:
    检测并确定所述归一化后的特征信息不满足所述预设条件,重新采集所述唤醒词的语音数据样本。
  15. 如权利要求13或14所述的装置,其特征在于,所述装置还包括:
    预处理模块,用于在采集唤醒词的语音数据样本之后,对所述语音数据样本进行 去噪处理。
  16. 如权利要求13-15任一项所述的装置,其特征在于,所述装置还包括:
    第一控制模块,用于在采集唤醒词的语音数据样本之前,控制家电设备进入自定义唤醒词模式。
  17. 如权利要求13-16任一项所述的装置,其特征在于,所述装置还包括:
    第一接收模块,用于接收输入的语音信息;
    识别模块,用于基于所述自定义唤醒词库检测并确定所述语音信息为自定义唤醒词;
    第一唤醒模块,用于生成唤醒指令,并根据所述唤醒指令唤醒家电设备。
  18. 如权利要求17所述的装置,其特征在于,所述识别模块,用于:
    提取所述语音信息的特征信息;
    对所述语音信息的特征信息进行归一化,并采用动态时间规划算法,将所述语音信息的特征信息与所述自定义唤醒词库中的唤醒词的特征信息进行比对;
    获取相似度最高的比对结果;
    检测并确定相似度最高的比对结果满足设定值,确定所述语音信息为自定义唤醒词。
  19. 如权利要求17或18所述的装置,其特征在于,所述装置还包括:
    提示模块,用于检测并确定所述语音信息不为自定义唤醒词,提示重新输入语音信息。
  20. 一种家电设备的唤醒词训练装置,其特征在于,包括:
    第二控制模块,用于控制家电设备进入自定义唤醒词模式;
    第二采集模块,用于采集输入的唤醒词;
    训练模块,用于对所述唤醒词进行训练,检测并确定训练唤醒词成功,所述采集模块进行下一次采集,所述训练模块进行下一次训练,直至第N次训练唤醒词成功,N为正整数。
  21. 如权利要求20所述的装置,其特征在于,所述装置还包括:
    第二保存模块,用于在第N次训练唤醒词成功之后,确定所述唤醒词生效,并将生效的唤醒词保存在本地。
  22. 如权利要求21所述的装置,其特征在于,所述装置还包括:
    第二接收模块,用于在确定所述唤醒词生效之后,接收输入的生效的唤醒词;
    第二唤醒模块,用于根据所述生效的唤醒词唤醒家电设备。
  23. 如权利要求20-22任一项所述的装置,其特征在于,所述训练模块,用于:
    提取所述唤醒词的特征信息;
    检测并确定所述唤醒词的特征信息符合预设标准,确定训练唤醒词成功。
  24. 如权利要求20-23任一项所述的装置,其特征在于,所述训练模块,还用于:
    提取第M次输入的唤醒词的特征信息;
    检测并确定第M次输入的唤醒词的特征信息符合预设标准,将第M次输入的唤醒词的特征信息分别与前M-1次输入的唤醒词的特征信息进行相似度计算;
    检测并确定第M次输入的唤醒词的特征信息与前M-1次输入的唤醒词的特征信息的相似度均大于预设相似度,确定训练唤醒词成功。
  25. 一种非临时性计算机可读存储介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如权利要求1-7任一项所述的家电设备的唤醒词训练方法,或者,实现如权利要求8-12任一项所述的家电设备的唤醒词训练方法。
  26. 一种家电设备,包括处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述处理器用于执行如权利要求1-7任一项所述的家电设备的唤醒词训练方法,或者,执行如权利要求8-12任一项所述的家电设备的唤醒词训练方法。
PCT/CN2019/074317 2018-06-19 2019-02-01 家电设备的唤醒词训练方法、装置及家电设备 WO2019242312A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN201810628693.7 2018-06-19
CN201810628693.7A CN109036393A (zh) 2018-06-19 2018-06-19 家电设备的唤醒词训练方法、装置及家电设备
CN201810885079.9A CN109166571B (zh) 2018-08-06 2018-08-06 家电设备的唤醒词训练方法、装置及家电设备
CN201810885079.9 2018-08-06

Publications (1)

Publication Number Publication Date
WO2019242312A1 true WO2019242312A1 (zh) 2019-12-26

Family

ID=68983484

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/074317 WO2019242312A1 (zh) 2018-06-19 2019-02-01 家电设备的唤醒词训练方法、装置及家电设备

Country Status (1)

Country Link
WO (1) WO2019242312A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104282307A (zh) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 唤醒语音控制***的方法、装置及终端
US20150154953A1 (en) * 2013-12-02 2015-06-04 Spansion Llc Generation of wake-up words
CN104795068A (zh) * 2015-04-28 2015-07-22 深圳市锐曼智能装备有限公司 机器人的唤醒控制方法及其控制***
US20160293168A1 (en) * 2015-03-30 2016-10-06 Opah Intelligence Ltd. Method of setting personal wake-up word by text for voice control
CN106161755A (zh) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 一种关键词语音唤醒***及唤醒方法及移动终端
CN107369439A (zh) * 2017-07-31 2017-11-21 北京捷通华声科技股份有限公司 一种语音唤醒方法和装置
CN109036393A (zh) * 2018-06-19 2018-12-18 广东美的厨房电器制造有限公司 家电设备的唤醒词训练方法、装置及家电设备
CN109166571A (zh) * 2018-08-06 2019-01-08 广东美的厨房电器制造有限公司 家电设备的唤醒词训练方法、装置及家电设备

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150154953A1 (en) * 2013-12-02 2015-06-04 Spansion Llc Generation of wake-up words
CN104282307A (zh) * 2014-09-05 2015-01-14 中兴通讯股份有限公司 唤醒语音控制***的方法、装置及终端
US20160293168A1 (en) * 2015-03-30 2016-10-06 Opah Intelligence Ltd. Method of setting personal wake-up word by text for voice control
CN106161755A (zh) * 2015-04-20 2016-11-23 钰太芯微电子科技(上海)有限公司 一种关键词语音唤醒***及唤醒方法及移动终端
CN104795068A (zh) * 2015-04-28 2015-07-22 深圳市锐曼智能装备有限公司 机器人的唤醒控制方法及其控制***
CN107369439A (zh) * 2017-07-31 2017-11-21 北京捷通华声科技股份有限公司 一种语音唤醒方法和装置
CN109036393A (zh) * 2018-06-19 2018-12-18 广东美的厨房电器制造有限公司 家电设备的唤醒词训练方法、装置及家电设备
CN109166571A (zh) * 2018-08-06 2019-01-08 广东美的厨房电器制造有限公司 家电设备的唤醒词训练方法、装置及家电设备

Similar Documents

Publication Publication Date Title
WO2021093449A1 (zh) 基于人工智能的唤醒词检测方法、装置、设备及介质
JP6453917B2 (ja) 音声ウェイクアップ方法及び装置
CN108320733B (zh) 语音数据处理方法及装置、存储介质、电子设备
US10504511B2 (en) Customizable wake-up voice commands
CN105632486B (zh) 一种智能硬件的语音唤醒方法和装置
CN106448663B (zh) 语音唤醒方法及语音交互装置
CN107481718B (zh) 语音识别方法、装置、存储介质及电子设备
WO2017114201A1 (zh) 一种设定操作的执行方法及装置
WO2017071182A1 (zh) 一种语音唤醒方法、装置及***
BR102018070673A2 (pt) Gerar diálogo baseado em pontuações de verificação
JP2019533193A (ja) 音声制御システム及びそのウェイクアップ方法、ウェイクアップ装置、並びに家電製品、コプロセッサ
CN105529028A (zh) 语音解析方法和装置
CN109036393A (zh) 家电设备的唤醒词训练方法、装置及家电设备
CN109166571B (zh) 家电设备的唤醒词训练方法、装置及家电设备
CN110706707B (zh) 用于语音交互的方法、装置、设备和计算机可读存储介质
CN110808050B (zh) 语音识别方法及智能设备
US20240013784A1 (en) Speaker recognition adaptation
CN112002349B (zh) 一种语音端点检测方法及装置
US20200312305A1 (en) Performing speaker change detection and speaker recognition on a trigger phrase
CN111862943B (zh) 语音识别方法和装置、电子设备和存储介质
US20190103110A1 (en) Information processing device, information processing method, and program
IT201900015506A1 (it) Procedimento di elaborazione di un segnale elettrico trasdotto da un segnale vocale, dispositivo elettronico, rete connessa di dispositivi elettronici e prodotto informatico corrispondenti
CN115691478A (zh) 语音唤醒方法、装置、人机交互设备和存储介质
WO2019242312A1 (zh) 家电设备的唤醒词训练方法、装置及家电设备
US20230113883A1 (en) Digital Signal Processor-Based Continued Conversation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19822929

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 12.05.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19822929

Country of ref document: EP

Kind code of ref document: A1