CN110444190A - Method of speech processing, device, terminal device and storage medium - Google Patents
Method of speech processing, device, terminal device and storage medium Download PDFInfo
- Publication number
- CN110444190A CN110444190A CN201910746794.9A CN201910746794A CN110444190A CN 110444190 A CN110444190 A CN 110444190A CN 201910746794 A CN201910746794 A CN 201910746794A CN 110444190 A CN110444190 A CN 110444190A
- Authority
- CN
- China
- Prior art keywords
- voice
- speech
- data
- voice messaging
- voice data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000012545 processing Methods 0.000 title claims abstract description 75
- 238000000034 method Methods 0.000 title claims abstract description 68
- 238000003860 storage Methods 0.000 title claims abstract description 21
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 79
- 238000003786 synthesis reaction Methods 0.000 claims abstract description 79
- 230000008569 process Effects 0.000 claims abstract description 11
- 230000001755 vocal effect Effects 0.000 claims description 21
- 238000011946 reduction process Methods 0.000 claims description 8
- 238000000605 extraction Methods 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 7
- 230000009467 reduction Effects 0.000 description 11
- 230000001960 triggered effect Effects 0.000 description 8
- 238000001514 detection method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 230000007613 environmental effect Effects 0.000 description 5
- 239000000284 extract Substances 0.000 description 5
- 238000001914 filtration Methods 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000004321 preservation Methods 0.000 description 4
- 238000012216 screening Methods 0.000 description 4
- 239000003086 colorant Substances 0.000 description 3
- 230000005611 electricity Effects 0.000 description 3
- 230000002194 synthesizing effect Effects 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000033764 rhythmic process Effects 0.000 description 2
- 230000000630 rising effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000029058 respiratory gaseous exchange Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a kind of method of speech processing, device, terminal device and computer readable storage mediums, by obtaining the voice messaging in environment, determine voice data in default speech database according to the voice messaging;The text information that preset interface receives is extracted, target speech data is searched from the voice data based on the text information;It is instructed according to speech synthesis, the target speech data is synthesized into voice sequence.The present invention realizes, it is not limited by factors such as scene, contexts and carries out speech recognition and speech synthesis processing, the efficiency of the processing carried out to voice is improved, and is customized based on user and carries out speech synthesis and output with individual demand, improves the performance of speech processes.
Description
Technical field
The present invention relates to speech analysis techniques field more particularly to a kind of method of speech processing, device, terminal device and meters
Calculation machine readable storage medium storing program for executing.
Background technique
The development of computer technology and Digital Signal Processing facilitates the development and practical application of speech analysis techniques.
Waveform concatenation phoneme synthesizing method based on unit selection has been used more due to the raising of Computing ability and memory capacity
Large-scale sound library and the finer unit selection strategy of introducing, improve the sound quality of synthesis voice, tone color on very significantly
And naturalness.And another mainstream speech synthesis technique, based on hidden Markov model (hidden Markov model, HMM)
Parameter phoneme synthesizing method, also because its better robust performance and generalization obtain the high praise of many researchers.
Existing speech analysis processing technique, speech synthesis technique and speech recognition technology etc., building tradition
Sound library in speech synthesis system, relies primarily on and is manually operated, need to arrange professional recording personnel selection to the rhythm and
Segment carries out manual mark, and larger workload needed for constructing, fabrication cycle is longer, to the inefficiency that voice is handled,
In addition it is also necessary to which the sound library that could complete recording corpus under the recording environment of profession is recorded, speech processes are seriously limited
By scene, context etc..
Above content is only used to facilitate the understanding of the technical scheme, and is not represented and is recognized that above content is existing skill
Art.
Summary of the invention
The main purpose of the present invention is to provide a kind of method of speech processing, terminal device and computer-readable storage mediums
Matter, it is intended to solve the existing mode handled voice, by the serious limitation of the factors such as scene, context, treatment effeciency is low
Under technical problem.
The embodiment of the present invention proposes a kind of method of speech processing, which includes:
The voice messaging in environment is obtained, voice data is determined in default speech database according to the voice messaging;
The text information that preset interface receives is extracted, target is searched from the voice data based on the text information
Voice data;
It is instructed according to speech synthesis, the target speech data is synthesized into voice sequence.
Optionally, it is described acquisition environment in voice messaging the step of before, the method also includes:
The sound in the environment including the voice messaging is carried out at noise reduction according to the wave volume in the environment
Reason;
The step of voice messaging in the acquisition environment includes:
From the sound after noise reduction process, the voice messaging is extracted.
Optionally, described the step of determining voice data in default speech database according to the voice messaging, includes:
Identify the word content and sound quality information of the voice messaging;
It whether detects in the default speech database containing voice data corresponding to the word content;
If not containing, the corresponding pass between the word content and voice data in presently described voice messaging is established
System, and the voice data in presently described voice messaging is stored into the default speech database;
If it does, voice data is then determined in the default speech database based on the sound quality information recognized.
Optionally, described to determine voice data in the default speech database based on the sound quality information recognized
The step of, comprising:
The sound quality information for detecting voice data corresponding to the word content stored in the default speech database is
The no sound quality information better than voice data in the presently described voice messaging recognized;
If it is not, then voice data corresponding to the word content is updated to currently in the default speech database
Voice data in the voice messaging;
If so, abandoning the language being updated to voice data corresponding to the word content in presently described voice messaging
Sound data.
Optionally, described the step of voice data is determined in default speech database according to the voice messaging it
Afterwards, the method also includes:
Establish the incidence relation list between word content and sentence;
Described the step of target speech data is searched from the voice data based on the text information, comprising:
Word segmentation processing is carried out to obtain the first word content of the text information to the text information, and in the pass
Join matching criteria sentence in relation list;
According to the second word content in the standard sentence, the voice data stored in the default speech database
Middle lookup target speech data.
Optionally, it is described acquisition environment in voice messaging the step of after, the method also includes:
Application on Voiceprint Recognition processing is carried out to extract vocal print feature to the voice data in the voice messaging, and is based on mentioning
The vocal print feature taken determines output tone color.
Optionally, described to be instructed according to speech synthesis, the step of target speech data is synthesized into voice sequence, packet
It includes:
Detect carrying sequence synthesis demand and Timbre Synthesis demand in the speech synthesis instruction;
Demand is synthesized based on the sequence, according in the character order or second text of first word content
The character order of appearance combines the target speech data to form initial speech sequence;
Based on the Timbre Synthesis demand, the output tone color is added in the initial speech sequence to form final language
Sound sequence.
In addition, to achieve the above object, the present invention also provides a kind of voice processing apparatus, the voice processing apparatus packet
It includes:
Module is obtained, for obtaining the voice messaging in environment, according to the voice messaging in default speech database
Determine voice data;
Searching module, the text information received for extracting preset interface are based on the text information from the voice
Target speech data is searched in data;
The target speech data is synthesized voice sequence for instructing according to speech synthesis by synthesis module.
In addition, to achieve the above object, the present invention also provides a kind of terminal device, the terminal device include: memory,
Processor and it is stored in the voice processing program that can be run on the memory and on the processor, the speech processes journey
The step of sequence realizes method of speech processing as described above when being executed by the processor.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium
Voice processing program is stored on storage medium, the voice processing program realizes voice as described above when being executed by processor
The step of processing method.
A kind of method of speech processing, device, terminal device and the computer readable storage medium that the embodiment of the present invention proposes,
By obtaining the voice messaging in environment, voice data is determined in default speech database according to the voice messaging;It extracts
The text information that preset interface receives searches target speech data from the voice data based on the text information;It presses
It is instructed according to speech synthesis, the target speech data is synthesized into voice sequence.
By obtaining the voice messaging in the external environmental sounds, and based on to the acquisition under meaning external environment in office
Voice messaging carries out respective handling, to determine voice data (to previously stored phase in pre-set speech database
Same voice data is updated replacement, stores to new voice data correspondence), from preset for receiving user
Text information is extracted on the interface of inputted text information, after carrying out basic processing to text information, preparatory
If preservation voice data default speech database in, find out the corresponding target voice number of text information with extraction
According to then according to the demand of user in the triggered speech synthesis instruction of user, the target speech data obtained to lookup carries out group
It closes to form the voice sequence for meeting user demand.Realize, by the factors such as scene, context do not limited carry out speech recognition and
Speech synthesis processing, improve to voice carry out processing treatment effeciency, and based on user customize and individual demand into
Row speech synthesis improves the performance of speech processes.
Detailed description of the invention
Fig. 1 is the terminal structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to;
Fig. 2 is the flow diagram of method of speech processing first embodiment of the present invention;
Fig. 3 is the flow diagram of method of speech processing second embodiment of the present invention;
Fig. 4 is the flow diagram of method of speech processing 3rd embodiment of the present invention;
Fig. 5 is the module diagram of voice processing apparatus of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The primary solutions of the embodiment of the present invention are: obtaining the voice messaging in environment, existed according to the voice messaging
Voice data is determined in default speech database;Extract the text information that receives of preset interface, based on the text information from
Target speech data is searched in the voice data;It is instructed according to speech synthesis, the target speech data is synthesized into voice
Sequence.
Due to existing speech analysis processing technique, speech synthesis technique and speech recognition technology etc., building
Sound library in traditional voice synthesis system, relies primarily on and is manually operated, and needs to arrange professional recording personnel selection to rhythm
Rule and segment carry out manual mark, and larger workload needed for constructing, fabrication cycle is longer, the low efficiency handled voice
Under, in addition it is also necessary to which the sound library that could complete recording corpus under the recording environment of profession is recorded, seriously limit speech processes
By scene or context.
The present invention provides a solution, can not be limited by factors such as scene, contexts and carry out speech recognition and voice
Synthesis processing is improved the treatment effeciency of the processing carried out to voice, and is customized based on user and carry out language with individual demand
Sound synthesis, improves the performance of speech processes.
As shown in Figure 1, Fig. 1 is the terminal structure schematic diagram for the hardware running environment that the embodiment of the present invention is related to.
The terminal of that embodiment of the invention equipment can be various network-termination devices, such as terminal server, and PC is also possible to
Smart phone, tablet computer, E-book reader, MP3 (Moving Picture Experts Group Audio Layer
III, dynamic image expert's compression standard audio level 3) player, MP4 (Moving Picture Experts Group
Audio Layer IV, dynamic image expert's compression standard audio level 3) player, digit broadcasting receiver, wearable device
The packaged types terminal device or immovable such as (such as Intelligent bracelet, smartwatch etc.), navigation device, portable computer
Terminal device.
As shown in Figure 1, the terminal device may include: processor 1001, such as CPU, network interface 1004, user interface
1003, memory 1005, communication bus 1002.Wherein, communication bus 1002 is for realizing the connection communication between these components.
User interface 1003 may include display screen (Display), input unit such as keyboard (Keyboard), optional user interface
1003 can also include standard wireline interface and wireless interface.Network interface 1004 optionally may include that the wired of standard connects
Mouth, wireless interface (such as WI-FI interface).Memory 1005 can be high speed RAM memory, be also possible to stable memory
(non-volatile memory), such as magnetic disk storage.Memory 1005 optionally can also be independently of aforementioned processor
1001 storage device.
It will be understood by those skilled in the art that terminal structure shown in Fig. 1 does not constitute the restriction to terminal device, it can
To include perhaps combining certain components or different component layouts than illustrating more or fewer components.
As shown in Figure 1, as may include that operating system, network are logical in a kind of memory 1005 of computer storage medium
Believe module, Subscriber Interface Module SIM and voice processing program.
In terminal device shown in Fig. 1, network interface 1004 is mainly used for connecting background server, with background server
Carry out data communication;User interface 1003 is mainly used for connecting client (user terminal), carries out data communication with client;And locate
Reason device 1001 can be used for calling the voice processing program stored in memory 1005, and execute following operation:
The voice messaging in environment is obtained, voice data is determined in default speech database according to the voice messaging;
The text information that preset interface receives is extracted, target is searched from the voice data based on the text information
Voice data;
It is instructed according to speech synthesis, the target speech data is synthesized into voice sequence.
Further, it is described acquisition environment in voice messaging the step of before, the method also includes:
The sound in the environment including the voice messaging is carried out at noise reduction according to the wave volume in the environment
Reason;
The step of voice messaging in the acquisition environment includes:
From the sound after noise reduction process, the voice messaging is extracted.
Further, described the step of determining voice data in default speech database according to the voice messaging, wraps
It includes:
Identify the word content and sound quality information of the voice messaging;
It whether detects in the default speech database containing voice data corresponding to the word content;
If not containing, the corresponding pass between the word content and voice data in presently described voice messaging is established
System, and the voice data in presently described voice messaging is stored into the default speech database;
If it does, voice data is then determined in the default speech database based on the sound quality information recognized.
Further, described to determine voice number in the default speech database based on the sound quality information recognized
According to the step of, comprising:
The sound quality information for detecting voice data corresponding to the word content stored in the default speech database is
The no sound quality information better than voice data in the presently described voice messaging recognized;
If it is not, then voice data corresponding to the word content is updated to currently in the default speech database
Voice data in the voice messaging;
If so, abandoning the language being updated to voice data corresponding to the word content in presently described voice messaging
Sound data.
Further, described the step of voice data is determined in default speech database according to the voice messaging it
Afterwards, the method also includes:
Establish the incidence relation list between word content and sentence;
Described the step of target speech data is searched from the voice data based on the text information, comprising:
Word segmentation processing is carried out to obtain the first word content of the text information to the text information, and in the pass
Join matching criteria sentence in relation list;
According to the second word content in the standard sentence, the voice data stored in the default speech database
Middle lookup target speech data.
Further, it is described acquisition environment in voice messaging the step of after, the method also includes:
Application on Voiceprint Recognition processing is carried out to extract vocal print feature to the voice data in the voice messaging, and is based on mentioning
The vocal print feature taken determines output tone color.
Further, described to be instructed according to speech synthesis, the step of target speech data is synthesized into voice sequence,
Include:
Detect carrying sequence synthesis demand and Timbre Synthesis demand in the speech synthesis instruction;
Demand is synthesized based on the sequence, according in the character order or second text of first word content
The character order of appearance combines the target speech data to form initial speech sequence;
Based on the Timbre Synthesis demand, the output tone color is added in the initial speech sequence to form final language
Sound sequence.
Based on above-mentioned hardware configuration, each embodiment of method of speech processing of the present invention is proposed.
Referring to figure 2., in method of speech processing first embodiment of the present invention, which includes:
Step S10 obtains the voice messaging in environment, determines language in default speech database according to the voice messaging
Sound data.
In the present embodiment, terminal device can be the equipment such as smart phone, tablet computer.It needs to carry out voice in user
When typing, terminal device can enable voice transformation mode based on the voice translative mode instruction that user triggers;Certainly, terminal is set
It is standby to enable voice transformation mode automatically in some scenarios, such as in terminal device when entering recording state, automatically
Starting voice transformation mode.
After terminal device enables voice translative mode, by installing microphone reception external environment on the terminal device
In sound, and the voice messaging in the received sound of microphone is filtered out based on speech recognition, thus the voice that will be filtered out
Information carries out typing.
Specifically, it for example, in the case where external environment is noisy, is removed in the sound that the microphone on terminal device receives
It, will also be miscellaneous comprising many unwanted noises comprising the voice messaging (i.e. user speak voice) for currently needing to filter out
Sound, such as Che Mingsheng, clamour or machine run sound interference sound, terminal device is to the ambient sound in the external environment received
Sound carries out screening to get voice messaging, after terminal device gets voice messaging, is located in advance to voice messaging
Reason, such as Text region processing and sound quality identifying processing, thus based on the voice messaging obtained after being pre-processed from pre-
In the default speech database for voice data first established, the current accessed voice messaging of terminal device is determined
Voice data.
Further, before step 10, method of speech processing of the present invention further include:
Step A carries out the sound in the environment including the voice messaging according to the wave volume in the environment
Noise reduction process.
In the present embodiment, to be promoted, terminal device obtains voice messaging and the processing handled voice messaging is imitated
Rate, before screening voice messaging in external environmental sounds received by the microphone on terminal device, terminal device is first
Whether the wave volume in automatic detection external environment is more than preset volume value, and wherein the default volume value can be according to user
It needs to carry out flexible setting, if detecting, the volume value in current outside environment has been more than default volume value, it is determined that terminal is set
Standby that the sound currently received to microphone is needed to carry out noise reduction screening, the voice messaging that can be just needed just leads to immediately
It crosses and noise reduction filtration treatment is carried out using the modes such as the noise reduction algorithm sound currently received to microphone.
In another embodiment, if detecting, the volume value in current outside environment is less than default volume value, really
Determine terminal device and be currently not necessarily to the sound progress noise reduction screening currently received to microphone, and directly will can currently be connect
The sound received carries out typing as the voice messaging needed.
Further, in step S10, the voice messaging in environment is obtained, comprising:
Step S101 extracts the voice messaging from the sound after noise reduction process.
Detecting that the volume value in current outside environment has been more than default volume value, and by using the side such as noise reduction algorithm
After the formula sound currently received to microphone carries out noise reduction filtration treatment, it is based on existing speech recognition skill
Art extracts relatively clear voice messaging from by the sound after noise reduction filtration treatment.
Step S20 extracts the text information that preset interface receives, based on the text information from the voice data
Search target speech data.
In the present embodiment, preset interface is that the data preset for obtaining the inputted text sentence information of user connect
Mouthful.
Terminal device mentions after detecting the text sentence information that pre-set data-interface receives user's input
The text sentence information that takes the user to be inputted simultaneously carries out word segmentation processing to text sentence information, is inputted to obtain user
Each word content of text sentence information, based on each word content from the default voice for voice data pre-established
In database, target speech data corresponding with each word content of the inputted text sentence information of active user is searched.
Step S30, instructs according to speech synthesis, and the target speech data is synthesized voice sequence.
The speech synthesis instruction that user is triggered is obtained, and detects user entrained in speech synthesis instruction to current
Inputted text sentence information carries out the synthesis demand of speech synthesis, such as needs to carry out sentence to inputted text sentence information
Retouching, and/or pairing is needed to retouch etc. at the output tone color of voice, based on the user detected to current inputted text
Sentence information carries out the synthesis demand of speech synthesis, and each word content of the inputted text sentence information of the user found is opposite
The target speech data answered is combined, to form the voice sequence that can be exported by voice mode.
In the present embodiment, by obtaining the voice messaging in the external environmental sounds, and base under meaning external environment in office
In the voice messaging progress respective handling to the acquisition, so that voice data is determined in pre-set speech database, from
Text information is extracted on the preset interface for receiving the inputted text information of user, by carrying out to text information
After basic processing, in the default speech database of the preservation voice data set in advance, the text envelope with extraction is found out
The corresponding target speech data of manner of breathing obtains lookup then according to the demand of user in the triggered speech synthesis instruction of user
Target speech data be combined to be formed and meet the voice sequence of user demand.Realize, can not by scene, context etc. because
Element, which limits, carries out speech recognition and speech synthesis processing, improves the treatment effeciency of the processing carried out to voice, and based on use
Family customizes and individual demand carries out speech synthesis, improves the performance of speech processes.
Further, on the basis of above-mentioned first embodiment, the second embodiment of method of speech processing of the present invention is proposed,
Referring to figure 3., in the second embodiment of method of speech processing of the present invention, in above-mentioned steps S10, existed according to the voice messaging
The step of determining voice data in default speech database, comprising:
Step S102 identifies the word content and sound quality information of the voice messaging.
Terminal device passes through the language to the typing after noise reduction filtration treatment of external environmental sounds received by microphone
Message breath carries out Text region processing and sound quality identifying processing, so that identification obtains the word content and sound of current speech information
Matter information, wherein terminal device gets current speech information after carrying out sound quality identifying processing to the voice messaging of typing
Volume, tone sound quality information.
Whether step S103 detects in the default speech database containing voice number corresponding to the word content
According to.
Terminal device detects whether to have protected in the pre-set default speech database for voice data
There is the word content of current institute's typing voice messaging obtained with identification, voice data corresponding to same text content,
In, in default speech database, according to the corresponding relationship between word content and voice data to the conscientious preservation of voice data,
For example, in default speech database, can save word content " you are good " and with corresponding to current " you are good " word content
Voice band, alternatively, the word content of the typing voice data recognized can be only saved in default speech database, and
Voice Band Data corresponding with the word content of preservation can be then downloaded from other databases.
Step S104 establishes the corresponding relationship in the word content and presently described voice messaging between voice data,
And the voice data in presently described voice messaging is stored into the default speech database.
If detecting in default speech database, do not preserve in the current institute's typing voice messaging text obtained with identification
When holding voice data corresponding to identical word content (voice band), then the current institute's typing voice messaging not obtained is established
Corresponding relationship in word content and current speech data between voice data, and the word content is corresponding with the voice data
It saves into current preset speech database, alternatively, being obtained when being not detected not preserve in default speech database with identification
Current institute's typing voice messaging word content identical word content when, will be in obtained current institute's typing voice messaging text
Hold and save into default speech database, and when needing to export voice data corresponding to current character content, under online
Load mode is downloaded and voice data corresponding to word content from other databases.
Step S105 determines voice data based on the sound quality information recognized in the default speech database.
If detecting in default speech database, save identical with current institute's typing voice messaging word content
When voice data corresponding to word content, further believed by detecting the sound quality of voice data in current institute's typing voice messaging
Whether breath is better than the sound quality information for presetting the voice data being saved in speech database, to current institute's typing voice messaging
Voice data corresponding to word content is updated replacement.
Further, step S105, comprising:
Step S1051 detects voice data corresponding to the word content stored in the default speech database
Sound quality information, if better than the sound quality information of voice data in the presently described voice messaging recognized.
Terminal device detects in the pre-set presetting database for voice data, the word content stored
The sound quality information such as volume, the tone of voice data identical with word content in the voice messaging of current typing, if better than working as
Preceding identification obtains the sound quality information such as volume, the tone of institute's typing voice messaging voice data.
Voice data corresponding to the word content is updated to work as by step S1052 in the default speech database
Voice data in the preceding voice messaging.
If terminal device detects that current identification obtains the sound quality such as volume, the tone of institute's typing voice messaging voice data letter
Breath, voice number identical with word content in the voice messaging of current typing better than the word content stored in presetting database
According to the sound quality information such as volume, tone, then will be pre-saved in presetting database and voice corresponding to current character content
Data are deleted, and the voice data in current institute's typing voice messaging is saved into current preset database.
Step S1053 abandons for voice data corresponding to the word content being updated in presently described voice messaging
Voice data.
If terminal device detects the voice messaging Chinese of the word content and current typing stored in presetting database
The sound quality information such as volume, the tone of the identical voice data of word content obtain institute's typing voice messaging voice better than current identification
The sound quality information such as volume, tone of data is then abandoned deleting currently stored voice data and will be current in presetting database
Voice data in institute's typing voice messaging re-starts the operation of storage.
In the present embodiment by the typing after noise reduction filtration treatment of external environmental sounds received by microphone
Voice messaging, carry out Text region processing and sound quality identifying processing, thus identification obtain the word content of current speech information
It detects whether to be saved and in the pre-set default speech database for voice data with sound quality information
There are the word content of the current institute's typing voice messaging obtained with identification, voice data corresponding to same text content, if inspection
It measures in default speech database, does not preserve corresponding to word content identical with current institute's typing voice messaging word content
Voice data when, establish the corresponding relationship between in current character content and current speech data, and by the word content with
The voice data is corresponding to be saved into current preset speech database, if detecting in default speech database, has been saved
When voice data corresponding to word content identical with current institute's typing voice messaging word content, further worked as by detection
Whether the sound quality information of voice data is better than the voice number being saved in default speech database in preceding institute's typing voice messaging
According to sound quality information, voice data corresponding to the word content to current institute's typing voice messaging is updated replacement.
It realizes and replacement is updated, to default language to identical voice data previously stored in default speech database
The new speech data correspondence of not stored current typing voice messaging is stored in sound database, ensure that in speech database
The accuracy of institute's voice data, to improve the effect for carrying out speech synthesis based on speech database institute's voice data
Rate.
Further, on the basis of above-mentioned first embodiment, the 3rd embodiment of method of speech processing of the present invention is proposed,
Referring to figure 4., in the 3rd embodiment of method of speech processing of the present invention, existed in above-mentioned steps S10 according to the voice messaging
After determining voice data in default speech database, method of speech processing of the present invention further include:
Step S40 establishes the incidence relation list between word content and sentence.
Terminal device can first pass through the corresponding relationship list established between word content and sentence in advance, and be stored in corresponding
It, can be in text information inputted to user based on the corresponding relationship list established between word content and sentence in memory
Certain texts are retouched, so that terminal device is same available accurate when user rapidly inputs fuzzy text information
The word content for needing to carry out speech synthesis.
In the present embodiment, the corresponding relationship list between word content and sentence should be pair of common text and sentence
Should be related to, for example, with word content in the fuzzy text information rapidly input in every family be " tomorrow removes film ", then again from described right
Lookup and the matched sentence in " film tomorrow " content part in relation list are answered, for example the corresponding sentence found is " tomorrow
Go to the cinema together " etc..
In the present embodiment further, it in above-mentioned steps S20, is searched from the voice data based on the text information
Target speech data, comprising:
Step S201, to text information progress word segmentation processing to obtain the first word content of the text information,
And the matching criteria sentence in the incidence relation list.
When terminal device get pre-set data-interface receive user input text sentence information after it is right
Text sentence information carries out word segmentation processing, so that the first word content of the inputted text sentence information of user is obtained, such as
For " tomorrow watches movie ", further, based in the triggered speech synthesis instruction of user detected to inputted text sentence
Information carries out the demand of sentence retouching, from the corresponding relationship list between the word content and sentence pre-established, matching pair
The standard sentence answered, for example, " tomorrow goes to the cinema together ".
Step S202 is stored in the default speech database according to the second word content in the standard sentence
Voice data in search target speech data.
Further, terminal device carries out at participle the standard sentence " tomorrow goes to the cinema together " being matched to again
Reason, thus obtain standard sentence the second word content (i.e. " bright ", " day ", " one ", " rising ", " going ", " seeing ", " electricity " and
" shadow "), thus based on each word content from the default speech database for voice data pre-established, search with
The corresponding target speech data of each second word content of Current standards sentence.
Further, in above-mentioned steps 10, after obtaining the voice messaging in environment, method of speech processing of the present invention, also
Include:
Step B carries out Application on Voiceprint Recognition to the voice data in the voice messaging and handles to extract vocal print feature, and
Output tone color is determined based on the vocal print feature of extraction.
Based on technologies such as Application on Voiceprint Recognition, from the microphone institute extracted in the voice messaging of typing in sending present terminal equipment
The vocal print feature of the speaker of voice messaging is received, thus according to the vocal print feature currently extracted, in the use pre-established
In the tamber data library of different output tone colors when storing pairing and carrying out voice output at voice sequence, output tone color is determined.
Specifically, for example, in the tamber data library for detecting the different output tone colors of the storage pre-established, if save
Have special based on vocal print identical as the speaker vocal print feature of voice messaging received by the microphone issued in present terminal equipment
Levy establish output tone color, and detect do not preserve based on identical vocal print feature establish output tone color when, be based on immediately
The vocal print feature of current speaker establishes new output tone color in the tamber data library, and marks the pronunciation of current output tone color
People's information, if detect the output tone color for having saved in current tamber data library and having established based on the identical vocal print feature,
Just it abandons establishing new output tone color in current tamber data library.
Further, step S30 includes:
Step S301 detects carrying sequence synthesis demand and Timbre Synthesis demand in the speech synthesis instruction.
When terminal device detects that user is based on the triggering speech synthesis instruction of preset instruction control, the language is obtained
Sound synthetic instruction, and detect user entrained in speech synthesis instruction and voice is carried out to current inputted text sentence information
The synthesis demand of synthesis, for example, user needs to carry out inputted text sentence information demand and/or the user of sentence retouching
The demand for needing pairing to be retouched at the output tone color of voice.
Step S302 synthesizes demand based on the sequence, according to the character order of first word content or described
The character order of second word content combines the target speech data to form initial speech sequence.
Terminal device carries out inputted text sentence information according to entrained in the triggered speech synthesis instruction of user
The demand of sentence retouching, according to word content (i.e. " bright ", " day ", " electricity " each in the inputted text information of user " film tomorrow "
" shadow ") character order, or according to the standard sentence being matched to based on the inputted text information of user " film tomorrow "
" tomorrow goes to the cinema together " second word content (i.e. " bright ", " day ", " one ", " rising ", " going ", " seeing ", " electricity " and " shadow ")
The target speech data answered with each word content found from default speech database is combined by character order,
To form initial voice sequence.
Step S303, be based on the Timbre Synthesis demand, added in the initial speech sequence output tone color with
Form final voice sequence.
Terminal device is based on the output tone color according to pairing entrained in the triggered speech synthesis instruction of user at voice
The demand retouched, from pre-establishing for storing different output tone colors of the pairing at voice sequence progress voice output when
Tamber data library in search target and export tone color, and the vocal print feature of target output tone color is added to and has currently synthesized
Initial speech sequence in, to form the final final voice sequence for carrying out voice output.
In another embodiment, it if terminal device detects in the triggered speech synthesis instruction of user, does not carry
When the demand that pairing is retouched at the output tone color of voice, then without being added in the initial speech sequence currently synthesized
The vocal print feature of tone color is exported, i.e., voice output directly is carried out to voice sequence according to " machine talk ".
In the present embodiment, the sequence carried in speech synthesis instruction is triggered by detection user and synthesizes demand and tone color
Synthesis demand, and based on sequence synthesis demand, according to the character order of each word content in the inputted text information of user, or
It is matched in incidence relation list between the word content pre-established and sentence according to based on the inputted text information of user
Standard sentence Chinese word content each character order, will find from stating in the voice data stored in default speech database
Target speech data, be combined to form initial speech sequence, and be further based on Timbre Synthesis demand, from according to language
Voice data in message breath carries out searching institute in output tamber data library of the Application on Voiceprint Recognition processing to extract vocal print feature foundation
Need target to export tone color, and by the vocal print feature of target output tone color be added to currently the initial speech sequence that has synthesized with
Final voice sequence is generated, to carry out voice output according to voice sequence of the selected output tone color to synthesis.
It realizes, based on user to the different synthesis demands of speech synthesis, synthesis and the voice for flexibly carrying out voice are defeated
Out, meet the needs of user individual customization, improve the performance processing such as synthesized, exported to voice.
In addition, referring to figure 5., the embodiment of the present invention also proposes a kind of voice processing apparatus, the voice processing apparatus packet
It includes:
Module is obtained, for obtaining the voice messaging in environment, according to the voice messaging in default speech database
Determine voice data;
Searching module, the text information received for extracting preset interface are based on the text information from the voice
Target speech data is searched in data;
The target speech data is synthesized voice sequence for instructing according to speech synthesis by synthesis module.
Preferably, voice processing apparatus of the present invention, further includes:
Detection module, for according to the wave volume in the environment in the environment include the voice messaging sound
Sound carries out noise reduction process;
Obtain module, comprising:
Extraction unit, for extracting the voice messaging from the sound after noise reduction process.
Preferably, module is obtained, further includes:
Recognition unit, for identification word content of the voice messaging and sound quality information;
First detection unit, for whether detecting in the default speech database containing corresponding to the word content
Voice data;
First determination unit, for establishing pair in the word content and presently described voice messaging between voice data
It should be related to, and the voice data in presently described voice messaging is stored into the default speech database;
Second determination unit, for determining language in the default speech database based on the sound quality information recognized
Sound data.
Preferably, the second determination unit, comprising:
Second detection unit, for detecting voice corresponding to the word content stored in the default speech database
The sound quality information of data, if better than the sound quality information of voice data in the presently described voice messaging recognized;
Updating unit, in the default speech database, voice data corresponding to the word content to be updated
For the voice data in presently described voice messaging;Wherein, updating unit is also used to abandon by language corresponding to the word content
Sound data, the voice data being updated in presently described voice messaging.
Preferably, voice processing apparatus of the present invention, further includes:
Module is established, the incidence relation list for establishing between word content and sentence;
Searching module, comprising:
Participle unit, for carrying out word segmentation processing to the text information to obtain in the first text of the text information
Hold, and the matching criteria sentence in the incidence relation list;
Searching unit, for according to the second word content in the standard sentence, in the default speech database
Target speech data is searched in the voice data of storage.
Preferably, voice processing apparatus of the present invention, further includes:
Voiceprint extraction module is handled for carrying out Application on Voiceprint Recognition to the voice data in the voice messaging to extract
Vocal print feature, and output tone color is determined based on the vocal print feature of extraction.
Preferably, synthesis module, comprising:
Third detection unit is needed for detecting carrying sequence synthesis demand and Timbre Synthesis in the speech synthesis instruction
It asks;
First synthesis unit, for synthesizing demand based on the sequence, according to the character order of first word content
Or the character order of second word content, the target speech data is combined to form initial speech sequence;
Second synthesis unit adds described defeated for being based on the Timbre Synthesis demand in the initial speech sequence
Tone color is out to form final voice sequence.
The each functional module for the voice processing apparatus that the present embodiment proposes at runtime, is realized at voice as described above
The step of reason method, details are not described herein.
In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, deposited on computer readable storage medium
The step of containing voice processing program, method of speech processing as described above realized when voice processing program is executed by processor.
Computer readable storage medium specific embodiment of the present invention is referred to each embodiment of above-mentioned method of speech processing,
Details are not described herein.
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, method, article or the system that include a series of elements not only include those elements, and
And further include other elements that are not explicitly listed, or further include for this process, method, article or system institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, method of element, article or system.
The serial number of the above embodiments of the invention is only for description, does not represent the advantages or disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can be realized by means of software and necessary general hardware platform, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on this understanding, technical solution of the present invention substantially in other words does the prior art
The part contributed out can be embodied in the form of software products, which is stored in one as described above
In storage medium (such as ROM/RAM, magnetic disk, CD), including some instructions are used so that terminal device (it can be mobile phone,
Computer, server or network equipment etc.) execute method described in each embodiment of the present invention.
The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
1. a kind of method of speech processing, which is characterized in that the method for speech processing includes:
The voice messaging in environment is obtained, voice data is determined in default speech database according to the voice messaging;
The text information that preset interface receives is extracted, target voice is searched from the voice data based on the text information
Data;
It is instructed according to speech synthesis, the target speech data is synthesized into voice sequence.
2. method of speech processing as described in claim 1, which is characterized in that the step of the voice messaging in the acquisition environment
Before rapid, the method also includes:
Noise reduction process is carried out to the sound in the environment including the voice messaging according to the wave volume in the environment;
The step of voice messaging in the acquisition environment includes:
From the sound after noise reduction process, the voice messaging is extracted.
3. method of speech processing as described in claim 1, which is characterized in that it is described according to the voice messaging in default voice
The step of voice data is determined in database, comprising:
Identify the word content and sound quality information of the voice messaging;
It whether detects in the default speech database containing voice data corresponding to the word content;
If not containing, the corresponding relationship in the word content and presently described voice messaging between voice data is established, and
Voice data in presently described voice messaging is stored into the default speech database;
If it does, voice data is then determined in the default speech database based on the sound quality information recognized.
4. method of speech processing as claimed in claim 3, which is characterized in that described to be existed based on the sound quality information recognized
The step of voice data is determined in the default speech database, comprising:
Detect the sound quality information of voice data corresponding to the word content stored in the default speech database, if excellent
The sound quality information of voice data in the presently described voice messaging recognized;
If it is not, voice data corresponding to the word content is updated to presently described then in the default speech database
Voice data in voice messaging;
If so, abandoning the voice number being updated to voice data corresponding to the word content in presently described voice messaging
According to.
5. method of speech processing as described in claim 1, which is characterized in that it is described according to the voice messaging in default language
After the step of determining voice data in sound database, the method also includes:
Establish the incidence relation list between word content and sentence;
Described the step of target speech data is searched from the voice data based on the text information, comprising:
Word segmentation processing is carried out to the text information to obtain the first word content of the text information, and close in the association
Matching criteria sentence in series of tables;
According to the second word content in the standard sentence, looked into the voice data stored in the default speech database
Look for target speech data.
6. method of speech processing as described in claim 1, which is characterized in that the step of the voice messaging in the acquisition environment
After rapid, the method also includes:
It carries out Application on Voiceprint Recognition to the voice data in the voice messaging to handle to extract vocal print feature, and based on extraction
The vocal print feature determines output tone color.
7. the method for speech processing as described in claim 1 to 6, which is characterized in that it is described to be instructed according to speech synthesis, it will be described
Target speech data synthesizes the step of voice sequence, comprising:
Detect carrying sequence synthesis demand and Timbre Synthesis demand in the speech synthesis instruction;
Demand is synthesized based on the sequence, according to the character order or second word content of first word content
Character order combines the target speech data to form initial speech sequence;
Based on the Timbre Synthesis demand, the output tone color is added in the initial speech sequence to form final voice sequence
Column.
8. a kind of voice processing apparatus, which is characterized in that the voice processing apparatus includes:
Module is obtained, for obtaining the voice messaging in environment, is determined in default speech database according to the voice messaging
Voice data;
Searching module, the text information received for extracting preset interface are based on the text information from the voice data
Middle lookup target speech data;
The target speech data is synthesized voice sequence for instructing according to speech synthesis by synthesis module.
9. a kind of terminal device, which is characterized in that the terminal device includes: memory, processor and is stored in the storage
It is real when the voice processing program is executed by the processor on device and the voice processing program that can run on the processor
Now the step of method of speech processing as described in any one of claims 1 to 7.
10. a kind of storage medium, which is characterized in that be stored with voice processing program, the speech processes on the storage medium
The step of method of speech processing as described in any one of claims 1 to 7 is realized when program is executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746794.9A CN110444190A (en) | 2019-08-13 | 2019-08-13 | Method of speech processing, device, terminal device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910746794.9A CN110444190A (en) | 2019-08-13 | 2019-08-13 | Method of speech processing, device, terminal device and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110444190A true CN110444190A (en) | 2019-11-12 |
Family
ID=68435210
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910746794.9A Pending CN110444190A (en) | 2019-08-13 | 2019-08-13 | Method of speech processing, device, terminal device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110444190A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111611208A (en) * | 2020-05-27 | 2020-09-01 | 北京太极华保科技股份有限公司 | File storage and query method and device and storage medium |
CN111667812A (en) * | 2020-05-29 | 2020-09-15 | 北京声智科技有限公司 | Voice synthesis method, device, equipment and storage medium |
CN111752524A (en) * | 2020-06-28 | 2020-10-09 | 支付宝(杭州)信息技术有限公司 | Information output method and device |
CN113053352A (en) * | 2021-03-09 | 2021-06-29 | 深圳软银思创科技有限公司 | Voice synthesis method, device, equipment and storage medium based on big data platform |
CN113299271A (en) * | 2020-02-06 | 2021-08-24 | 菜鸟智能物流控股有限公司 | Voice synthesis method, voice interaction method, device and equipment |
CN113436605A (en) * | 2021-06-22 | 2021-09-24 | 广州小鹏汽车科技有限公司 | Processing method of vehicle-mounted voice synthesis data, vehicle-mounted electronic equipment and vehicle |
TWI767197B (en) * | 2020-03-10 | 2022-06-11 | 中華電信股份有限公司 | Method and server for providing interactive voice tutorial |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1345028A (en) * | 2000-09-18 | 2002-04-17 | 松下电器产业株式会社 | Speech sunthetic device and method |
CN1770261A (en) * | 2004-11-01 | 2006-05-10 | 英业达股份有限公司 | Speech synthesis system and method |
CN102117614A (en) * | 2010-01-05 | 2011-07-06 | 索尼爱立信移动通讯有限公司 | Personalized text-to-speech synthesis and personalized speech feature extraction |
CN106448665A (en) * | 2016-10-28 | 2017-02-22 | 努比亚技术有限公司 | Voice processing device and method |
CN107293284A (en) * | 2017-07-27 | 2017-10-24 | 上海传英信息技术有限公司 | A kind of phoneme synthesizing method and speech synthesis system based on intelligent terminal |
CN108173740A (en) * | 2017-11-30 | 2018-06-15 | 维沃移动通信有限公司 | A kind of method and apparatus of voice communication |
-
2019
- 2019-08-13 CN CN201910746794.9A patent/CN110444190A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1345028A (en) * | 2000-09-18 | 2002-04-17 | 松下电器产业株式会社 | Speech sunthetic device and method |
CN1770261A (en) * | 2004-11-01 | 2006-05-10 | 英业达股份有限公司 | Speech synthesis system and method |
CN102117614A (en) * | 2010-01-05 | 2011-07-06 | 索尼爱立信移动通讯有限公司 | Personalized text-to-speech synthesis and personalized speech feature extraction |
CN106448665A (en) * | 2016-10-28 | 2017-02-22 | 努比亚技术有限公司 | Voice processing device and method |
CN107293284A (en) * | 2017-07-27 | 2017-10-24 | 上海传英信息技术有限公司 | A kind of phoneme synthesizing method and speech synthesis system based on intelligent terminal |
CN108173740A (en) * | 2017-11-30 | 2018-06-15 | 维沃移动通信有限公司 | A kind of method and apparatus of voice communication |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113299271A (en) * | 2020-02-06 | 2021-08-24 | 菜鸟智能物流控股有限公司 | Voice synthesis method, voice interaction method, device and equipment |
CN113299271B (en) * | 2020-02-06 | 2023-12-15 | 菜鸟智能物流控股有限公司 | Speech synthesis method, speech interaction method, device and equipment |
TWI767197B (en) * | 2020-03-10 | 2022-06-11 | 中華電信股份有限公司 | Method and server for providing interactive voice tutorial |
CN111611208A (en) * | 2020-05-27 | 2020-09-01 | 北京太极华保科技股份有限公司 | File storage and query method and device and storage medium |
CN111667812A (en) * | 2020-05-29 | 2020-09-15 | 北京声智科技有限公司 | Voice synthesis method, device, equipment and storage medium |
CN111667812B (en) * | 2020-05-29 | 2023-07-18 | 北京声智科技有限公司 | Speech synthesis method, device, equipment and storage medium |
CN111752524A (en) * | 2020-06-28 | 2020-10-09 | 支付宝(杭州)信息技术有限公司 | Information output method and device |
CN113053352A (en) * | 2021-03-09 | 2021-06-29 | 深圳软银思创科技有限公司 | Voice synthesis method, device, equipment and storage medium based on big data platform |
CN113436605A (en) * | 2021-06-22 | 2021-09-24 | 广州小鹏汽车科技有限公司 | Processing method of vehicle-mounted voice synthesis data, vehicle-mounted electronic equipment and vehicle |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110444190A (en) | Method of speech processing, device, terminal device and storage medium | |
CN109087669B (en) | Audio similarity detection method and device, storage medium and computer equipment | |
CN110288077B (en) | Method and related device for synthesizing speaking expression based on artificial intelligence | |
CN111261144B (en) | Voice recognition method, device, terminal and storage medium | |
CN104168353B (en) | Bluetooth headset and its interactive voice control method | |
CN111508511A (en) | Real-time sound changing method and device | |
CN107705783A (en) | A kind of phoneme synthesizing method and device | |
CN107291690A (en) | Punctuate adding method and device, the device added for punctuate | |
KR100339587B1 (en) | Song title selecting method for mp3 player compatible mobile phone by voice recognition | |
CN111583944A (en) | Sound changing method and device | |
CN107992485A (en) | A kind of simultaneous interpretation method and device | |
CN103377651B (en) | The automatic synthesizer of voice and method | |
CN106302933B (en) | Voice information processing method and terminal | |
CN114401417B (en) | Live stream object tracking method, device, equipment and medium thereof | |
CN110097890A (en) | A kind of method of speech processing, device and the device for speech processes | |
CN110149548A (en) | Video dubbing method, electronic device and readable storage medium storing program for executing | |
CN109801618A (en) | A kind of generation method and device of audio-frequency information | |
CN110428811B (en) | Data processing method and device and electronic equipment | |
CN107291704A (en) | Treating method and apparatus, the device for processing | |
CN113033245A (en) | Function adjusting method and device, storage medium and electronic equipment | |
CN107517313A (en) | Awakening method and device, terminal and readable storage medium storing program for executing | |
CN101354886A (en) | Apparatus for recognizing speech | |
CN113113040B (en) | Audio processing method and device, terminal and storage medium | |
CN107112007A (en) | Speech recognition equipment and audio recognition method | |
CN106372203A (en) | Information response method and device for smart terminal and smart terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191112 |
|
RJ01 | Rejection of invention patent application after publication |