CN106340297A - Speech recognition method and system based on cloud computing and confidence calculation - Google Patents
Speech recognition method and system based on cloud computing and confidence calculation Download PDFInfo
- Publication number
- CN106340297A CN106340297A CN201610840519.XA CN201610840519A CN106340297A CN 106340297 A CN106340297 A CN 106340297A CN 201610840519 A CN201610840519 A CN 201610840519A CN 106340297 A CN106340297 A CN 106340297A
- Authority
- CN
- China
- Prior art keywords
- speech recognition
- clouds
- voice
- different
- confidence level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000004364 calculation method Methods 0.000 title claims abstract description 28
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000011156 evaluation Methods 0.000 claims abstract description 11
- 238000011946 reduction process Methods 0.000 claims description 17
- 238000012360 testing method Methods 0.000 claims description 14
- 230000009467 reduction Effects 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 5
- 235000013399 edible fruits Nutrition 0.000 claims description 2
- 238000012549 training Methods 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 3
- 230000008859 change Effects 0.000 description 3
- 238000013178 mathematical model Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 239000000945 filler Substances 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000011017 operating method Methods 0.000 description 1
- 230000001151 other effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/32—Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a speech recognition method based on cloud calculation and confidence calculation, relating to the technical field of speech recognition. The method comprises the steps of (S1) a local speech recognition system and a cloud speech recognition system receive speech signals, (S2) the local speech recognition system obtains a local speech recognition result, and the cloud speech recognition system obtains a cloud speech recognition result, (S31) carrying out confidence evaluation on the local speech recognition result, and obtaining the confidence of the local speech recognition result, (S32) carrying out confidence evaluation on the cloud speech recognition result, and obtaining the confidence of the cloud speech recognition result, (S4) comparing the confidence of the local speech recognition result and the confidence of the cloud speech recognition result, and outputting a speech recognition result with high confidence is outputted. The invention also discloses a speech recognition system based on cloud calculation and confidence calculation. By using the method with the combination of cloud and local speech recognition is used, and the quality of the speech recognition can be improved.
Description
Technical field
The present invention relates to the technical field of speech recognition is and in particular to a kind of voice based on cloud computing and confidence calculations
Recognition methodss and system.
Background technology
Progress with science and the development of technology, speech recognition technology has reached its maturity, and is just progressively becoming information skill
The key technology of man-machine interface in art.Multiple voice recognizer makes speech recognition either on discrimination or recognition efficiency
All have a distinct increment.In recent years, speech recognition technology is also gradually commonly used in every field.However, traditional voice is known
Other technology carries out speech recognition using local voice identification software mostly, and so resulting in the speech recognition algorithm in software is no
Method changes.And different speech recognition algorithms certainly will have difference for the speech recognition effect of different phonetic entry environment
Different.For example in complicated noise, there is the noise in various sources.Under such noise circumstance, the language of original operational excellence
The discrimination of sound identifying system may be a greater impact.If the method that software adopts template training, due to training sample and
The mismatch of sample planting modes on sink characteristic, then the recognition performance of software will drastically decline, the shortcoming of existing voice identifying system be with
Its speech recognition performance of the change of environment also can drastically decline, its adaptability and the suitability not high it is impossible to meet multiple in the case of
Speech recognition demand.Therefore, how to allow speech recognition system is with a wide range of applications to be just particularly important with the suitability.
Chinese patent application cn201310163915.x discloses a kind of update method of speech recognition apparatus, device and is
System, comprising: receive voice input signal;Using local voice identification equipment, speech recognition is carried out to voice input signal, obtain
Local voice recognition result;Obtain optimal identification result as from local voice recognition result and high in the clouds voice identification result
Whole voice identification result, wherein high in the clouds voice identification result are to carry out speech recognition in local voice equipment to voice input signal
While, using high in the clouds speech recognition apparatus, speech recognition acquisition is carried out to voice input signal;Anti- in conjunction with the user obtaining
Feedforward information and final voice identification result determine whether the reliability of local voice recognition result meets requirement;Local when determining
When the reliability of voice identification result is unsatisfactory for requiring, using high in the clouds speech recognition apparatus, local voice identification equipment is carried out more
Newly.Apply high in the clouds speech recognition apparatus in the technical scheme of this patent application publication and carry out speech recognition, but voice is known
The lifting of other effect is inconspicuous, and need to determine the reliability of voice identification result in conjunction with the feedback information of user, needs user
Carry out result selection, make the operating procedure of user more loaded down with trivial details, be unfavorable for lifting experience.
Content of the invention
For the deficiencies in the prior art, the purpose of the present invention aims to provide a kind of language based on cloud computing and confidence calculations
Voice recognition method and system, carry out, using cloud computing mode, method that the identification of speech recognition and local voice combines so that language
Sound identification equipment or system can effectively adapt to multiple voice input environment, improve the quality of speech recognition.
For achieving the above object, the present invention adopts the following technical scheme that
A kind of audio recognition method based on cloud computing and confidence calculations, includes following steps:
S1, local speech recognition system and high in the clouds speech recognition system receive voice signal respectively;
S2, local speech recognition system draw local voice recognition result, and high in the clouds speech recognition system draws high in the clouds voice
Recognition result;
S31, confidence level evaluation and test is carried out to local voice recognition result, draw the confidence level of local voice recognition result;
S32, confidence level evaluation and test is carried out to high in the clouds voice identification result, draw the confidence level of high in the clouds voice identification result;
S4, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is compared, will put
The higher voice identification result of reliability is exported.
Further, it is provided with different speech recognition modelings in the speech recognition system of high in the clouds, high in the clouds voice in step s2
Identifying system draws different plan high in the clouds voice identification results based on different speech recognition modelings, and the content of step s32 comprises
Have:
S321, confidence level evaluation and test is carried out to different plan high in the clouds voice identification results, draw and intend high in the clouds language corresponding to different
The confidence level of sound recognition result;
S322, the confidence levels that will intend high in the clouds voice identification result corresponding to difference be compared, and confidence level highest is intended
High in the clouds voice identification result is exported as high in the clouds voice identification result.
Further, different speech recognition modelings includes the speech recognition set up based on different speech recognition algorithms
Model, also include the speech recognition modeling set up based on different speech recognition algorithm combinations, different speech recognition modeling
Corresponding to different phonetic entry environment.
Further, before carrying out step s2, first carry out step s20:
S20, local speech recognition system and high in the clouds speech recognition system are carried out at noise reduction to the voice signal receiving respectively
Reason.
Further, in step s20, high in the clouds speech recognition system is entered to voice signal using different voice de-noising models
Row noise reduction process, this different voice de-noising model is set up based on different phonetic entry environment, this different voice de-noising mould
Type and different speech recognition modelings correspond, and the voice signal completing noise reduction process is sent to by high in the clouds speech recognition system
Speech recognition modeling corresponding to same phonetic entry environment.
A kind of speech recognition system based on cloud computing and confidence calculations, includes:
Local speech recognition system, for receiving voice signal and drawing local voice recognition result;
High in the clouds speech recognition system, for receiving voice signal and drawing high in the clouds voice identification result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and high in the clouds voice identification result is carried out
Confidence level is evaluated and tested;
Data processing module, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is carried out
Compare, and export the higher voice identification result of confidence level.
Further, different high in the clouds speech recognition submodules are included in the speech recognition system of high in the clouds:
In the speech recognition system of high in the clouds, in different high in the clouds speech recognition submodules, include different speech recognition moulds
Type, high in the clouds speech recognition submodule is used for receiving voice signal and drawing plan high in the clouds voice identification result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and plan high in the clouds voice identification result is entered
Row confidence level is evaluated and tested;
Data processing module, by the confidence intending high in the clouds voice identification result of different high in the clouds speech recognition submodule output
Degree is compared, and confidence level highest is intended high in the clouds voice identification result as high in the clouds voice identification result;Local voice is known
The confidence level of other result and the confidence level of high in the clouds voice identification result are compared, and export the higher speech recognition knot of confidence level
Really.
Further, different speech recognition modelings includes the speech recognition set up based on different speech recognition algorithms
Model, also include the speech recognition modeling set up based on different speech recognition algorithm combinations, different speech recognition modeling
Corresponding to different phonetic entry environment.
Further, local voice noise reduction module and high in the clouds voice de-noising module, local voice noise reduction module are also included
For voice signal being carried out with noise reduction process, again the voice signal completing noise reduction process being sent to local speech recognition system,
High in the clouds voice de-noising module is used for voice signal is carried out with noise reduction process, again the voice signal completing noise reduction process is sent to cloud
End speech recognition system.
Further, different high in the clouds voice de-noising submodules are included in high in the clouds voice de-noising module, different high in the clouds
Different voice de-noising models are included, this different voice de-noising model is defeated based on different voices in voice de-noising submodule
Enter environment and set up, this different voice de-noising model is corresponded from different speech recognition modelings.
The beneficial effects of the present invention is: language is synchronously identified with local speech recognition system using high in the clouds speech recognition system
Sound, wherein high in the clouds speech recognition system are to include multiple speech recognition modelings corresponding to different input environments, from various languages
Preferentially export in sound recognition result, so that speech recognition apparatus or system can effectively adapt to multiple voice input ring
Border, effectively improves the quality of speech recognition;Using certainty factor algebra, various voice identification results are evaluated, improve voice and know
The reliability of other result;Combine, in confidence level valuation, the information not being fully utilized in legacy speech recognition systems, thus subtracting
The entropy of little speech recognition system, more accurately judges correcting errors of recognition result, thus improving the systematic function of speech recognition.
Brief description
Fig. 1 is the flow chart in the present invention based on cloud computing and the audio recognition method of confidence calculations.
Specific embodiment
Below, in conjunction with accompanying drawing and specific embodiment, the present invention is described further:
Embodiment 1
As shown in figure 1, a kind of audio recognition method based on cloud computing and confidence calculations, include following steps:
S1, local speech recognition system and high in the clouds speech recognition system receive voice signal respectively;
S20, local speech recognition system and high in the clouds speech recognition system are carried out at noise reduction to the voice signal receiving respectively
Reason, wherein high in the clouds speech recognition system carries out noise reduction process using different voice de-noising models to voice signal, and this is different
Voice de-noising model is set up based on different phonetic entry environment;
S2, local speech recognition system draw local voice recognition result, and high in the clouds speech recognition system is based on different languages
Sound identification model draws different high in the clouds voice identification results, and this different speech recognition modeling includes knowing based on different voices
Other algorithm and set up speech recognition modeling, also include the speech recognition mould set up based on different speech recognition algorithm combinations
Type, different speech recognition modelings corresponds to different phonetic entry environment, different voice de-noising models and different voices
Identification model corresponds, and the voice signal completing noise reduction process is sent to corresponding voice by different voice de-noising models to be known
Other model;
S31, confidence level evaluation and test is carried out to local voice recognition result, draw the confidence level of local voice recognition result;
S321, confidence level evaluation and test is carried out to different plan high in the clouds voice identification results, draw and intend high in the clouds language corresponding to different
The confidence level of sound recognition result;
S322, the confidence levels that will intend high in the clouds voice identification result corresponding to difference be compared, and confidence level highest is intended
High in the clouds voice identification result is exported as high in the clouds voice identification result;
S4, it is less than setting value then direct output high in the clouds voice identification result when the confidence level of local voice recognition result;If
The confidence level of local voice recognition result reaches setting value, by the confidence level of local voice recognition result and high in the clouds speech recognition knot
The confidence level of fruit is compared, and voice identification result higher for confidence level is exported.
Embodiment 2
A kind of speech recognition system based on cloud computing and confidence calculations, includes:
Local voice noise reduction module, for carrying out noise reduction process, believing the voice completing noise reduction process to voice signal
Number it is sent to local speech recognition system;
High in the clouds voice de-noising module, includes different high in the clouds voice de-noising submodules, and different high in the clouds voice de-noisings is sub
Different voice de-noising models are included, this different voice de-noising model is built based on different phonetic entry environment in module
Vertical, this different voice de-noising model is corresponded from different speech recognition modelings, for carrying out at noise reduction to voice signal
Manage, again the voice signal completing noise reduction process be sent to high in the clouds speech recognition system;
Local speech recognition system, for receiving the voice signal being derived from local voice noise reduction module and drawing local language
Sound recognition result;
High in the clouds speech recognition system, includes different high in the clouds speech recognition submodules, and different high in the clouds speech recognitions is sub
Different speech recognition modelings are included, this different speech recognition modeling is included based on different speech recognition algorithms in module
And the speech recognition modeling set up, also include the speech recognition modeling set up based on different speech recognition algorithm combinations, no
Same speech recognition modeling corresponds to different phonetic entry environment, different voice de-noising models and different speech recognition moulds
Type corresponds, and different sound identification modules receives the voice messaging from corresponding voice de-noising model and draws plan high in the clouds language
Sound recognition result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and plan high in the clouds voice identification result is entered
Row confidence level is evaluated and tested;
Data processing module, by the confidence intending high in the clouds voice identification result of different high in the clouds speech recognition submodule output
Degree is compared, and confidence level highest is intended high in the clouds voice identification result as high in the clouds voice identification result;Local voice is known
The confidence level of other result and the confidence level of high in the clouds voice identification result are compared, and export the higher speech recognition knot of confidence level
Really.
Embodiment 3
It is based on the audio recognition method of confidence calculations or based in embodiment 2 based on cloud computing based in embodiment 1
Cloud computing and the speech recognition system of confidence calculations, in the present embodiment, different speech recognition algorithms includes template matching calculation
Method, probabilistic model algorithm and artificial neural network algorithm, wherein:
Template matching algorithm, extracts the characteristic vector that can fully describe phonic signal character in the training stage and is formed
Feature vector sequence, and be optimized, show that a characteristic vector set carrys out expressing feature vector sequence, with this feature vector set
Cooperate as template;In use, extracting the characteristic vector of voice to be identified, and form the characteristic vector sequence of voice to be identified
Row, the feature vector sequence of the feature vector sequence of voice to be identified and template is contrasted, and by matching degree highest
The corresponding voice signal of template is as the voice identification result based on template matching algorithm;
Probabilistic model algorithm, extracts the characteristic vector that can fully describe phonic signal character in the training stage, according to
The regularity of distribution in feature space for this feature vector forms mathematical model;In use, extracting the feature of voice to be identified
Vector, speech characteristic vector to be identified is contrasted with mathematical model in the regularity of distribution of feature space, is calculated similarity,
And using corresponding for corresponding for similarity highest mathematical model voice signal as the voice identification result based on probabilistic model algorithm.
Embodiment 4
It is based on the audio recognition method of confidence calculations or based in embodiment 2 based on cloud computing based in embodiment 1
Cloud computing and the speech recognition system of confidence calculations, include for setting up the information of confidence level Valuation Modelling in the present embodiment:
1) mark (trace) of viterbi decoding information and hidden Markov model (hmm): state alignment information, state duration
(segment length), likelihood score;2) to alternative hvpothesis h1And anti-word modelModeling;3) the online rubbish that competition candidate result is constituted
Model;4) the clear and definite filler model to foundation of pronouncing outside knowledge by mistake and vocabulary or filler model;5) word lattice density.Confidence level valuation mould
Type is segmented into rule-based comprehensive and based on statistical model synthesis to the synthesis of voice messaging, wherein rule-based comprehensive
Close and in different cognitive phase applications different information source, confidence level is estimated respectively, with its emphasis point is the total of experience
Knot, the formation of rule and adjustment;Statistical model includes linear model and generalized linear model:
Definition event a occur occasionality beThe collection of all information is combined into x,
q(ci=1/x) it is to true probability p (ci=1/x) estimation, then the linear model of confidence level be:
Wherein, xiIngredient for x, i.e. xi∈x;ciFor confidence level label: ci=0 (identification mistake);ci=1 (identification
Correctly).The generalized linear model of confidence level is:
Separately, define wjT () is the optimal score of the front t observed quantity (t frame) reaching state j in search procedure, γi(ot)
Confidence score for t frame state i:
Wherein, logvik(k=1,2,3) represents likelihood score, segment length and likelihood ratio 3 category information respectively:
logvi2(ot)=k2Logw (d),
logvi3(ot)=k3Logw (cm),
Wherein, aijAnd bj(ot) it is respectively the transition probability of speech recognition modeling and output probability, kiRepresent to different characteristic
The weight coefficient of information, w (cm) is likelihood ratio information, the computational methods of w (cm):
If log-likelihood ratio isFootmark c and a represents this speech recognition modeling and certain phase respectively
Anti- speech recognition modeling, then have:
Wherein, t is normal number, and u is constant, and the value of w (cm) is necessarily between 0~1.If current speech identification model is seemingly
When so degree is higher than the likelihood score of phase inverse model, llr > 0, close to 1;Otherwise close to 0.T and u is used for decay and the position of control function
Put, its value is determined by experiment.
The confidence level of (as phoneme, syllable, whole word and whole word) in different levels can be calculated by above method respectively
Valuation.
It will be apparent to those skilled in the art that can technical scheme as described above and design, make other various
Corresponding change and deformation, and all these change and deformation all should belong to the protection domain of the claims in the present invention
Within.
Claims (10)
1. a kind of audio recognition method based on cloud computing and confidence calculations is it is characterised in that include following steps:
S1, local speech recognition system and high in the clouds speech recognition system receive voice signal respectively;
S2, local speech recognition system draw local voice recognition result, and high in the clouds speech recognition system draws high in the clouds speech recognition
Result;
S31, confidence level evaluation and test is carried out to local voice recognition result, draw the confidence level of local voice recognition result;
S32, confidence level evaluation and test is carried out to high in the clouds voice identification result, draw the confidence level of high in the clouds voice identification result;
S4, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is compared, by confidence level
Higher voice identification result is exported.
2. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 1 is it is characterised in that high in the clouds language
It is provided with different speech recognition modelings, in step s2, high in the clouds speech recognition system is known based on different voices in sound identifying system
Other model draws different plan high in the clouds voice identification results, and the content of step s32 includes:
S321, confidence level evaluation and test is carried out to different plan high in the clouds voice identification results, draw and know corresponding to different high in the clouds voices of intending
The confidence level of other result;
S322, the confidence levels that will intend high in the clouds voice identification result corresponding to difference be compared, and confidence level highest is intended high in the clouds
Voice identification result is exported as high in the clouds voice identification result.
3. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 2 is it is characterised in that different
Speech recognition modeling that speech recognition modeling includes setting up based on different speech recognition algorithms, also include based on different languages
The speech recognition modeling that sound recognizer combines and sets up, different speech recognition modelings corresponds to different phonetic entry rings
Border.
4. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 3 is it is characterised in that carrying out
Before step s2, first carry out step s20:
S20, local speech recognition system and high in the clouds speech recognition system carry out noise reduction process to the voice signal receiving respectively.
5. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 4 is it is characterised in that step
In s20, high in the clouds speech recognition system carries out noise reduction process using different voice de-noising models to voice signal, this different language
Sound noise reduction model is based on the different foundation of phonetic entry environment, this different voice de-noising model and different speech recognition modelings
Correspond, the voice signal completing noise reduction process is sent to corresponding to same phonetic entry environment high in the clouds speech recognition system
Speech recognition modeling.
6. a kind of speech recognition system based on cloud computing and confidence calculations is it is characterised in that include:
Local speech recognition system, for receiving voice signal and drawing local voice recognition result;
High in the clouds speech recognition system, for receiving voice signal and drawing high in the clouds voice identification result;
Confidence level evaluates and tests module, carries out confidence using certainty factor algebra to local voice recognition result and high in the clouds voice identification result
Degree evaluation and test;
Data processing module, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is compared
Relatively, and export the higher voice identification result of confidence level.
7. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 6 is it is characterised in that high in the clouds language
Different high in the clouds speech recognition submodules are included in sound identifying system:
In the speech recognition system of high in the clouds, in different high in the clouds speech recognition submodules, include different speech recognition modelings, cloud
End speech recognition submodule is used for receiving voice signal and drawing plan high in the clouds voice identification result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and plan high in the clouds voice identification result is put
Reliability is evaluated and tested;
Data processing module, the confidence level intending high in the clouds voice identification result of different high in the clouds speech recognition submodule output is entered
Row compares, and confidence level highest is intended high in the clouds voice identification result as high in the clouds voice identification result;Local voice is identified knot
The confidence level of fruit is compared with the confidence level of high in the clouds voice identification result, and exports the higher voice identification result of confidence level.
8. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 7 is it is characterised in that different
Speech recognition modeling that speech recognition modeling includes setting up based on different speech recognition algorithms, also include based on different languages
The speech recognition modeling that sound recognizer combines and sets up, different speech recognition modelings corresponds to different phonetic entry rings
Border.
9. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 8 is it is characterised in that also include
There are local voice noise reduction module and high in the clouds voice de-noising module, local voice noise reduction module is used for voice signal is carried out at noise reduction
Manage, again the voice signal completing noise reduction process be sent to local speech recognition system, high in the clouds voice de-noising module is used for language
Message number carries out noise reduction process, again the voice signal completing noise reduction process is sent to high in the clouds speech recognition system.
10. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 9 is it is characterised in that high in the clouds
Include different high in the clouds voice de-noising submodules in voice de-noising module, include in different high in the clouds voice de-noising submodules
Different voice de-noising models, this different voice de-noising model is set up based on different phonetic entry environment, and this is different
Voice de-noising model is corresponded from different speech recognition modelings.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610840519.XA CN106340297A (en) | 2016-09-21 | 2016-09-21 | Speech recognition method and system based on cloud computing and confidence calculation |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610840519.XA CN106340297A (en) | 2016-09-21 | 2016-09-21 | Speech recognition method and system based on cloud computing and confidence calculation |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106340297A true CN106340297A (en) | 2017-01-18 |
Family
ID=57838636
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610840519.XA Pending CN106340297A (en) | 2016-09-21 | 2016-09-21 | Speech recognition method and system based on cloud computing and confidence calculation |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106340297A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107316637A (en) * | 2017-05-31 | 2017-11-03 | 广东欧珀移动通信有限公司 | Audio recognition method and Related product |
CN107564525A (en) * | 2017-10-23 | 2018-01-09 | 深圳北鱼信息科技有限公司 | Audio recognition method and device |
CN107733762A (en) * | 2017-11-20 | 2018-02-23 | 马博 | The sound control method and device of a kind of smart home, system |
CN108022593A (en) * | 2018-01-16 | 2018-05-11 | 成都福兰特电子技术股份有限公司 | A kind of high sensitivity speech recognition system and its control method |
CN108806682A (en) * | 2018-06-12 | 2018-11-13 | 奇瑞汽车股份有限公司 | The method and apparatus for obtaining Weather information |
CN109979454A (en) * | 2019-03-29 | 2019-07-05 | 联想(北京)有限公司 | Data processing method and device |
CN110634481A (en) * | 2019-08-06 | 2019-12-31 | 惠州市德赛西威汽车电子股份有限公司 | Voice integration method for outputting optimal recognition result |
WO2020082724A1 (en) * | 2018-10-26 | 2020-04-30 | 华为技术有限公司 | Method and apparatus for object classification |
CN111145757A (en) * | 2020-02-18 | 2020-05-12 | 上海华镇电子科技有限公司 | Vehicle-mounted voice intelligent Bluetooth integration device and method |
CN113380253A (en) * | 2021-06-21 | 2021-09-10 | 紫优科技(深圳)有限公司 | Voice recognition system, device and medium based on cloud computing and edge computing |
CN113380254A (en) * | 2021-06-21 | 2021-09-10 | 紫优科技(深圳)有限公司 | Voice recognition method, device and medium based on cloud computing and edge computing |
CN113450781A (en) * | 2020-03-25 | 2021-09-28 | 阿里巴巴集团控股有限公司 | Speech processing method, speech encoder, speech decoder and speech recognition system |
CN115410578A (en) * | 2022-10-27 | 2022-11-29 | 广州小鹏汽车科技有限公司 | Processing method of voice recognition, processing system thereof, vehicle and readable storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1633679A (en) * | 2001-12-29 | 2005-06-29 | 摩托罗拉公司 | Method and apparatus for multi-level distributed speech recognition |
CN102439660A (en) * | 2010-06-29 | 2012-05-02 | 株式会社东芝 | Voice-tag method and apparatus based on confidence score |
CN102710539A (en) * | 2012-05-02 | 2012-10-03 | 中兴通讯股份有限公司 | Method and device for transferring voice messages |
US20140303974A1 (en) * | 2013-04-03 | 2014-10-09 | Kabushiki Kaisha Toshiba | Text generator, text generating method, and computer program product |
-
2016
- 2016-09-21 CN CN201610840519.XA patent/CN106340297A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1633679A (en) * | 2001-12-29 | 2005-06-29 | 摩托罗拉公司 | Method and apparatus for multi-level distributed speech recognition |
CN102439660A (en) * | 2010-06-29 | 2012-05-02 | 株式会社东芝 | Voice-tag method and apparatus based on confidence score |
CN102710539A (en) * | 2012-05-02 | 2012-10-03 | 中兴通讯股份有限公司 | Method and device for transferring voice messages |
US20140303974A1 (en) * | 2013-04-03 | 2014-10-09 | Kabushiki Kaisha Toshiba | Text generator, text generating method, and computer program product |
Non-Patent Citations (1)
Title |
---|
刘镜,刘加: "置信度的原理及其在语音识别中的应用", 《计算机研究与发展》 * |
Cited By (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107316637A (en) * | 2017-05-31 | 2017-11-03 | 广东欧珀移动通信有限公司 | Audio recognition method and Related product |
CN107564525A (en) * | 2017-10-23 | 2018-01-09 | 深圳北鱼信息科技有限公司 | Audio recognition method and device |
CN107733762A (en) * | 2017-11-20 | 2018-02-23 | 马博 | The sound control method and device of a kind of smart home, system |
CN107733762B (en) * | 2017-11-20 | 2020-07-24 | 宁波向往智能科技有限公司 | Voice control method, device and system for smart home |
CN108022593A (en) * | 2018-01-16 | 2018-05-11 | 成都福兰特电子技术股份有限公司 | A kind of high sensitivity speech recognition system and its control method |
CN108806682A (en) * | 2018-06-12 | 2018-11-13 | 奇瑞汽车股份有限公司 | The method and apparatus for obtaining Weather information |
CN108806682B (en) * | 2018-06-12 | 2020-12-01 | 奇瑞汽车股份有限公司 | Method and device for acquiring weather information |
WO2020082724A1 (en) * | 2018-10-26 | 2020-04-30 | 华为技术有限公司 | Method and apparatus for object classification |
CN109979454A (en) * | 2019-03-29 | 2019-07-05 | 联想(北京)有限公司 | Data processing method and device |
CN110634481B (en) * | 2019-08-06 | 2021-11-16 | 惠州市德赛西威汽车电子股份有限公司 | Voice integration method for outputting optimal recognition result |
CN110634481A (en) * | 2019-08-06 | 2019-12-31 | 惠州市德赛西威汽车电子股份有限公司 | Voice integration method for outputting optimal recognition result |
CN111145757A (en) * | 2020-02-18 | 2020-05-12 | 上海华镇电子科技有限公司 | Vehicle-mounted voice intelligent Bluetooth integration device and method |
CN113450781A (en) * | 2020-03-25 | 2021-09-28 | 阿里巴巴集团控股有限公司 | Speech processing method, speech encoder, speech decoder and speech recognition system |
CN113450781B (en) * | 2020-03-25 | 2022-08-09 | 阿里巴巴集团控股有限公司 | Speech processing method, speech encoder, speech decoder and speech recognition system |
CN113380254A (en) * | 2021-06-21 | 2021-09-10 | 紫优科技(深圳)有限公司 | Voice recognition method, device and medium based on cloud computing and edge computing |
CN113380253A (en) * | 2021-06-21 | 2021-09-10 | 紫优科技(深圳)有限公司 | Voice recognition system, device and medium based on cloud computing and edge computing |
CN113380254B (en) * | 2021-06-21 | 2024-05-24 | 枣庄福缘网络科技有限公司 | Voice recognition method, device and medium based on cloud computing and edge computing |
CN115410578A (en) * | 2022-10-27 | 2022-11-29 | 广州小鹏汽车科技有限公司 | Processing method of voice recognition, processing system thereof, vehicle and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106340297A (en) | Speech recognition method and system based on cloud computing and confidence calculation | |
CN109326283B (en) | Many-to-many voice conversion method based on text encoder under non-parallel text condition | |
US11410029B2 (en) | Soft label generation for knowledge distillation | |
WO2021174757A1 (en) | Method and apparatus for recognizing emotion in voice, electronic device and computer-readable storage medium | |
US11062699B2 (en) | Speech recognition with trained GMM-HMM and LSTM models | |
CN106098059B (en) | Customizable voice awakening method and system | |
CN103400577B (en) | The acoustic model method for building up of multilingual speech recognition and device | |
CN103065620B (en) | Method with which text input by user is received on mobile phone or webpage and synthetized to personalized voice in real time | |
CN106297800B (en) | Self-adaptive voice recognition method and equipment | |
CN108281137A (en) | A kind of universal phonetic under whole tone element frame wakes up recognition methods and system | |
CN111179916A (en) | Re-scoring model training method, voice recognition method and related device | |
KR102199246B1 (en) | Method And Apparatus for Learning Acoustic Model Considering Reliability Score | |
CN111653274B (en) | Wake-up word recognition method, device and storage medium | |
CN111599339B (en) | Speech splicing synthesis method, system, equipment and medium with high naturalness | |
CN110827799B (en) | Method, apparatus, device and medium for processing voice signal | |
CN111091809B (en) | Regional accent recognition method and device based on depth feature fusion | |
CN117115581A (en) | Intelligent misoperation early warning method and system based on multi-mode deep learning | |
CN112331207A (en) | Service content monitoring method and device, electronic equipment and storage medium | |
CN115312033A (en) | Speech emotion recognition method, device, equipment and medium based on artificial intelligence | |
CN107610720B (en) | Pronunciation deviation detection method and device, storage medium and equipment | |
Hammami et al. | Tree distribution classifier for automatic spoken arabic digit recognition | |
CN104199811A (en) | Short sentence analytic model establishing method and system | |
CN108182938B (en) | A kind of training method of the Mongol acoustic model based on DNN | |
CN116189671A (en) | Data mining method and system for language teaching | |
CN115687934A (en) | Intention recognition method and device, computer equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170118 |