CN106340297A - Speech recognition method and system based on cloud computing and confidence calculation - Google Patents

Speech recognition method and system based on cloud computing and confidence calculation Download PDF

Info

Publication number
CN106340297A
CN106340297A CN201610840519.XA CN201610840519A CN106340297A CN 106340297 A CN106340297 A CN 106340297A CN 201610840519 A CN201610840519 A CN 201610840519A CN 106340297 A CN106340297 A CN 106340297A
Authority
CN
China
Prior art keywords
speech recognition
clouds
voice
different
confidence level
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610840519.XA
Other languages
Chinese (zh)
Inventor
李志�
田宗贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong University of Technology
Original Assignee
Guangdong University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong University of Technology filed Critical Guangdong University of Technology
Priority to CN201610840519.XA priority Critical patent/CN106340297A/en
Publication of CN106340297A publication Critical patent/CN106340297A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/32Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a speech recognition method based on cloud calculation and confidence calculation, relating to the technical field of speech recognition. The method comprises the steps of (S1) a local speech recognition system and a cloud speech recognition system receive speech signals, (S2) the local speech recognition system obtains a local speech recognition result, and the cloud speech recognition system obtains a cloud speech recognition result, (S31) carrying out confidence evaluation on the local speech recognition result, and obtaining the confidence of the local speech recognition result, (S32) carrying out confidence evaluation on the cloud speech recognition result, and obtaining the confidence of the cloud speech recognition result, (S4) comparing the confidence of the local speech recognition result and the confidence of the cloud speech recognition result, and outputting a speech recognition result with high confidence is outputted. The invention also discloses a speech recognition system based on cloud calculation and confidence calculation. By using the method with the combination of cloud and local speech recognition is used, and the quality of the speech recognition can be improved.

Description

A kind of audio recognition method based on cloud computing and confidence calculations and system
Technical field
The present invention relates to the technical field of speech recognition is and in particular to a kind of voice based on cloud computing and confidence calculations Recognition methodss and system.
Background technology
Progress with science and the development of technology, speech recognition technology has reached its maturity, and is just progressively becoming information skill The key technology of man-machine interface in art.Multiple voice recognizer makes speech recognition either on discrimination or recognition efficiency All have a distinct increment.In recent years, speech recognition technology is also gradually commonly used in every field.However, traditional voice is known Other technology carries out speech recognition using local voice identification software mostly, and so resulting in the speech recognition algorithm in software is no Method changes.And different speech recognition algorithms certainly will have difference for the speech recognition effect of different phonetic entry environment Different.For example in complicated noise, there is the noise in various sources.Under such noise circumstance, the language of original operational excellence The discrimination of sound identifying system may be a greater impact.If the method that software adopts template training, due to training sample and The mismatch of sample planting modes on sink characteristic, then the recognition performance of software will drastically decline, the shortcoming of existing voice identifying system be with Its speech recognition performance of the change of environment also can drastically decline, its adaptability and the suitability not high it is impossible to meet multiple in the case of Speech recognition demand.Therefore, how to allow speech recognition system is with a wide range of applications to be just particularly important with the suitability.
Chinese patent application cn201310163915.x discloses a kind of update method of speech recognition apparatus, device and is System, comprising: receive voice input signal;Using local voice identification equipment, speech recognition is carried out to voice input signal, obtain Local voice recognition result;Obtain optimal identification result as from local voice recognition result and high in the clouds voice identification result Whole voice identification result, wherein high in the clouds voice identification result are to carry out speech recognition in local voice equipment to voice input signal While, using high in the clouds speech recognition apparatus, speech recognition acquisition is carried out to voice input signal;Anti- in conjunction with the user obtaining Feedforward information and final voice identification result determine whether the reliability of local voice recognition result meets requirement;Local when determining When the reliability of voice identification result is unsatisfactory for requiring, using high in the clouds speech recognition apparatus, local voice identification equipment is carried out more Newly.Apply high in the clouds speech recognition apparatus in the technical scheme of this patent application publication and carry out speech recognition, but voice is known The lifting of other effect is inconspicuous, and need to determine the reliability of voice identification result in conjunction with the feedback information of user, needs user Carry out result selection, make the operating procedure of user more loaded down with trivial details, be unfavorable for lifting experience.
Content of the invention
For the deficiencies in the prior art, the purpose of the present invention aims to provide a kind of language based on cloud computing and confidence calculations Voice recognition method and system, carry out, using cloud computing mode, method that the identification of speech recognition and local voice combines so that language Sound identification equipment or system can effectively adapt to multiple voice input environment, improve the quality of speech recognition.
For achieving the above object, the present invention adopts the following technical scheme that
A kind of audio recognition method based on cloud computing and confidence calculations, includes following steps:
S1, local speech recognition system and high in the clouds speech recognition system receive voice signal respectively;
S2, local speech recognition system draw local voice recognition result, and high in the clouds speech recognition system draws high in the clouds voice Recognition result;
S31, confidence level evaluation and test is carried out to local voice recognition result, draw the confidence level of local voice recognition result;
S32, confidence level evaluation and test is carried out to high in the clouds voice identification result, draw the confidence level of high in the clouds voice identification result;
S4, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is compared, will put The higher voice identification result of reliability is exported.
Further, it is provided with different speech recognition modelings in the speech recognition system of high in the clouds, high in the clouds voice in step s2 Identifying system draws different plan high in the clouds voice identification results based on different speech recognition modelings, and the content of step s32 comprises Have:
S321, confidence level evaluation and test is carried out to different plan high in the clouds voice identification results, draw and intend high in the clouds language corresponding to different The confidence level of sound recognition result;
S322, the confidence levels that will intend high in the clouds voice identification result corresponding to difference be compared, and confidence level highest is intended High in the clouds voice identification result is exported as high in the clouds voice identification result.
Further, different speech recognition modelings includes the speech recognition set up based on different speech recognition algorithms Model, also include the speech recognition modeling set up based on different speech recognition algorithm combinations, different speech recognition modeling Corresponding to different phonetic entry environment.
Further, before carrying out step s2, first carry out step s20:
S20, local speech recognition system and high in the clouds speech recognition system are carried out at noise reduction to the voice signal receiving respectively Reason.
Further, in step s20, high in the clouds speech recognition system is entered to voice signal using different voice de-noising models Row noise reduction process, this different voice de-noising model is set up based on different phonetic entry environment, this different voice de-noising mould Type and different speech recognition modelings correspond, and the voice signal completing noise reduction process is sent to by high in the clouds speech recognition system Speech recognition modeling corresponding to same phonetic entry environment.
A kind of speech recognition system based on cloud computing and confidence calculations, includes:
Local speech recognition system, for receiving voice signal and drawing local voice recognition result;
High in the clouds speech recognition system, for receiving voice signal and drawing high in the clouds voice identification result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and high in the clouds voice identification result is carried out Confidence level is evaluated and tested;
Data processing module, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is carried out Compare, and export the higher voice identification result of confidence level.
Further, different high in the clouds speech recognition submodules are included in the speech recognition system of high in the clouds:
In the speech recognition system of high in the clouds, in different high in the clouds speech recognition submodules, include different speech recognition moulds Type, high in the clouds speech recognition submodule is used for receiving voice signal and drawing plan high in the clouds voice identification result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and plan high in the clouds voice identification result is entered Row confidence level is evaluated and tested;
Data processing module, by the confidence intending high in the clouds voice identification result of different high in the clouds speech recognition submodule output Degree is compared, and confidence level highest is intended high in the clouds voice identification result as high in the clouds voice identification result;Local voice is known The confidence level of other result and the confidence level of high in the clouds voice identification result are compared, and export the higher speech recognition knot of confidence level Really.
Further, different speech recognition modelings includes the speech recognition set up based on different speech recognition algorithms Model, also include the speech recognition modeling set up based on different speech recognition algorithm combinations, different speech recognition modeling Corresponding to different phonetic entry environment.
Further, local voice noise reduction module and high in the clouds voice de-noising module, local voice noise reduction module are also included For voice signal being carried out with noise reduction process, again the voice signal completing noise reduction process being sent to local speech recognition system, High in the clouds voice de-noising module is used for voice signal is carried out with noise reduction process, again the voice signal completing noise reduction process is sent to cloud End speech recognition system.
Further, different high in the clouds voice de-noising submodules are included in high in the clouds voice de-noising module, different high in the clouds Different voice de-noising models are included, this different voice de-noising model is defeated based on different voices in voice de-noising submodule Enter environment and set up, this different voice de-noising model is corresponded from different speech recognition modelings.
The beneficial effects of the present invention is: language is synchronously identified with local speech recognition system using high in the clouds speech recognition system Sound, wherein high in the clouds speech recognition system are to include multiple speech recognition modelings corresponding to different input environments, from various languages Preferentially export in sound recognition result, so that speech recognition apparatus or system can effectively adapt to multiple voice input ring Border, effectively improves the quality of speech recognition;Using certainty factor algebra, various voice identification results are evaluated, improve voice and know The reliability of other result;Combine, in confidence level valuation, the information not being fully utilized in legacy speech recognition systems, thus subtracting The entropy of little speech recognition system, more accurately judges correcting errors of recognition result, thus improving the systematic function of speech recognition.
Brief description
Fig. 1 is the flow chart in the present invention based on cloud computing and the audio recognition method of confidence calculations.
Specific embodiment
Below, in conjunction with accompanying drawing and specific embodiment, the present invention is described further:
Embodiment 1
As shown in figure 1, a kind of audio recognition method based on cloud computing and confidence calculations, include following steps:
S1, local speech recognition system and high in the clouds speech recognition system receive voice signal respectively;
S20, local speech recognition system and high in the clouds speech recognition system are carried out at noise reduction to the voice signal receiving respectively Reason, wherein high in the clouds speech recognition system carries out noise reduction process using different voice de-noising models to voice signal, and this is different Voice de-noising model is set up based on different phonetic entry environment;
S2, local speech recognition system draw local voice recognition result, and high in the clouds speech recognition system is based on different languages Sound identification model draws different high in the clouds voice identification results, and this different speech recognition modeling includes knowing based on different voices Other algorithm and set up speech recognition modeling, also include the speech recognition mould set up based on different speech recognition algorithm combinations Type, different speech recognition modelings corresponds to different phonetic entry environment, different voice de-noising models and different voices Identification model corresponds, and the voice signal completing noise reduction process is sent to corresponding voice by different voice de-noising models to be known Other model;
S31, confidence level evaluation and test is carried out to local voice recognition result, draw the confidence level of local voice recognition result;
S321, confidence level evaluation and test is carried out to different plan high in the clouds voice identification results, draw and intend high in the clouds language corresponding to different The confidence level of sound recognition result;
S322, the confidence levels that will intend high in the clouds voice identification result corresponding to difference be compared, and confidence level highest is intended High in the clouds voice identification result is exported as high in the clouds voice identification result;
S4, it is less than setting value then direct output high in the clouds voice identification result when the confidence level of local voice recognition result;If The confidence level of local voice recognition result reaches setting value, by the confidence level of local voice recognition result and high in the clouds speech recognition knot The confidence level of fruit is compared, and voice identification result higher for confidence level is exported.
Embodiment 2
A kind of speech recognition system based on cloud computing and confidence calculations, includes:
Local voice noise reduction module, for carrying out noise reduction process, believing the voice completing noise reduction process to voice signal Number it is sent to local speech recognition system;
High in the clouds voice de-noising module, includes different high in the clouds voice de-noising submodules, and different high in the clouds voice de-noisings is sub Different voice de-noising models are included, this different voice de-noising model is built based on different phonetic entry environment in module Vertical, this different voice de-noising model is corresponded from different speech recognition modelings, for carrying out at noise reduction to voice signal Manage, again the voice signal completing noise reduction process be sent to high in the clouds speech recognition system;
Local speech recognition system, for receiving the voice signal being derived from local voice noise reduction module and drawing local language Sound recognition result;
High in the clouds speech recognition system, includes different high in the clouds speech recognition submodules, and different high in the clouds speech recognitions is sub Different speech recognition modelings are included, this different speech recognition modeling is included based on different speech recognition algorithms in module And the speech recognition modeling set up, also include the speech recognition modeling set up based on different speech recognition algorithm combinations, no Same speech recognition modeling corresponds to different phonetic entry environment, different voice de-noising models and different speech recognition moulds Type corresponds, and different sound identification modules receives the voice messaging from corresponding voice de-noising model and draws plan high in the clouds language Sound recognition result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and plan high in the clouds voice identification result is entered Row confidence level is evaluated and tested;
Data processing module, by the confidence intending high in the clouds voice identification result of different high in the clouds speech recognition submodule output Degree is compared, and confidence level highest is intended high in the clouds voice identification result as high in the clouds voice identification result;Local voice is known The confidence level of other result and the confidence level of high in the clouds voice identification result are compared, and export the higher speech recognition knot of confidence level Really.
Embodiment 3
It is based on the audio recognition method of confidence calculations or based in embodiment 2 based on cloud computing based in embodiment 1 Cloud computing and the speech recognition system of confidence calculations, in the present embodiment, different speech recognition algorithms includes template matching calculation Method, probabilistic model algorithm and artificial neural network algorithm, wherein:
Template matching algorithm, extracts the characteristic vector that can fully describe phonic signal character in the training stage and is formed Feature vector sequence, and be optimized, show that a characteristic vector set carrys out expressing feature vector sequence, with this feature vector set Cooperate as template;In use, extracting the characteristic vector of voice to be identified, and form the characteristic vector sequence of voice to be identified Row, the feature vector sequence of the feature vector sequence of voice to be identified and template is contrasted, and by matching degree highest The corresponding voice signal of template is as the voice identification result based on template matching algorithm;
Probabilistic model algorithm, extracts the characteristic vector that can fully describe phonic signal character in the training stage, according to The regularity of distribution in feature space for this feature vector forms mathematical model;In use, extracting the feature of voice to be identified Vector, speech characteristic vector to be identified is contrasted with mathematical model in the regularity of distribution of feature space, is calculated similarity, And using corresponding for corresponding for similarity highest mathematical model voice signal as the voice identification result based on probabilistic model algorithm.
Embodiment 4
It is based on the audio recognition method of confidence calculations or based in embodiment 2 based on cloud computing based in embodiment 1 Cloud computing and the speech recognition system of confidence calculations, include for setting up the information of confidence level Valuation Modelling in the present embodiment: 1) mark (trace) of viterbi decoding information and hidden Markov model (hmm): state alignment information, state duration (segment length), likelihood score;2) to alternative hvpothesis h1And anti-word modelModeling;3) the online rubbish that competition candidate result is constituted Model;4) the clear and definite filler model to foundation of pronouncing outside knowledge by mistake and vocabulary or filler model;5) word lattice density.Confidence level valuation mould Type is segmented into rule-based comprehensive and based on statistical model synthesis to the synthesis of voice messaging, wherein rule-based comprehensive Close and in different cognitive phase applications different information source, confidence level is estimated respectively, with its emphasis point is the total of experience Knot, the formation of rule and adjustment;Statistical model includes linear model and generalized linear model:
Definition event a occur occasionality beThe collection of all information is combined into x,
q(ci=1/x) it is to true probability p (ci=1/x) estimation, then the linear model of confidence level be:
log [ o o d s ( c i = 1 / x ) ] = log q ( c i = 1 / x ) 1 - q ( c i = 1 / x ) = σ i t i x i ,
Wherein, xiIngredient for x, i.e. xi∈x;ciFor confidence level label: ci=0 (identification mistake);ci=1 (identification Correctly).The generalized linear model of confidence level is:
log [ o o d s ( c i = 1 / x ) ] = log q ( c i = 1 / x ) 1 - q ( c i = 1 / x ) = σ i g i ( x i ) ,
Separately, define wjT () is the optimal score of the front t observed quantity (t frame) reaching state j in search procedure, γi(ot) Confidence score for t frame state i:
w j ( t ) = max i { w i ( t - 1 ) γ i ( o t ) } ,
logγ i ( o t ) = σ k = 1 3 logv i k ( o t ) ,
Wherein, logvik(k=1,2,3) represents likelihood score, segment length and likelihood ratio 3 category information respectively:
logvi2(ot)=k2Logw (d),
logvi3(ot)=k3Logw (cm),
Wherein, aijAnd bj(ot) it is respectively the transition probability of speech recognition modeling and output probability, kiRepresent to different characteristic The weight coefficient of information, w (cm) is likelihood ratio information, the computational methods of w (cm):
If log-likelihood ratio isFootmark c and a represents this speech recognition modeling and certain phase respectively Anti- speech recognition modeling, then have:
log w ( c m ) = l o g 1 1 + exp { - t ( l l r + u ) } ,
Wherein, t is normal number, and u is constant, and the value of w (cm) is necessarily between 0~1.If current speech identification model is seemingly When so degree is higher than the likelihood score of phase inverse model, llr > 0, close to 1;Otherwise close to 0.T and u is used for decay and the position of control function Put, its value is determined by experiment.
The confidence level of (as phoneme, syllable, whole word and whole word) in different levels can be calculated by above method respectively Valuation.
It will be apparent to those skilled in the art that can technical scheme as described above and design, make other various Corresponding change and deformation, and all these change and deformation all should belong to the protection domain of the claims in the present invention Within.

Claims (10)

1. a kind of audio recognition method based on cloud computing and confidence calculations is it is characterised in that include following steps:
S1, local speech recognition system and high in the clouds speech recognition system receive voice signal respectively;
S2, local speech recognition system draw local voice recognition result, and high in the clouds speech recognition system draws high in the clouds speech recognition Result;
S31, confidence level evaluation and test is carried out to local voice recognition result, draw the confidence level of local voice recognition result;
S32, confidence level evaluation and test is carried out to high in the clouds voice identification result, draw the confidence level of high in the clouds voice identification result;
S4, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is compared, by confidence level Higher voice identification result is exported.
2. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 1 is it is characterised in that high in the clouds language It is provided with different speech recognition modelings, in step s2, high in the clouds speech recognition system is known based on different voices in sound identifying system Other model draws different plan high in the clouds voice identification results, and the content of step s32 includes:
S321, confidence level evaluation and test is carried out to different plan high in the clouds voice identification results, draw and know corresponding to different high in the clouds voices of intending The confidence level of other result;
S322, the confidence levels that will intend high in the clouds voice identification result corresponding to difference be compared, and confidence level highest is intended high in the clouds Voice identification result is exported as high in the clouds voice identification result.
3. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 2 is it is characterised in that different Speech recognition modeling that speech recognition modeling includes setting up based on different speech recognition algorithms, also include based on different languages The speech recognition modeling that sound recognizer combines and sets up, different speech recognition modelings corresponds to different phonetic entry rings Border.
4. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 3 is it is characterised in that carrying out Before step s2, first carry out step s20:
S20, local speech recognition system and high in the clouds speech recognition system carry out noise reduction process to the voice signal receiving respectively.
5. the audio recognition method based on cloud computing and confidence calculations as claimed in claim 4 is it is characterised in that step In s20, high in the clouds speech recognition system carries out noise reduction process using different voice de-noising models to voice signal, this different language Sound noise reduction model is based on the different foundation of phonetic entry environment, this different voice de-noising model and different speech recognition modelings Correspond, the voice signal completing noise reduction process is sent to corresponding to same phonetic entry environment high in the clouds speech recognition system Speech recognition modeling.
6. a kind of speech recognition system based on cloud computing and confidence calculations is it is characterised in that include:
Local speech recognition system, for receiving voice signal and drawing local voice recognition result;
High in the clouds speech recognition system, for receiving voice signal and drawing high in the clouds voice identification result;
Confidence level evaluates and tests module, carries out confidence using certainty factor algebra to local voice recognition result and high in the clouds voice identification result Degree evaluation and test;
Data processing module, the confidence level of the confidence level of local voice recognition result and high in the clouds voice identification result is compared Relatively, and export the higher voice identification result of confidence level.
7. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 6 is it is characterised in that high in the clouds language Different high in the clouds speech recognition submodules are included in sound identifying system:
In the speech recognition system of high in the clouds, in different high in the clouds speech recognition submodules, include different speech recognition modelings, cloud End speech recognition submodule is used for receiving voice signal and drawing plan high in the clouds voice identification result;
Confidence level evaluates and tests module, using certainty factor algebra, local voice recognition result and plan high in the clouds voice identification result is put Reliability is evaluated and tested;
Data processing module, the confidence level intending high in the clouds voice identification result of different high in the clouds speech recognition submodule output is entered Row compares, and confidence level highest is intended high in the clouds voice identification result as high in the clouds voice identification result;Local voice is identified knot The confidence level of fruit is compared with the confidence level of high in the clouds voice identification result, and exports the higher voice identification result of confidence level.
8. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 7 is it is characterised in that different Speech recognition modeling that speech recognition modeling includes setting up based on different speech recognition algorithms, also include based on different languages The speech recognition modeling that sound recognizer combines and sets up, different speech recognition modelings corresponds to different phonetic entry rings Border.
9. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 8 is it is characterised in that also include There are local voice noise reduction module and high in the clouds voice de-noising module, local voice noise reduction module is used for voice signal is carried out at noise reduction Manage, again the voice signal completing noise reduction process be sent to local speech recognition system, high in the clouds voice de-noising module is used for language Message number carries out noise reduction process, again the voice signal completing noise reduction process is sent to high in the clouds speech recognition system.
10. the speech recognition system based on cloud computing and confidence calculations as claimed in claim 9 is it is characterised in that high in the clouds Include different high in the clouds voice de-noising submodules in voice de-noising module, include in different high in the clouds voice de-noising submodules Different voice de-noising models, this different voice de-noising model is set up based on different phonetic entry environment, and this is different Voice de-noising model is corresponded from different speech recognition modelings.
CN201610840519.XA 2016-09-21 2016-09-21 Speech recognition method and system based on cloud computing and confidence calculation Pending CN106340297A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610840519.XA CN106340297A (en) 2016-09-21 2016-09-21 Speech recognition method and system based on cloud computing and confidence calculation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610840519.XA CN106340297A (en) 2016-09-21 2016-09-21 Speech recognition method and system based on cloud computing and confidence calculation

Publications (1)

Publication Number Publication Date
CN106340297A true CN106340297A (en) 2017-01-18

Family

ID=57838636

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610840519.XA Pending CN106340297A (en) 2016-09-21 2016-09-21 Speech recognition method and system based on cloud computing and confidence calculation

Country Status (1)

Country Link
CN (1) CN106340297A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316637A (en) * 2017-05-31 2017-11-03 广东欧珀移动通信有限公司 Audio recognition method and Related product
CN107564525A (en) * 2017-10-23 2018-01-09 深圳北鱼信息科技有限公司 Audio recognition method and device
CN107733762A (en) * 2017-11-20 2018-02-23 马博 The sound control method and device of a kind of smart home, system
CN108022593A (en) * 2018-01-16 2018-05-11 成都福兰特电子技术股份有限公司 A kind of high sensitivity speech recognition system and its control method
CN108806682A (en) * 2018-06-12 2018-11-13 奇瑞汽车股份有限公司 The method and apparatus for obtaining Weather information
CN109979454A (en) * 2019-03-29 2019-07-05 联想(北京)有限公司 Data processing method and device
CN110634481A (en) * 2019-08-06 2019-12-31 惠州市德赛西威汽车电子股份有限公司 Voice integration method for outputting optimal recognition result
WO2020082724A1 (en) * 2018-10-26 2020-04-30 华为技术有限公司 Method and apparatus for object classification
CN111145757A (en) * 2020-02-18 2020-05-12 上海华镇电子科技有限公司 Vehicle-mounted voice intelligent Bluetooth integration device and method
CN113380253A (en) * 2021-06-21 2021-09-10 紫优科技(深圳)有限公司 Voice recognition system, device and medium based on cloud computing and edge computing
CN113380254A (en) * 2021-06-21 2021-09-10 紫优科技(深圳)有限公司 Voice recognition method, device and medium based on cloud computing and edge computing
CN113450781A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Speech processing method, speech encoder, speech decoder and speech recognition system
CN115410578A (en) * 2022-10-27 2022-11-29 广州小鹏汽车科技有限公司 Processing method of voice recognition, processing system thereof, vehicle and readable storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633679A (en) * 2001-12-29 2005-06-29 摩托罗拉公司 Method and apparatus for multi-level distributed speech recognition
CN102439660A (en) * 2010-06-29 2012-05-02 株式会社东芝 Voice-tag method and apparatus based on confidence score
CN102710539A (en) * 2012-05-02 2012-10-03 中兴通讯股份有限公司 Method and device for transferring voice messages
US20140303974A1 (en) * 2013-04-03 2014-10-09 Kabushiki Kaisha Toshiba Text generator, text generating method, and computer program product

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1633679A (en) * 2001-12-29 2005-06-29 摩托罗拉公司 Method and apparatus for multi-level distributed speech recognition
CN102439660A (en) * 2010-06-29 2012-05-02 株式会社东芝 Voice-tag method and apparatus based on confidence score
CN102710539A (en) * 2012-05-02 2012-10-03 中兴通讯股份有限公司 Method and device for transferring voice messages
US20140303974A1 (en) * 2013-04-03 2014-10-09 Kabushiki Kaisha Toshiba Text generator, text generating method, and computer program product

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘镜,刘加: "置信度的原理及其在语音识别中的应用", 《计算机研究与发展》 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107316637A (en) * 2017-05-31 2017-11-03 广东欧珀移动通信有限公司 Audio recognition method and Related product
CN107564525A (en) * 2017-10-23 2018-01-09 深圳北鱼信息科技有限公司 Audio recognition method and device
CN107733762A (en) * 2017-11-20 2018-02-23 马博 The sound control method and device of a kind of smart home, system
CN107733762B (en) * 2017-11-20 2020-07-24 宁波向往智能科技有限公司 Voice control method, device and system for smart home
CN108022593A (en) * 2018-01-16 2018-05-11 成都福兰特电子技术股份有限公司 A kind of high sensitivity speech recognition system and its control method
CN108806682A (en) * 2018-06-12 2018-11-13 奇瑞汽车股份有限公司 The method and apparatus for obtaining Weather information
CN108806682B (en) * 2018-06-12 2020-12-01 奇瑞汽车股份有限公司 Method and device for acquiring weather information
WO2020082724A1 (en) * 2018-10-26 2020-04-30 华为技术有限公司 Method and apparatus for object classification
CN109979454A (en) * 2019-03-29 2019-07-05 联想(北京)有限公司 Data processing method and device
CN110634481B (en) * 2019-08-06 2021-11-16 惠州市德赛西威汽车电子股份有限公司 Voice integration method for outputting optimal recognition result
CN110634481A (en) * 2019-08-06 2019-12-31 惠州市德赛西威汽车电子股份有限公司 Voice integration method for outputting optimal recognition result
CN111145757A (en) * 2020-02-18 2020-05-12 上海华镇电子科技有限公司 Vehicle-mounted voice intelligent Bluetooth integration device and method
CN113450781A (en) * 2020-03-25 2021-09-28 阿里巴巴集团控股有限公司 Speech processing method, speech encoder, speech decoder and speech recognition system
CN113450781B (en) * 2020-03-25 2022-08-09 阿里巴巴集团控股有限公司 Speech processing method, speech encoder, speech decoder and speech recognition system
CN113380254A (en) * 2021-06-21 2021-09-10 紫优科技(深圳)有限公司 Voice recognition method, device and medium based on cloud computing and edge computing
CN113380253A (en) * 2021-06-21 2021-09-10 紫优科技(深圳)有限公司 Voice recognition system, device and medium based on cloud computing and edge computing
CN113380254B (en) * 2021-06-21 2024-05-24 枣庄福缘网络科技有限公司 Voice recognition method, device and medium based on cloud computing and edge computing
CN115410578A (en) * 2022-10-27 2022-11-29 广州小鹏汽车科技有限公司 Processing method of voice recognition, processing system thereof, vehicle and readable storage medium

Similar Documents

Publication Publication Date Title
CN106340297A (en) Speech recognition method and system based on cloud computing and confidence calculation
CN109326283B (en) Many-to-many voice conversion method based on text encoder under non-parallel text condition
US11410029B2 (en) Soft label generation for knowledge distillation
WO2021174757A1 (en) Method and apparatus for recognizing emotion in voice, electronic device and computer-readable storage medium
US11062699B2 (en) Speech recognition with trained GMM-HMM and LSTM models
CN106098059B (en) Customizable voice awakening method and system
CN103400577B (en) The acoustic model method for building up of multilingual speech recognition and device
CN103065620B (en) Method with which text input by user is received on mobile phone or webpage and synthetized to personalized voice in real time
CN106297800B (en) Self-adaptive voice recognition method and equipment
CN108281137A (en) A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
CN111179916A (en) Re-scoring model training method, voice recognition method and related device
KR102199246B1 (en) Method And Apparatus for Learning Acoustic Model Considering Reliability Score
CN111653274B (en) Wake-up word recognition method, device and storage medium
CN111599339B (en) Speech splicing synthesis method, system, equipment and medium with high naturalness
CN110827799B (en) Method, apparatus, device and medium for processing voice signal
CN111091809B (en) Regional accent recognition method and device based on depth feature fusion
CN117115581A (en) Intelligent misoperation early warning method and system based on multi-mode deep learning
CN112331207A (en) Service content monitoring method and device, electronic equipment and storage medium
CN115312033A (en) Speech emotion recognition method, device, equipment and medium based on artificial intelligence
CN107610720B (en) Pronunciation deviation detection method and device, storage medium and equipment
Hammami et al. Tree distribution classifier for automatic spoken arabic digit recognition
CN104199811A (en) Short sentence analytic model establishing method and system
CN108182938B (en) A kind of training method of the Mongol acoustic model based on DNN
CN116189671A (en) Data mining method and system for language teaching
CN115687934A (en) Intention recognition method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170118