CN1119794C - 分布式话音识别*** - Google Patents
分布式话音识别*** Download PDFInfo
- Publication number
- CN1119794C CN1119794C CN94194566A CN94194566A CN1119794C CN 1119794 C CN1119794 C CN 1119794C CN 94194566 A CN94194566 A CN 94194566A CN 94194566 A CN94194566 A CN 94194566A CN 1119794 C CN1119794 C CN 1119794C
- Authority
- CN
- China
- Prior art keywords
- station
- word decoder
- speech
- parameter
- conversion
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Lifetime
Links
- 238000000605 extraction Methods 0.000 claims abstract description 16
- 239000000284 extract Substances 0.000 claims abstract description 11
- 238000004891 communication Methods 0.000 claims description 29
- 238000006243 chemical reaction Methods 0.000 claims description 28
- 230000008878 coupling Effects 0.000 claims description 10
- 238000010168 coupling process Methods 0.000 claims description 10
- 238000005859 coupling reaction Methods 0.000 claims description 10
- 238000001228 spectrum Methods 0.000 claims description 10
- 230000004044 response Effects 0.000 claims description 6
- 230000005540 biological transmission Effects 0.000 claims description 5
- 238000010295 mobile communication Methods 0.000 claims 2
- 230000003595 spectral effect Effects 0.000 claims 1
- 230000009466 transformation Effects 0.000 claims 1
- 238000012545 processing Methods 0.000 abstract description 5
- 238000000034 method Methods 0.000 description 13
- 230000000875 corresponding effect Effects 0.000 description 7
- 238000012549 training Methods 0.000 description 7
- 230000008901 benefit Effects 0.000 description 6
- 239000013598 vector Substances 0.000 description 5
- 230000001413 cellular effect Effects 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 239000000203 mixture Substances 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 206010038743 Restlessness Diseases 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000005311 autocorrelation function Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000008676 import Effects 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 238000010183 spectrum analysis Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- UPLPHRJJTCUQAY-WIRWPRASSA-N 2,3-thioepoxy madol Chemical compound C([C@@H]1CC2)[C@@H]3S[C@@H]3C[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@](C)(O)[C@@]2(C)CC1 UPLPHRJJTCUQAY-WIRWPRASSA-N 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 241001269238 Data Species 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001427 coherent effect Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000036962 time dependent Effects 0.000 description 1
- 210000001260 vocal cord Anatomy 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Telephonic Communication Services (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
- Computer And Data Communications (AREA)
- Radar Systems Or Details Thereof (AREA)
- Machine Translation (AREA)
- Selective Calling Equipment (AREA)
- Use Of Switch Circuits For Exchanges And Methods Of Control Of Multiplex Exchanges (AREA)
- Image Processing (AREA)
- Image Analysis (AREA)
- Traffic Control Systems (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
一种具有特征抽取装置(22)的话音识别***,该***位于远程站(40)中。特征抽取装置(22)从输入话音帧中抽取特征,然后将抽取的特征提供给中央处理站(42)。在中央处理站(42)中,将特征提供给确定输入话音帧之语法的字译码器(48)。
Description
技术领域
本发明涉及话音信号处理。本发明尤其涉及一种新的用于实现标准话音识别***分布化的方法和设备。
背景技术
话音识别是赋予机器模拟智能以识别用户或用户话音指令并便于人机交互的最重要技术之一。它还是一种理解人类语言的关键技术。利用各种技术从声信号中还原语言信息的***称为话音识别器(VR)。话音识别器由声处理器和字译码器组成,其中声处理器从来话原始话音中抽取VR所需的信息传播特征(information-bearingfeature)(矢量)序列,而字译码器对该特征(矢量)序列译码,产生所需的且有意义的输出格式,比如与输入发音相应的语音字序列。为改善某给定***的性能,需要进行训练,以给***配备有效的参数。换句话说,***在优化工作前需要学习。
声处理器是话音识别器中的一个前端话音分析子***。它对输入话音信号作出响应,为表征随时间变化的话音信号提供一种适当表示。它应该除去诸如背景噪声、信道失真、说话者特征和说话方式等无关的信息。有效的声特征将为话音识别器提供较高的声鉴辨力。最有用的特征是短时间频谱包络。为表征短时间频谱包络,两种最常用的频谱分析技术是线性预测编码(LPC)和基于滤波器组的频谱分析模式。但是容易证明,如L.R.Rabiner和R.W.Schafer在1978年Prentice Hall公司出版的“话音信号数字处理”一书中所讨论的,LPC不仅为声带频谱包络提供良好的近似,而且在计算方面的花费比所有数字化的滤波器组模式要小得多。经验还表明,基于LPC的话音识别器的性能可以与基于滤波器组的识别器相媲美,或者会更好。具体可以参见Prentice Hall公司于1993年出版的“话音识别基础”一书,作者是L.R.Rabiner和B.H.Juang。
参照图1,在基于LPC的声处理器中,将输入话音提供给传声器(未示出),并且将其转换成模拟电信号。然后A/D转换器(未示出)将该电信号数字化。为把数字化的话音信号频谱展平并使其在接下来的信号处理过程中少受有限精度的影响,让该信号通过预加重滤波器2。然后将经过预加重滤波的话音提供给分段单元4,在分段单元4处,话音被分段,或者被分成许多暂时重叠或不重叠的组。然后将话音数据帧提供给开窗单元6,在开窗单元6中除去分成帧的DC分量,并在每一帧上进行数字开窗操作,以减少因帧边界处的不连续性而引起的分组效应。LPC分析中最常用的开窗函数是Hamming窗口w(n),其定义是: 经开窗的话音被提供给LPC分析单元8。在LPC分析单元8中,根据被开窗的采样计算自相关函数,并且从自相关函数中直接获得相应的LPC参数。
一般地说,字译码器将声处理器产生的声特性序列翻译成说话者原始字串的估计。它分两步完成:即声模式匹配(acoustic patternmatching)和语言建模(language modeling)。在孤立的字识别应用中可以避免语言建模步骤。将来自LPC分析单元8的LPC参数提供给声模型匹配单元10,以对音素、音节和字等可能的声模型进行检测和分类。将候选的模型提供给建立语法约束规则模型的语言建模单元12,这些语法约束规则可以确定哪些字序列是按语法构成并有意义的。当单独的声信息含义不清时,语法信息能为话音识别提供有价值的指导。基于语言建模,VR按顺序解释声特性匹配结果并提供估计字串。
字译码器中的声模型匹配和语言建模都需要数学模型,或为确定的或为随机的,以描述说话者的音韵和声音语音的变化。话音识别***的性能直接与这两种建模的质量有关。在声模型匹配的各种模型类别中,基于模板的动态时间翘曲(dynamic time warping)(DTW)和随机隐含马尔可夫模型(hidden Markov modeling)(HMM)是两种最常用的。但是,已经知道,基于DTW的方法可视为基于HMM方法的一种特例,后者是一种参数化双随机模型。目前HMM***是最成功的话音识别算法。HMM中的双随机特性在吸收声音和与话音信号有关的暂时变化方面提供了较好的适应性。这常常导致识别精度的改善。关于语言模型,已在实际的大词汇量话音识别***中成功地应用了一种称为k语法语言模型的随机模型,详见1985年《电气与电子工程师协会会刊》,第73卷,第1616-1624页,由F.Jelink撰写的“实验分散口语识别器的开发”一文。在词汇量小的情况下,已在飞机订票和信息***的应用中将确定性的语法制成一有限状态网络(FSN)(详见1985年六月《电气与电子工程师协会IASSP会刊》第33卷第3册,由L.R.Rabiner和S.Z.Levinson 撰写的“A Speaker-Independent,Syntax-Directed,Connected Word Recognition System Based on HiddenMarkov Model and Level Building”一文)。
从统计角度讲,为了尽量减少识别错误的可能性,可以按下述方法使话音识别问题形式化:利用声证据观测O,话音识别操作将是寻找最相似的字串W*,以使
W*=arg max P(WIO) (1)其中取最大运算是针对所有可能的字串W。根据Bayes规则,可将上述方程中的后验概率重写成: 由于P(O)与识别无关,所以可以用另一种方法获得字串估计,即
W*=arg max P(W)P(OIW) (3)这里P(W)表示将发出字串W的先验概率,而P(O|W)是对给定说话者发出字序列W,观察到声证据O的概率。P(O|W)由声模型匹配确定,而先验概率P(W)由所用的语言模型定义。
在连贯的字识别中,如果词汇量较少(少于100),则可用确定性语法硬性规定那些字可以逻辑地接在其他字的后面,以形成语言中的合法句子。确定性语法可隐含地结合在声匹配算法中,以限制潜在字的搜索空间并大大减少计算量。但是,当词汇量中等(大于100但小于1000)或者较大(大于1000)时,可用随机语言建模获得可能的字序列W=(w1,w2,…,wn)。根据简单的概率论,可以如下分解先验概率P(W): 其中P(wi|w1,w2,…,Wi-1)是在给定序列(W1,W2,…,Wi-1)后将说出wi的概率。wi的选择依赖于以前整个输入字的历史。对于词汇量为V的情况,需要Vi个值才能使P(wi|w1,w2,…,Wi-1)完全确定。即使对于词汇量中等的情况,也需要惊人数量的样例来训练语言模型。因训练数据不充分而引起的对P(wi|w1,w2,…,Wi-1)估计不精确,将降低原本声匹配结果的价值。
上述问题的实际解决方法是假设wi仅依赖于(k-1)个先行字Wi-1,wi-2,…,Wi-k+1。随机语言模型可以用导出k语法语言模型的P(wi|w1,w2,…,Wi-k+1)来完整地描述。由于如果k>3,大多数字串将永远不会出现在语言中,所以单语法(k=1)、双语法(k=2)和三语法(k=3)是统计地考虑语法的最有效的随机语言模型。语言模型包含对识别有用的语法和语义信息,但这些概率必须从大量话音数据中训练得到。当有效的训练数据相当有限,K语法永远不会出现在数据中时,P(wi|Wi-2,wi-1)可以直接从双语法概率P(wi|Wi-1)估计得到。该过程详见1985年《电气与电子工程师协会会刊》,第73卷,第1616-1624页,由F.Jelink撰写的“实验分散口语识别器的开发”一文。在连贯的字识别中,整个字模型用作基本话音单位,而在连续话音识别中,可将诸如音素、音节或半音节等分波段单位用作基本话音单位。字译码器将作相应改进。
常规的话音识别***把声处理器和字处理器合在一起,不考虑其可分离性,应用***的限制(诸如功率损耗、可用存储等)和通信信道的特性。这激发了人们对设计上述两部分被适当分离的分布式话音识别***的兴趣。
发明内容
本发明是一种新的改进型分布式话音识别***,在该***中,(i)前端声处理器可以基于LPC或者基于滤波器组;(ii)字译码器中的声模型匹配可以基于隐含马尔可夫模型(HMM)、动态时间翘曲(DTW)或者甚至基于神经网络(NN);并且(iii)对于连贯或连续的字识别,语言模型可以基于确定性或随机性的语法。本发明不同于通过适当分离特征抽取和字译码部分来提高***性能的常规话音识别器。正如以下实施例所描述的,如果诸如倒频谱系数等基于LPC的特征通过通信信道发送,则可用LPC和LSP之间的转换来减少噪声对特征序列的影响。
附图说明
结合附图阅读以下叙述的详细说明将更加了解本发明的特征、目的和优点。附图中相同的标号自始至终表示相同的部分,其中
图1是常规话音识别***的方框图;
图2是在无线电通信环境下本发明一实施例的方框图;
图3是本发明的总方框图;
图4是本发明转换单元和逆转换单元实施例的方框图;
图5是包括本地字检测器和远程字检测器的本发明优选实施例的方框图。
具体实施方式
在标准话音识别器中,无论在识别过程中或者在训练过程中,复杂的计算大多集中在话音识别器的字译码子***中。在实现具有分布式***结构的话音识别器时,通常希望把字译码任务放在能适当吸收计算负载的子***中。但声处理器应尽量靠近语音源,以减少信号处理引起的量化误差和/或信道引入误差的影响。
图2示出了本发明的一个实施例。在该实施例中,环境是无线电通信***,***包括一便携式蜂窝电话机或个人通信设备40,以及被称为蜂窝基站的中央通信中心42。在该实施例中,给出了分布式VR***。在该分布式VR中,声处理器或特性抽取单元22在个人通信设备40中,而字译码器48在中央通信中心中。如果不用分布式VR,只在便携式蜂窝电话中实现VR,那么由于计算花费很大,所以即使对于词汇量中等的连贯字识别,也是极不可行的。另一方面,如果VR只在基站中,那么与话音编码译码器相关的话音信号的衰减和信道效应会大大降低精度。显然,该推荐的分布式***设计有三个好处。第一个好处是由于字译码器硬件不再位于电话机40中,所以降低了蜂窝电话机的成本。第二个好处是,减缓了便携式电话40中电池(未示出)的耗电,本地进行计算强度很大的字译码器操作时会引起上述电池耗电。第三个好处是,除了分布式***的灵活性和扩展能力,识别精度也有改善。
将话音提供给传声器20,传声器将该话音信号转换成电信号,提供给特性抽取单元22。传声器20输出的信号可以是模拟或是数字的。如果信号是模拟的,那么需要在传声器20和特性抽取单元22间安置一个模拟-数字转换器(未示出)。话音信号被提供给特性抽取单元22。特性抽取单元22抽取将用来对输入话音之语言解释进行译码的输入话音的相关特征。可用来估计话音的特征一个例子是输入话音帧的频率特性。该特性常常用作话音输入帧的线性预测编码参数。然后将抽取的话音特征提供给发射器24,发射器24对抽取的特征信号进行编码、调制和放大,并通过双工器将调制特征提供至天线28,天线将话音调制特征发送给蜂窝基站或中央通信中心42。本领域中已知的各种数字编码、调制和发射方式皆可使用。
在中央通信中心42处,天线44接收发送来的特征,并将其提供给接收器46。接收器46实行解调功能并对接收到的被发送来的特征译码,然后提供给字译码器48。字译码器48根据话音特征确定给话音的语言估计,并将一动作信号提供给发射器50。发射器50对该动作信号进行放大、调制和编码,并将放大后的信号提供给天线52,天线52将估计字或命令信号发送给便携式电话40。发射器50也可以使用已知的数字编码、调制或发送技术。
在便携式电话40处,天线28接收估计字或命令信号,并通过双工器26将接收到的信号提供给接收器30,接收器30对该信号解调、译码,然后将该命令信号或估计字提供个给控制单元38。控制单元38对接收到的命令信号或估计字作出响应,提供预定的反应(例如,拨电话号码、将信息提供给便携式电话上的显示屏等等)。
图2所示的***还可以按略微不同的方式使用,即从中央通信中心42发回的信息不一定是被发送话音的解释,从中央通信中心42发回的信息也可以是对便携式电话所发译码消息的响应。例如,可以在通过通信网与中央通信中心42耦合的远程应答机(未示出)上询问消息,在该情况下,从中央通信中心42发送至便携式电话机40的信号可以是来自应答机的消息。第二控制单元49可以同在中央通信中心中。
以下是将特征抽取单元22放在便携式电话40中而不放在中央通信中心42处的重要性。如果与分布式VR相反,将声处理器放在中央通信中心42处,那么低带宽数字无线电信道由于量化失真而需要一个限制特征矢量分解(resolution)的声码器(在第一子***处)。但是,通过将声处理器放在便携式或蜂窝式电话中,就可以把整个信道频带用于特征发送。通常,传输被抽取的声特征矢量比传输话音信号需要较小的带宽。由于识别精度高度依赖输入话音信号的衰减,所以应该尽可能地使特征抽取单元22接近用户,从而特征抽取单元22根据传声器话音抽取特征矢量,而不是根据可能会在传输中又出错的声码式电话话音抽取特征矢量。
在实际应用中,话音识别器被设计在诸如背景噪声等环境条件下工作。因此,考虑噪声存在情况下的话音识别问题是很重要的。已经证明,如果在与测试条件完全(或近似)相同的环境下进行词汇量(参考模型)的训练,那么话音识别器不仅能在噪声很大的环境下提供良好的性能,而且能大大降低因噪声引起的识别精度的降低。训练和测试条件之间的不匹配构成了识别性能降低的主要因素之一。如前所述由于传输声特征所需带宽比话音信号要小,所以可假设声特征比话音信号能更可靠地通过通信信道,由此所推荐的分布式话音识别***在提供匹配状态方面具有优势。如果在远地实现话音识别器,那么诸如无线电通信中遇到的衰落等信道变化会大大破坏匹配状态。如果能在本地吸收大量的训练计算,那么在本地实现VR便能避免上述影响。不幸的是,在许多应用中,这是不可能的。显然,分布式话音识别装置可以避免由信道的复杂性引起的不匹配情况,并弥补集中化装置的缺点。
参照图3,将数字话音采样提供给特征抽取单元51,特征抽取单元51通过通信信道56将特征提供给字估计单元62,确定估计字串。话音信号被提供给声处理器52,确定每个话音帧的潜在特征。由于字译码器在执行识别和训练任务时都要求输入声特征序列,所以须将这些声特征通过通信信道56发送过来。但是,并非标准话音识别***中使用的所有潜在特征都适于通过噪声信道传输。在某些情况下,需要转换单元22以便于进行源编码,并降低信道噪声的影响。话音识别器中广泛使用的一例基于LPC的声特征是倒频谱系数{ci}。它们可以如下从LPC系数{ai}直接获得: 其中P是所用LPC滤波器的级,而Q是倒频谱特征矢量的大小。由于倒频谱特征矢量快速变化,所以不容易压缩倒频谱系数帧序列。但是,在LPC和线谱对(line spectrum pair)(LPC)频率之间存在一种转换,后者变化较慢,并能用δ脉冲编码的调制(DPCM)方案有效编码。由于倒频谱系数可以从LPC系数中直接导出,所以转换单元54将LPC转换成LPS,然后将其编码,通过通信信道56。在远程字估计单元62处,逆转换单元60对经转换的潜在特征进行逆转换,以将声特征提供给字译码器64,随后字译码器64提供估计字串。
转换单元54的一个实施例以图4中的转换子***70示出。在图4中,来自声处理器52的LPC系数被提供给LPC至LPS转换单元72。在LPC至LPS单元72内,可如下确定LPS系数。对于第P级LPC系数,其相应的LPS频率可作为下列方程的在0和π间的P个根获得:
P(w)=cos 5w+p1 cos4w+…+p5/2 (7)
Q(w)=cos 5w+q1 cos4w+…+q5/2 (8)其中pi和qi可如下递归地计算:
p0=q0=1 (9)
pi=-ai-aP-i-pi-1,1≤i≤P/2 (10)
qi=-ai+aP-i-qi-1,1≤i≤P/2 (11)LPS频率被提供给DPCM单元74,并在此编码以通过通信信道76发送出去。
在逆转换单元78处,从信道接收到的信号通过逆DPCM单元80和LPC至LPS单元82,恢复话音信号的LPS频率。LPS至LPC单元82进行LPC至LPS单元72的逆过程,将LPS频率转换回推导倒频谱系数用的LPC系数。LPS至LPC单元82进行下列转换: 然后LPC系数被提供给LPC至倒频谱单元84,单元84再根据方程5和方程6将倒频谱系数提供给字译码器64。
由于字译码器仅依赖于声特征序列(如果序列直接通过通信信道发送,则易产生噪声),所以如图3所示,在子***51中将潜在的声特征序列推算或转换成另一种便于传输的表示。经过逆转换后可以获得字译码器中使用的声特征序列。因此,在VR分布式装置中,通过大气(信道)发送的特征序列可以与字译码器中真正使用的不同。预期,可用本领域中已知的任何错误保护方案对转换单元70的输出再编码。
在图5中,示出了本发明的改进实施例。在无线电通信应用中,部分由于昂贵的信道访问,用户可能不希望少数简单但又常用的话音命令占用通信信道。以在本地手机100处进行词汇量相当小的话音识别而词汇量较大的第二话音识别***位于远程基站110中的方式,进一步在手机和基站间分配字译码功能,便可达到上述愿望。它们公用手机中的同一声处理器。本地字译码器中的词汇表包含最常用的字或字串。另一方面,远程字译码器中的词汇表包含通用字或字串。如图5所示,基于这种内在结构,可以缩短占用信道的平均时间,并且提高平均识别精度。
另外,有两组话音命令可以使用,一组称为特殊话音命令(specialvoiced command),与本地VR识别的命令对应,另一组称为通用话音命令(regular voiced command),它与本地VR不能识别的命令对应。无论何时发出特殊话音命令,皆从本地字译码器中抽取真正的声特征,并在本地实行话音识别功能,不访问通信信道。当发出通用话音命令时,通过信道发送经转换的声特征矢量,并在远程基站处进行字译码操作。
由于对于任何特殊话音命令不需要转换声特征,也不进行编码,并且本地VR的词汇量小,所以所需的计算将比远程所需的少得多(与在可能的词汇中寻找正确字串有关的计算正比于词汇量)。另外,由于声特征将在没有信道潜在错误的情况下直接送至本地VR,所以与远程VR相比,可以用简化的HMM(例如用较少状态数、较少状态输出概率混合部件数等)对本地话音识别器建模。尽管词汇量有所限制,但这将有可能在计算负载受到限制的手机中(子***1)实施本地VR。可以预期,分布式VR还可以用在其他不同于无线电通信***的应用目的中。
参照图5,将话音信号提供给声处理器102,然后从话音信号中抽取例如基于LPC的特征参数等特征。然后将这些特征提供给本地字译码器106,字译码器106在其较小的词汇表中搜寻,识别输入话音信号。如果对输入字串的译码失败并且断定远程VR应该对其译码,那么它就将信号传送给转换单元104,由转换单元104准备需发送的特征。然后通过通信信道108将转换后的特征发送至远程字译码器110。逆转换单元112接收被转换的特征,进行转换单元104的逆操作并将声特征提供给远程字译码器单元114,远程字译码器单元114作出响应,输出估计远程字串。
上述对优选实施例的描述能使本领域的技术人员实施或使用本发明。显然对这些实施例的各种修改对于本领域的技术人员而言是很容易的,并且无需创造性的智慧便能将此处定义的一般原理应用到其他实施例中。因此,不应将本发明局限于这里描述的实施例,本发明应被给予与此处揭示的原理和新特征相一致的最宽的范围。
Claims (26)
1.一种在移动通信***中使用的远程站,其特征在于,包括:
特征抽取装置,它位于远程站,用于接收语音采样帧,并抽取一组参数用于语音识别;
第一字译码装置,用于接收所述参数组,并根据一小词汇表从所述参数中抽取所述语音的意义;和
发射装置,用于将所述第一字译码装置不能译码的一组参数无线发送给一接收站,所述接收站具有第二字译码装置,用于根据一大词汇表从所述发送的参数中抽取所述语音的意义。
2.如权利要求1所述的远程站,其特征在于,还包括一传声器,用于接收声信号并将所述声信号提供给所述特征抽取装置。
3.如权利要求1所述的远程站,其特征在于,还包括转换装置,所述转换装置位于所述特征抽取装置和所述发射装置之间,并根据一预定转换格式将所述参数组转换成其另一种表达。
4.如权利要求1所述的远程站,其特征在于,所述参数组包含线性预测系数。
5.如权利要求1所述的远程站,其特征在于,所述参数组包含线谱对的值。
6.如权利要求3所述的远程站,其特征在于,所述参数组包含线性预测系数,并且所述预定转换格式是从线性预测系数至线谱对的转换。
7.如权利要求1所述的远程站,其特征在于,还包括接收装置,用于依照对所述语音帧所作的语音识别操作,接收一响应信号。
8.如权利要求7所述的远程站,其特征在于,还包括控制装置,用于接收所述响应信号,并根据所述响应信号提供一控制信号。
9.如权利要求1所述的远程站,其特征在于,还包括一转换单元,它位于所述特征抽取装置和所述发射装置之间,其输入端与所述特征抽取装置的输出端耦连,其输出端与所述发射装置的输入端耦连。
10.如权利要求7所述的远程站,其特征在于,还包括一控制单元,其输入端与所述接收装置的输出端相连,其输出端根据所述响应信号提供一控制信号。
11.一种在移动通信***中使用的中央通信站,其特征在于,包括:
字译码器,它位于所述中央通信站,用于接收来自一远程站的一组语音参数,其中所述远程站与所述中央通信站在空间上分离并通过一无线通信装置与中央通信站通信,所述字译码器用一个与所述中央通信站的字译码器相关的常用词汇表对所述语音参数组进行语音识别操作,其中所述语音参数不能用与远程站中的字译码器相关的本地词汇表进行识别;和
信号发生器,用于根据所述语音识别操作的结果,产生一响应信号。
12.如权利要求11所述的中央通信站,其特征在于,还包括一接收机,其输入端用于接收远程站语音参数信号,并将所述远程站语音参数提供给所述字译码装置。
13.如权利要求11所述的中央通信站,其特征在于,还包括控制装置,其输入端与所述字译码器的输出端耦连,其输出端提供一控制信号。
14.一种分布式话音识别***,其特征在于,包括:
本地字译码器,它位于用户站,用于接收第一语音采样帧的抽取的声特征,并根据一小词汇表对所述声特征译码;和
远程字译码器,它位于中央处理站,其中所述中央处理站与所述用户站在空间上分离,所述远程字译码器用于接收第二帧的抽取声特征,并根据比所述小词汇表大的常用词汇表对所述本地字译码器不能译码的所述第二帧的所述声特征进行译码。
15.如权利要求14所述的***,其特征在于,还包括一预处理器,用于根据一预定特征抽取格式抽取所述语音采样帧的所述声特征,并提供所述声特征。
16.如权利要求15所述的***,其特征在于,所述声特征是基于线性预测编码的参数。
17.如权利要求15所述的***,其特征在于,所述声特征是谱系数。
18.如权利要求15所述的***,其特征在于,所述预处理器包括一声码器。
19.如权利要求18所述的***,其特征在于,所述声码器是基于线性预测编码的声码器。
20.如权利要求14所述的***,其特征在于,还包括
转换单元,它位于所述用户站,用于接收所述声特征,并根据一预定转换格式将所述声特征转换成经转换的特征,所述经转换的特征经一通信信道发送给所述中央处理站;和
逆转换单元,它位于所述中央处理站,用于接收所述经转换的特征,并根据一预定的逆转换格式将所述经转换的特征转换成估计的声特征,并且将所述估计的声特征提供给所述远程字译码器。
21.如权利要求20所述的***,其特征在于,所述声特征是基于线性预测编码的参数,所述预定转换格式将所述基于线性预测编码的参数转换成线谱对频率,而所述逆转换格式将所述线谱对频率转换成基于线性预测编码的数据。
22.如权利要求14所述的***,其特征在于,所述本地字译码器根据隐含马尔可夫模型进行声模型匹配。
23.如权利要求14所述的***,其特征在于,所述远程字译码器根据隐含马尔可夫模型进行声模型匹配。
24.如权利要求14所述的***,其特征在于,所述本地字译码器根据动态时间翘曲进行声模型匹配。
25.如权利要求14所述的***,其特征在于,所述远程字译码器根据动态时间翘曲进行声模型匹配。
26.如权利要求14所述的***,其特征在于,所述用户站通过一无线通信装置与所述中央处理站通信。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US17324793A | 1993-12-22 | 1993-12-22 | |
US08/173,247 | 1993-12-22 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN1138386A CN1138386A (zh) | 1996-12-18 |
CN1119794C true CN1119794C (zh) | 2003-08-27 |
Family
ID=22631169
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN94194566A Expired - Lifetime CN1119794C (zh) | 1993-12-22 | 1994-12-20 | 分布式话音识别*** |
Country Status (17)
Country | Link |
---|---|
US (1) | US5956683A (zh) |
EP (3) | EP1381029A1 (zh) |
JP (1) | JP3661874B2 (zh) |
KR (1) | KR100316077B1 (zh) |
CN (1) | CN1119794C (zh) |
AT (1) | ATE261172T1 (zh) |
AU (1) | AU692820B2 (zh) |
BR (1) | BR9408413A (zh) |
CA (1) | CA2179759C (zh) |
DE (1) | DE69433593T2 (zh) |
FI (2) | FI118909B (zh) |
HK (1) | HK1011109A1 (zh) |
IL (1) | IL112057A0 (zh) |
MY (1) | MY116482A (zh) |
TW (1) | TW318239B (zh) |
WO (1) | WO1995017746A1 (zh) |
ZA (1) | ZA948426B (zh) |
Families Citing this family (283)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6594628B1 (en) * | 1995-09-21 | 2003-07-15 | Qualcomm, Incorporated | Distributed voice recognition system |
US9063719B1 (en) * | 1995-10-02 | 2015-06-23 | People Innovate For Economy Foundation, Inc. | Table format programming |
US5774858A (en) * | 1995-10-23 | 1998-06-30 | Taubkin; Vladimir L. | Speech analysis method of protecting a vehicle from unauthorized accessing and controlling |
US8209184B1 (en) | 1997-04-14 | 2012-06-26 | At&T Intellectual Property Ii, L.P. | System and method of providing generated speech via a network |
FI972723A0 (fi) * | 1997-06-24 | 1997-06-24 | Nokia Mobile Phones Ltd | Mobila kommunikationsanordningar |
CA2219008C (en) * | 1997-10-21 | 2002-11-19 | Bell Canada | A method and apparatus for improving the utility of speech recognition |
JP3055514B2 (ja) * | 1997-12-05 | 2000-06-26 | 日本電気株式会社 | 電話回線用音声認識装置 |
US6208959B1 (en) | 1997-12-15 | 2001-03-27 | Telefonaktibolaget Lm Ericsson (Publ) | Mapping of digital data symbols onto one or more formant frequencies for transmission over a coded voice channel |
US6003004A (en) | 1998-01-08 | 1999-12-14 | Advanced Recognition Technologies, Inc. | Speech recognition method and system using compressed speech data |
US6614885B2 (en) * | 1998-08-14 | 2003-09-02 | Intervoice Limited Partnership | System and method for operating a highly distributed interactive voice response system |
US6260010B1 (en) * | 1998-08-24 | 2001-07-10 | Conexant Systems, Inc. | Speech encoder using gain normalization that combines open and closed loop gains |
US7003463B1 (en) * | 1998-10-02 | 2006-02-21 | International Business Machines Corporation | System and method for providing network coordinated conversational services |
GB2342828A (en) * | 1998-10-13 | 2000-04-19 | Nokia Mobile Phones Ltd | Speech parameter compression; distributed speech recognition |
US6389389B1 (en) | 1998-10-13 | 2002-05-14 | Motorola, Inc. | Speech recognition using unequally-weighted subvector error measures for determining a codebook vector index to represent plural speech parameters |
US6185535B1 (en) * | 1998-10-16 | 2001-02-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Voice control of a user interface to service applications |
GB2343777B (en) * | 1998-11-13 | 2003-07-02 | Motorola Ltd | Mitigating errors in a distributed speech recognition process |
GB2343778B (en) * | 1998-11-13 | 2003-03-05 | Motorola Ltd | Processing received data in a distributed speech recognition process |
US6490621B1 (en) * | 1998-11-20 | 2002-12-03 | Orchestream Americas Corporation | Calculation of resource availability using degradation factors |
US6336090B1 (en) | 1998-11-30 | 2002-01-01 | Lucent Technologies Inc. | Automatic speech/speaker recognition over digital wireless channels |
KR100667522B1 (ko) * | 1998-12-18 | 2007-05-17 | 주식회사 현대오토넷 | Lpc 계수를 이용한 이동통신 단말기 음성인식 방법 |
US6411926B1 (en) * | 1999-02-08 | 2002-06-25 | Qualcomm Incorporated | Distributed voice recognition system |
ATE341810T1 (de) * | 1999-02-19 | 2006-10-15 | Custom Speech Usa Inc | Automatisiertes übertragungssystem und -verfahren mit zwei instanzen zur sprachumwandlung und rechnergestützter korrektur |
DE19910236A1 (de) * | 1999-03-09 | 2000-09-21 | Philips Corp Intellectual Pty | Verfahren zur Spracherkennung |
ATE281689T1 (de) * | 1999-03-26 | 2004-11-15 | Scansoft Inc | Client-server spracherkennungssystem |
EP1088299A2 (en) * | 1999-03-26 | 2001-04-04 | Scansoft, Inc. | Client-server speech recognition |
US20050091057A1 (en) * | 1999-04-12 | 2005-04-28 | General Magic, Inc. | Voice application development methodology |
US20050261907A1 (en) * | 1999-04-12 | 2005-11-24 | Ben Franklin Patent Holding Llc | Voice integration platform |
US6408272B1 (en) * | 1999-04-12 | 2002-06-18 | General Magic, Inc. | Distributed voice user interface |
US6290646B1 (en) | 1999-04-16 | 2001-09-18 | Cardiocom | Apparatus and method for monitoring and communicating wellness parameters of ambulatory patients |
US8419650B2 (en) | 1999-04-16 | 2013-04-16 | Cariocom, LLC | Downloadable datasets for a patient monitoring system |
US6292781B1 (en) * | 1999-05-28 | 2001-09-18 | Motorola | Method and apparatus for facilitating distributed speech processing in a communication system |
US6363349B1 (en) * | 1999-05-28 | 2002-03-26 | Motorola, Inc. | Method and apparatus for performing distributed speech processing in a communication system |
DE19930407A1 (de) * | 1999-06-09 | 2000-12-14 | Philips Corp Intellectual Pty | Verfahren zur sprachbasierten Navigation in einem Kommunikationsnetzwerk und zur Implementierung einer Spracheingabemöglichkeit in private Informationseinheiten |
KR20010019786A (ko) * | 1999-08-30 | 2001-03-15 | 윤종용 | 이동통신 시스템에서 음성인식 및 문자표시 장치 및 방법 |
JP3969908B2 (ja) | 1999-09-14 | 2007-09-05 | キヤノン株式会社 | 音声入力端末器、音声認識装置、音声通信システム及び音声通信方法 |
US7194752B1 (en) | 1999-10-19 | 2007-03-20 | Iceberg Industries, Llc | Method and apparatus for automatically recognizing input audio and/or video streams |
US7689416B1 (en) * | 1999-09-29 | 2010-03-30 | Poirier Darrell A | System for transferring personalize matter from one computer to another |
US6963759B1 (en) * | 1999-10-05 | 2005-11-08 | Fastmobile, Inc. | Speech recognition technique based on local interrupt detection |
US6912496B1 (en) * | 1999-10-26 | 2005-06-28 | Silicon Automation Systems | Preprocessing modules for quality enhancement of MBE coders and decoders for signals having transmission path characteristics |
FI19992350A (fi) | 1999-10-29 | 2001-04-30 | Nokia Mobile Phones Ltd | Parannettu puheentunnistus |
EP1098297A1 (en) * | 1999-11-02 | 2001-05-09 | BRITISH TELECOMMUNICATIONS public limited company | Speech recognition |
US6725190B1 (en) * | 1999-11-02 | 2004-04-20 | International Business Machines Corporation | Method and system for speech reconstruction from speech recognition features, pitch and voicing with resampled basis functions providing reconstruction of the spectral envelope |
US6665640B1 (en) | 1999-11-12 | 2003-12-16 | Phoenix Solutions, Inc. | Interactive speech based learning/training system formulating search queries based on natural language parsing of recognized user queries |
US7725307B2 (en) | 1999-11-12 | 2010-05-25 | Phoenix Solutions, Inc. | Query engine for processing voice based queries including semantic decoding |
US6615172B1 (en) | 1999-11-12 | 2003-09-02 | Phoenix Solutions, Inc. | Intelligent query engine for processing voice based queries |
US6633846B1 (en) * | 1999-11-12 | 2003-10-14 | Phoenix Solutions, Inc. | Distributed realtime speech recognition system |
US7050977B1 (en) | 1999-11-12 | 2006-05-23 | Phoenix Solutions, Inc. | Speech-enabled server for internet website and method |
US7392185B2 (en) | 1999-11-12 | 2008-06-24 | Phoenix Solutions, Inc. | Speech based learning/training system using semantic decoding |
US9076448B2 (en) | 1999-11-12 | 2015-07-07 | Nuance Communications, Inc. | Distributed real time speech recognition system |
AU3083201A (en) * | 1999-11-22 | 2001-06-04 | Microsoft Corporation | Distributed speech recognition for mobile communication devices |
US6675027B1 (en) * | 1999-11-22 | 2004-01-06 | Microsoft Corp | Personal mobile computing device having antenna microphone for improved speech recognition |
US6532446B1 (en) | 1999-11-24 | 2003-03-11 | Openwave Systems Inc. | Server based speech recognition user interface for wireless devices |
US6424945B1 (en) * | 1999-12-15 | 2002-07-23 | Nokia Corporation | Voice packet data network browsing for mobile terminals system and method using a dual-mode wireless connection |
DE10003529A1 (de) * | 2000-01-27 | 2001-08-16 | Siemens Ag | Verfahren und Vorrichtung zum Erstellen einer Textdatei mittels Spracherkennung |
US7505921B1 (en) | 2000-03-03 | 2009-03-17 | Finali Corporation | System and method for optimizing a product configuration |
US8645137B2 (en) | 2000-03-16 | 2014-02-04 | Apple Inc. | Fast, language-independent method for user authentication by voice |
CN1315721A (zh) * | 2000-03-23 | 2001-10-03 | 韦尔博泰克有限公司 | 客户服务器语音信息传送***与方法 |
US6760699B1 (en) * | 2000-04-24 | 2004-07-06 | Lucent Technologies Inc. | Soft feature decoding in a distributed automatic speech recognition system for use over wireless channels |
US6502070B1 (en) * | 2000-04-28 | 2002-12-31 | Nortel Networks Limited | Method and apparatus for normalizing channel specific speech feature elements |
US6785653B1 (en) * | 2000-05-01 | 2004-08-31 | Nuance Communications | Distributed voice web architecture and associated components and methods |
JP3728177B2 (ja) | 2000-05-24 | 2005-12-21 | キヤノン株式会社 | 音声処理システム、装置、方法及び記憶媒体 |
EP1290678B1 (en) * | 2000-06-08 | 2007-03-28 | Nokia Corporation | Method and system for adaptive distributed speech recognition |
EP1304682A1 (en) * | 2000-07-05 | 2003-04-23 | Alcatel | Distributed speech recognition system |
CN1404603A (zh) * | 2000-09-07 | 2003-03-19 | 皇家菲利浦电子有限公司 | 语音控制及加载的用户控制信息 |
US6823306B2 (en) * | 2000-11-30 | 2004-11-23 | Telesector Resources Group, Inc. | Methods and apparatus for generating, updating and distributing speech recognition models |
US6915262B2 (en) | 2000-11-30 | 2005-07-05 | Telesector Resources Group, Inc. | Methods and apparatus for performing speech recognition and using speech recognition results |
US8135589B1 (en) | 2000-11-30 | 2012-03-13 | Google Inc. | Performing speech recognition over a network and using speech recognition results |
EP1215659A1 (en) * | 2000-12-14 | 2002-06-19 | Nokia Corporation | Locally distibuted speech recognition system and method of its operation |
US20020091515A1 (en) * | 2001-01-05 | 2002-07-11 | Harinath Garudadri | System and method for voice recognition in a distributed voice recognition system |
US20030004720A1 (en) * | 2001-01-30 | 2003-01-02 | Harinath Garudadri | System and method for computing and transmitting parameters in a distributed voice recognition system |
US7024359B2 (en) * | 2001-01-31 | 2006-04-04 | Qualcomm Incorporated | Distributed voice recognition system using acoustic feature vector modification |
US6633839B2 (en) * | 2001-02-02 | 2003-10-14 | Motorola, Inc. | Method and apparatus for speech reconstruction in a distributed speech recognition system |
FR2820872B1 (fr) * | 2001-02-13 | 2003-05-16 | Thomson Multimedia Sa | Procede, module, dispositif et serveur de reconnaissance vocale |
US6885735B2 (en) * | 2001-03-29 | 2005-04-26 | Intellisist, Llc | System and method for transmitting voice input from a remote location over a wireless data channel |
US6487494B2 (en) * | 2001-03-29 | 2002-11-26 | Wingcast, Llc | System and method for reducing the amount of repetitive data sent by a server to a client for vehicle navigation |
US7406421B2 (en) * | 2001-10-26 | 2008-07-29 | Intellisist Inc. | Systems and methods for reviewing informational content in a vehicle |
US20020143611A1 (en) * | 2001-03-29 | 2002-10-03 | Gilad Odinak | Vehicle parking validation system and method |
USRE46109E1 (en) | 2001-03-29 | 2016-08-16 | Lg Electronics Inc. | Vehicle navigation system and method |
US8175886B2 (en) | 2001-03-29 | 2012-05-08 | Intellisist, Inc. | Determination of signal-processing approach based on signal destination characteristics |
US7236777B2 (en) | 2002-05-16 | 2007-06-26 | Intellisist, Inc. | System and method for dynamically configuring wireless network geographic coverage or service levels |
US7941313B2 (en) * | 2001-05-17 | 2011-05-10 | Qualcomm Incorporated | System and method for transmitting speech activity information ahead of speech features in a distributed voice recognition system |
US7203643B2 (en) * | 2001-06-14 | 2007-04-10 | Qualcomm Incorporated | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
US7366673B2 (en) * | 2001-06-15 | 2008-04-29 | International Business Machines Corporation | Selective enablement of speech recognition grammars |
US20020198716A1 (en) * | 2001-06-25 | 2002-12-26 | Kurt Zimmerman | System and method of improved communication |
KR100777551B1 (ko) * | 2001-06-29 | 2007-11-16 | 주식회사 케이티 | 채널용량에 따른 가변 구성이 가능한 음성인식 시스템 및그 방법 |
DE10228408B4 (de) | 2001-07-10 | 2021-09-30 | Sew-Eurodrive Gmbh & Co Kg | Bussystem, umfassend mindestens einen Bus und Busteilnehmer und Verfahren zur Sprachsteuerung |
ATE310302T1 (de) * | 2001-09-28 | 2005-12-15 | Cit Alcatel | Kommunikationsvorrichtung und verfahren zum senden und empfangen von sprachsignalen unter kombination eines spracherkennungsmodules mit einer kodiereinheit |
US7139704B2 (en) * | 2001-11-30 | 2006-11-21 | Intel Corporation | Method and apparatus to perform speech recognition over a voice channel |
GB2383459B (en) * | 2001-12-20 | 2005-05-18 | Hewlett Packard Co | Speech recognition system and method |
US7013275B2 (en) * | 2001-12-28 | 2006-03-14 | Sri International | Method and apparatus for providing a dynamic speech-driven control and remote service access system |
US6898567B2 (en) * | 2001-12-29 | 2005-05-24 | Motorola, Inc. | Method and apparatus for multi-level distributed speech recognition |
US8249880B2 (en) * | 2002-02-14 | 2012-08-21 | Intellisist, Inc. | Real-time display of system instructions |
US20030154080A1 (en) * | 2002-02-14 | 2003-08-14 | Godsey Sandra L. | Method and apparatus for modification of audio input to a data processing system |
US7099825B1 (en) | 2002-03-15 | 2006-08-29 | Sprint Communications Company L.P. | User mobility in a voice recognition environment |
US7089178B2 (en) * | 2002-04-30 | 2006-08-08 | Qualcomm Inc. | Multistream network feature processing for a distributed speech recognition system |
US20030233233A1 (en) * | 2002-06-13 | 2003-12-18 | Industrial Technology Research Institute | Speech recognition involving a neural network |
US6834265B2 (en) | 2002-12-13 | 2004-12-21 | Motorola, Inc. | Method and apparatus for selective speech recognition |
US7197331B2 (en) * | 2002-12-30 | 2007-03-27 | Motorola, Inc. | Method and apparatus for selective distributed speech recognition |
US7076428B2 (en) * | 2002-12-30 | 2006-07-11 | Motorola, Inc. | Method and apparatus for selective distributed speech recognition |
KR100956941B1 (ko) * | 2003-06-27 | 2010-05-11 | 주식회사 케이티 | 네트워크 상황에 따른 선택적 음성인식 장치 및 그 방법 |
WO2005024780A2 (en) * | 2003-09-05 | 2005-03-17 | Grody Stephen D | Methods and apparatus for providing services using speech recognition |
US7283850B2 (en) | 2004-10-12 | 2007-10-16 | Microsoft Corporation | Method and apparatus for multi-sensory speech enhancement on a mobile device |
US8024194B2 (en) * | 2004-12-08 | 2011-09-20 | Nuance Communications, Inc. | Dynamic switching between local and remote speech rendering |
US7680656B2 (en) | 2005-06-28 | 2010-03-16 | Microsoft Corporation | Multi-sensory speech enhancement using a speech-state model |
US7406303B2 (en) | 2005-07-05 | 2008-07-29 | Microsoft Corporation | Multi-sensory speech enhancement using synthesized sensor signal |
US8677377B2 (en) | 2005-09-08 | 2014-03-18 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
US7970613B2 (en) | 2005-11-12 | 2011-06-28 | Sony Computer Entertainment Inc. | Method and system for Gaussian probability data bit reduction and computation |
US7930178B2 (en) | 2005-12-23 | 2011-04-19 | Microsoft Corporation | Speech modeling and enhancement based on magnitude-normalized spectra |
US20070162282A1 (en) * | 2006-01-09 | 2007-07-12 | Gilad Odinak | System and method for performing distributed speech recognition |
DE102006002604A1 (de) * | 2006-01-13 | 2007-07-19 | Deutsche Telekom Ag | Verfahren und System zur Durchführung einer Datentelekommunikation |
US8010358B2 (en) * | 2006-02-21 | 2011-08-30 | Sony Computer Entertainment Inc. | Voice recognition with parallel gender and age normalization |
US7778831B2 (en) * | 2006-02-21 | 2010-08-17 | Sony Computer Entertainment Inc. | Voice recognition with dynamic filter bank adjustment based on speaker categorization determined from runtime pitch |
US7599861B2 (en) | 2006-03-02 | 2009-10-06 | Convergys Customer Management Group, Inc. | System and method for closed loop decisionmaking in an automated care system |
US8644396B2 (en) | 2006-04-18 | 2014-02-04 | Qualcomm Incorporated | Waveform encoding for wireless applications |
TWI405444B (zh) * | 2006-04-26 | 2013-08-11 | Qualcomm Inc | 用於無線通訊之方法、電子裝置、電腦程式產品、耳機、手錶、醫療裝置及感測器 |
US8406794B2 (en) | 2006-04-26 | 2013-03-26 | Qualcomm Incorporated | Methods and apparatuses of initiating communication in wireless networks |
US8289159B2 (en) | 2006-04-26 | 2012-10-16 | Qualcomm Incorporated | Wireless localization apparatus and method |
US7809663B1 (en) | 2006-05-22 | 2010-10-05 | Convergys Cmg Utah, Inc. | System and method for supporting the utilization of machine language |
US8379830B1 (en) | 2006-05-22 | 2013-02-19 | Convergys Customer Management Delaware Llc | System and method for automated customer service with contingent live interaction |
KR100794140B1 (ko) * | 2006-06-30 | 2008-01-10 | 주식회사 케이티 | 분산 음성 인식 단말기에서 음성 부호화기의 전처리를공유해 잡음에 견고한 음성 특징 벡터를 추출하는 장치 및그 방법 |
US9318108B2 (en) | 2010-01-18 | 2016-04-19 | Apple Inc. | Intelligent automated assistant |
US7904298B2 (en) * | 2006-11-17 | 2011-03-08 | Rao Ashwin P | Predictive speech-to-text input |
WO2008064137A2 (en) * | 2006-11-17 | 2008-05-29 | Rao Ashwin P | Predictive speech-to-text input |
JP4658022B2 (ja) * | 2006-11-20 | 2011-03-23 | 株式会社リコー | 音声認識システム |
US9830912B2 (en) | 2006-11-30 | 2017-11-28 | Ashwin P Rao | Speak and touch auto correction interface |
US20080153465A1 (en) * | 2006-12-26 | 2008-06-26 | Voice Signal Technologies, Inc. | Voice search-enabled mobile device |
US20080154870A1 (en) * | 2006-12-26 | 2008-06-26 | Voice Signal Technologies, Inc. | Collection and use of side information in voice-mediated mobile search |
US20080154612A1 (en) * | 2006-12-26 | 2008-06-26 | Voice Signal Technologies, Inc. | Local storage and use of search results for voice-enabled mobile communications devices |
US20080154608A1 (en) * | 2006-12-26 | 2008-06-26 | Voice Signal Technologies, Inc. | On a mobile device tracking use of search results delivered to the mobile device |
US8204746B2 (en) * | 2007-03-29 | 2012-06-19 | Intellisist, Inc. | System and method for providing an automated call center inline architecture |
US8977255B2 (en) | 2007-04-03 | 2015-03-10 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
JP5139747B2 (ja) * | 2007-08-17 | 2013-02-06 | 株式会社ユニバーサルエンターテインメント | 電話端末装置及びこれを用いた音声認識システム |
US9330720B2 (en) | 2008-01-03 | 2016-05-03 | Apple Inc. | Methods and apparatus for altering audio output signals |
US8996376B2 (en) | 2008-04-05 | 2015-03-31 | Apple Inc. | Intelligent text-to-speech conversion |
US10496753B2 (en) | 2010-01-18 | 2019-12-03 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US20100030549A1 (en) | 2008-07-31 | 2010-02-04 | Lee Michael M | Mobile device having human language translation capability with positional feedback |
US9020816B2 (en) * | 2008-08-14 | 2015-04-28 | 21Ct, Inc. | Hidden markov model for speech processing with training method |
US7933777B2 (en) * | 2008-08-29 | 2011-04-26 | Multimodal Technologies, Inc. | Hybrid speech recognition |
US9922640B2 (en) | 2008-10-17 | 2018-03-20 | Ashwin P Rao | System and method for multimodal utterance detection |
US9390167B2 (en) | 2010-07-29 | 2016-07-12 | Soundhound, Inc. | System and methods for continuous audio matching |
WO2010067118A1 (en) * | 2008-12-11 | 2010-06-17 | Novauris Technologies Limited | Speech recognition involving a mobile device |
US8788256B2 (en) | 2009-02-17 | 2014-07-22 | Sony Computer Entertainment Inc. | Multiple language voice recognition |
US8442829B2 (en) * | 2009-02-17 | 2013-05-14 | Sony Computer Entertainment Inc. | Automatic computation streaming partition for voice recognition on multiple processors with limited memory |
US8442833B2 (en) * | 2009-02-17 | 2013-05-14 | Sony Computer Entertainment Inc. | Speech processing with source location estimation using signals from two or more microphones |
US10241752B2 (en) | 2011-09-30 | 2019-03-26 | Apple Inc. | Interface for a virtual digital assistant |
US20120309363A1 (en) | 2011-06-03 | 2012-12-06 | Apple Inc. | Triggering notifications associated with tasks items that represent tasks to perform |
US9858925B2 (en) | 2009-06-05 | 2018-01-02 | Apple Inc. | Using context information to facilitate processing of commands in a virtual assistant |
US10241644B2 (en) | 2011-06-03 | 2019-03-26 | Apple Inc. | Actionable reminder entries |
US9431006B2 (en) | 2009-07-02 | 2016-08-30 | Apple Inc. | Methods and apparatuses for automatic speech recognition |
US10705794B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Automatically adapting user interfaces for hands-free interaction |
US10679605B2 (en) | 2010-01-18 | 2020-06-09 | Apple Inc. | Hands-free list-reading by intelligent automated assistant |
US10553209B2 (en) | 2010-01-18 | 2020-02-04 | Apple Inc. | Systems and methods for hands-free notification summaries |
US10276170B2 (en) | 2010-01-18 | 2019-04-30 | Apple Inc. | Intelligent automated assistant |
DE202011111062U1 (de) | 2010-01-25 | 2019-02-19 | Newvaluexchange Ltd. | Vorrichtung und System für eine Digitalkonversationsmanagementplattform |
US8682667B2 (en) | 2010-02-25 | 2014-03-25 | Apple Inc. | User profiling for selecting user specific voice input processing information |
TWI420510B (zh) * | 2010-05-28 | 2013-12-21 | Ind Tech Res Inst | 可調整記憶體使用空間之語音辨識系統與方法 |
US9047371B2 (en) | 2010-07-29 | 2015-06-02 | Soundhound, Inc. | System and method for matching a query against a broadcast stream |
US9484018B2 (en) * | 2010-11-23 | 2016-11-01 | At&T Intellectual Property I, L.P. | System and method for building and evaluating automatic speech recognition via an application programmer interface |
US8898065B2 (en) | 2011-01-07 | 2014-11-25 | Nuance Communications, Inc. | Configurable speech recognition system using multiple recognizers |
WO2012116110A1 (en) | 2011-02-22 | 2012-08-30 | Speak With Me, Inc. | Hybridized client-server speech recognition |
US9262612B2 (en) | 2011-03-21 | 2016-02-16 | Apple Inc. | Device access using voice authentication |
US9035163B1 (en) | 2011-05-10 | 2015-05-19 | Soundbound, Inc. | System and method for targeting content based on identified audio and multimedia |
US10057736B2 (en) | 2011-06-03 | 2018-08-21 | Apple Inc. | Active transport based notifications |
US8946812B2 (en) | 2011-07-21 | 2015-02-03 | Semiconductor Energy Laboratory Co., Ltd. | Semiconductor device and manufacturing method thereof |
US8994660B2 (en) | 2011-08-29 | 2015-03-31 | Apple Inc. | Text correction processing |
US8972263B2 (en) | 2011-11-18 | 2015-03-03 | Soundhound, Inc. | System and method for performing dual mode speech recognition |
US8918804B2 (en) | 2012-02-07 | 2014-12-23 | Turner Broadcasting System, Inc. | Method and system for a reward program based on automatic content recognition |
US10134385B2 (en) | 2012-03-02 | 2018-11-20 | Apple Inc. | Systems and methods for name pronunciation |
US9483461B2 (en) | 2012-03-06 | 2016-11-01 | Apple Inc. | Handling speech synthesis of content for multiple languages |
US9153235B2 (en) | 2012-04-09 | 2015-10-06 | Sony Computer Entertainment Inc. | Text dependent speaker recognition with long-term feature based on functional data analysis |
US9685160B2 (en) * | 2012-04-16 | 2017-06-20 | Htc Corporation | Method for offering suggestion during conversation, electronic device using the same, and non-transitory storage medium |
US9280610B2 (en) | 2012-05-14 | 2016-03-08 | Apple Inc. | Crowd sourcing information to fulfill user requests |
US9721563B2 (en) | 2012-06-08 | 2017-08-01 | Apple Inc. | Name recognition system |
US10354650B2 (en) | 2012-06-26 | 2019-07-16 | Google Llc | Recognizing speech with mixed speech recognition models to generate transcriptions |
US9495129B2 (en) | 2012-06-29 | 2016-11-15 | Apple Inc. | Device, method, and user interface for voice-activated navigation and browsing of a document |
US10957310B1 (en) | 2012-07-23 | 2021-03-23 | Soundhound, Inc. | Integrated programming framework for speech and text understanding with meaning parsing |
US9576574B2 (en) | 2012-09-10 | 2017-02-21 | Apple Inc. | Context-sensitive handling of interruptions by intelligent digital assistant |
US9547647B2 (en) | 2012-09-19 | 2017-01-17 | Apple Inc. | Voice-based media searching |
US9886944B2 (en) | 2012-10-04 | 2018-02-06 | Nuance Communications, Inc. | Hybrid controller for ASR |
US9570076B2 (en) | 2012-10-30 | 2017-02-14 | Google Technology Holdings LLC | Method and system for voice recognition employing multiple voice-recognition techniques |
US9691377B2 (en) | 2013-07-23 | 2017-06-27 | Google Technology Holdings LLC | Method and device for voice recognition training |
US9395234B2 (en) | 2012-12-05 | 2016-07-19 | Cardiocom, Llc | Stabilizing base for scale |
US9288509B2 (en) | 2012-12-28 | 2016-03-15 | Turner Broadcasting System, Inc. | Method and system for providing synchronized advertisements and services |
US9542947B2 (en) | 2013-03-12 | 2017-01-10 | Google Technology Holdings LLC | Method and apparatus including parallell processes for voice recognition |
US9275638B2 (en) | 2013-03-12 | 2016-03-01 | Google Technology Holdings LLC | Method and apparatus for training a voice recognition model database |
US9368114B2 (en) | 2013-03-14 | 2016-06-14 | Apple Inc. | Context-sensitive handling of interruptions |
WO2014144579A1 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | System and method for updating an adaptive speech recognition model |
WO2014144949A2 (en) | 2013-03-15 | 2014-09-18 | Apple Inc. | Training an at least partial voice command system |
US9058805B2 (en) | 2013-05-13 | 2015-06-16 | Google Inc. | Multiple recognizer speech recognition |
WO2014197336A1 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for detecting errors in interactions with a voice-based digital assistant |
US9582608B2 (en) | 2013-06-07 | 2017-02-28 | Apple Inc. | Unified ranking with entropy-weighted information for phrase-based semantic auto-completion |
WO2014197334A2 (en) | 2013-06-07 | 2014-12-11 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
WO2014197335A1 (en) | 2013-06-08 | 2014-12-11 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
WO2014200728A1 (en) | 2013-06-09 | 2014-12-18 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
US10176167B2 (en) | 2013-06-09 | 2019-01-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
AU2014278595B2 (en) | 2013-06-13 | 2017-04-06 | Apple Inc. | System and method for emergency calls initiated by voice command |
US9548047B2 (en) | 2013-07-31 | 2017-01-17 | Google Technology Holdings LLC | Method and apparatus for evaluating trigger phrase enrollment |
CN103531197A (zh) * | 2013-10-11 | 2014-01-22 | 安徽科大讯飞信息科技股份有限公司 | 一种对用户语音识别结果反馈的命令词识别自适应优化方法 |
US9507849B2 (en) | 2013-11-28 | 2016-11-29 | Soundhound, Inc. | Method for combining a query and a communication command in a natural language computer system |
US9292488B2 (en) | 2014-02-01 | 2016-03-22 | Soundhound, Inc. | Method for embedding voice mail in a spoken utterance using a natural language processing computer system |
US10779769B2 (en) * | 2014-02-19 | 2020-09-22 | Institut National De La Recherche Scientifique | Method and system for evaluating a noise level of a biosignal |
US11295730B1 (en) | 2014-02-27 | 2022-04-05 | Soundhound, Inc. | Using phonetic variants in a local context to improve natural language understanding |
CN103915092B (zh) * | 2014-04-01 | 2019-01-25 | 百度在线网络技术(北京)有限公司 | 语音识别方法和装置 |
US9564123B1 (en) | 2014-05-12 | 2017-02-07 | Soundhound, Inc. | Method and system for building an integrated user profile |
US9620105B2 (en) | 2014-05-15 | 2017-04-11 | Apple Inc. | Analyzing audio input for efficient speech and music recognition |
US10592095B2 (en) | 2014-05-23 | 2020-03-17 | Apple Inc. | Instantaneous speaking of content on touch devices |
US9502031B2 (en) | 2014-05-27 | 2016-11-22 | Apple Inc. | Method for supporting dynamic grammars in WFST-based ASR |
US9715875B2 (en) | 2014-05-30 | 2017-07-25 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
US9785630B2 (en) | 2014-05-30 | 2017-10-10 | Apple Inc. | Text prediction using combined word N-gram and unigram language models |
EP3480811A1 (en) | 2014-05-30 | 2019-05-08 | Apple Inc. | Multi-command single utterance input method |
US9734193B2 (en) | 2014-05-30 | 2017-08-15 | Apple Inc. | Determining domain salience ranking from ambiguous words in natural speech |
US10078631B2 (en) | 2014-05-30 | 2018-09-18 | Apple Inc. | Entropy-guided text prediction using combined word and character n-gram language models |
US10170123B2 (en) | 2014-05-30 | 2019-01-01 | Apple Inc. | Intelligent assistant for home automation |
US9760559B2 (en) | 2014-05-30 | 2017-09-12 | Apple Inc. | Predictive text input |
US9842101B2 (en) | 2014-05-30 | 2017-12-12 | Apple Inc. | Predictive conversion of language input |
US9633004B2 (en) | 2014-05-30 | 2017-04-25 | Apple Inc. | Better resolution when referencing to concepts |
US9430463B2 (en) | 2014-05-30 | 2016-08-30 | Apple Inc. | Exemplar-based natural language processing |
US10289433B2 (en) | 2014-05-30 | 2019-05-14 | Apple Inc. | Domain specific language for encoding assistant dialog |
WO2015199653A1 (en) * | 2014-06-24 | 2015-12-30 | Nuance Communications, Inc. | Methods and apparatus for joint stochastic and deterministic dictation formatting |
US9338493B2 (en) | 2014-06-30 | 2016-05-10 | Apple Inc. | Intelligent automated assistant for TV user interactions |
US10659851B2 (en) | 2014-06-30 | 2020-05-19 | Apple Inc. | Real-time digital assistant knowledge updates |
US10446141B2 (en) | 2014-08-28 | 2019-10-15 | Apple Inc. | Automatic speech recognition based on user feedback |
US9818400B2 (en) | 2014-09-11 | 2017-11-14 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
US10789041B2 (en) | 2014-09-12 | 2020-09-29 | Apple Inc. | Dynamic thresholds for always listening speech trigger |
US9668121B2 (en) | 2014-09-30 | 2017-05-30 | Apple Inc. | Social reminders |
US9886432B2 (en) | 2014-09-30 | 2018-02-06 | Apple Inc. | Parsimonious handling of word inflection via categorical stem + suffix N-gram language models |
US10074360B2 (en) | 2014-09-30 | 2018-09-11 | Apple Inc. | Providing an indication of the suitability of speech recognition |
US9646609B2 (en) | 2014-09-30 | 2017-05-09 | Apple Inc. | Caching apparatus for serving phonetic pronunciations |
US10127911B2 (en) | 2014-09-30 | 2018-11-13 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
US10552013B2 (en) | 2014-12-02 | 2020-02-04 | Apple Inc. | Data detection |
US9711141B2 (en) | 2014-12-09 | 2017-07-18 | Apple Inc. | Disambiguating heteronyms in speech synthesis |
US9865280B2 (en) | 2015-03-06 | 2018-01-09 | Apple Inc. | Structured dictation using intelligent automated assistants |
US9886953B2 (en) | 2015-03-08 | 2018-02-06 | Apple Inc. | Virtual assistant activation |
US9721566B2 (en) | 2015-03-08 | 2017-08-01 | Apple Inc. | Competing devices responding to voice triggers |
US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
US9899019B2 (en) | 2015-03-18 | 2018-02-20 | Apple Inc. | Systems and methods for structured stem and suffix language models |
US9842105B2 (en) | 2015-04-16 | 2017-12-12 | Apple Inc. | Parsimonious continuous-space phrase representations for natural language processing |
US10083688B2 (en) | 2015-05-27 | 2018-09-25 | Apple Inc. | Device voice control for selecting a displayed affordance |
US10127220B2 (en) | 2015-06-04 | 2018-11-13 | Apple Inc. | Language identification from short strings |
US10101822B2 (en) | 2015-06-05 | 2018-10-16 | Apple Inc. | Language input correction |
US10255907B2 (en) | 2015-06-07 | 2019-04-09 | Apple Inc. | Automatic accent detection using acoustic models |
US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
US10186254B2 (en) | 2015-06-07 | 2019-01-22 | Apple Inc. | Context-based endpoint detection |
US9715498B2 (en) | 2015-08-31 | 2017-07-25 | Microsoft Technology Licensing, Llc | Distributed server system for language understanding |
US10747498B2 (en) | 2015-09-08 | 2020-08-18 | Apple Inc. | Zero latency digital assistant |
US10671428B2 (en) | 2015-09-08 | 2020-06-02 | Apple Inc. | Distributed personal assistant |
US9697820B2 (en) | 2015-09-24 | 2017-07-04 | Apple Inc. | Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks |
US10366158B2 (en) | 2015-09-29 | 2019-07-30 | Apple Inc. | Efficient word encoding for recurrent neural network language models |
US11010550B2 (en) | 2015-09-29 | 2021-05-18 | Apple Inc. | Unified language modeling framework for word prediction, auto-completion and auto-correction |
US11587559B2 (en) | 2015-09-30 | 2023-02-21 | Apple Inc. | Intelligent device identification |
US10691473B2 (en) | 2015-11-06 | 2020-06-23 | Apple Inc. | Intelligent automated assistant in a messaging environment |
WO2017080604A1 (en) | 2015-11-12 | 2017-05-18 | Telefonaktiebolaget Lm Ericsson (Publ) | Server, wireless device, methods and computer programs |
US10049668B2 (en) | 2015-12-02 | 2018-08-14 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
US10223066B2 (en) | 2015-12-23 | 2019-03-05 | Apple Inc. | Proactive assistance based on dialog communication between devices |
US10446143B2 (en) | 2016-03-14 | 2019-10-15 | Apple Inc. | Identification of voice inputs providing credentials |
US9934775B2 (en) | 2016-05-26 | 2018-04-03 | Apple Inc. | Unit-selection text-to-speech synthesis based on predicted concatenation parameters |
US9972304B2 (en) | 2016-06-03 | 2018-05-15 | Apple Inc. | Privacy preserving distributed evaluation framework for embedded personalized systems |
US10249300B2 (en) | 2016-06-06 | 2019-04-02 | Apple Inc. | Intelligent list reading |
US10049663B2 (en) | 2016-06-08 | 2018-08-14 | Apple, Inc. | Intelligent automated assistant for media exploration |
DK179309B1 (en) | 2016-06-09 | 2018-04-23 | Apple Inc | Intelligent automated assistant in a home environment |
US10192552B2 (en) | 2016-06-10 | 2019-01-29 | Apple Inc. | Digital assistant providing whispered speech |
US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
US10509862B2 (en) | 2016-06-10 | 2019-12-17 | Apple Inc. | Dynamic phrase expansion of language input |
US10586535B2 (en) | 2016-06-10 | 2020-03-10 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
US10490187B2 (en) | 2016-06-10 | 2019-11-26 | Apple Inc. | Digital assistant providing automated status report |
DK179343B1 (en) | 2016-06-11 | 2018-05-14 | Apple Inc | Intelligent task discovery |
DK201670540A1 (en) | 2016-06-11 | 2018-01-08 | Apple Inc | Application integration with a digital assistant |
DK179415B1 (en) | 2016-06-11 | 2018-06-14 | Apple Inc | Intelligent device arbitration and control |
DK179049B1 (en) | 2016-06-11 | 2017-09-18 | Apple Inc | Data driven natural language event detection and classification |
US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
US10701438B2 (en) | 2016-12-31 | 2020-06-30 | Turner Broadcasting System, Inc. | Automatic content recognition and verification in a broadcast chain |
US10971157B2 (en) | 2017-01-11 | 2021-04-06 | Nuance Communications, Inc. | Methods and apparatus for hybrid speech recognition processing |
DK179745B1 (en) | 2017-05-12 | 2019-05-01 | Apple Inc. | SYNCHRONIZATION AND TASK DELEGATION OF A DIGITAL ASSISTANT |
DK201770431A1 (en) | 2017-05-15 | 2018-12-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
US10410635B2 (en) | 2017-06-09 | 2019-09-10 | Soundhound, Inc. | Dual mode speech recognition |
US11063645B2 (en) | 2018-12-18 | 2021-07-13 | XCOM Labs, Inc. | Methods of wirelessly communicating with a group of devices |
US10756795B2 (en) | 2018-12-18 | 2020-08-25 | XCOM Labs, Inc. | User equipment with cellular link and peer-to-peer link |
US11330649B2 (en) | 2019-01-25 | 2022-05-10 | XCOM Labs, Inc. | Methods and systems of multi-link peer-to-peer communications |
US10756767B1 (en) | 2019-02-05 | 2020-08-25 | XCOM Labs, Inc. | User equipment for wirelessly communicating cellular signal with another user equipment |
WO2020246649A1 (ko) * | 2019-06-07 | 2020-12-10 | 엘지전자 주식회사 | 엣지 컴퓨팅 디바이스에서 음성 인식 방법 |
US20210104233A1 (en) * | 2019-10-03 | 2021-04-08 | Ez-Ai Corp. | Interactive voice feedback system and method thereof |
CN110970031B (zh) * | 2019-12-16 | 2022-06-24 | 思必驰科技股份有限公司 | 语音识别***及方法 |
US11586964B2 (en) * | 2020-01-30 | 2023-02-21 | Dell Products L.P. | Device component management using deep learning techniques |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0108354A2 (en) * | 1982-11-03 | 1984-05-16 | International Standard Electric Corporation | A data processing apparatus and method for use in speech recognition |
EP0177405A1 (fr) * | 1984-10-02 | 1986-04-09 | Regie Nationale Des Usines Renault | Système de radiotéléphone, notamment pour véhicule automobile |
US4991217A (en) * | 1984-11-30 | 1991-02-05 | Ibm Corporation | Dual processor speech recognition system with dedicated data acquisition bus |
US5109509A (en) * | 1984-10-29 | 1992-04-28 | Hitachi, Ltd. | System for processing natural language including identifying grammatical rule and semantic concept of an undefined word |
EP0534410A2 (en) * | 1991-09-25 | 1993-03-31 | Nippon Hoso Kyokai | Method and apparatus for hearing assistance with speech speed control function |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US108354A (en) * | 1870-10-18 | Improvement in combined eaves-troughs and lightning-rods | ||
DE3519915A1 (de) * | 1985-06-04 | 1986-12-04 | Telefonbau Und Normalzeit Gmbh, 6000 Frankfurt | Verfahren zur spracherkennung an endgeraeten von fernmelde-, insbesondere fernsprechanlagen |
JPS6269297A (ja) * | 1985-09-24 | 1987-03-30 | 日本電気株式会社 | 話者確認タ−ミナル |
US5231670A (en) * | 1987-06-01 | 1993-07-27 | Kurzweil Applied Intelligence, Inc. | Voice controlled system and method for generating text from a voice controlled input |
US5321840A (en) * | 1988-05-05 | 1994-06-14 | Transaction Technology, Inc. | Distributed-intelligence computer system including remotely reconfigurable, telephone-type user terminal |
US5040212A (en) * | 1988-06-30 | 1991-08-13 | Motorola, Inc. | Methods and apparatus for programming devices to recognize voice commands |
US5325524A (en) * | 1989-04-06 | 1994-06-28 | Digital Equipment Corporation | Locating mobile objects in a distributed computer system |
US5012518A (en) * | 1989-07-26 | 1991-04-30 | Itt Corporation | Low-bit-rate speech coder using LPC data reduction processing |
US5146538A (en) * | 1989-08-31 | 1992-09-08 | Motorola, Inc. | Communication system and method with voice steering |
US5045082A (en) * | 1990-01-10 | 1991-09-03 | Alza Corporation | Long-term delivery device including loading dose |
US5280585A (en) * | 1990-09-28 | 1994-01-18 | Hewlett-Packard Company | Device sharing system using PCL macros |
WO1993001664A1 (en) * | 1991-07-08 | 1993-01-21 | Motorola, Inc. | Remote voice control system |
DE4126882A1 (de) * | 1991-08-14 | 1993-02-18 | Philips Patentverwaltung | Anordnung zur sprachuebertragung |
-
1994
- 1994-10-26 ZA ZA948426A patent/ZA948426B/xx unknown
- 1994-11-15 TW TW083110578A patent/TW318239B/zh not_active IP Right Cessation
- 1994-12-09 MY MYPI94003300A patent/MY116482A/en unknown
- 1994-12-19 IL IL11205794A patent/IL112057A0/xx not_active IP Right Cessation
- 1994-12-20 CA CA002179759A patent/CA2179759C/en not_active Expired - Lifetime
- 1994-12-20 AT AT95904956T patent/ATE261172T1/de not_active IP Right Cessation
- 1994-12-20 DE DE69433593T patent/DE69433593T2/de not_active Expired - Lifetime
- 1994-12-20 EP EP03021806A patent/EP1381029A1/en not_active Ceased
- 1994-12-20 EP EP95904956A patent/EP0736211B1/en not_active Expired - Lifetime
- 1994-12-20 WO PCT/US1994/014803 patent/WO1995017746A1/en active Application Filing
- 1994-12-20 KR KR1019960703304A patent/KR100316077B1/ko not_active IP Right Cessation
- 1994-12-20 BR BR9408413A patent/BR9408413A/pt not_active IP Right Cessation
- 1994-12-20 JP JP51760595A patent/JP3661874B2/ja not_active Expired - Lifetime
- 1994-12-20 EP EP08152546A patent/EP1942487A1/en not_active Withdrawn
- 1994-12-20 AU AU13753/95A patent/AU692820B2/en not_active Ceased
- 1994-12-20 CN CN94194566A patent/CN1119794C/zh not_active Expired - Lifetime
-
1996
- 1996-04-04 US US08/627,333 patent/US5956683A/en not_active Expired - Lifetime
- 1996-06-20 FI FI962572A patent/FI118909B/fi not_active IP Right Cessation
-
1998
- 1998-08-21 HK HK98110090A patent/HK1011109A1/xx not_active IP Right Cessation
-
2007
- 2007-12-03 FI FI20070933A patent/FI20070933A/fi not_active IP Right Cessation
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP0108354A2 (en) * | 1982-11-03 | 1984-05-16 | International Standard Electric Corporation | A data processing apparatus and method for use in speech recognition |
EP0177405A1 (fr) * | 1984-10-02 | 1986-04-09 | Regie Nationale Des Usines Renault | Système de radiotéléphone, notamment pour véhicule automobile |
US5109509A (en) * | 1984-10-29 | 1992-04-28 | Hitachi, Ltd. | System for processing natural language including identifying grammatical rule and semantic concept of an undefined word |
US4991217A (en) * | 1984-11-30 | 1991-02-05 | Ibm Corporation | Dual processor speech recognition system with dedicated data acquisition bus |
EP0534410A2 (en) * | 1991-09-25 | 1993-03-31 | Nippon Hoso Kyokai | Method and apparatus for hearing assistance with speech speed control function |
Also Published As
Publication number | Publication date |
---|---|
CN1138386A (zh) | 1996-12-18 |
ATE261172T1 (de) | 2004-03-15 |
US5956683A (en) | 1999-09-21 |
CA2179759C (en) | 2005-11-15 |
BR9408413A (pt) | 1997-08-05 |
EP1942487A1 (en) | 2008-07-09 |
AU1375395A (en) | 1995-07-10 |
MY116482A (en) | 2004-02-28 |
FI962572A (fi) | 1996-08-20 |
AU692820B2 (en) | 1998-06-18 |
FI118909B (fi) | 2008-04-30 |
EP0736211B1 (en) | 2004-03-03 |
EP1381029A1 (en) | 2004-01-14 |
TW318239B (zh) | 1997-10-21 |
JP3661874B2 (ja) | 2005-06-22 |
HK1011109A1 (en) | 1999-07-02 |
WO1995017746A1 (en) | 1995-06-29 |
KR100316077B1 (ko) | 2002-02-28 |
JPH09507105A (ja) | 1997-07-15 |
FI20070933A (fi) | 2007-12-03 |
FI962572A0 (fi) | 1996-06-20 |
IL112057A0 (en) | 1995-03-15 |
CA2179759A1 (en) | 1995-06-29 |
DE69433593D1 (de) | 2004-04-08 |
DE69433593T2 (de) | 2005-02-03 |
EP0736211A1 (en) | 1996-10-09 |
ZA948426B (en) | 1995-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN1119794C (zh) | 分布式话音识别*** | |
US6594628B1 (en) | Distributed voice recognition system | |
CN1327405C (zh) | 分布式语音识别***中语音识别的方法和设备 | |
KR100923896B1 (ko) | 분산형 음성 인식 시스템에서 음성 활성을 송신하는 방법및 장치 | |
CN1168070C (zh) | 分布式语音识别*** | |
CN1306472C (zh) | 分布式语音识别***中用于发送语音活动的***和方法 | |
CN1215491A (zh) | 语言处理 | |
CN100527224C (zh) | 有效存储语音识别模型的***和方法 | |
US11763801B2 (en) | Method and system for outputting target audio, readable storage medium, and electronic device | |
Rabiner et al. | Historical Perspective of the Field of ASR/NLU | |
Touazi et al. | An experimental framework for Arabic digits speech recognition in noisy environments | |
JP3531342B2 (ja) | 音声処理装置および音声処理方法 | |
CN111199747A (zh) | 人工智能通信***及通信方法 | |
Sakka et al. | Using geometric spectral subtraction approach for feature extraction for DSR front-end Arabic system | |
CN117041430B (zh) | 一种提高智能协调外呼***的外呼质量及鲁棒方法和装置 | |
Stein et al. | Automatic Speech Recognition on Firefighter TETRA broadcast | |
CN117909486A (zh) | 一种基于情感识别和大语言模型的多模式问答方法及*** | |
Shanthamallappa et al. | Robust Automatic Speech Recognition Using Wavelet-Based Adaptive Wavelet Thresholding: A Review | |
Hokking et al. | A hybrid of fractal code descriptor and harmonic pattern generator for improving speech recognition of different sampling rates | |
US20020116180A1 (en) | Method for transmission and storage of speech | |
Flanagan et al. | Integrated information modalities for human/machine communication: HuMaNet, an experimental system for conferencing | |
Gerazov et al. | Overview of Feature Selection for Automatic Speech Recognition | |
Rabiner | Telecommunications applications of speech processing | |
Mwangi | Speaker independent isolated word recognition | |
Beaufays et al. | Porting channel robustness across languages. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
C56 | Change in the name or address of the patentee | ||
CP03 | Change of name, title or address |
Address after: Holy land, California, Egypt Patentee after: Qualcomm Inc. Address before: Holy land, California, Egypt Patentee before: Qualcomm Inc. |
|
C17 | Cessation of patent right | ||
CX01 | Expiry of patent term |
Expiration termination date: 20141220 Granted publication date: 20030827 |