EP1058235A2 - Wiedergabeverfahren für sprachgesteuerte Systeme mit text-basierter Sprachsynthese - Google Patents
Wiedergabeverfahren für sprachgesteuerte Systeme mit text-basierter Sprachsynthese Download PDFInfo
- Publication number
- EP1058235A2 EP1058235A2 EP00108486A EP00108486A EP1058235A2 EP 1058235 A2 EP1058235 A2 EP 1058235A2 EP 00108486 A EP00108486 A EP 00108486A EP 00108486 A EP00108486 A EP 00108486A EP 1058235 A2 EP1058235 A2 EP 1058235A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- character string
- variant
- converted
- speech
- speech input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 35
- 230000015572 biosynthetic process Effects 0.000 title claims abstract description 24
- 238000003786 synthesis reaction Methods 0.000 title claims abstract description 23
- 230000011218 segmentation Effects 0.000 claims description 16
- 238000013459 approach Methods 0.000 claims description 9
- 238000009877 rendering Methods 0.000 claims description 4
- 230000008569 process Effects 0.000 abstract description 3
- 230000001755 vocal effect Effects 0.000 description 5
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 238000004088 simulation Methods 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- 238000013518 transcription Methods 0.000 description 2
- 230000035897 transcription Effects 0.000 description 2
- 241000251468 Actinopterygii Species 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 239000002131 composite material Substances 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000005284 excitation Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000005755 formation reaction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000013178 mathematical model Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 238000010183 spectrum analysis Methods 0.000 description 1
- 230000000638 stimulation Effects 0.000 description 1
- 238000005309 stochastic process Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
- G10L13/04—Details of speech synthesis systems, e.g. synthesiser structure or memory management
Definitions
- the invention is concerned with the improvement of voice-controlled systems text-based speech synthesis, especially with the improvement of synthetic Playback of saved but determined during pronunciation Strings subject to peculiarities.
- the subject of speech synthesis is the mechanical transformation of the symbolic Representation of an utterance in an acoustic signal emitted by one human speaker recognized as sufficiently similar to human language becomes.
- a speech synthesis technique is a technique that involves building a speech synthesizer allowed.
- Examples of speech synthesis techniques are direct synthesis, the Synthesis using a model and simulation of the vocal tract.
- either parts of the speech signal are started from stored signal pieces (e.g. one per phoneme) to the corresponding ones Words put together or the transfer function of the vocal tract, which at People used for speech generation by energy of a signal in simulated certain frequency ranges.
- stored signal pieces e.g. one per phoneme
- Words put together e.g. one per phoneme
- the transfer function of the vocal tract which at People used for speech generation by energy of a signal in simulated certain frequency ranges.
- Voiced sounds through a quasi-periodic excitation of a certain frequency represents.
- the phoneme mentioned above is the smallest meaning-distinguishing, but itself not meaningful unity of language.
- Two words different Meaning that only differ by a phoneme e.g. fish - table; forest - Wild
- form a minimal pair e.g. fish - table; forest - Wild
- the number of phonemes in a language is proportional small (between 20 and 60).
- Diphones are mostly used in direct synthesis.
- Phonemes or sequences of phonemes are created with the help of the International Phonetic alphabet (IPA) noted.
- IPA International Phonetic alphabet
- the implementation of a text in a sequence of Phonetic Alphabet characters is called Phonetic Transcription designated.
- a production model is formed, which mostly on minimizing the difference between a digitized human Speech signal (original signal) and a predicted signal based.
- Another method consists in the simulation of the vocal tract, in which its Shape and position of the individual articulation organs (tongue, jaw, lips) is reproduced. To do this, a mathematical model of the flow conditions generated in such a defined vocal tract and the speech signal with the help of this model.
- the phonemes or diphones used in direct synthesis must first can be obtained by segmentation from natural language. Here you can two approaches can be distinguished:
- features For segmentation, features must first be extracted from the speech signal, on the basis of which it is possible to distinguish between the segments. These characteristics are then classified into classes.
- Features for feature extraction include spectral analysis, filter bank analysis or the method of linear prediction.
- Hidden Markov models for the classification, for example, Hidden Markov models, artificial ones neural networks or dynamic time warping (a method for line normalization) to be used.
- HMM Hidden Markov Model
- a common approach is to classify voiced / unvoiced / silent - according to the different forms of stimulation in the generation of language in Vocal tract.
- the special treatment of certain words of a language is extremely complex, it has been used in speech-controlled arrangements to form the announcement, which an arrangement is to indicate, from a mix of spoken and synthesized language.
- the desired destination is recorded for a route finder, for example, which has special pronunciations in terms of pronunciation compared to the other words of the corresponding language and which is specified by a user in the case of voice-controlled arrangements, and copied into the corresponding destination announcement.
- a route finder for example, which has special pronunciations in terms of pronunciation compared to the other words of the corresponding language and which is specified by a user in the case of voice-controlled arrangements, and copied into the corresponding destination announcement.
- the procedure is simplified if, according to claim 4 Segmentation of the speech input and the converted character string or the variants formed therefrom. This segmentation allows segments in where no differences or differences below the threshold are found, of exclude further treatment.
- segmentation approaches can also be used become. This is especially true when looking at the original voice input Advantages, because the segmentation contains those contained in the speech signal information that can only be determined in a very complex step is used must, while the segmentation of strings very simply the well-known Number of phonemes contained in the utterance can be used.
- a particularly simple procedure is achieved if according to claim 9 at least one replacement phoneme similar to this phoneme is linked to each phoneme or stored in a list.
- the computing work is further reduced if according to claim 10 at a variant of a character string that is determined to be reproducible, the special features, associated with rendering the string, along with the String can be saved. In this case there is the special pronunciation the respective character string if you use it later without much effort available in the memory.
- strings can be one
- route finders are street or place names.
- a mailbox application can do this as in a phone book the names of subscribers his. So that the memory easily with the appropriate information loaded or the stored information can be easily updated, the respective strings are available as text.
- a memory is designated 10.
- This memory 10 which for the Representation of the invention, which should contain German city names, belongs to one Route finder 11.
- This route finder 11 also includes an arrangement 12 with which natural voice inputs are recorded and temporarily saved can. In the present case, this is realized in such a way that the respective voice input by one Microphone 13 is detected and stored in a voice memory 14. Now becomes a The user of the route finder 11 is prompted to enter his destination, the each destination spoken by the user z. B. "Bochum” or "Itzehoe” from Microphone 13 detected and passed on to the voice memory 14.
- the route finder 11 Because the route finder 11 has either been informed of his current location or has still been given it knows, he is first based on the desired destination and the current one Determine the appropriate route to the destination. If the route finder 11 not only show the corresponding route graphically, but spoken ones Deliver announcement, the textual strings of the respective announcement Described phonetically according to general rules and then for speech converted into a purely synthetic form. In that shown in Fig. 1 Exemplary embodiment is the phonetic description of the stored character strings in the converter 15 and the synthesizing arranged in the following Speech synthesizer arrangement 16.
- the respective character string if it has passed through the converter 15 and the speech synthesizer assembly 16 as a word corresponding to the phonetic conditions of the respective language a speaker 17 to the environment and from this as such be understood.
- the Play route finder 11 after entering the destination approximately the following sentence: "You have Berlin chosen as the destination. If this does not meet your expectations, give it now set a new goal. "Even though this information follows general rules can be reproduced correctly, problems arise when that The goal should not be Berlin, but Laboe. If the string that the Textual representation of the destination Laboe in the converter 15 according to general rules written phonetically and then in the speech synthesizer 16 Output through speaker 17 like the rest of the information above in brought a synthetic form, that would be given over the speaker 17 Correct result only if, according to general rules, the ending "oe” is generally reproduced as "ö".
- a comparison arrangement 18 becomes the destination actually spoken by the user and the the character string corresponding to the destination after the converter 15 and the Speech synthesizer 16 has passed, fed and then compared. If the synthesized string shows a high - above a threshold lying - coincidence with the originally spoken destination, is used for the Playback uses the synthesized string. Can this match are not ascertained, a variant becomes in the speech synthesis arrangement 16 of the original string and in the comparator 18 again Comparison between the originally spoken destination and the variant formed carried out.
- the route finder 11 designed such that as soon as a character string or a Variant has the required agreement with the original, whose Playback via the loudspeaker 17 takes place, further variant formations stopped immediately.
- the route finder 11 can also be modified such that a A plurality of variants are formed and then one from the variants Variant is selected that most closely matches the original shows.
- FIG. 2a there is a speech signal in the time domain of actually represented by Itzehoe spoken by a user.
- Fig. 2b also shows a speech signal in the time domain of the word Itzehoe, but in in Fig. 2b the word Itzehoe from a corresponding present String first in the converter 15 phonetically according to general rules described and then subsequently in the speech synthesizer 16 in a synthetic form.
- the illustration according to FIG. 2b is clear it can be seen that, when applying the general rules, the ending "oe" des Word Itzehoe is reproduced as "ö". However, this incorrect playback exclude the spoken and the synthesized form in one Comparator 18 compared with each other.
- the converter 15 'in another - not shown - embodiment of the converter 15th can be formed.
- the process sequence can also be modified. It is found that a Deviation between the spoken and the original synthetic form is given, and there are a plurality of replacement phonemes in the memory 21 stored list, a plurality of variants can also be formed at the same time and be compared to the actual spoken word. Is played then the variant that most closely matches what is spoken Word shows.
- the additional memory 22 not just on the inclusion of information on the correct pronunciation of stored strings is limited.
- a comparison in Comparator 18 that between the spoken and the synthesized form of a Word no deviation or below a threshold can be stored in the additional memory 22 for this word, which in the future use of this word an elaborate comparison in Comparator 18 excludes.
- the segments 19 according to FIGS. 2a and the segments 20 according to FIG. 2b do not have the same format.
- the segment 20.1 compared to segment 19.1 a larger width
- the segment 20.2 compared to the corresponding segment 19.2. essential is narrower. This is due to the fact that the "speech length" of the different phonemes to be compared can be of different lengths.
- the comparison arrangement 18 is designed so that different lengths Speaking times of a phoneme do not yet indicate a mutual deviation.
- segment 19, 20 when using different segmentation methods for the spoken and the synthesized Format also a different number of segments 19, 20 can be calculated can. If this occurs, then a certain segment 19, 20 should not only have one correspond to segment 19, 20, but also with the Predecessor and successor of the corresponding segment 19, 20. So it is also possible to replace one phoneme with two other phonemes. This procedure is also possible in the opposite direction. There is no match for a segment 19, 20, this can be excluded, or by two better fitting ones be replaced.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
- Input Circuits Of Receivers And Coupling Of Receivers And Audio Equipment (AREA)
Abstract
Description
Die explizite Segmentierung hingegen nutzt zusätzliche Informationen, wie z.B. die Anzahl der in der Äußerung enthaltenen Phoneme.
Anschließend werden diese Merkmale in Klassen eingeordnet.
Möglichkeiten zur Merkmalsextraktion bieten unter anderem Spektralanalysen, Filterbankanalysen oder das Verfahren der Linearen Prädiktion.
In neueren Ansätzen werden zur Klassifikation vielfach selbstorganisierende Merkmalskarten (Kohonen-Maps) verwendet. Diese spezielle Art eines Künstlichen Neuronalen Netzes ist in der Lage, die im menschlichen Gehirn ablaufenden Vorgänge nachzubilden.
- Fig. 1
- einen schematischen Ablauf gemäß der Erfindung
- Fig. 2
- einen Vergleich von segmentierten Äußerungen
Claims (10)
- Wiedergabeverfahren für sprachgesteuerte Systeme mit text-basierter Sprachsynthese,
dadurch gekennzeichnet,dass beim Vorliegen einer tatsächlich gesprochenen und mit einer gespeicherten Zeichenkette korrespondierenden Spracheingabe vor einer Wiedergabe der nach allgemeinen Regeln phonetisch beschriebenen und in eine rein synthetische Form gewandelten Zeichenkette die gewandelte Zeichenkette mit der Spracheingabe verglichen wird,dass bei Feststellung einer oberhalb einer Schwelle liegenden Abweichung der gewandelten Zeichenkette von der Spracheingabe wenigstens eine Variante der gewandelten Zeichenkette gebildet wird unddass eine der gebildeten Varianten, sofern diese bei einem Vergleich mit der Spracheingabe eine unterhalb der Schwelle liegende Abweichung aufweist, anstelle der gewandelten Zeichenkette ausgegeben wird. - Wiedergabeverfahren nach Anspruch 1,
dadurch gekennzeichnet,dass in Schritt zwei jeweils immer nur eine Variante gebildet wird unddass, sofern in Schritt drei ein Vergleich der Variante mit der Spracheingabe immer eine oberhalb der Schwelle liegende Abweichung zeigt, Schritt zwei mindestens noch einmal zur Bildung einer neuen Variante durchgeführt wird. - Wiedergabeverfahren nach Anspruch 1,
dadurch gekennzeichnet,dass in Schritt zwei wenigstens zwei Varianten gebildet werden unddass beim Vorliegen von Varianten, die jeweils im Vergleich zur Spracheingabe eine unterhalb der Schwelle liegende Abweichung haben, immer diejenige Variante wiedergegeben wird, die die geringste Abweichung zur Spracheingabe besitzt. - Verfahren nach einem der Ansprüche 1 bis 3,
dadurch gekennzeichnet,dass vor einem Vergleich der Spracheingabe mit der gewandelten Zeichenkette bzw. der daraus gebildeten Variante(n) eine Segmentierung der Spracheingabe und der gewandelten Zeichenkette bzw. der gebildeten Variante(n) erfolgt. - Wiedergabeverfahren nach Anspruch 4,
dadurch gekennzeichnet,dass das sowohl zur Segmentierung der Spracheingabe und der gewandelten Zeichenkette bzw. der daraus abgeleiteten Variante(n) ein gleicher Segmentierungsansatz verwendet wird. - Wiedergabeverfahren nach Anspruch 4,
dadurch gekennzeichnet,dass das sowohl zur Segmentierung der Spracheingabe und der gewandelten Zeichenkette bzw. der daraus abgeleiteten Variante(n) jeweils ein verschiedener Segmentierungsansatz verwendet wird. - Wiedergabeverfahren nach Anspruch 4,
dadurch gekennzeichnet,dass zur Segmentierung der gewandelten Zeichenkette bzw. der daraus abgeleiteten Variante(n) ein explizierter und zur Segmentierung der Spracheingabe ein implizierter Segmentierungsansatz verwendet wird. - Wiedergabeverfahren nach einem der Ansprüche 4 bis 7,
dadurch kennzeichnet,dass die in segmentierter Form vorliegende gewandelte Zeichenkette und die segmentierte Spracheingabe in den entsprechenden Segmenten auf Gemeinsamkeiten untersucht wird unddass, wenn in zwei korrespondierenden Segmenten eine oberhalb eines Schwellwerts liegende Abweichung vorliegt, das in dem Segment der gewandelten Zeichenkette vorliegende Phonem durch ein Ersatzphonem ersetzt wird. - Wiedergabeverfahren nach Anspruch 8,
dadurch gekennzeichnet,dass mit jedem Phonem wenigstens ein diesem Phonem ähnliches Ersatzphonem verknüpft ist. - Wiedergabeverfahren nach einem der Ansprüche 1 bis 9,
dadurch gekennzeichnet,dass, sobald eine Variante einer Zeichenkette als wiedergabewürdig ermittelt wird, die Besonderheiten, die mit der Wiedergabe der Zeichenkette verbunden sind, im Zusammenhang mit der Zeichenkette abgespeichert werden.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19920501A DE19920501A1 (de) | 1999-05-05 | 1999-05-05 | Wiedergabeverfahren für sprachgesteuerte Systeme mit textbasierter Sprachsynthese |
DE19920501 | 1999-05-05 |
Publications (3)
Publication Number | Publication Date |
---|---|
EP1058235A2 true EP1058235A2 (de) | 2000-12-06 |
EP1058235A3 EP1058235A3 (de) | 2003-02-05 |
EP1058235B1 EP1058235B1 (de) | 2003-11-05 |
Family
ID=7906935
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP00108486A Expired - Lifetime EP1058235B1 (de) | 1999-05-05 | 2000-04-19 | Wiedergabeverfahren für sprachgesteuerte Systeme mit text-basierter Sprachsynthese |
Country Status (5)
Country | Link |
---|---|
US (1) | US6546369B1 (de) |
EP (1) | EP1058235B1 (de) |
JP (1) | JP4602511B2 (de) |
AT (1) | ATE253762T1 (de) |
DE (2) | DE19920501A1 (de) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP1422638A2 (de) * | 2002-11-19 | 2004-05-26 | Detmar Schäfer | Rechnergestützte Ermittlung einer Ähnlichkeit eines elektronisch erfassten ersten Kennzeichens zu mindestens einem zweiten solchen Kennzeichen |
US7167824B2 (en) | 2002-02-14 | 2007-01-23 | Sail Labs Technology Ag | Method for generating natural language in computer-based dialog systems |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4759827B2 (ja) * | 2001-03-28 | 2011-08-31 | 日本電気株式会社 | 音声セグメンテーション装置及びその方法並びにその制御プログラム |
US7107215B2 (en) * | 2001-04-16 | 2006-09-12 | Sakhr Software Company | Determining a compact model to transcribe the arabic language acoustically in a well defined basic phonetic study |
DE60314844T2 (de) * | 2003-05-07 | 2008-03-13 | Harman Becker Automotive Systems Gmbh | Verfahren und Vorrichtung zur Sprachausgabe, Datenträger mit Sprachdaten |
CN1879146B (zh) * | 2003-11-05 | 2011-06-08 | 皇家飞利浦电子股份有限公司 | 用于语音到文本的转录***的错误检测 |
JP2006047866A (ja) * | 2004-08-06 | 2006-02-16 | Canon Inc | 電子辞書装置およびその制御方法 |
US20060136195A1 (en) * | 2004-12-22 | 2006-06-22 | International Business Machines Corporation | Text grouping for disambiguation in a speech application |
JP4385949B2 (ja) * | 2005-01-11 | 2009-12-16 | トヨタ自動車株式会社 | 車載チャットシステム |
US20070016421A1 (en) * | 2005-07-12 | 2007-01-18 | Nokia Corporation | Correcting a pronunciation of a synthetically generated speech object |
US20070129945A1 (en) * | 2005-12-06 | 2007-06-07 | Ma Changxue C | Voice quality control for high quality speech reconstruction |
US8504365B2 (en) * | 2008-04-11 | 2013-08-06 | At&T Intellectual Property I, L.P. | System and method for detecting synthetic speaker verification |
US8380503B2 (en) | 2008-06-23 | 2013-02-19 | John Nicholas and Kristin Gross Trust | System and method for generating challenge items for CAPTCHAs |
US8752141B2 (en) | 2008-06-27 | 2014-06-10 | John Nicholas | Methods for presenting and determining the efficacy of progressive pictorial and motion-based CAPTCHAs |
US9564120B2 (en) * | 2010-05-14 | 2017-02-07 | General Motors Llc | Speech adaptation in speech synthesis |
KR20170044849A (ko) * | 2015-10-16 | 2017-04-26 | 삼성전자주식회사 | 전자 장치 및 다국어/다화자의 공통 음향 데이터 셋을 활용하는 tts 변환 방법 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE2435654C2 (de) * | 1974-07-24 | 1983-11-17 | Gretag AG, 8105 Regensdorf, Zürich | Verfahren und Vorrichtung zur Analyse und Synthese von menschlicher Sprache |
NL8302985A (nl) * | 1983-08-26 | 1985-03-18 | Philips Nv | Multipulse excitatie lineair predictieve spraakcodeerder. |
US5029200A (en) * | 1989-05-02 | 1991-07-02 | At&T Bell Laboratories | Voice message system using synthetic speech |
US5293449A (en) * | 1990-11-23 | 1994-03-08 | Comsat Corporation | Analysis-by-synthesis 2,4 kbps linear predictive speech codec |
GB9223066D0 (en) * | 1992-11-04 | 1992-12-16 | Secr Defence | Children's speech training aid |
FI98163C (fi) * | 1994-02-08 | 1997-04-25 | Nokia Mobile Phones Ltd | Koodausjärjestelmä parametriseen puheenkoodaukseen |
US6005549A (en) * | 1995-07-24 | 1999-12-21 | Forest; Donald K. | User interface method and apparatus |
US5913193A (en) * | 1996-04-30 | 1999-06-15 | Microsoft Corporation | Method and system of runtime acoustic unit selection for speech synthesis |
JPH10153998A (ja) * | 1996-09-24 | 1998-06-09 | Nippon Telegr & Teleph Corp <Ntt> | 補助情報利用型音声合成方法、この方法を実施する手順を記録した記録媒体、およびこの方法を実施する装置 |
US6163769A (en) * | 1997-10-02 | 2000-12-19 | Microsoft Corporation | Text-to-speech using clustered context-dependent phoneme-based units |
US6081780A (en) * | 1998-04-28 | 2000-06-27 | International Business Machines Corporation | TTS and prosody based authoring system |
US6173263B1 (en) * | 1998-08-31 | 2001-01-09 | At&T Corp. | Method and system for performing concatenative speech synthesis using half-phonemes |
US6266638B1 (en) * | 1999-03-30 | 2001-07-24 | At&T Corp | Voice quality compensation system for speech synthesis based on unit-selection speech database |
-
1999
- 1999-05-05 DE DE19920501A patent/DE19920501A1/de not_active Withdrawn
-
2000
- 2000-04-19 EP EP00108486A patent/EP1058235B1/de not_active Expired - Lifetime
- 2000-04-19 AT AT00108486T patent/ATE253762T1/de not_active IP Right Cessation
- 2000-04-19 DE DE50004296T patent/DE50004296D1/de not_active Expired - Lifetime
- 2000-04-27 JP JP2000132902A patent/JP4602511B2/ja not_active Expired - Fee Related
- 2000-05-05 US US09/564,787 patent/US6546369B1/en not_active Expired - Lifetime
Non-Patent Citations (1)
Title |
---|
DESHMUKH N ET AL: "Automated generation of N-best pronunciations of proper nouns" 1996 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING - PROCEEDINGS. (ICASSP). ATLANTA, MAY 7 - 10, 1996, IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH, AND SIGNAL PROCESSING - PROCEEDINGS. (ICASSP), NEW YORK, IEEE, US, Bd. 1 CONF. 21, 7. Mai 1996 (1996-05-07), Seiten 283-286, XP002164538 ISBN: 0-7803-3193-1 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7167824B2 (en) | 2002-02-14 | 2007-01-23 | Sail Labs Technology Ag | Method for generating natural language in computer-based dialog systems |
EP1422638A2 (de) * | 2002-11-19 | 2004-05-26 | Detmar Schäfer | Rechnergestützte Ermittlung einer Ähnlichkeit eines elektronisch erfassten ersten Kennzeichens zu mindestens einem zweiten solchen Kennzeichen |
EP1422638A3 (de) * | 2002-11-19 | 2005-11-16 | Detmar Schäfer | Rechnergestützte Ermittlung einer Ähnlichkeit eines elektronisch erfassten ersten Kennzeichens zu mindestens einem zweiten solchen Kennzeichen |
Also Published As
Publication number | Publication date |
---|---|
DE50004296D1 (de) | 2003-12-11 |
JP2000347681A (ja) | 2000-12-15 |
EP1058235A3 (de) | 2003-02-05 |
EP1058235B1 (de) | 2003-11-05 |
US6546369B1 (en) | 2003-04-08 |
ATE253762T1 (de) | 2003-11-15 |
JP4602511B2 (ja) | 2010-12-22 |
DE19920501A1 (de) | 2000-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
DE69311303T2 (de) | Sprachtrainingshilfe für kinder. | |
DE602004012909T2 (de) | Verfahren und Vorrichtung zur Modellierung eines Spracherkennungssystems und zur Schätzung einer Wort-Fehlerrate basierend auf einem Text | |
DE60203705T2 (de) | Umschreibung und anzeige eines eingegebenen sprachsignals | |
DE69514382T2 (de) | Spracherkennung | |
DE69712277T2 (de) | Verfahren und vorrichtung zur automatischen sprachsegmentierung in phonemartigen einheiten | |
DE60111329T2 (de) | Anpassung des phonetischen Kontextes zur Verbesserung der Spracherkennung | |
EP1058235B1 (de) | Wiedergabeverfahren für sprachgesteuerte Systeme mit text-basierter Sprachsynthese | |
DE69519297T2 (de) | Verfahren und vorrichtung zur spracherkennung mittels optimierter partieller buendelung von wahrscheinlichkeitsmischungen | |
DE60000138T2 (de) | Erzeugung von mehreren Aussprachen eines Eigennames für die Spracherkennung | |
DE602005002706T2 (de) | Verfahren und System für die Umsetzung von Text-zu-Sprache | |
DE69622565T2 (de) | Verfahren und vorrichtung zur dynamischen anpassung eines spracherkennungssystems mit grossem wortschatz und zur verwendung von einschränkungen aus einer datenbank in einem spracherkennungssystem mit grossem wortschatz | |
DE69719270T2 (de) | Sprachsynthese unter Verwendung von Hilfsinformationen | |
EP1466317B1 (de) | Betriebsverfahren eines automatischen spracherkenners zur sprecherunabhängigen spracherkennung von worten aus verschiedenen sprachen und automatischer spracherkenner | |
DE19942178C1 (de) | Verfahren zum Aufbereiten einer Datenbank für die automatische Sprachverarbeitung | |
EP3010014B1 (de) | Verfahren zur interpretation von automatischer spracherkennung | |
DE112006000322T5 (de) | Audioerkennungssystem zur Erzeugung von Antwort-Audio unter Verwendung extrahierter Audiodaten | |
DE60108104T2 (de) | Verfahren zur Sprecheridentifikation | |
DE60018696T2 (de) | Robuste sprachverarbeitung von verrauschten sprachmodellen | |
DE10018134A1 (de) | Verfahren und Vorrichtung zum Bestimmen prosodischer Markierungen | |
WO2001069591A1 (de) | Verfahren zur erkennung von sprachäusserungen nicht-mutter-sprachlicher sprecher in einem sprachverarbeitungssystem | |
EP1282897B1 (de) | Verfahren zum erzeugen einer sprachdatenbank für einen zielwortschatz zum trainieren eines spracherkennungssystems | |
DE10040063A1 (de) | Verfahren zur Zuordnung von Phonemen | |
DE102010040553A1 (de) | Spracherkennungsverfahren | |
EP1435087B1 (de) | Verfahren zur erzeugung von sprachbausteine beschreibenden referenzsegmenten und verfahren zur modellierung von spracheinheiten eines gesprochenen testmusters | |
DE60021666T2 (de) | Inkrementales Trainieren eines Spracherkenners für eine neue Sprache |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
AX | Request for extension of the european patent |
Free format text: AL;LT;LV;MK;RO;SI |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: NOKIA CORPORATION |
|
PUAL | Search report despatched |
Free format text: ORIGINAL CODE: 0009013 |
|
AK | Designated contracting states |
Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK RO SI |
|
17P | Request for examination filed |
Effective date: 20030120 |
|
GRAH | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOS IGRA |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AKX | Designation fees paid |
Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE CH CY DE DK ES FI FR GB GR IE IT LI LU MC NL PT SE |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: IE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20031105 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT;WARNING: LAPSES OF ITALIAN PATENTS WITH EFFECTIVE DATE BEFORE 2007 MAY HAVE OCCURRED AT ANY TIME BEFORE 2007. THE CORRECT EFFECTIVE DATE MAY BE DIFFERENT FROM THE ONE RECORDED. Effective date: 20031105 Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20031105 Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20031105 |
|
REG | Reference to a national code |
Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP |
|
REF | Corresponds to: |
Ref document number: 50004296 Country of ref document: DE Date of ref document: 20031211 Kind code of ref document: P |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D Free format text: GERMAN |
|
GBT | Gb: translation of ep patent filed (gb section 77(6)(a)/1977) |
Effective date: 20031224 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20040205 Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20040205 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20040205 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20040216 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040419 Ref country code: AT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040419 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040430 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040430 Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040430 Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040430 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FD4D |
|
ET | Fr: translation filed | ||
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
26N | No opposition filed |
Effective date: 20040806 |
|
BERE | Be: lapsed |
Owner name: *NOKIA CORP. Effective date: 20040430 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PT Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20040405 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: DE Payment date: 20120425 Year of fee payment: 13 Ref country code: NL Payment date: 20120413 Year of fee payment: 13 |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: FR Payment date: 20120504 Year of fee payment: 13 Ref country code: GB Payment date: 20120418 Year of fee payment: 13 |
|
REG | Reference to a national code |
Ref country code: NL Ref legal event code: V1 Effective date: 20131101 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20130419 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20131101 Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20130419 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20131231 |
|
REG | Reference to a national code |
Ref country code: DE Ref legal event code: R119 Ref document number: 50004296 Country of ref document: DE Effective date: 20131101 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20130430 Ref country code: NL Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20131101 |