EP1710788B1 - Verfahren und Vorrichtung zur Sprachkonversion - Google Patents
Verfahren und Vorrichtung zur Sprachkonversion Download PDFInfo
- Publication number
- EP1710788B1 EP1710788B1 EP05102714A EP05102714A EP1710788B1 EP 1710788 B1 EP1710788 B1 EP 1710788B1 EP 05102714 A EP05102714 A EP 05102714A EP 05102714 A EP05102714 A EP 05102714A EP 1710788 B1 EP1710788 B1 EP 1710788B1
- Authority
- EP
- European Patent Office
- Prior art keywords
- diction
- excitation
- fundamental frequency
- signal
- original excitation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Not-in-force
Links
- 238000000034 method Methods 0.000 title claims description 18
- 230000005284 excitation Effects 0.000 claims description 77
- 230000001771 impaired effect Effects 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 7
- 238000002156 mixing Methods 0.000 claims description 5
- 230000000737 periodic effect Effects 0.000 claims description 4
- 238000000926 separation method Methods 0.000 claims description 3
- 210000000867 larynx Anatomy 0.000 description 13
- 230000008451 emotion Effects 0.000 description 6
- 238000004519 manufacturing process Methods 0.000 description 6
- 210000003238 esophagus Anatomy 0.000 description 5
- 241000861223 Issus Species 0.000 description 4
- 238000002679 ablation Methods 0.000 description 4
- 210000003800 pharynx Anatomy 0.000 description 4
- 210000001260 vocal cord Anatomy 0.000 description 3
- 230000001755 vocal effect Effects 0.000 description 3
- 230000015572 biosynthetic process Effects 0.000 description 2
- 238000000354 decomposition reaction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 206010015137 Eructation Diseases 0.000 description 1
- 206010023825 Laryngeal cancer Diseases 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 208000027687 belching Diseases 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 230000006735 deficit Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 210000002409 epiglottis Anatomy 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 210000004704 glottis Anatomy 0.000 description 1
- 206010023841 laryngeal neoplasm Diseases 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000007170 pathology Effects 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 238000003168 reconstitution method Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
- G10L21/007—Changing voice quality, e.g. pitch or formants characterised by the process used
- G10L21/013—Adapting to target pitch
- G10L2021/0135—Voice conversion or morphing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/04—Time compression or expansion
- G10L21/057—Time compression or expansion for improving intelligibility
- G10L2021/0575—Aids for the handicapped in speaking
Definitions
- the present invention relates to a voice aid device and method for persons having ablation of the larynx.
- Laryngectomy an operation often performed for laryngeal cancer, deprives the patient of his ability to speak. Indeed, the partial or total removal of the vocal cords prevents the generation of a harmonic excitation necessary for the production of vowels or voiced consonants.
- the present invention overcomes this disadvantage, by proposing a portable device and an altered speech processing method, allowing, on the one hand, the restoration of a voice perceived as natural and, on the other hand, the reduction of the efforts to generate it.
- the invention also makes it possible to avoid any additional surgical procedure, and proposes a compact, portable device permitting its use in all circumstances.
- the processor is furthermore programmed to estimate the probability of a voiced sound, in parallel with the extraction of the excitation and the parameters of the articulation and the restoration of the original excitation, and to mix the original excitement restored and the original excitement from the probability of a voiced sound.
- the present invention is based on the assumption that the subject is, of Preferably, already capable, after carrying out the necessary training, to produce, by modulation of the emitted air flow, an altered speech, for example of the esophageal type, expressing the message that it wishes to transmit.
- the sounds that it emits are picked up by an external microphone whose output signals are processed by a processing circuit responsible for restoring the original voice, then emitted, for example, by a loudspeaker.
- esophageal speech there are other types of altered speech than esophageal speech. For example, tracheoesophageal speech, supraglottal elocution, etc. may be mentioned.
- the designation esophageal speech will be assumed to designate any other type of speech impairment resulting from the total or partial removal of the larynx.
- the principle, according to the invention, of restoration of esophageal speech is based on a known model of speech production, schematized on the figure 1 and described in detail in the article of G. Fant (Q. Prog Status Rep. Speech Transmiss. Lab.1, 21-37 ). According to this model, the generation of a sound is decomposed into two distinct blocks.
- excitation 1 is of two types, depending on the configuration of the larynx.
- the air In open configuration, the air generates turbulence in the vocal cavities, without vibrating the vocal cords.
- the excitation produced is similar to a noisy signal 3 and is used to generate unvoiced consonants (p, t, s, etc.).
- the vibrations of the larynx excite the vocal cords that produce a harmonic acoustic wave, also called glottal wave 4, whose pattern and the fundamental fundamental frequency, located between 100 and 250Hz, are characteristic of each individual.
- This excitement allows to produce voiced vowels and consonants (b, d, z, etc ).
- the excitation 1 is therefore in the form of a sequence of harmonic signals 4 alternating or mingling with noisy signals 3.
- the excitation, in a subject without a larynx is in the form of a sequence of quasi-harmonic signals alternating or mingling with noisy signals.
- the voice apparatus of a person having undergone laryngeal ablation is functional, the articulation 2, which constitutes the second block of speech generation, is slightly impaired.
- altered speech for example of the esophageal type, although deficient in the production of a harmonic acoustic wave, is capable of producing the subtle modulations of frequency and amplitude characteristic of the human voice.
- the restoration of the esophageal-type altered speech requires mainly the restoration of the excitation during the voiced alternations of the speech, ie the restoration of the glottal wave 4, the unvoiced alternans and the joint being relatively unaffected by the removal of the larynx.
- FIG. 2 illustrates the method of reconstruction of the voice according to the invention, based on the model explained above. It comprises various steps of processing of the signal delivered by the microphone, which are conducted in parallel or in series, and allow to move from the esophageal voice to a laryngeal voice expressed by the speaker.
- the incoming signal is a digitized electrical signal representing the esophageal voice.
- this original signal undergoes signal enhancement processing (see in particular the European patent EP 1'253'581 of the Applicant entitled “Method and System for Enhancing Speech in a noisy Environment"), then the signal obtained is directed in parallel to a block for estimating the probability of a voiced sound 12 and a series of three blocks 14, 16 and 18 to reconstruct the two types of excitations according to the speech pattern described with regard to the figure 1 .
- the block 12 makes an estimate of the probability of a voiced sound, using automatic classification modules known to those skilled in the art and as described in the article of JM Solà et al ("Environmental Robust Features for Speech Detection", INTERSPEECH-ICSLP'04, Jeju Island, Korea, October 2004 ). This estimate, in the form of a number between zero and one, is then directed to a block 20 for mixing between harmonic excitation and noisy excitation, as explained later.
- Blocks 14 and 16 common to both paths, respectively perform the subband decomposition and the identification of the parameters of the joint, operations known to those skilled in the art.
- the subband decomposition consists of a band division of the frequency spectrum obtained by Fourier transform of the signal, and a rebalancing of this spectrum by the increase in amplitude of the less noisy bands compared to the others. This operation, although effective for filtering the signal, is optional.
- the identification of the parameters of the articulation is a crucial step to separate, within the incoming signal, the excitation of the parameters of the articulation.
- a description of the method used is given in an article by Yingyong Qi (J. Acoust, Am., 88 (3), September 1990 ).
- the original excitation in the form of a signal amplitude as a function of time
- the parameters of the articulation in the form of a vector At the output of the block 16, one thus finds, on the one hand, the original excitation in the form of a signal amplitude as a function of time and, on the other hand, the parameters of the articulation in the form of a vector.
- the original excitation is then directed, on the one hand, to the restoration block of the glottal wave 18 and, on the other hand, to the mixing block 20 of the two excitations.
- Block 18 which constitutes the heart of the invention, is dissociated into different blocks represented in FIG. figure 3 and is the subject of a complete description later. Its role is to restore greatly altered harmonic excitement in the subject who has undergone laryngectomy. Once this restoration is completed, the signal is directed to the blur mixing block 20.
- the restored unchanged original harmonic excitations are mixed in proportions fixed by the probability estimate of a voiced sound. If the excitation is estimated purely noisy, only the original excitation will be preserved at the output of the block 20. On the other hand, if the excitation is considered purely harmonic, only the harmonic excitation restored will be preserved at the output of the block 20. In intermediate cases, the block 20 will perform a mixture between restored harmonic excitation and excitation original unchanged, the last signal is not a purely noisy signal but a superposition of noisy and quasi-harmonic signals. At the output of block 20, the restored excitation is of the laryngeal type.
- the vector of articulation parameters is, at the output of the identification block of the voice apparatus 16, to a block 22 intended for the restoration of these parameters. Indeed, although satisfactory in a larynctomised subject, the joint must be somewhat corrected to match that of a healthy subject. A description of this operation is given in the article " Replacing tracheoesophageal voicing sources using LPC synthesis, Yingyong Qi, J. Acoust, Am., 88 (3), September 1990 .
- the parameters of the joint thus restored and the reconstituted excitation are directed to a block 24 which convolves the excitation by the parameters of the articulation in order to reconstitute the speech.
- a last correction is made by a block 26 which performs a signal recovery to correct a decrease in power observed on long vowels.
- FIG. 3 illustrates in detail the harmonic excitation restoration operations carried out by block 18 and constituting the core of the present invention.
- the restoration operations of the harmonic excitation according to the invention therefore aim at restoring the basic pattern, while introducing a variability, restoring the fundamental fundamental frequency, introducing also a variability, and modulating the signal obtained in amplitude and frequency. .
- a first operation, performed by the identification block of the harmonic parameters 18a, therefore consists in estimating and extracting from the incoming signal the mean fundamental frequency and instantaneous average power, calculated over a given time interval, for example 20ms. This operation is more complex than in a healthy subject, due to the deformation of the quasi-harmonic wave.
- a method, based on the histogram of the detected upper and lower envelopes, is used to determine, at regular intervals, for example every 8ms, this instantaneous average fundamental frequency. For more details on the method used, refer to the article of V. Parsa et al. (Journal of Speech, Langauge and Hearing Research, Vol42, 112-126, February 1999 ).
- a reference table has been formed beforehand from a sound voice, preferably that of the subject before its operation.
- This table represented by block 18c, contains a large number of basic patterns recorded during speech and having a characteristic variability of the human voice. It also contains statistics on the variability of the fundamental fundamental frequency, also characteristic of the recorded human voice, and calculated from the recording.
- Block 18b is thus connected to block 18c so as to receive the fundamental frequency variability information as contained in the reference table.
- the signal from block 18b therefore contains all the data necessary to reconstruct an amplitude modulated harmonic signal, whose instantaneous average fundamental frequency and the variability in fundamental frequency correspond to those of a healthy subject.
- the basic pattern of the signal is still characteristic of esophageal speech.
- the signals from the blocks 18b and 18c are then directed to the block 18d reconstruction of the glottal wave.
- a glottal wave having all the characteristics of the human glottal wave is reconstituted from the healthy basic patterns and the parameters of the harmonic wave, corrected or not, at the level of the block 18b.
- the perception of the excitation emitted at the end of the block 18d is that of a human voice, where the emotions are expressed thanks to the modulation in amplitude and in frequency, carried out by the larynctomised subject.
- the invention also relates to the voice aid device for implementing the method described above.
- This system schematically represented in figure 4 , essentially comprises a voice acquisition device 30, such as a microphone, for capturing the oesophageal acoustic signal emitted by the patient and transforming it into an electrical signal.
- This microphone is connected to a first amplifier module 32 responsible for adjusting the dynamic scale and itself connected to an output A / D module 34 for converting the analog signal into a digital signal.
- DSP 36 Digital Signal Processing English
- the digital signals from the processor 36 are received by a D / A module 38 for transforming the digital signal into an analog signal, itself connected to a second amplification module 40.
- a loudspeaker 42 transforms the electric signal in acoustic signal.
- the signal can be processed by a suitable telephone apparatus.
- the microphone 30 may be of portable type, for everyday use, or fixed, for example for a public address.
- the modules 32 to 40 are, for example, integrated in a single housing, portable or not, and the speaker 42 can be attached to the shoulders of the patient or in any other strategic position.
- a speaker 42 in combination with a digital signal processor 36 further allows the compensation of the acoustic signal emitted by the subject. This possibility can be very useful for a small group discussion in which the voice of the laryngectomee is superimposed on the voice corrected by the device.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Prostheses (AREA)
Claims (8)
- Tragbare Vorrichtung zur Wiederherstellung einer beeinträchtigten Sprechweise, dadurch gekennzeichnet, dass sie folgendes umfasst:- ein System zum Erfassen dieser beeinträchtigten Sprechweise (30), das ein elektrisches Signal hervorbringt, das diese darstellt, wobei das Signal aus einem Mikrofon kommt,- einen Prozessor (36) für dieses Signal, der programmiert ist, um eine Entnahme und eine Trennung der ursprünglichen Erregung und der Ausspracheparameter vorzunehmen, wobei die ursprüngliche Erregung im Wesentlichen periodisch ist und ein Grundmuster aufweist, die Wiederherstellung der ursprünglichen Erregung vorzunehmen ausgehend von Parametern, die sich aus der beeinträchtigten Sprechweise ergeben und welche die momentane mittlere Leistung und die momentane mittlere Grundfrequenz der ursprünglichen Erregung umfassen, und von vorherbestimmten Elementen, die sich aus einer gesunden Sprechweise ergeben, die Informationen über die Variabilität der Grundfrequenz, die Form des Musters und seine Variabilität umfassen, und die Rekonstruktion der Sprache vorzunehmen ausgehend von den Ausspracheparametern und der wiederhergestellten ursprünglichen Erregung, um ein Signal hervorzubringen, das die rekonstruierte Sprechweise darstellt, und- ein Lautsprechersystem (42), das das von dem Prozessor gelieferte Signal in ein akustisches Signal umsetzt.
- Vorrichtung nach Anspruch 1, dadurch gekennzeichnet, dass der Prozessor (36) ferner programmiert ist, um die Wahrscheinlichkeit eines stimmhaften Tons, parallel zur Entnahme der Erregung und den Ausspracheparametern und der Wiederherstellung der ursprünglichen Erregung zu schätzen, und um die ursprüngliche wiederhergestellte Erregung und die ursprüngliche Erregung ausgehend von der Wahrscheinlichkeit eines stimmhaften Tons zu mischen.
- Vorrichtung nach einem der Ansprüche 1 und 2, dadurch gekennzeichnet, dass zum Durchführen der Wiederherstellung der ursprünglichen Erregung der Prozessor (36) programmiert ist zum:- Berechnen der momentanen mittleren Leistung und der momentanen mittleren Grundfrequenz der ursprünglichen Erregung,- Verlagern der momentanen mittleren Grundfrequenz und Einführen einer Variabilität der Grundfrequenz, die sich aus einer gesunden Sprechweise ergibt, und- Rekonstruieren einer harmonischen Erregung, die das Muster, das sich aus einer gesunden Sprechweise ergibt, und die dafür charakteristische Variabilität, die berechnete momentane mittlere Leistung, die verlagerte momentane mittlere Grundfrequenz und die Variabilität der Grundfrequenz, die sich aus einer gesunden Sprechweise ergibt, aufweist.
- Vorrichtung nach einem der Ansprüche 1 bis 3, dadurch gekennzeichnet, dass sie ferner folgendes umfasst:- ein erstes Verstärkungsmodul (32) am Ausgang des Spracherfassungssystems (30),- ein Modul zum Umsetzen des Analogsignals in ein Digitalsignal (34) zwischen dem ersten Verstärkungsmodul (32) und dem Signalprozessor (36),- ein Modul zum Umsetzen des Digitalsignals in ein Analogsignal (38) am Ausgang des Signalprozessors (36), und- ein zweites Verstärkungsmodul (40) zwischen dem Modul zum Umsetzen des Digitalsignals in ein Analogsignal (38) und dem Lautsprechersystem (42).
- Vorrichtung nach einem der Ansprüche 1 bis 4, dadurch gekennzeichnet, dass es ferner ein Modul umfasst zum Kompensieren einer beeinträchtigten Sprechweise.
- Verfahren zur Wiederherstellung einer beeinträchtigten Sprechweise durch die Verarbeitung eines elektrischen Signals, das aus einem Mikrofon kommt und die Sprechweise darstellt, dadurch gekennzeichnet, dass es die folgenden Hauptschritte umfasst:- Entnehmen und Trennen (16) der ursprünglichen Erregung und der Ausspracheparameter, wobei die ursprüngliche Erregung im Wesentlichen periodisch ist und ein Grundmuster aufweist,- Wiederherstellen (18) der ursprünglichen Erregung ausgehend von Parametern, die sich aus der beeinträchtigten Sprechweise ergeben und die momentane mittlere Grundfrequenz und die momentane mittlere Leistung der ursprünglichen Erregung umfassen, und von vorherbestimmten Elementen, die sich aus einer gesunden Sprechweise ergeben und Informationen über die Variabilität der Grundfrequenz, die Form des Musters und seine Variabilität umfassen, und- Rekonstruieren der Sprechweise (24) ausgehend von den Ausspracheparametern und der wiederhergestellten ursprünglichen Erregung, um ein akustisches Signal hervorzubringen, das die rekonstruierte Sprechweise darstellt.
- Verfahren nach Anspruch 6, dadurch gekennzeichnet, dass es ferner folgendes umfasst:- einen Schritt des Einschätzens der Wahrscheinlichkeit eines stimmhaften Tons (12), der parallel zu den Schritten des Entnehmens der Erregung und der Ausspracheparameter (16) und des Wiederherstellens der ursprünglichen Erregung (18) erfolgt, und- einen Schritt des Mischens der wiederhergestellten ursprünglichen Erregung und der ursprünglichen Erregung (20) ausgehend von der Wahrscheinlichkeit eines stimmhaften Tons.
- Verfahren nach einem der Ansprüche 6 und 7, dadurch gekennzeichnet, dass der Schritt des Wiederherstellens der ursprünglichen Erregung folgende Vorgänge umfasst:- Berechnen der momentanen mittleren Leistung und der momentanen mittleren Grundfrequenz der ursprünglichen Erregung (18a),- Verlagern der momentanen mittleren Grundfrequenz und Einführen einer Variabilität der Grundfrequenz, die sich aus einer gesunden Sprechweise (18c) ergibt, und- Rekonstruieren einer harmonischen Erregung (18d), die das Muster, das sich aus einer gesunden Sprechweise ergibt, und die dafür charakteristische Variabilität, die berechnete momentane mittlere Leistung, die verlagerte momentane mittlere Grundfrequenz und die Variabilität der Grundfrequenz, die sich aus einer gesunden Sprechweise ergibt, aufweist.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP05102714A EP1710788B1 (de) | 2005-04-07 | 2005-04-07 | Verfahren und Vorrichtung zur Sprachkonversion |
DE602005015419T DE602005015419D1 (de) | 2005-04-07 | 2005-04-07 | Verfahren und Vorrichtung zur Sprachkonversion |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP05102714A EP1710788B1 (de) | 2005-04-07 | 2005-04-07 | Verfahren und Vorrichtung zur Sprachkonversion |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1710788A1 EP1710788A1 (de) | 2006-10-11 |
EP1710788B1 true EP1710788B1 (de) | 2009-07-15 |
Family
ID=35355638
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP05102714A Not-in-force EP1710788B1 (de) | 2005-04-07 | 2005-04-07 | Verfahren und Vorrichtung zur Sprachkonversion |
Country Status (2)
Country | Link |
---|---|
EP (1) | EP1710788B1 (de) |
DE (1) | DE602005015419D1 (de) |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6336092B1 (en) * | 1997-04-28 | 2002-01-01 | Ivl Technologies Ltd | Targeted vocal transformation |
US6359988B1 (en) * | 1999-09-03 | 2002-03-19 | Trustees Of Boston University | Process for introduce realistic pitch variation in artificial larynx speech |
DE60104091T2 (de) | 2001-04-27 | 2005-08-25 | CSEM Centre Suisse d`Electronique et de Microtechnique S.A. - Recherche et Développement | Verfahren und Vorrichtung zur Sprachverbesserung in verrauschte Umgebung |
US7275030B2 (en) * | 2003-06-23 | 2007-09-25 | International Business Machines Corporation | Method and apparatus to compensate for fundamental frequency changes and artifacts and reduce sensitivity to pitch information in a frame-based speech processing system |
-
2005
- 2005-04-07 DE DE602005015419T patent/DE602005015419D1/de not_active Expired - Fee Related
- 2005-04-07 EP EP05102714A patent/EP1710788B1/de not_active Not-in-force
Non-Patent Citations (1)
Title |
---|
MATSUI K.; HARA N.: "Enhancement of esophageal speech using formant synthesis", PROCEEDINGS OF ICASSP 1999, 15 March 1999 (1999-03-15) - 19 March 1999 (1999-03-19), PHOENIX (AZ), pages 81 - 84, XP000898268 * |
Also Published As
Publication number | Publication date |
---|---|
DE602005015419D1 (de) | 2009-08-27 |
EP1710788A1 (de) | 2006-10-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP1901286B1 (de) | Sprachverbesserungsvorrichtung, Sprachaufzeichnungsvorrichtung, Sprachverbesserungsprogramm, Sprachaufzeichnungsprogramm, Sprachverbesserungsverfahren und Sprachaufzeichnungsverfahren | |
Kong et al. | Speech and melody recognition in binaurally combined acoustic and electric hearing | |
EP1006511B1 (de) | Tonverarbeitungverfahren und Vorrichtung zur Anpassung eines Hörgerätes für Hörbehinderte | |
EP0745363A1 (de) | Hörgerät mit einer mittels kleinwellen betriebenen cochlearen Implantat | |
Qin et al. | Effects of simulated cochlear-implant processing on speech reception in fluctuating maskers | |
EP2518724B1 (de) | Kombinierte Audioeinheit bestehend aus Mikrofon und Kopfhörer, die Mittel zur Geräuschdämpfung eines nahen Wortsignals umfasst, insbesondere für eine telefonische Freisprechanlage | |
EP1593116B1 (de) | Verfahren zur differenzierten digitalen Sprach- und Musikbearbeitung, Rauschfilterung, Erzeugung von Spezialeffekten und Einrichtung zum Ausführen des Verfahrens | |
CN103390408B (zh) | 用于处理音频信号的方法和装置 | |
KR101475894B1 (ko) | 장애 음성 개선 방법 및 장치 | |
US9936308B2 (en) | Hearing aid apparatus with fundamental frequency modification | |
Keller | The analysis of voice quality in speech processing | |
US20060126859A1 (en) | Sound system improving speech intelligibility | |
WO2004015652A1 (fr) | Procede de calibrage d'audio-intonation | |
Fogerty et al. | Sentence intelligibility during segmental interruption and masking by speech-modulated noise: Effects of age and hearing loss | |
EP1279166A1 (de) | Robuste merkmale für die erkennung von verrauschten sprachsignalen | |
EP1710788B1 (de) | Verfahren und Vorrichtung zur Sprachkonversion | |
Gaudrain et al. | Streaming of vowel sequences based on fundamental frequency in a cochlear-implant simulation | |
Guest et al. | The role of pitch and harmonic cancellation when listening to speech in harmonic background sounds | |
Luo | Talker variability effects on vocal emotion recognition in acoustic and simulated electric hearing | |
CN115985303A (zh) | 基于声音的数字人形象生成方法及其相关装置 | |
Clavel | Analyse et reconnaissance des manifestations acoustiques des émotions de type peur en situations anormales | |
Hu | A simulation study of harmonics regeneration in noise reduction for electric and acoustic stimulation | |
Xiao et al. | Reconstruction of mandarin electrolaryngeal fricatives with hybrid noise source | |
Won et al. | Improving performance in noise for hearing aids and cochlear implants using coherent modulation filtering | |
Kim et al. | Speech identification in noise: Contribution of temporal, spectral, and visual speech cues |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL BA HR LV MK YU |
|
17P | Request for examination filed |
Effective date: 20070314 |
|
AKX | Designation fees paid |
Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR |
|
17Q | First examination report despatched |
Effective date: 20071106 |
|
GRAP | Despatch of communication of intention to grant a patent |
Free format text: ORIGINAL CODE: EPIDOSNIGR1 |
|
GRAS | Grant fee paid |
Free format text: ORIGINAL CODE: EPIDOSNIGR3 |
|
GRAA | (expected) grant |
Free format text: ORIGINAL CODE: 0009210 |
|
AK | Designated contracting states |
Kind code of ref document: B1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LI LT LU MC NL PL PT RO SE SI SK TR |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: EP Ref country code: GB Ref legal event code: FG4D Free format text: NOT ENGLISH |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FG4D |
|
REF | Corresponds to: |
Ref document number: 602005015419 Country of ref document: DE Date of ref document: 20090827 Kind code of ref document: P |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: NV Representative=s name: GLN S.A. |
|
NLV1 | Nl: lapsed or annulled due to failure to fulfill the requirements of art. 29p and 29m of the patents act | ||
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: SE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: LT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: IS Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20091115 Ref country code: AT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: ES Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20091026 |
|
REG | Reference to a national code |
Ref country code: IE Ref legal event code: FD4D |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: PL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: SI Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: NL Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BG Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20091015 Ref country code: PT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20091115 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: EE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: CZ Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: DK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: IE Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 Ref country code: RO Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 |
|
PLBE | No opposition filed within time limit |
Free format text: ORIGINAL CODE: 0009261 |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: SK Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 |
|
26N | No opposition filed |
Effective date: 20100416 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: GR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20091016 |
|
BERE | Be: lapsed |
Owner name: CSEM CENTRE SUISSE D'ELECTRONIQUE ET DE MICROTECH Effective date: 20100430 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: MC Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20100430 |
|
GBPC | Gb: european patent ceased through non-payment of renewal fee |
Effective date: 20100407 |
|
REG | Reference to a national code |
Ref country code: FR Ref legal event code: ST Effective date: 20101230 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: DE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20101103 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: BE Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20100430 Ref country code: GB Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20100407 Ref country code: IT Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: FR Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20100430 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: CY Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LU Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20100407 Ref country code: HU Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20100116 |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: TR Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT Effective date: 20090715 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PCAR Free format text: NEW ADDRESS: AVENUE EDOUARD-DUBOIS 20, 2000 NEUCHATEL (CH) |
|
PGFP | Annual fee paid to national office [announced via postgrant information from national office to epo] |
Ref country code: CH Payment date: 20140428 Year of fee payment: 10 |
|
REG | Reference to a national code |
Ref country code: CH Ref legal event code: PL |
|
PG25 | Lapsed in a contracting state [announced via postgrant information from national office to epo] |
Ref country code: LI Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20150430 Ref country code: CH Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES Effective date: 20150430 |