GB2516208B - Noise reduction in voice communications - Google Patents
Noise reduction in voice communications Download PDFInfo
- Publication number
- GB2516208B GB2516208B GB1219175.5A GB201219175A GB2516208B GB 2516208 B GB2516208 B GB 2516208B GB 201219175 A GB201219175 A GB 201219175A GB 2516208 B GB2516208 B GB 2516208B
- Authority
- GB
- United Kingdom
- Prior art keywords
- voice
- phonemes
- acoustic signal
- words
- initial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004891 communication Methods 0.000 title claims description 37
- 238000000034 method Methods 0.000 claims description 58
- 230000005540 biological transmission Effects 0.000 claims description 8
- 230000001413 cellular effect Effects 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 6
- 230000005284 excitation Effects 0.000 claims description 3
- 230000003595 spectral effect Effects 0.000 claims description 3
- 230000001419 dependent effect Effects 0.000 claims 1
- 238000012545 processing Methods 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 239000000523 sample Substances 0.000 description 3
- 238000013500 data storage Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000008929 regeneration Effects 0.000 description 1
- 238000011069 regeneration method Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000008054 signal transmission Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Probability & Statistics with Applications (AREA)
- Telephone Function (AREA)
Description
Noise Reduction in Voice Communications
Technical Field of the Invention
The present invention relates to noise reduction in voice communications, in particular to noise reduction in reproduction of voices captured as part of a voice communication.
Background to the Invention
In voice communications, an acoustic signal captured at a first device is transmitted to a second device and reproduced. Typically, the second device is also operable to capture an acoustic signal and transmit this to the first device for reproduction. For convenience, the acoustic signal is usually converted to or encoded in another form for transmission. The captured acoustic signals generally comprise a speaker’s voice and background noise. Furthermore, in transmission of the signal there may be significant channel noise introduced to the signal. If the overall noise level is low, this will not be a significant issue. If the overall noise level is high, whether resulting from background noise, channel noise or both, this can have a significant impact on the intelligibility and/or recognisability of the captured voice reproduced at the other device.
This problem can be addressed by amplifying or filtering the captured voice signals either on capture or on reproduction or by applying similar techniques to the converted form of the signal for transmission. Such simple techniques typically only provide very limited success. It is also possible to address this problem using noise cancellation technology. This requires the provision of noise cancellation microphones to capture the background noise independently of the voice allowing this background noise to subsequently be cancelled from the captured signal either by emitting an opposing acoustic signal or by deleting said noise from the captured signal including the voice. This technique relies upon the provision of additional hardware and on there being suitable places to mount said additional hardware. Furthermore, whilst this system may have impact on reducing background noise, it will not have an impact on channel noise.
It is therefore an object of the present invention to provide a method and system for at least partially overcoming or alleviating the above problems.
Summary of the Invention
According to a first aspect of the present invention there is provided a method of noise reduction in voice communications, the method comprising the steps of: comparing an initial acoustic signal including a voice to a stored model of the voice; identifying elements of the initial acoustic signal corresponding to words or phonemes uttered by the voice; parsing the identified elements into an ordered data stream of said words or phonemes; retrieving data from the stored model of the voice corresponding to the words or phonemes of the ordered data stream; and utilising the retrieved data to generate a secondary acoustic signal corresponding to the parsed words or phonemes.
Identifying the voiced words or phonemes in this manner allows the subsequent reconstruction of a secondary acoustic signal corresponding to the voiced words or phonemes without or with reduced noise. Since the method concentrates on identifying elements within the voice of interest, it can perform more effectively than simple filtering or amplification techniques applied to the initial acoustic signal as a whole. This method can further be applied without the provision of additional microphones to cancel background noise.
The above method may be applied in systems wherein the initial signal is captured by a first voice communication device and the secondary acoustic signal is reproduced by a second voice communication device. In such instances, the method may be applied by the first device or second device as desired or as appropriate. The transmission may take place using any suitable communication networks including but not limited to: public telephone systems, either cellular or fixed line as desired or required, internet connections, Wi-Fi (Registered Trade Mark) networks or other data networks. For transmission the initial or secondary acoustic signal may be converted or encoded in any suitable manner according to the standards of the communication network.
The method may include the step of capturing the initial acoustic signal using a suitable microphone or a device comprising a suitable microphone. The method may include the step of outputting the secondary signal using a suitable loudspeaker or a device comprising a suitable loudspeaker.
The or each voice communication device may be a fixed line or cellular telephone; desktop, laptop or tablet computer; audio or audiovisual recording device or the like.
The method may include the step of identifying the voice. The identification can be achieved by direct consideration of the initial acoustic signal. This consideration may involve comparing the captured acoustic signal to one or more stored voice models. Preferably, where possible, a specific speech model is stored for each speaker. Using individual models for each speaker in this way can significantly increase the effectiveness of the method. Additionally or alternatively, the identification may be made by identifying the voice communication device or a physical or network location of the voice communication device used to capture the acoustic signal. For example, a telephone handset may be identified by a phone number, SIM or handset IMEI.
The method may be applied on all possible occasions. Alternatively, the method may only be applied in response to a user request, or when the noise exceeds a particular threshold. In the last case, the method may include the step of measuring the background noise and/or channel noise and comparing it to a predetermined threshold.
Identifying and parsing the words or phonemes in the initial acoustic signal can be achieved directly by comparing the acoustic signal to the stored model. Additionally or alternatively, the identification and parsing may include a probabilistic prediction based on the syntax of other identified words or phonemes. Using a probabilistic approach can also allow for the identification of phonemes previously missing from a particular voice model.
The stored model may comprise samples of the voice uttering words or phonemes. Additionally or alternatively, the stored model may comprise data indicating how characteristics of the voice differ from reference samples of the same words or phonemes. The voice characteristics may include accent, cadence, tone, excitation, inflexion, spectral characteristics, sound/pause duration or the like.
The method may include the step of updating a stored model on an ongoing basis and/or the step of building up and storing a model of any unidentified voices. This may be achieved by capturing samples of the voice, analysing the samples to identify corresponding words or phonemes and storing said samples or data indicating how the voice characteristics differ from reference samples of the same words or phonemes.
According to a second aspect of the present invention there is provided a noise reduction system for use in voice communication comprising: a library of stored voice models; a speech detection engine operable to identify elements of an initial acoustic signal corresponding to words or phonemes uttered by a voice and parse the identified elements into an ordered data stream of said words or phonemes; a speech reconstruction engine operable to retrieve data from the library of stored voice models corresponding to the words or phonemes of the ordered data stream and to utilise the retrieved data to generate a secondary acoustic signal corresponding to the parsed words or phonemes.
The noise reduction system of the second aspect of the present invention may incorporate any or all features of the first aspect of the present invention, as desired or as appropriate.
According to a third aspect of the present invention there is provided a voice communications device incorporating a noise reduction system according to the second aspect of the present invention.
The voice communications device may be a fixed line or cellular telephone, desktop, laptop or tablet computer, audio or audiovisual recording device or the like.
Detailed Description of the Invention
In order that the invention may be more clearly understood an embodiment/embodiments thereof will now be described, by way of example only, with reference to the accompanying drawings, of which:
Figure 1 is a schematic illustration of a voice communication situation in which the present invention might be implemented;
Figure 2 is a flow diagram illustrating the steps involved in creating or updating a stored voice model in the present invention;
Figure 3 is a flow diagram illustrating the steps involved in processing an initial acoustic signal to reduce background noise in the present invention; and Figure 4 is a schematic block diagram of a mobile telephone handset adapted to implement the present invention.
In a conventional voice communication system, a first voice communication device A (such as a telephone handset) captures an acoustic signal including a speaker’s voice. This captured acoustic signal is then transmitted, in suitably encoded form, via a communication network N to a second voice communication device B. The captured signal is subsequently reproduced by device B for the benefit of a listener. Should the listener at device B wish to reply, device B is also operable to capture an acoustic signal and transmit the suitable encoded signal to device A for reproduction. On occasions where the voice is captured alongside significant amounts of background noise, this background noise forms part of the acoustic signal reproduced for the listener. Additionally or alternatively, there can be significant channel noise encountered upon transmission of a signal. These noise contributions can significantly reduce the intelligibility and/or recognisability of the voice communication.
In the present invention, the acoustic signal captured by device A is subjected to noise reduction processing before reproduction by device B. This processing can take place either at device A before transmission or at device B after receipt but before reproduction. The processing involves an initial step of analysing the captured acoustic signal with respect to a stored model of the speaker’s voice. By way of this analysis, elements of the captured acoustic signal corresponding to words or phonemes uttered by the speaker can be identified and parsed into an ordered data stream of said words or phonemes. Subsequently, data from the stored model of the voice corresponding to the words or phonemes of the ordered data stream can be retrieved and used to generate a new acoustic signal corresponding to the parsed words or phonemes. This new acoustic signal can then be reproduced for the listener. By identifying the voiced words or phonemes in this manner, the subsequent reconstruction of new acoustic signal corresponding to the voiced words or phonemes can substantially exclude noise. This increases the intelligibility of the voice communications considerably on occasions where the voice is captured alongside significant amounts of background noise or is subject to significant amounts of channel noise.
In order for the method to operate, it is necessary to have a viable model of a speaker’s voice. Such a model may be created by processing samples of the speaker’s voice. These samples may be acquired by the speaker submitting a predetermined range of voice samples. More typically, these samples can be acquired by capturing and analysing voice samples on occasions where the noise reduction is not required. Ideally, ongoing sampling allows for each speaker’s voice model to be continuously adapted. This can significantly improve models overtime as the noise contribution in the collected averaged samples will tend to zero as more samples are collected.
Turning now to figure 2, a schematic illustration of this analysis is illustrated. Following detection of a voice at SI, a determination is made at S2 as to whether the speaker has an existing voice model. This may be achieved by comparing the initially detected voice against existing speech models. Alternatively, the speaker may be identified by another method (for instance by their phone number or by direct input of an identity). If the speaker does have an existing model, then at S3, the voice sample is analysed to determine its likely content and characteristic parameters ofthe voice. These parameters may include accent, cadence, tone, excitation, inflexion, spectral characteristics, sound/pause duration or the like. At S4, the voice model is then updated with any additional or revised parameters.
If the speaker does not have an existing model, a choice may be made at step S5 whether to create a new model or not. If a new model is to be created, this model is assigned to the speaker identity at S6. The voice sample can then be analysed and updated as set out above in steps S3 & S4.
Turning now to figure 3, there is presented a flow chart illustrating the steps involved in a preferred implementation of noise reduction processing according to the present invention. The steps are performed on a captured or received acoustic signal. Initially, at step SI 1, it is determined whether a voice is detected. If a voice is detected, at S12, an attempt is made to identify the voice. Thi s attempt may involve comparing the voice against existing voice models and/or analysing the source of the acoustic signal. For instance an acoustic signal received from a particular phone could be directly identified as containing a voice corresponding to the user of the phone. If a voice model exists, at step S13, an assessment is made as to whether the model contains sufficient data to make use of the present method viable.
In the event that use of the voice model is viable, the model parameters are retrieved from the library of voice models at SI 4. At S14, the acoustic signal is analysed probabilistically based on the word/phoneme recognition, syntax considerations and the specific parameters of the voice model At S15 this analysis is processed into an ordered data stream corresponding to the predicted word or phonemes uttered by the voice. Subsequently, at SI6, the voice model can be used to generate a new acoustic signal corresponding to the successive words or phonemes of the data stream. By applying the voice model, the new acoustic signal will correspond substantially to the voice elements within the original signal excluding noise. If desired, for a more natural sound, the new acoustic signal may be mixed with a low level of background noise.
To facilitate processing, one implementation of the invention may involve delaying the signal for a processing interval. In view of the low latency of contemporary networks, a delay of a few milliseconds may prove adequate for processing whilst having minimal impact on a user.
Turning now to figure 4, an exemplary device incorporating a system for implementing the method of the preset invention is shown. The device in this example is a cellular telephone handset 100, albeit that the skilled man will appreciate that this method may be applied to or implemented by any other device useable for voice communication including but not limited to fixed line telephones, desktop, laptop or tablet computers and the like.
The handset 10 incorporates a communication unit 11 adapted to enable data, in particular encoded acoustic signals to be transmitted and received via a cellular telephone network. The handset is also provided with a microphone 12 for capturing an acoustic signal including the voice of a phone user and a loudspeaker 13 for reproducing an acoustic signal received via the communication unit 11.
Within the phone 10 is provided a noise reduction system 100 according to the present invention for implementing the above discussed method. The system 100 comprises a data storage means 110, a speech detection engine 120 and a speech reconstruction engine 130.
The data storage means 110 contains a library of stored voice models. The speech detection engine 120 is operable to retrieve data from the library and use this in the analysis of an acoustic signal. The acoustic signal may be a signal captured by the microphone 12 or may be an acoustic signal received via the communication unit 11. The analysis can allow the speech detection engine to identify elements of the acoustic signal as corresponding to words or phonemes uttered by the modelled voice and to parse the identified elements into an ordered data stream of said words or phonemes. The ordered data stream can then be passed to the speech reconstruction engine 130. Subsequently, the speech reconstruction engine 130 is operable to retrieve data from the library corresponding to the words or phonemes of the ordered data stream and to utilise the retrieved data to generate a new acoustic signal corresponding to the parsed words or phonemes. This new acoustic signal may be output by the loudspeaker 13 or may be passed to the communication unit for transmission to another device via the cellular telephone network.
In further implementations of the invention, it is possible for the ordered data stream of the acoustic signal recreated from the ordered data stream to be fed to additional voice processing units. The data stream/reconstructed audio signal can provide a high quality in put for such systems to undertake further processing before generating an output audio signal. In a particular example, the additional voice processing unit may include a translation engine. In such an example, the captured acoustic signal may be translated into a separate language for regeneration as text or an acoustic signal in a different language.
It is of course to be understood that the invention is not to be restricted to the details of the above embodiment is/embodiments which are described by way of example only.
Claims (24)
1. A method of noise reduction in voice communications, the method comprising the steps of comparing an initial acoustic signal including a voice to a stored model of the voice; identifying elements of the initial acoustic signal corresponding to words or phonemes uttered by the voice; parsing the identified elements into an ordered data stream of said words or phonemes; retrieving data from the stored model of the voice corresponding to the words or phonemes of the ordered data stream; and utilising the retrieved data to generate a secondary acoustic signal corresponding to the parsed words or phonemes.
2. A method as claimed in claim 1 wherein the method is applied in systems wherein the initial signal is captured by a first voice communication device and the secondary acoustic signal is reproduced by a second voice communication device.
3. A method as claimed in claim 2 wherein the method of claim 1 is applied by the first device.
4. A method as claimed in claim 2 wherein the method of claim 1 is applied by the second device.
5. A method as claimed in any one of claims 2 to 4 wherein for transmission the initial or secondary acoustic signal is converted or encoded according to the standards of the communication network.
6. A method as claimed in any preceding claim wherein the method includes the step of capturing the initial acoustic signal using a suitable microphone or a device comprising a suitable microphone.
7. A method as claimed in any preceding claim wherein the method includes the step of outputting the secondary signal using a suitable loudspeaker or a device comprising a suitable loudspeaker.
8. A method as claimed in any one of claims 2 to 7 wherein the or each voice communication device is a fixed line or cellular telephone; desktop, laptop or tablet computer; audio or audiovisual recording device.
9. A method as claimed in any preceding claim wherein the method includes the step of identifying the voice.
10. A method as claimed in claim 9 wherein identification is achieved by direct consideration of the initial acoustic signal.
11. A method as claimed in claim 9 or claim 10 when dependent directly or indirectly on claim 2, wherein identification is achieved by identifying the voice communication device or a physical or network location of the voice communication device used to capture the acoustic signal.
12. A method as claimed in any preceding claim wherein the method is applied in response to a user request.
13. A method as claimed in any preceding claim wherein the method is applied when background noise exceeds a particular threshold.
14. A method as claimed in claim 13 wherein the method includes the step of measuring the background noise and comparing it to a predetermined threshold
15. A method as claimed in any preceding claim wherein identifying and parsing the words or phonemes in the initial acoustic signal is achieved directly by comparing the acoustic signal to the stored model.
16. A method as claimed in any preceding claim wherein identification and parsing includes a probabilistic prediction based on the syntax of other identified words or phonemes.
17. A method as claimed in any preceding claim wherein the stored model comprises samples of the voice uttering words or phonemes.
18. A method as claimed in any preceding claim wherein the stored model comprises data indicating how characteristics of the voice differ from reference samples of the same words or phonemes.
19. A method as claimed in claim 18 wherein the voice characteristics include accent, cadence, tone, excitation, inflexion, spectral characteristics, or sound/pause duration.
20. A method as claimed in any preceding claim wherein the method includes the step of updating a stored model on an ongoing basis and/or the step of building up and storing a model of any unidentified voices.
21. A method as claimed in claim 20 wherein this is achieved by capturing samples of the voice, analysing the samples to identify corresponding words or phonemes and storing said samples or data indicating how the voice characteristics differ from reference samples of the same words or phonemes
22. A noise reduction system for use in voice communication comprising: a library of stored voice models; a speech detection engine operable to identify elements of an initial acoustic signal corresponding to words or phonemes uttered by a voice and parse the identified elements into an ordered data stream of said words or phonemes; a speech reconstruction engine operable to retrieve data from the library of stored voice models corresponding to the words or phonemes of the ordered data stream and to utilise the retrieved data to generate a secondary acoustic signal corresponding to the parsed words or phonemes.
23. A noise reduction system operable to implement the method of any one of claims Ito 21.
24. A voice communications device incorporating a noise reduction system as claimed in 22 or claim 23.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1219175.5A GB2516208B (en) | 2012-10-25 | 2012-10-25 | Noise reduction in voice communications |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
GB1219175.5A GB2516208B (en) | 2012-10-25 | 2012-10-25 | Noise reduction in voice communications |
Publications (3)
Publication Number | Publication Date |
---|---|
GB201219175D0 GB201219175D0 (en) | 2012-12-12 |
GB2516208A GB2516208A (en) | 2015-01-21 |
GB2516208B true GB2516208B (en) | 2019-08-28 |
Family
ID=47358616
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
GB1219175.5A Active GB2516208B (en) | 2012-10-25 | 2012-10-25 | Noise reduction in voice communications |
Country Status (1)
Country | Link |
---|---|
GB (1) | GB2516208B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106157959B (en) * | 2015-03-31 | 2019-10-18 | 讯飞智元信息科技有限公司 | Sound-groove model update method and system |
CN107481732B (en) * | 2017-08-31 | 2020-10-02 | 广东小天才科技有限公司 | Noise reduction method and device in spoken language evaluation and terminal equipment |
CN113409809B (en) * | 2021-07-07 | 2023-04-07 | 上海新氦类脑智能科技有限公司 | Voice noise reduction method, device and equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2278708A (en) * | 1992-11-04 | 1994-12-07 | Secr Defence | Children's speech training aid |
US20020087307A1 (en) * | 2000-12-29 | 2002-07-04 | Lee Victor Wai Leung | Computer-implemented progressive noise scanning method and system |
US7133827B1 (en) * | 2002-02-06 | 2006-11-07 | Voice Signal Technologies, Inc. | Training speech recognition word models from word samples synthesized by Monte Carlo techniques |
-
2012
- 2012-10-25 GB GB1219175.5A patent/GB2516208B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2278708A (en) * | 1992-11-04 | 1994-12-07 | Secr Defence | Children's speech training aid |
US20020087307A1 (en) * | 2000-12-29 | 2002-07-04 | Lee Victor Wai Leung | Computer-implemented progressive noise scanning method and system |
US7133827B1 (en) * | 2002-02-06 | 2006-11-07 | Voice Signal Technologies, Inc. | Training speech recognition word models from word samples synthesized by Monte Carlo techniques |
Also Published As
Publication number | Publication date |
---|---|
GB201219175D0 (en) | 2012-12-12 |
GB2516208A (en) | 2015-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6790029B2 (en) | A device for managing voice profiles and generating speech signals | |
US10209951B2 (en) | Language-based muting during multiuser communications | |
US7995732B2 (en) | Managing audio in a multi-source audio environment | |
JP6113302B2 (en) | Audio data transmission method and apparatus | |
TWI527024B (en) | Method of transmitting voice data and non-transitory computer readable medium | |
CN107995360B (en) | Call processing method and related product | |
JP5232151B2 (en) | Packet-based echo cancellation and suppression | |
US20200012724A1 (en) | Bidirectional speech translation system, bidirectional speech translation method and program | |
US9936068B2 (en) | Computer-based streaming voice data contact information extraction | |
US9728202B2 (en) | Method and apparatus for voice modification during a call | |
US9299358B2 (en) | Method and apparatus for voice modification during a call | |
CN111919249A (en) | Continuous detection of words and related user experience | |
US11328721B2 (en) | Wake suppression for audio playing and listening devices | |
US9832299B2 (en) | Background noise reduction in voice communication | |
US20130246061A1 (en) | Automatic realtime speech impairment correction | |
US10540983B2 (en) | Detecting and reducing feedback | |
US10204634B2 (en) | Distributed suppression or enhancement of audio features | |
CN110875036A (en) | Voice classification method, device, equipment and computer readable storage medium | |
GB2516208B (en) | Noise reduction in voice communications | |
CN104078049B (en) | Signal processing apparatus and signal processing method | |
Shang et al. | Audio recordings dataset of genuine and replayed speech at both ends of a telecommunication channel | |
JP2014235263A (en) | Speech recognition device and program | |
KR20070072793A (en) | Noise suppressor for audio signal recording and method apparatus | |
JP2016025471A (en) | Echo suppression device, echo suppression program, echo suppression method and communication terminal | |
CN113593568A (en) | Method, system, apparatus, device and storage medium for converting speech into text |