CN112017682A - Single-channel voice simultaneous noise reduction and reverberation removal system - Google Patents
Single-channel voice simultaneous noise reduction and reverberation removal system Download PDFInfo
- Publication number
- CN112017682A CN112017682A CN202010985378.7A CN202010985378A CN112017682A CN 112017682 A CN112017682 A CN 112017682A CN 202010985378 A CN202010985378 A CN 202010985378A CN 112017682 A CN112017682 A CN 112017682A
- Authority
- CN
- China
- Prior art keywords
- voice
- module
- noise reduction
- speech
- dereverberation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000009467 reduction Effects 0.000 title claims abstract description 54
- 238000012549 training Methods 0.000 claims abstract description 27
- 239000000284 extract Substances 0.000 claims abstract description 6
- 238000001228 spectrum Methods 0.000 claims description 19
- 230000000873 masking effect Effects 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 230000006870 function Effects 0.000 description 12
- 238000000034 method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 4
- 238000013135 deep learning Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 241000282414 Homo sapiens Species 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L2021/02082—Noise filtering the noise being echo, reverberation of the speech
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Quality & Reliability (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The invention discloses a system for simultaneously reducing noise and removing reverberation of single-channel voice, which comprises: the voice noise reduction module trains a deep embedded feature extractor by using a deep clustering algorithm, extracts deep embedded features from a mixed voice signal, and maps the input mixed voice to an embedded space without noise, so that the deep embedded features do not contain noise and have great distinctiveness on reverberation and direct sound; the voice dereverberation module is connected with the voice noise reduction module, removes the reverberation voice signal from the deep embedded feature, and estimates the direct sound of a clean target, thereby achieving the purposes of voice noise reduction and dereverberation; the joint training module is respectively connected with the voice noise reduction module and the voice dereverberation module and is used for jointly optimizing the voice noise reduction module and the voice dereverberation module so as to improve the quality and the intelligibility of the enhanced voice.
Description
Technical Field
The invention relates to the technical field of signal processing, in particular to a system for simultaneously reducing noise and removing reverberation of single-channel voice.
Background
Speech is one of the main means for human beings to communicate information, and speech noise reduction and dereverberation have always occupied an important position in speech signal processing. In a real environment, a speech signal often contains both reverberation and noise, which seriously affects the quality and intelligibility of speech, and has a large impact on the performance of a speech recognition and voiceprint recognition system. Therefore, speech dereverberation and noise reduction are important. To solve the speech dereverberation problem, many methods have been proposed over the past years. The Weighted Prediction Error (WPE) algorithm processes speech dereverberation at the signal level, i.e. delayed linear prediction. WPE first derives a frequency dependent linear prediction filter over a number of historical frames. The filtered signal is then subtracted from the original reverberation signal in the subband domain to obtain the enhancement signal. However, when noise and reverberation exist simultaneously, the performance of the WPE algorithm is seriously affected, and the application of the method is limited.
In recent years, with the development of computer technology, a speech dereverberation method based on deep learning has been greatly developed and receives more and more attention. The speech dereverberation method based on deep learning establishes a mapping relation between the characteristic parameters of the mixed speech and the characteristic parameters of the target clean speech signal by training a speech dereverberation model, so that the target clean speech signal can be output by the established dereverberation model for any input mixed speech signal, and the purpose of speech dereverberation is achieved. However, these methods only use amplitude spectrum as a feature, and have no distinction, limiting the speech dereverberation performance. In the case that the voice contains both noise and reverberation, the voice quality after enhancement cannot be guaranteed.
Disclosure of Invention
In order to solve the defects of the prior art and realize the purpose of still keeping the enhanced voice to have higher tone quality under the condition that the voice simultaneously contains noise and reverberation, the invention adopts the following technical scheme:
a single channel speech simultaneous noise reduction and dereverberation system comprising: the voice noise reduction module trains a deep embedded feature extractor by using a deep clustering algorithm, extracts deep embedded features from a mixed voice signal, and maps the input mixed voice to an embedded space without noise, so that the deep embedded features do not contain noise and have great distinctiveness on reverberation and direct sound; the voice dereverberation module is connected with the voice noise reduction module, removes the reverberation voice signal from the deep embedded feature, and estimates the direct sound of a clean target, thereby achieving the purposes of voice noise reduction and dereverberation; the joint training module is respectively connected with the voice noise reduction module and the voice dereverberation module and is used for jointly optimizing the voice noise reduction module and the voice dereverberation module so as to improve the quality and the intelligibility of the enhanced voice.
The voice noise reduction module carries out short-time Fourier transform on an input mixed voice signal, models the input mixed voice signal after transforming a time domain signal to a frequency domain signal, extracts deep embedded features by using a deep clustering algorithm, maps the input mixed voice to an embedded space without noise, and trains the deep embedded features by using a deep neural network, wherein the training loss objective function of the voice noise reduction module is as follows:
v is a feature that is embedded in depth, representing real numbers, TF is a time frequency block after Fourier transformation, B is the corresponding relation between direct sound and reverberation of each time frequency block,the square Frobenius norm is expressed, so that the aim of voice noise reduction is fulfilled.
The voice dereverberation module is realized by using a deep neural network, the input of the network is a deep embedded characteristic, the output is an estimated target floating point masking value, and the formula is as follows:
is the estimated target floating-point masking value, the training loss objective function of the speech dereverberation module is:
the I Y (t, f) I is the amplitude spectrum of the mixed voice, the I X (t, f) I is the amplitude spectrum of the target clean direct sound, and the input amplitude spectrum Y (t, f) I of the mixed voice and the estimated target floating point masking value are utilizedAnd performing point-by-point multiplication to obtain an estimated amplitude spectrum of the target clean direct sound, and calculating a mean square error between the estimated amplitude spectrum of the target clean direct sound and the estimated amplitude spectrum of the target clean direct sound.
The joint training module is used for jointly optimizing the voice noise reduction module and the voice dereverberation module, and linearly adding the target function of the voice noise reduction module and the target function of the voice dereverberation module with certain weight to serve as a final target function so as to jointly optimize the voice noise reduction module and the voice dereverberation module and improve the performance of the voice enhancement system.
The overall training objective function is:
Jtotal=λJDC+(1-λ)J
and lambda is the weight of the voice noise reduction module and the voice dereverberation module, and finally, the whole voice noise reduction and dereverberation module is optimized in a joint training mode.
The invention has the advantages and beneficial effects that:
the voice noise reduction module carries out noise reduction through feature extraction, and the extracted features distinguish reverberation from direct sound, so that the distinguishing performance of a voice reverberation-free system on the reverberation and the direct sound is improved; the voice dereverberation module estimates a target clean direct sound through training a neural network, so that the voice dereverberation performance is improved; the combined training module jointly optimizes the voice noise reduction module and the voice dereverberation module, and ensures the performance of voice enhancement while obtaining the depth embedded feature with distinctiveness, so that the enhanced voice can be clearer and understandable, and the tone quality is better.
Drawings
Fig. 1 is a schematic block diagram of the present invention.
Fig. 2 is a schematic structural diagram of a speech noise reduction module according to the present invention.
Fig. 3 is a schematic diagram of the structure of the speech dereverberation module in the present invention.
FIG. 4 is a schematic diagram of the structure of the joint training module according to the present invention.
Detailed Description
The following detailed description of embodiments of the invention refers to the accompanying drawings. It should be understood that the detailed description and specific examples, while indicating the present invention, are given by way of illustration and explanation only, not limitation.
As shown in fig. 1, a simultaneous noise reduction and dereverberation system for single-channel speech includes: the voice noise reduction module trains a deep embedded feature extractor by utilizing a deep clustering algorithm, extracts deep embedded features from a mixed voice signal, and maps input voice into an embedded space without noise, so that the deep embedded features do not contain noise and have great distinctiveness on reverberation and direct sound; the voice dereverberation module is connected with the voice noise reduction module, and removes the reverberation voice signal from the deep embedded feature by utilizing the distinctiveness to estimate the direct sound of a clean target, thereby achieving the purposes of voice noise reduction and dereverberation; and the joint training module is respectively connected with the voice noise reduction module and the voice dereverberation module and is used for jointly optimizing the voice noise reduction and voice dereverberation modules and improving the quality and the intelligibility of the enhanced voice.
As shown in fig. 2, the voice noise reduction module performs short-time fourier transform on the input mixed voice signal, transforms the time domain signal to the frequency domain signal, and then models it; the voice noise reduction module extracts deep embedded features by using a deep clustering algorithm, input voice with noise and reverberation is mapped into an embedded space without noise, namely, the voice with noise and reverberation only contains the deep embedded features of the reverberation, the deep embedded features are obtained by using deep neural network training, and the training loss objective function of the voice noise reduction module is as follows:
where V is a deep embedded feature, representing real numbers, TF is a time frequency block after Fourier transformation, B is the corresponding relation between direct sound and reverberation of each time frequency block,represents the squared Frobenius norm, for example: b if the direct sound is larger than the reverberant energy at time-frequency block tftf,11 and Btf,20; otherwise Btf,10 and Btf,2The method is equivalent to mapping the input mixed voice to an embedded space which only contains reverberation and has no noise, so as to achieve the purpose of voice noise reduction.
As shown in fig. 3, the speech dereverberation module is used for training a speech dereverberation model, and the module is implemented by using a deep neural network, where the input of the network is a deep embedded feature, and the output is an estimated target floating point masking value, and the formula is as follows:
wherein the content of the first and second substances,is the estimated target floating-point masking value, the training loss objective function of the speech dereverberation module is:
wherein | Y (t, f) | is the amplitude spectrum of the mixed voice, | X (t, f) | is the amplitude spectrum of the target clean direct sound, and the input amplitude spectrum | Y (t, f) | of the mixed voice and the estimated target floating point masking value are utilizedAnd performing point-by-point multiplication to obtain an estimated amplitude spectrum of the target clean direct sound, and calculating a mean square error between the estimated amplitude spectrum and the real amplitude spectrum.
As shown in fig. 4, the joint training module is used for jointly optimizing the speech noise reduction module and the speech dereverberation module, and the objective function of the speech noise reduction module and the objective function of the speech dereverberation module are linearly added with a certain weight to serve as a final objective function, so that the joint optimization of each module is performed, and the performance of the speech enhancement system is improved.
The overall training objective function is:
Jtotal=λJDC+(1-λ)J
wherein, λ represents the weight of the speech noise reduction module and the speech dereverberation module, and finally, the whole speech noise reduction and dereverberation system is optimized in a joint training mode.
And after the training is finished, the mixed voice signal is sequentially input into the voice noise reduction module and the voice dereverberation module, and a target clean direct sound signal is obtained.
The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.
Claims (4)
1. A single channel speech simultaneous noise reduction and dereverberation system, comprising: the system comprises a voice noise reduction module, a voice dereverberation module and a joint training module, wherein the voice noise reduction module utilizes a deep clustering algorithm to train a deep embedded feature extractor, deep embedded features are extracted from mixed voice signals, input mixed voice is mapped into an embedded space without noise, the voice dereverberation module is connected with the voice noise reduction module, the reverberation voice signals are removed from the deep embedded features, direct sound of a clean target is estimated, and the joint training module is respectively connected with the voice noise reduction module and the voice dereverberation module and used for jointly optimizing the voice noise reduction and voice dereverberation modules.
2. The system of claim 1, wherein the speech noise reduction module performs short-time fourier transform on the input mixed speech signal, transforms the time domain signal into the frequency domain signal, models the frequency domain signal, extracts deep embedded features by using a deep clustering algorithm, maps the input mixed speech into an embedded space without noise, the deep embedded features are obtained by using deep neural network training, and the training loss objective function of the speech noise reduction module is:
3. The system of claim 1, wherein the voice dereverberation module is implemented by using a deep neural network, the input of the network is a deep embedded feature, and the output is an estimated target floating point masking value, and the formula is as follows:
is the estimated target floating-point masking value, the training loss objective function of the speech dereverberation module is:
the I Y (t, f) I is the amplitude spectrum of the mixed voice, the I X (t, f) I is the amplitude spectrum of the target clean direct sound, and the input amplitude spectrum Y (t, f) I of the mixed voice and the estimated target floating point masking value are utilizedAnd performing point-by-point multiplication to obtain an estimated amplitude spectrum of the target clean direct sound, and calculating a mean square error between the estimated amplitude spectrum of the target clean direct sound and the estimated amplitude spectrum of the target clean direct sound.
4. The system of claim 1, wherein the joint training module is configured to jointly optimize the speech noise reduction module and the speech dereverberation module, and linearly add the objective function of the speech noise reduction module and the objective function of the speech dereverberation module with a certain weight as a final objective function, so as to jointly optimize the speech noise reduction module and the speech dereverberation module.
The overall training objective function is:
Jtotal=λJDC+(1-λ)J
λ is the weight of the speech noise reduction module and the speech dereverberation module.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010985378.7A CN112017682B (en) | 2020-09-18 | 2020-09-18 | Single-channel voice simultaneous noise reduction and reverberation removal system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010985378.7A CN112017682B (en) | 2020-09-18 | 2020-09-18 | Single-channel voice simultaneous noise reduction and reverberation removal system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112017682A true CN112017682A (en) | 2020-12-01 |
CN112017682B CN112017682B (en) | 2023-05-23 |
Family
ID=73522656
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010985378.7A Active CN112017682B (en) | 2020-09-18 | 2020-09-18 | Single-channel voice simultaneous noise reduction and reverberation removal system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112017682B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112837697A (en) * | 2021-02-20 | 2021-05-25 | 北京猿力未来科技有限公司 | Echo suppression method and device |
CN112992170A (en) * | 2021-01-29 | 2021-06-18 | 青岛海尔科技有限公司 | Model training method and device, storage medium and electronic device |
CN113257265A (en) * | 2021-05-10 | 2021-08-13 | 北京有竹居网络技术有限公司 | Voice signal dereverberation method and device and electronic equipment |
CN113724723A (en) * | 2021-09-02 | 2021-11-30 | 西安讯飞超脑信息科技有限公司 | Reverberation and noise suppression method, device, electronic equipment and storage medium |
CN114220448A (en) * | 2021-12-16 | 2022-03-22 | 游密科技(深圳)有限公司 | Voice signal generation method and device, computer equipment and storage medium |
CN115424628A (en) * | 2022-07-20 | 2022-12-02 | 荣耀终端有限公司 | Voice processing method and electronic equipment |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140270216A1 (en) * | 2013-03-13 | 2014-09-18 | Accusonus S.A. | Single-channel, binaural and multi-channel dereverberation |
US20150071461A1 (en) * | 2013-03-15 | 2015-03-12 | Broadcom Corporation | Single-channel suppression of intefering sources |
CN108538305A (en) * | 2018-04-20 | 2018-09-14 | 百度在线网络技术(北京)有限公司 | Audio recognition method, device, equipment and computer readable storage medium |
US20190043491A1 (en) * | 2018-05-18 | 2019-02-07 | Intel Corporation | Neural network based time-frequency mask estimation and beamforming for speech pre-processing |
CN109817209A (en) * | 2019-01-16 | 2019-05-28 | 深圳市友杰智新科技有限公司 | A kind of intelligent speech interactive system based on two-microphone array |
CN109949821A (en) * | 2019-03-15 | 2019-06-28 | 慧言科技(天津)有限公司 | A method of far field speech dereverbcration is carried out using the U-NET structure of CNN |
CN110503972A (en) * | 2019-08-26 | 2019-11-26 | 北京大学深圳研究生院 | Sound enhancement method, system, computer equipment and storage medium |
CN110544482A (en) * | 2019-09-09 | 2019-12-06 | 极限元(杭州)智能科技股份有限公司 | single-channel voice separation system |
CN111372041A (en) * | 2019-11-01 | 2020-07-03 | 广州畅驿智能科技有限公司 | Monitoring equipment and monitoring system |
US20200219524A1 (en) * | 2017-09-21 | 2020-07-09 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Signal processor and method for providing a processed audio signal reducing noise and reverberation |
-
2020
- 2020-09-18 CN CN202010985378.7A patent/CN112017682B/en active Active
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140270216A1 (en) * | 2013-03-13 | 2014-09-18 | Accusonus S.A. | Single-channel, binaural and multi-channel dereverberation |
US20180047378A1 (en) * | 2013-03-13 | 2018-02-15 | Accusonus, Inc. | Single-channel, binaural and multi-channel dereverberation |
US20150071461A1 (en) * | 2013-03-15 | 2015-03-12 | Broadcom Corporation | Single-channel suppression of intefering sources |
US20200219524A1 (en) * | 2017-09-21 | 2020-07-09 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Signal processor and method for providing a processed audio signal reducing noise and reverberation |
CN111512367A (en) * | 2017-09-21 | 2020-08-07 | 弗劳恩霍夫应用研究促进协会 | Signal processor and method providing processed noise reduced and reverberation reduced audio signals |
CN108538305A (en) * | 2018-04-20 | 2018-09-14 | 百度在线网络技术(北京)有限公司 | Audio recognition method, device, equipment and computer readable storage medium |
US20190043491A1 (en) * | 2018-05-18 | 2019-02-07 | Intel Corporation | Neural network based time-frequency mask estimation and beamforming for speech pre-processing |
CN109817209A (en) * | 2019-01-16 | 2019-05-28 | 深圳市友杰智新科技有限公司 | A kind of intelligent speech interactive system based on two-microphone array |
CN109949821A (en) * | 2019-03-15 | 2019-06-28 | 慧言科技(天津)有限公司 | A method of far field speech dereverbcration is carried out using the U-NET structure of CNN |
CN110503972A (en) * | 2019-08-26 | 2019-11-26 | 北京大学深圳研究生院 | Sound enhancement method, system, computer equipment and storage medium |
CN110544482A (en) * | 2019-09-09 | 2019-12-06 | 极限元(杭州)智能科技股份有限公司 | single-channel voice separation system |
CN111372041A (en) * | 2019-11-01 | 2020-07-03 | 广州畅驿智能科技有限公司 | Monitoring equipment and monitoring system |
Non-Patent Citations (3)
Title |
---|
MATTHIAS WOLFEL: "Enhanced Speech Features by Single-Channel Joint Compensation of Noise and Reverberation" * |
曹猛: "基于计算听觉场景分析和深度神经网络的混响语音分离", 《万方》 * |
杨磊主编: "《数字媒体技术概论》", 30 September 2017, 中国铁道出版社 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112992170A (en) * | 2021-01-29 | 2021-06-18 | 青岛海尔科技有限公司 | Model training method and device, storage medium and electronic device |
CN112992170B (en) * | 2021-01-29 | 2022-10-28 | 青岛海尔科技有限公司 | Model training method and device, storage medium and electronic device |
CN112837697A (en) * | 2021-02-20 | 2021-05-25 | 北京猿力未来科技有限公司 | Echo suppression method and device |
CN112837697B (en) * | 2021-02-20 | 2024-05-14 | 北京猿力未来科技有限公司 | Echo suppression method and device |
CN113257265A (en) * | 2021-05-10 | 2021-08-13 | 北京有竹居网络技术有限公司 | Voice signal dereverberation method and device and electronic equipment |
CN113724723A (en) * | 2021-09-02 | 2021-11-30 | 西安讯飞超脑信息科技有限公司 | Reverberation and noise suppression method, device, electronic equipment and storage medium |
CN113724723B (en) * | 2021-09-02 | 2024-06-11 | 西安讯飞超脑信息科技有限公司 | Reverberation and noise suppression method and device, electronic equipment and storage medium |
CN114220448A (en) * | 2021-12-16 | 2022-03-22 | 游密科技(深圳)有限公司 | Voice signal generation method and device, computer equipment and storage medium |
CN115424628A (en) * | 2022-07-20 | 2022-12-02 | 荣耀终端有限公司 | Voice processing method and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN112017682B (en) | 2023-05-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112017682A (en) | Single-channel voice simultaneous noise reduction and reverberation removal system | |
CN109841226B (en) | Single-channel real-time noise reduction method based on convolution recurrent neural network | |
CN110428849B (en) | Voice enhancement method based on generation countermeasure network | |
CN113488058B (en) | Voiceprint recognition method based on short voice | |
CN109949821B (en) | Method for removing reverberation of far-field voice by using U-NET structure of CNN | |
CN110544482B (en) | Single-channel voice separation system | |
CN107068167A (en) | Merge speaker's cold symptoms recognition methods of a variety of end-to-end neural network structures | |
CN108597505A (en) | Audio recognition method, device and terminal device | |
CN105679312A (en) | Phonetic feature processing method of voiceprint identification in noise environment | |
CN109767781A (en) | Speech separating method, system and storage medium based on super-Gaussian priori speech model and deep learning | |
CN110660406A (en) | Real-time voice noise reduction method of double-microphone mobile phone in close-range conversation scene | |
CN111899750B (en) | Speech enhancement algorithm combining cochlear speech features and hopping deep neural network | |
CN103021405A (en) | Voice signal dynamic feature extraction method based on MUSIC and modulation spectrum filter | |
CN111489763B (en) | GMM model-based speaker recognition self-adaption method in complex environment | |
CN113763965A (en) | Speaker identification method with multiple attention characteristics fused | |
CN104778948A (en) | Noise-resistant voice recognition method based on warped cepstrum feature | |
CN114023353A (en) | Transformer fault classification method and system based on cluster analysis and similarity calculation | |
CN111341351B (en) | Voice activity detection method, device and storage medium based on self-attention mechanism | |
CN115472168B (en) | Short-time voice voiceprint recognition method, system and equipment for coupling BGCC and PWPE features | |
CN112233657A (en) | Speech enhancement method based on low-frequency syllable recognition | |
CN111524520A (en) | Voiceprint recognition method based on error reverse propagation neural network | |
CN111916060A (en) | Deep learning voice endpoint detection method and system based on spectral subtraction | |
CN116665681A (en) | Thunder identification method based on combined filtering | |
CN111462770A (en) | L STM-based late reverberation suppression method and system | |
CN113571074B (en) | Voice enhancement method and device based on multi-band structure time domain audio frequency separation network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |