CN111261179A - Echo cancellation method and device and intelligent equipment - Google Patents

Echo cancellation method and device and intelligent equipment Download PDF

Info

Publication number
CN111261179A
CN111261179A CN201811453725.0A CN201811453725A CN111261179A CN 111261179 A CN111261179 A CN 111261179A CN 201811453725 A CN201811453725 A CN 201811453725A CN 111261179 A CN111261179 A CN 111261179A
Authority
CN
China
Prior art keywords
echo cancellation
signal
preset
reference signal
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811453725.0A
Other languages
Chinese (zh)
Inventor
薛少飞
陈梦喆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201811453725.0A priority Critical patent/CN111261179A/en
Publication of CN111261179A publication Critical patent/CN111261179A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

The application discloses an echo cancellation method and device and intelligent equipment, wherein an echo cancellation model with a plurality of layers of recursive networks is used for processing an input signal, and influences caused by nonlinear factors are improved to a great extent, so that echo cancellation generates a better effect, and actual requirements are better met. Moreover, the application considers the case of multiple loudspeakers and multiple microphone arrays, so that the application is wider and more convenient.

Description

Echo cancellation method and device and intelligent equipment
Technical Field
The present application relates to, but not limited to, artificial intelligence technologies, and in particular, to an echo cancellation method and apparatus, and an intelligent device.
Background
Echo feedback is a common problem in electro-acoustic instruments, such as telephones, hearing aids, etc. Echo feedback seriously affects the quality of the speech signal, often causes noise problems such as howling and whistling, reduces the gain of the system, and changes the response of the system.
Adaptive Echo Control (AEC) is based on the correlation between the output signal of a loudspeaker and the multipath Echo generated by the output signal of the loudspeaker, and subtracts an Echo estimate from the input signal of the sound pickup device, thereby achieving the purpose of canceling the Echo.
After the intelligent device is born, the intelligent voice device also needs to eliminate the sound source of the intelligent device, as shown in fig. 1, after the reference signal (i.e. the input signal entering the loudspeaker of the intelligent device) in the intelligent device is amplified by the loudspeaker, the reference signal and the original signal such as the human voice are received by the microphone array of the intelligent device, so as to form a received signal. For example, the smart speaker needs to eliminate music played by itself, and for example, the smart television needs to eliminate sound of a television program played by itself. In scenarios such as voice wake-up, voice recognition, etc., scenarios requiring echo cancellation are often encountered.
In the related art, echo cancellation mainly performs voice wake-up or voice recognition after processing multi-channel sound on a signal layer. On one hand, the original sound signal and the reference signal are processed in a linear mode, and in practical situations, due to reverberation, equipment structures and the like, a large number of nonlinear transformations exist, and the influence of the nonlinear factors cannot be overcome. On the other hand, the method can only judge the AEC effect from the sense of hearing, and the optimization of the sense of hearing does not mean the promotion of voice awakening and voice recognition effects.
Disclosure of Invention
The application provides an echo cancellation method and device and intelligent equipment, which can enable echo cancellation to generate a better effect, so that actual requirements can be better met.
The embodiment of the invention provides an echo cancellation method, which comprises the following steps:
respectively inputting all reference signals into an echo cancellation model according to the number of channels of the received signals, and calculating to obtain a reference signal estimation value;
subtracting the reference signal estimation value corresponding to each channel from the radio signal of each channel to obtain an original signal estimation value;
and carrying out normalization processing on the original signal estimation values corresponding to all channels to obtain original signals.
In one illustrative example, the method further comprises generating the echo cancellation model, comprising:
simulating a radio signal by using a preset original signal and a preset reference signal;
and training the network to be trained by taking the radio signal obtained by simulation and a preset reference signal as input and taking a preset original signal as a modeling target to obtain the echo cancellation model.
In one illustrative example, the echo cancellation model includes a multi-layer recursive network.
In one illustrative example, the analog radio signal comprises:
after the preset impulse response is carried out on the preset reference signal, a preset environmental noise signal is added to obtain a first signal;
and superposing the first signal and a preset original signal to obtain the simulated radio signal.
In one illustrative example, the network to be trained includes at least one of:
the feedforward sequence memory neural network FSMN, the deep feedforward sequence memory neural network DFSMN, the long and short time memory unit LSTM, the bidirectional long and short time memory unit BLSTM or the gate cycle unit GRU.
In one illustrative example, the reference signal comprises at least the original signal comprises at least one path.
In one illustrative example, the method further comprises:
and performing joint training by adopting the obtained original signal, a voice awakening model and a voice recognition model.
The application also provides an echo cancellation processing method, which comprises the following steps:
simulating a radio signal by using a preset original signal and a preset reference signal;
and training the network to be trained by taking the radio signal obtained by simulation and a preset reference signal as input and taking a preset original signal as a modeling target to obtain the echo cancellation model.
In one illustrative example, the analog radio signal comprises:
after the preset impulse response is carried out on the preset reference signal, a preset environmental noise signal is added to obtain a first signal;
and superposing the first signal and a preset original signal to obtain the simulated radio signal.
In one illustrative example, the network to be trained includes at least one of:
the feedforward sequence memory neural network FSMN, the deep feedforward sequence memory neural network DFSMN, the long and short time memory unit LSTM, the bidirectional long and short time memory unit BLSTM or the gate cycle unit GRU.
The present application further provides a computer-readable storage medium storing computer-executable instructions for performing the echo cancellation method of any one of the above and/or performing the echo cancellation processing method of any one of the above.
The present application further provides an echo cancellation device, comprising a memory and a processor, wherein the memory stores the following instructions executable by the processor: for performing the steps of the echo cancellation method of any one of the above.
The application also provides an echo cancellation device, comprising a memory and a processor, wherein the memory stores the following instructions executable by the processor: for performing the steps of the echo cancellation processing method of any one of the above.
The present application further provides a smart device comprising a memory and a processor, wherein the memory has stored therein the following instructions executable by the processor:
according to the number of channels of the received radio signals, inputting all reference signals entering a loudspeaker into an echo cancellation model respectively, and calculating to obtain a plurality of reference signal estimation values;
for each channel, subtracting a reference signal estimation value corresponding to the channel from the radio signal of each channel to obtain a plurality of original signal estimation values;
and carrying out normalization processing on the original signal estimation values corresponding to all the channels to obtain original signals entering the microphone array.
In one illustrative example, the smart device comprises: intelligent audio amplifier, intelligent TV.
According to the echo cancellation method and device, the echo cancellation model with the multilayer recursive network is used for processing the input signals, and influences caused by nonlinear factors are improved to a great extent, so that echo cancellation can generate a better effect, and actual requirements can be better met. Moreover, the application considers the case of multiple loudspeakers and multiple microphone arrays, so that the application is wider and more convenient.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the claimed subject matter and are incorporated in and constitute a part of this specification, illustrate embodiments of the subject matter and together with the description serve to explain the principles of the subject matter and not to limit the subject matter.
FIG. 1 is a schematic diagram of radio signal generation in a smart device;
FIG. 2 is a flow chart of an echo cancellation processing method according to the present application;
FIG. 3 is a diagram of a network architecture in which an embodiment of echo cancellation and speech recognition are used in conjunction with one embodiment of the present application;
FIG. 4 is a schematic diagram of the structure of an echo cancellation processing apparatus according to the present application;
FIG. 5 is a flow chart of the echo cancellation method of the present application;
FIG. 6 is a schematic diagram of an embodiment of an echo cancellation network according to the present application;
fig. 7 is a schematic diagram of a structure of an echo cancellation device according to the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in detail below with reference to the accompanying drawings. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.
In one exemplary configuration of the present application, a computing device includes one or more processors (CPUs), input/output interfaces, a network interface, and memory.
The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
The inventor of the application discovers through research on echo cancellation related technologies that if AEC can be realized based on a deep neural network, the strong nonlinear modeling capability of the deep neural network can be fully utilized, and the influence of nonlinear factors in actual conditions is handled.
Fig. 2 is a flowchart of an echo cancellation processing method of the present application, for training generation of an echo cancellation model, as shown in fig. 2, including:
step 200: and simulating the radio signal by using a preset original signal and a preset reference signal.
In one illustrative example, the analog radio signal may include:
after preset impulse response is carried out on a preset reference signal, a preset environmental noise signal is added to obtain a first signal;
and superposing the first signal and a preset original signal to obtain an analog radio signal.
Taking a microphone array (including 4 microphones) as an example, the far field data is usually simulated by using the near field data, and the formula is as follows:
y_1(t)=x(t)*h_s1(t)+n(t);y_2(t)=x(t)*h_s2(t)+n(t);y_3(t)=x(t)*h_s3(t)+n(t);y_4(t)=x(t)*h_s4(t)+n(t)。
where y _ i (t) represents the far-field data of the ith microphone generated by simulation, x (t) represents the near-field data, h _ si (t) represents the impulse response of the ith microphone determined by the house, environment and microphone position, represents the convolution operation, and n (t) represents the ambient noise. i is 1, 2, 3 or 4.
Step 201: and training the network to be trained by taking the radio signal obtained by simulation and a preset reference signal as input and taking a preset original signal as a modeling target to obtain an echo cancellation model.
In an exemplary embodiment, the modeling criterion may be a minimum mean square error criterion or may be modeled in the form of a mask. The general goal is to establish a mapping from the received signal to the original signal.
In an exemplary example, the network to be trained may include a multi-layer recursive network such as a feed-forward Sequential Memory neural network (FSMN), or a Deep feed-forward Sequential Memory neural network (DFSMN), or a Long-Short Term Memory unit (LSTM), or a Bidirectional Long-Short Term Memory unit (BLSTM), or a Gated Recursive Unit (GRU). Wherein LSTM is a time recursive Recurrent Neural Networks (RNN)
It should be noted that how to train the network to be trained to obtain the specific implementation of the echo cancellation model is not used to limit the scope of the present application. The application emphasizes that a multi-channel radio signal and a multi-channel reference signal obtained through simulation are used as input, and the network structure of the application is adopted to train the echo cancellation multi-layer neural network.
In one illustrative example, the reference signal includes at least one channel and the original signal includes at least one channel.
In an exemplary embodiment, the multi-channel speech signal after echo cancellation can be further used for model training of subsequent voice wakeup and speech recognition, and can be subjected to joint training.
In an exemplary example, taking echo cancellation followed by a speech recognition model as an example, assuming that the collected signals are 2 channels of original signals (such as wav1 and wav2 in fig. 3) and 2 channels of reference signals (such as ref1 and ref2 in fig. 3), the input of the network is the collected 4 channels of signals, and the echo cancellation process is performed on the collected 4 channels of signals, such as a dashed frame portion indicated by NN Front-end shown in fig. 3, to implement an AEC function, and the AEC processed signals are combined with the reference signals and then input to an Acoustic Model (AM) portion. During training, the NN Front-end and the AM are trained independently, and then the two networks are connected in series to be trained jointly.
The structure shown in fig. 3 is suitable for a wide range of application scenarios, such as: the method comprises the following steps that a multi-channel signal is collected by a wind array and can be added to an input end of a neural network at the same time; the following steps are repeated: there are cases of different types of signals, such as a case where an internal signal is simultaneously acquired in addition to an external signal, and the like.
The echo cancellation model obtained by training is a multilayer recursive network, and is very suitable for overcoming the influence caused by nonlinear factors, so that better effect is generated by echo cancellation, and the actual requirement is better met.
The present application further provides a computer-readable storage medium storing computer-executable instructions for performing the echo cancellation processing method of any one of the above.
The present application further provides an echo cancellation model generation apparatus, comprising a memory and a processor, wherein the memory stores the steps of the echo cancellation processing method according to any one of the above.
The present application further provides a smart device comprising a memory and a processor, wherein the memory has stored therein the following instructions executable by the processor:
according to the number of channels of the received radio signals, inputting all reference signals entering a loudspeaker into an echo cancellation model respectively, and calculating to obtain a plurality of reference signal estimation values;
for each channel, subtracting a reference signal estimation value corresponding to the channel from the radio signal of each channel to obtain a plurality of original signal estimation values;
and carrying out normalization processing on the original signal estimation values corresponding to all the channels to obtain original signals entering the microphone array.
In one illustrative example, the smart device comprises: intelligent audio amplifier, intelligent TV.
Fig. 4 is a schematic diagram of a structure of an echo cancellation processing apparatus according to the present application, as shown in fig. 4, at least including: the device comprises a signal processing module and a training module; wherein the content of the first and second substances,
the signal processing module is used for simulating a radio signal by using the original signal and the reference signal;
and the training module is used for training to obtain the echo cancellation model by taking the radio signal and the reference signal obtained by simulation as input and the original signal as a modeling target.
In an illustrative example, the signal processing module is specifically configured to:
after the reference signal is subjected to preset impulse response, a preset environmental noise signal is added to obtain a first signal; and superposing the first signal and the original signal to obtain an analog radio signal.
In an illustrative example, the network to be trained may include a multi-layer neural network such as FSMN, or DFSMN, or LSTM, or BLSTM, or GRU.
In one illustrative example, the reference signal includes at least one channel and the original signal includes at least one channel.
Fig. 5 is a flowchart of an echo cancellation method according to the present application, as shown in fig. 5, including:
step 500: and respectively calculating all reference signal echo cancellation models according to the number of the channels of the received signals to obtain a reference signal estimation value.
In an exemplary embodiment, the echo cancellation model is a multi-layer recursive network trained using a simulated radio signal of an original signal and a reference signal.
In one illustrative example, the representative form of the radio signal or reference signal may include, but is not limited to, such as: a raw Wave (WAV) signal, or a Fast Fourier Transform (FFT) signal that has undergone a Fourier Transform, or a commonly used word speech wake-up, FilterBank (FKank) feature of speech recognition, etc.
In an exemplary embodiment, the multi-layer recursive network for implementing the echo cancellation model in the present application is divided into a plurality of sub-networks according to the number of channels of the received signal, and each sub-network is an echo cancellation model with all reference signals as inputs. And obtaining a reference signal estimation value corresponding to the channel after echo cancellation model calculation.
In an exemplary embodiment, all reference signals in the sub-network corresponding to each channel are processed by a multi-layer recursive network for non-linearity (including linearity), the sub-network includes multiple layers, such as a recursive network layer of FSMN or LSTM-based RNN, a multi-layer normalization layer, and a direct connection of a residual network.
Step 501: and subtracting the reference signal estimation value corresponding to each channel from the radio signal of each channel to obtain an original signal estimation value.
In one exemplary embodiment, the steps include: for each channel, the estimated reference signal is subtracted from the received signal to obtain the raw signal estimate.
Fig. 6 is a schematic diagram of an embodiment of an echo cancellation network according to the present application, and as shown in fig. 6, in this embodiment, it is assumed that the number of speakers is 2, and the number of microphone arrays is also 2, that is, sound reception signals of 2 channels and reference signals of 2 channels are input, such as a sound reception signal 1 (represented as sound reception signal 1 (channel 1) in fig. 6) of channel 1, a sound reception signal 2 (represented as sound reception signal 2 (channel 2) in fig. 6) of channel 2, a reference signal 1 (represented as reference signal 1 (channel 1) in fig. 6) of channel 1, and a reference signal 2 (represented as reference signal 2 (channel 2) in fig. 6) of channel 2 shown in fig. 6 of the present application. Then, as shown by the dashed-line frame part in fig. 6, the multi-layer recursive network for implementing the echo cancellation model in the present embodiment is divided into 2 sub-networks according to the number of channels of the received signal. In each sub-network, there are for example two recursive network layers, such as FSMN or DFSMN or LSTM or BLSTM or GRU, and a plurality of normalization layers, where in this embodiment there is one normalization layer in the processing of all reference information in each sub-network and one normalization layer in the processing of reference signal estimates from each sub-network.
Step 502: and carrying out normalization processing on the original signal estimation values corresponding to all channels to obtain original signals.
According to the echo cancellation method and device, the echo cancellation model with the multilayer recursive network is used for processing the input signals, and influences caused by nonlinear factors are improved to a great extent, so that echo cancellation can generate a better effect, and actual requirements can be better met. Moreover, the application considers the case of multiple loudspeakers and multiple microphone arrays, so that the application is wider and more convenient.
In an exemplary embodiment, the echo cancellation processing of the present application is networked, so that joint training of the echo cancellation processing and a back-end voice wake-up and voice recognition model is more flexible.
The present application further provides a computer-readable storage medium having stored thereon computer-executable instructions for performing the echo cancellation method of any of the above.
The present application further provides an echo cancellation device, comprising a memory and a processor, wherein the memory stores the steps of the echo cancellation method of any one of the above.
Fig. 7 is a schematic structural diagram of the echo cancellation device according to the present application, as shown in fig. 7, at least including: the device comprises a first estimation module, a second estimation module and a processing module; wherein the content of the first and second substances,
the first estimation module is used for respectively inputting all reference signals into the echo cancellation model according to the number of the channels of the received signals and calculating to obtain a reference signal estimation value;
the second estimation module is used for subtracting the reference signal estimation value obtained by calculation corresponding to each channel from the radio signal of each channel to obtain an original signal estimation value;
and the processing module is used for carrying out normalization processing on the original signal estimation value corresponding to each channel to obtain an expected original signal.
Although the embodiments disclosed in the present application are described above, the descriptions are only for the convenience of understanding the present application, and are not intended to limit the present application. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims (15)

1. An echo cancellation method, comprising:
respectively inputting all reference signals into an echo cancellation model according to the number of channels of the received signals, and calculating to obtain a reference signal estimation value;
subtracting the reference signal estimation value corresponding to each channel from the radio signal of each channel to obtain an original signal estimation value;
and carrying out normalization processing on the original signal estimation values corresponding to all channels to obtain original signals.
2. The echo cancellation method of claim 1, further comprising, before the method, generating the echo cancellation model, comprising:
simulating a radio signal by using a preset original signal and a preset reference signal;
and training the network to be trained by taking the radio signal obtained by simulation and a preset reference signal as input and taking a preset original signal as a modeling target to obtain the echo cancellation model.
3. The echo cancellation method of claim 2, wherein the echo cancellation model comprises a multi-layer recursive network.
4. The echo cancellation method of claim 2, wherein the analog radio signal comprises:
after the preset impulse response is carried out on the preset reference signal, a preset environmental noise signal is added to obtain a first signal;
and superposing the first signal and a preset original signal to obtain the simulated radio signal.
5. The echo cancellation method according to claim 2, wherein the network to be trained comprises at least one of:
the feedforward sequence memory neural network FSMN, the deep feedforward sequence memory neural network DFSMN, the long and short time memory unit LSTM, the bidirectional long and short time memory unit BLSTM or the gate cycle unit GRU.
6. The echo cancellation method according to claim 1, wherein the reference signal comprises at least the original signal comprises at least one path.
7. The echo cancellation method of claim 1, the method further comprising:
and performing joint training by adopting the obtained original signal, a voice awakening model and a voice recognition model.
8. An echo cancellation processing method, comprising:
simulating a radio signal by using a preset original signal and a preset reference signal;
and training the network to be trained by taking the radio signal obtained by simulation and a preset reference signal as input and taking a preset original signal as a modeling target to obtain the echo cancellation model.
9. The echo cancellation processing method of claim 8, wherein the analog radio signal comprises:
after the preset impulse response is carried out on the preset reference signal, a preset environmental noise signal is added to obtain a first signal;
and superposing the first signal and a preset original signal to obtain the simulated radio signal.
10. The echo cancellation process of claim 8, wherein the network to be trained comprises at least one of:
the feedforward sequence memory neural network FSMN, the deep feedforward sequence memory neural network DFSMN, the long and short time memory unit LSTM, the bidirectional long and short time memory unit BLSTM or the gate cycle unit GRU.
11. A computer-readable storage medium storing computer-executable instructions for performing the echo cancellation method of any one of claims 1 to 7 and/or performing the echo cancellation processing method of any one of claims 8 to 10.
12. An apparatus for echo cancellation comprising a memory and a processor, wherein the memory has stored therein the following instructions executable by the processor: steps for performing the echo cancellation method of any one of claims 1 to 7.
13. An apparatus for echo cancellation comprising a memory and a processor, wherein the memory has stored therein the following instructions executable by the processor: steps for performing the echo cancellation processing method of any one of claims 8 to 10.
14. A smart device comprising a memory and a processor, wherein the memory has stored therein the following instructions executable by the processor:
according to the number of channels of the received radio signals, inputting all reference signals entering a loudspeaker into an echo cancellation model respectively, and calculating to obtain a plurality of reference signal estimation values;
for each channel, subtracting a reference signal estimation value corresponding to the channel from the radio signal of each channel to obtain a plurality of original signal estimation values;
and carrying out normalization processing on the original signal estimation values corresponding to all the channels to obtain original signals entering the microphone array.
15. The smart device of claim 14, wherein the smart device comprises: intelligent audio amplifier, intelligent TV.
CN201811453725.0A 2018-11-30 2018-11-30 Echo cancellation method and device and intelligent equipment Pending CN111261179A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811453725.0A CN111261179A (en) 2018-11-30 2018-11-30 Echo cancellation method and device and intelligent equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811453725.0A CN111261179A (en) 2018-11-30 2018-11-30 Echo cancellation method and device and intelligent equipment

Publications (1)

Publication Number Publication Date
CN111261179A true CN111261179A (en) 2020-06-09

Family

ID=70946490

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811453725.0A Pending CN111261179A (en) 2018-11-30 2018-11-30 Echo cancellation method and device and intelligent equipment

Country Status (1)

Country Link
CN (1) CN111261179A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111883154A (en) * 2020-07-17 2020-11-03 海尔优家智能科技(北京)有限公司 Echo cancellation method and apparatus, computer-readable storage medium, and electronic apparatus
CN112102816A (en) * 2020-08-17 2020-12-18 北京百度网讯科技有限公司 Speech recognition method, apparatus, system, electronic device and storage medium
CN112202778A (en) * 2020-09-30 2021-01-08 联想(北京)有限公司 Information processing method and device and electronic equipment
CN112420073A (en) * 2020-10-12 2021-02-26 北京百度网讯科技有限公司 Voice signal processing method, device, electronic equipment and storage medium
CN112634923A (en) * 2020-12-14 2021-04-09 广州智讯通信***有限公司 Audio echo cancellation method, device and storage medium based on command scheduling system
CN113450819A (en) * 2021-05-21 2021-09-28 音科思(深圳)技术有限公司 Signal processing method and related product
CN114512136A (en) * 2022-03-18 2022-05-17 北京百度网讯科技有限公司 Model training method, audio processing method, device, apparatus, storage medium, and program

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1249889A (en) * 1997-03-06 2000-04-05 旭化成工业株式会社 Device and method for processing speech
WO2000072564A2 (en) * 1999-05-24 2000-11-30 Matthias Waldorf Echo compensation device
CN101370323A (en) * 2007-08-15 2009-02-18 美商富迪科技股份有限公司 Apparatus capable of performing acoustic echo cancellation and a method thereof
CN103152500A (en) * 2013-02-21 2013-06-12 中国对外翻译出版有限公司 Method for eliminating echo from multi-party call
CN104157293A (en) * 2014-08-28 2014-11-19 福建师范大学福清分校 Signal processing method for enhancing target voice signal pickup in sound environment
CN105144674A (en) * 2013-05-03 2015-12-09 高通股份有限公司 Multi-channel echo cancellation and noise suppression
CN106157953A (en) * 2015-04-16 2016-11-23 科大讯飞股份有限公司 continuous speech recognition method and system
CN106210368A (en) * 2016-06-20 2016-12-07 百度在线网络技术(北京)有限公司 The method and apparatus eliminating multiple channel acousto echo
CN107105366A (en) * 2017-06-15 2017-08-29 歌尔股份有限公司 A kind of multi-channel echo eliminates circuit, method and smart machine
US20170330071A1 (en) * 2016-05-10 2017-11-16 Google Inc. Audio processing with neural networks
CN107483761A (en) * 2016-06-07 2017-12-15 电信科学技术研究院 A kind of echo suppressing method and device
CN107564539A (en) * 2017-08-29 2018-01-09 苏州奇梦者网络科技有限公司 Towards the acoustic echo removing method and device of microphone array
US20180040333A1 (en) * 2016-08-03 2018-02-08 Apple Inc. System and method for performing speech enhancement using a deep neural network-based signal
CN107910014A (en) * 2017-11-23 2018-04-13 苏州科达科技股份有限公司 Test method, device and the test equipment of echo cancellor
US9947338B1 (en) * 2017-09-19 2018-04-17 Amazon Technologies, Inc. Echo latency estimation
CN108028982A (en) * 2015-09-23 2018-05-11 三星电子株式会社 Electronic equipment and its audio-frequency processing method
CN108429994A (en) * 2017-02-15 2018-08-21 阿里巴巴集团控股有限公司 Audio identification, echo cancel method, device and equipment
US20180261225A1 (en) * 2017-03-13 2018-09-13 Mitsubishi Electric Research Laboratories, Inc. System and Method for Multichannel End-to-End Speech Recognition
CN108604452A (en) * 2016-02-15 2018-09-28 三菱电机株式会社 Voice signal intensifier
WO2018190547A1 (en) * 2017-04-14 2018-10-18 한양대학교 산학협력단 Deep neural network-based method and apparatus for combined noise and echo removal

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1249889A (en) * 1997-03-06 2000-04-05 旭化成工业株式会社 Device and method for processing speech
WO2000072564A2 (en) * 1999-05-24 2000-11-30 Matthias Waldorf Echo compensation device
CN101370323A (en) * 2007-08-15 2009-02-18 美商富迪科技股份有限公司 Apparatus capable of performing acoustic echo cancellation and a method thereof
CN103152500A (en) * 2013-02-21 2013-06-12 中国对外翻译出版有限公司 Method for eliminating echo from multi-party call
CN105144674A (en) * 2013-05-03 2015-12-09 高通股份有限公司 Multi-channel echo cancellation and noise suppression
CN104157293A (en) * 2014-08-28 2014-11-19 福建师范大学福清分校 Signal processing method for enhancing target voice signal pickup in sound environment
CN106157953A (en) * 2015-04-16 2016-11-23 科大讯飞股份有限公司 continuous speech recognition method and system
CN108028982A (en) * 2015-09-23 2018-05-11 三星电子株式会社 Electronic equipment and its audio-frequency processing method
CN108604452A (en) * 2016-02-15 2018-09-28 三菱电机株式会社 Voice signal intensifier
US20170330071A1 (en) * 2016-05-10 2017-11-16 Google Inc. Audio processing with neural networks
CN107483761A (en) * 2016-06-07 2017-12-15 电信科学技术研究院 A kind of echo suppressing method and device
CN106210368A (en) * 2016-06-20 2016-12-07 百度在线网络技术(北京)有限公司 The method and apparatus eliminating multiple channel acousto echo
US20180040333A1 (en) * 2016-08-03 2018-02-08 Apple Inc. System and method for performing speech enhancement using a deep neural network-based signal
CN108429994A (en) * 2017-02-15 2018-08-21 阿里巴巴集团控股有限公司 Audio identification, echo cancel method, device and equipment
US20180261225A1 (en) * 2017-03-13 2018-09-13 Mitsubishi Electric Research Laboratories, Inc. System and Method for Multichannel End-to-End Speech Recognition
WO2018190547A1 (en) * 2017-04-14 2018-10-18 한양대학교 산학협력단 Deep neural network-based method and apparatus for combined noise and echo removal
CN107105366A (en) * 2017-06-15 2017-08-29 歌尔股份有限公司 A kind of multi-channel echo eliminates circuit, method and smart machine
CN107564539A (en) * 2017-08-29 2018-01-09 苏州奇梦者网络科技有限公司 Towards the acoustic echo removing method and device of microphone array
US9947338B1 (en) * 2017-09-19 2018-04-17 Amazon Technologies, Inc. Echo latency estimation
CN107910014A (en) * 2017-11-23 2018-04-13 苏州科达科技股份有限公司 Test method, device and the test equipment of echo cancellor

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
余尤好;: "神经网络在通信***回音对消中的应用", no. 09, pages 1 - 5 *
崔海徽,王石刚,王高中,蒋志辉: "基于前馈神经网络的自适应回声消除方法", no. 02 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111883154A (en) * 2020-07-17 2020-11-03 海尔优家智能科技(北京)有限公司 Echo cancellation method and apparatus, computer-readable storage medium, and electronic apparatus
CN111883154B (en) * 2020-07-17 2023-11-28 海尔优家智能科技(北京)有限公司 Echo cancellation method and device, computer-readable storage medium, and electronic device
CN112102816A (en) * 2020-08-17 2020-12-18 北京百度网讯科技有限公司 Speech recognition method, apparatus, system, electronic device and storage medium
CN112202778A (en) * 2020-09-30 2021-01-08 联想(北京)有限公司 Information processing method and device and electronic equipment
CN112420073A (en) * 2020-10-12 2021-02-26 北京百度网讯科技有限公司 Voice signal processing method, device, electronic equipment and storage medium
CN112420073B (en) * 2020-10-12 2024-04-16 北京百度网讯科技有限公司 Voice signal processing method, device, electronic equipment and storage medium
CN112634923A (en) * 2020-12-14 2021-04-09 广州智讯通信***有限公司 Audio echo cancellation method, device and storage medium based on command scheduling system
CN112634923B (en) * 2020-12-14 2021-11-19 广州智讯通信***有限公司 Audio echo cancellation method, device and storage medium based on command scheduling system
CN113450819A (en) * 2021-05-21 2021-09-28 音科思(深圳)技术有限公司 Signal processing method and related product
CN114512136A (en) * 2022-03-18 2022-05-17 北京百度网讯科技有限公司 Model training method, audio processing method, device, apparatus, storage medium, and program
CN114512136B (en) * 2022-03-18 2023-09-26 北京百度网讯科技有限公司 Model training method, audio processing method, device, equipment, storage medium and program

Similar Documents

Publication Publication Date Title
CN111261179A (en) Echo cancellation method and device and intelligent equipment
CN107452389B (en) Universal single-track real-time noise reduction method
CN111161752B (en) Echo cancellation method and device
JP5587396B2 (en) System, method and apparatus for signal separation
US20190222691A1 (en) Data driven echo cancellation and suppression
Li et al. Online direction of arrival estimation based on deep learning
Mertins et al. Room impulse response shortening/reshaping with infinity-and $ p $-norm optimization
CN108429994B (en) Audio identification and echo cancellation method, device and equipment
Xiao et al. Speech dereverberation for enhancement and recognition using dynamic features constrained deep neural networks and feature adaptation
CN106537501B (en) Reverberation estimator
KR102076760B1 (en) Method for cancellating nonlinear acoustic echo based on kalman filtering using microphone array
CN114283795A (en) Training and recognition method of voice enhancement model, electronic equipment and storage medium
KR102401959B1 (en) Joint training method and apparatus for deep neural network-based dereverberation and beamforming for sound event detection in multi-channel environment
CN116030823B (en) Voice signal processing method and device, computer equipment and storage medium
Su et al. Perceptually-motivated environment-specific speech enhancement
CN109379652B (en) Earphone active noise control secondary channel off-line identification method
WO2022256577A1 (en) A method of speech enhancement and a mobile computing device implementing the method
Pfeifenberger et al. Deep complex-valued neural beamformers
KR101587844B1 (en) Microphone signal compensation apparatus and method of the same
JP2024526679A (en) Data Augmentation for Speech Improvement
KR102045953B1 (en) Method for cancellating mimo acoustic echo based on kalman filtering
Ayrapetian et al. Asynchronous acoustic echo cancellation over wireless channels
JP4094523B2 (en) Echo canceling apparatus, method, echo canceling program, and recording medium recording the program
Tang et al. A Time-Varying Forgetting Factor-Based QRRLS Algorithm for Multichannel Speech Dereverberation
US20240127842A1 (en) Apparatus, Methods and Computer Programs for Audio Signal Enhancement Using a Dataset

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40031322

Country of ref document: HK