WO2011033924A1 - Dispositif de suppression d'écho, procédé de suppression d'écho et programme pour dispositif de suppression d'écho - Google Patents

Dispositif de suppression d'écho, procédé de suppression d'écho et programme pour dispositif de suppression d'écho Download PDF

Info

Publication number
WO2011033924A1
WO2011033924A1 PCT/JP2010/064678 JP2010064678W WO2011033924A1 WO 2011033924 A1 WO2011033924 A1 WO 2011033924A1 JP 2010064678 W JP2010064678 W JP 2010064678W WO 2011033924 A1 WO2011033924 A1 WO 2011033924A1
Authority
WO
WIPO (PCT)
Prior art keywords
signal
reference signal
voice
input
output
Prior art date
Application number
PCT/JP2010/064678
Other languages
English (en)
Japanese (ja)
Inventor
島津宝浩
Original Assignee
ブラザー工業株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ブラザー工業株式会社 filed Critical ブラザー工業株式会社
Publication of WO2011033924A1 publication Critical patent/WO2011033924A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M9/00Arrangements for interconnection not involving centralised switching
    • H04M9/08Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic
    • H04M9/082Two-way loud-speaking telephone systems with means for conditioning the signal, e.g. for suppressing echoes for one or both directions of traffic using echo cancellers

Definitions

  • the present invention relates to an echo removal apparatus, an echo removal method, and an echo removal apparatus program for removing an acoustic echo component from an audio signal transmitted to a communication destination apparatus.
  • a video conference system in which audio signals and video signals are transmitted and received between terminal devices installed at a plurality of bases, and a conference can be performed by exchanging audio and video between users in real time.
  • the voice uttered by the user is slightly delayed and returns to the user's site via the speaker and microphone at the site where the user is located at a remote location.
  • a so-called acoustic echo is generated in which the emitted voice reverberates.
  • the voice uttered by the user at the local site is transmitted to the other site and output from the speaker.
  • the output voice is picked up by the microphone at the other site, it is transmitted again to the local site and the speaker at the local site.
  • the positional relationship between the microphone and the speaker may change during the conference, for example, when the user moves from his seat to the front of the whiteboard with a microphone and gives an explanation.
  • information on time lag and level lag hereinafter also referred to as “(acoustic echo component) parameter”
  • the acoustic echo component is updated. Is always obtained based on the latest parameters (see, for example, Patent Document 1).
  • the echo canceller is continuously subjected to a load for calculating those parameters. Even if the parameters are updated when there is no change in the positional relationship between the microphone and the speaker, the parameters before the update and the parameters after the update are the same or there is almost no difference. Performing this only puts a wasteful load on the echo canceller.
  • An object of the present invention is to provide an echo removal apparatus, an echo removal method, and an echo removal apparatus program capable of removing an acoustic echo component.
  • An echo removing apparatus includes: an output unit that converts a received voice signal, which is a voice signal received from a communication destination apparatus, into a voice; and outputs an input surrounding voice to the communication destination apparatus.
  • Input means for converting to a transmission voice signal which is a voice signal to be transmitted, position detection means for detecting that a change has occurred in at least one of the output means and the input means, and output from the output means
  • Generating means for generating, when the position detecting means detects a change in the arrangement position, a reference signal used as a reference for removing an acoustic echo component generated when the sound is input to the input means from the transmission sound signal;
  • a superimposing unit that superimposes the reference signal on the received audio signal; and a filtering process performed on the transmission audio signal converted by the input unit to extract the reference signal
  • a calculation means for obtaining, and performing the calculation based on the time shift information and the level shift information for the received voice signal to generate the acoustic echo component, subtracting from the transmission voice signal, the acoustic echo component Removing means for generating a removed voice signal from which the sound is removed, and the removed voice as the transmission voice signal to be transmitted to the communication destination device And a transmitting means for transmitting the items.
  • the reference signal generated when obtaining the time shift information and the level shift information necessary for generating the acoustic echo component is superimposed on the received audio signal and output from the output means. can do. Therefore, even during transmission / reception of an audio signal to / from a communication destination device (hereinafter referred to as “in operation”), the time shift information and the level shift information are obtained using the reference signal, Can be updated. As a result, during operation, there is a change in the arrangement position of the output means and input means, and there is a possibility that an appropriate acoustic echo component cannot be generated with the information on the time deviation and the information on the level deviation used so far. Immediately, new time shift information and level shift information can be obtained and updated.
  • the reference signal can be generated when it is detected that a change has occurred in the arrangement position of at least one of the output means and the input means. In other words, if there is no change in the arrangement position of the output means and the input means, the reference signal is not generated, and the calculation for obtaining the time shift information and the level shift information is not performed. In other words, the time lag information and the level lag information are updated appropriately when a necessary situation occurs (when there is a change in the arrangement position of the output means or the input means). In comparison with the case where the echo is updated, a wasteful load is not applied to the echo canceller.
  • the position detection means detects a change in the arrangement position of the output means and the input means, but not only a change in the relative positional relationship between the output means and the input means, but also a change in each absolute arrangement position. Is detected. Therefore, it is possible to reliably detect a change in the situation that may affect the generation accuracy of the acoustic echo component.
  • the first aspect is a photographing unit that photographs an image including at least one of the output unit and the input unit from a fixed position, and at least one of the output unit and the input unit in a photographed image of the photographing unit.
  • Analysis means for analyzing the position may be further provided.
  • the position detection means may detect that a change has occurred in the arrangement position based on an analysis result of the analysis means. If the imaging means is used and the output means and the input means are photographed from a fixed position, it is possible to easily and reliably at least one of the output means and the input means simply by analyzing the photographed image and grasping the position of both in the photographed image. It is possible to detect an absolute change in the arrangement position.
  • the first aspect may further include an acceleration detection unit that detects an acceleration applied to at least one of the output unit and the input unit.
  • the position detection unit may include a detection result of the acceleration detection unit. Based on this, it may be detected that a change has occurred in the arrangement position. If it is an acceleration detection means, it is easy to provide it integrally with an output means or an input means. If there is a change in the arrangement position of the output means or input means, acceleration is applied to the acceleration detection means. Therefore, if the presence or absence of movement of the output means or input means is grasped based on the detection result of the acceleration detection means, It is possible to reliably detect an absolute change in the arrangement position of at least one of the output unit and the input unit.
  • the generating means may generate a signal having a frequency of a speech waveform in a non-audible region as the reference signal. If the frequency of the sound waveform of the reference signal is in the non-audible region, even if the reference signal is superimposed on the received sound signal and output from the output means, the user cannot hear the sound based on the reference signal. In this case, the user can hear only the voice based on the received voice signal. Therefore, even if the reference signal is output during operation, the user's utterance or listening is not hindered by the reference signal. Therefore, if there is a change in the position of the output means or input means, a new time is immediately Deviation information and level deviation information can be obtained and updated.
  • the first aspect may further include a determination unit that determines whether or not the received audio signal is in a silent state.
  • the position detection unit detects a change in the arrangement position, and the When the determination unit determines that the received audio signal is silent, the generation unit may generate a signal having a frequency of an audio waveform in the audible region as the reference signal.
  • an audible frequency signal has a wider directivity than a non-audible frequency signal.
  • the frequency of the acoustic echo component is also the frequency in the audible region.
  • the generation accuracy of the acoustic echo component can be further improved by obtaining the time shift information and the level shift information using the reference signal of the frequency in the audible region having a wide directivity and frequency characteristics close to those of the acoustic echo component. be able to.
  • a reference signal having a frequency in the audible region is superimposed on the received sound signal and output from the output means, the user can hear the sound based on the reference signal together with the sound based on the received sound signal.
  • the person's utterance and listening will be hindered by the reference signal. Therefore, it is preferable to generate the reference signal having a frequency in the audible region when the received audio signal is in a silent state.
  • the echo cancellation method includes an output step in which a received voice signal, which is a voice signal received from a communication destination device, is converted into voice and output from the output means, and surrounding voice is input means. And an input step that is converted into a transmission audio signal that is an audio signal to be transmitted to the communication destination device, and a change in at least one of the output means and the input means is detected.
  • a reference signal that serves as a reference for removing a sound echo component generated when the sound output from the output means is input to the input means is removed from the transmission sound signal in the position detection step.
  • a generation step that is generated when a change is detected, a superimposition step in which the reference signal is superimposed on the received audio signal, and the transmission converted in the input step An extraction process in which a voice signal is filtered and the reference signal is extracted; a generation reference signal that is the reference signal generated in the generation process; and the reference that is extracted in the extraction process And an extraction reference signal, which is a signal, is compared, information on a time lag between the generation timing of the generation reference signal and the extraction timing of the extraction reference signal, the signal level of the generation reference signal at the generation timing, and the extraction timing
  • the reference signal generated when obtaining the time lag information and the level lag information necessary for generating the acoustic echo component is superimposed on the received audio signal and output from the output means. can do. Therefore, even during transmission / reception of audio signals to / from the communication destination apparatus (during operation), it is possible to obtain and update information on time lag and information on level lag using the reference signal. As a result, during operation, there is a change in the arrangement position of the output means and input means, and there is a possibility that an appropriate acoustic echo component cannot be generated with the information on the time deviation and the information on the level deviation used so far. Immediately, new time shift information and level shift information can be obtained and updated.
  • the reference signal can be generated when it is detected that a change has occurred in the arrangement position of at least one of the output means and the input means.
  • the reference signal is not generated, and the calculation for obtaining the time shift information and the level shift information is not performed.
  • the time lag information and the level lag information are updated appropriately when a necessary situation occurs (when there is a change in the arrangement position of the output means or the input means). Compared to the case where it is regularly updated, there is no unnecessary load on the echo canceller.
  • the change in the arrangement position of the output means and the input means is detected, but not only the change in the relative positional relationship between the output means and the input means, but also the absolute arrangement position of each. A change has been detected. Therefore, it is possible to reliably detect a change in the situation that may affect the generation accuracy of the acoustic echo component.
  • the program of the echo removal apparatus causes a computer to function as various processing means of the echo removal apparatus according to claim 1.
  • the computer By causing the computer to execute the program of the echo removal apparatus, the effect of the invention described in claim 1 can be achieved.
  • the echo canceller is used for a terminal device of a video conference system in which users at remote locations (multiple locations) can exchange audio and video in real time via a network and proceed with a conference or the like.
  • the echo canceller is provided as a device that controls processing related to sound in the terminal device, and is incorporated in the terminal device of the video conference system as part of the hardware circuit.
  • the video conference system is a system that can transmit and receive audio signals and video signals between terminal devices 2 to 4 connected to each other via a network 1.
  • Each of the terminal devices 2 to 4 plays a role as a client or a host in the video conference system depending on the situation.
  • the terminal devices 2 to 4 may be used as clients.
  • the terminal devices 2 to 4 are all video conference dedicated terminals having the same configuration, and the details of the echo removal device will be described by taking the echo removal unit 8 of the terminal device 2 as an example.
  • three terminal devices 2 to 4 are connected to the network 1, but the number of terminal devices constituting the video conference system is not limited to three.
  • the terminal device 2 includes a known CPU 80 that controls the entire terminal device 2.
  • a ROM 82, a RAM 84, and an input / output interface 88 are connected to the CPU 80 via a bus 86.
  • An operation unit 92, a video processing unit 94, an audio processing unit 10, and a communication unit 46 are connected to the input / output interface 88.
  • the ROM 82 stores various programs and data for operating the terminal device 2.
  • the CPU 80 controls the operation of the terminal device 2 according to the program stored in the ROM 82.
  • the RAM 84 temporarily stores various data.
  • the operation unit 92 is an input device for a user to operate the terminal device 2.
  • the communication unit 46 connects the terminal device 2 at its own site and the terminal devices 3 and 4 at other sites via the network 1, and various signals (control signals, audio signals) converted into communication protocols between the terminals. , Video signals, etc.). Furthermore, the communication unit 46 exchanges audio signals and video signals with the audio processing unit 10 and the video processing unit 94 via the input / output interface 88.
  • the terminal device 2 also includes a codec, and compresses a signal to be transmitted and decompresses a received signal.
  • a video input device 96 and a video output device 98 are connected to the video processing unit 94.
  • the video processing unit 94 processes video captured by the video input device 96 (for example, a camera) and generates a video signal to be transmitted to the terminal devices 3 and 4.
  • the video processing unit 94 processes video signals received from the terminal devices 3 and 4 and displays the video on a video output device 98 (for example, a monitor).
  • a voice input device 60 and a voice output device 70 are connected to the voice processing unit 10.
  • the audio processing unit 10 processes audio input to the microphone 64 of the audio input device 60 and generates an audio signal (hereinafter referred to as “transmission audio signal”) to be transmitted to the terminal devices 3 and 4.
  • the audio processing unit 10 processes audio signals received from the terminal devices 3 and 4 (hereinafter referred to as “received audio signals”), and outputs audio from the speaker 74 of the audio output device 70.
  • the audio processing unit 10 the audio input device 60, the audio output device 70, the communication unit 46, and each configuration (CPU 80) for controlling each of these processing units (each device).
  • ROM 82, RAM 84, etc. constitute an echo removal unit 8.
  • the voice input device 60 includes a microphone 64 and an acceleration sensor 62, and is configured as a movable device.
  • the microphone 64 converts input ambient sound into an electric signal (analog sound signal).
  • the acceleration sensor 62 detects acceleration applied to the voice input device 60.
  • the audio output device 70 includes a speaker 74 and an acceleration sensor 72, and is configured as a movable device like the audio input device 60.
  • the speaker 74 converts an input electric signal (analog audio signal) into a sound and outputs the sound.
  • the acceleration sensor 72 detects acceleration applied to the audio output device 70.
  • the voice input device 60 and the voice output device 70 are provided separately from the terminal device 2 so that the installation location (arrangement position) can be changed independently.
  • the voice processing unit 10 includes a movement detection unit 12, reference signal generation units 14 and 16, a switch (SW) 18, a switch control unit 22, an adder 24, an A / D converter 26, a D / A converter 28, A / D converter 30, digital filter 34, signal comparison unit 36, delay processing unit 38, attenuation processing unit 40, subtractor 42, timer 44, and distributors 20 and 32.
  • An acceleration sensor 62 of the voice input device 60 and an acceleration sensor 72 of the voice output device 70 are connected to the movement detection unit 12 via the A / D converter 26.
  • the movement detection unit 12 detects that movement from the current position has occurred in at least one of the voice input device 60 and the voice output device 70 based on the detection results of acceleration by the acceleration sensors 62 and 72. That is, the movement detection unit 12 can detect not only a change in the relative positional relationship between the audio input device 60 and the audio output device 70 but also a change in each absolute arrangement position.
  • the inputs of the reference signal generators 14 and 16 are connected to the movement detector 12 respectively. Further, the outputs of the reference signal generation units 14 and 16 are connected to an adder 24 and a signal comparison unit 36 (described later) via the switch 18 and the distributor 20, respectively.
  • the reference signal generation unit 14 generates a signal whose frequency of the audio waveform is an audible frequency (1 KHz in the present embodiment) as a reference signal, and outputs the signal to the adder 24 and the signal comparison unit 36.
  • the reference signal generation unit 16 generates a signal having a frequency of the sound waveform in the non-audible region (100 kHz in the present embodiment) as the reference signal, and outputs the signal to the adder 24 and the signal comparison unit 36.
  • the switch 18 selectively switches connection between one of the reference signal generation unit 14 or the reference signal generation unit 16 and the adder 24 and the signal comparison unit 36. More specifically, the switch 18 is controlled by the switch control unit 22, a connection (A side in FIG. 1) that allows the reference signal of 1 KHz to be input to the adder 24 and the signal comparison unit 36, and 100 KHz.
  • the connection (B side in FIG. 1) for switching the reference signal is switched.
  • the switch 18 is shown as a contact type switch for the sake of convenience. However, a contactless type switch using a transistor or the like is preferable.
  • the switch control unit 22 is provided on a path through which the received audio signal is input to the adder 24. More specifically, the received audio signal received from the terminal devices 3 and 4 in the communication unit 46 is input to the audio processing unit 10 via the input / output interface 88, but the switch control unit 22 It is provided between the adder 24.
  • the switch control unit 22 determines whether or not the received audio signal passing through the switch control unit 22 is in a silent state.
  • the silent state refers to a state in which the signal level of the received audio signal (the amplitude of the audio waveform) is 0 or less than a predetermined threshold, but the signal level is 0 even when the received audio signal itself is not input. Yes, considered silent.
  • the switch control unit 22 performs control so that the switch 18 is switched to the A side when the received audio signal is silent, and the switch 18 is switched to the B side when it is in a sound state.
  • the silence state may be determined when the received audio signal passes as described above. However, in order to improve the accuracy, the state where the signal level is less than the threshold value continues for a predetermined time (for example, 1 second). Then, it is better to judge that there is no sound.
  • the switch control unit 22 also transmits an instruction to switch to filter setting corresponding to the reference signal generated according to the signal level of the received audio signal, also to the digital filter 34 described later.
  • the input of the adder 24 is connected to the reference signal generation units 14 and 16 through the switch 18 and the communication unit 46 through the switch control unit 22 and the input / output interface 88.
  • a D / A converter 28 and a delay processing unit 38 are connected to the output of the adder 24, respectively.
  • the adder 24 superimposes the reference signal input from the reference signal generation units 14 and 16 on the received audio signal input from the communication unit 46 (that is, combines the received audio signal and the reference signal), and outputs the output audio signal. To the D / A converter 28 and the delay processing unit 38.
  • the reference signal is not always generated.
  • the adder 24 passes the received audio signal as it is, and passes it to the D / A converter 28 and the delay processing unit 38. Output.
  • the reference signal may be generated even when the received audio signal is in a silent state (including no input).
  • the adder 24 passes the reference signal as it is and outputs it to the D / A converter 28 and the delay processing unit 38.
  • these signals output from the adder 24 are also referred to as output audio signals.
  • the speaker 74 of the audio output device 70 is connected to the output of the D / A converter 28 via an amplifier (not shown).
  • the D / A converter 28 converts the output audio signal into an analog audio signal and outputs the analog audio signal to the speaker 74.
  • the speaker 74 converts an input audio signal into audio and outputs it.
  • the microphone 64 of the voice input device 60 is connected to the input of the A / D converter 30.
  • the sound around the sound input device 60 is input to the microphone 64 and converted into an analog sound signal, and further converted into a digital sound signal (hereinafter referred to as “input sound signal”) by the A / D converter 30.
  • the An output of the A / D converter 30 is connected to a digital filter 34 and a subtractor 42 via a distributor 32.
  • the digital filter 34 performs a filtering process on the input audio signal input from the A / D converter 30 and extracts a reference signal included in the input audio signal.
  • a band pass filter (BPF) that can be set to selectively extract a 1 KHz or 100 KHz signal is adopted as the digital filter 34.
  • BPF band pass filter
  • the digital filter 34 is configured to switch the setting of the frequency of the voice waveform to be extracted in accordance with an instruction from the switch control unit 22. More specifically, a digital signal is extracted so that a 1 KHz reference signal is extracted when the received audio signal passing through the switch control unit 22 is silent, and a 100 KHz reference signal is extracted when the voice signal is sound. Filter setting of the filter 34 is performed.
  • the output of the digital filter 34 is connected to the signal comparison unit 36. That is, two types of reference signals are input to the signal comparison unit 36.
  • One is a reference signal (hereinafter referred to as “generated reference signal”) that is generated by the reference signal generation units 14 and 16 and is input as it is (without deterioration).
  • the other is generated by the reference signal generators 14 and 16 and is extracted from the input audio signal by the digital filter 34 via the adder 24, the D / A converter 28, the speaker 74, the microphone 64, and the A / D converter 30.
  • Reference signal (deteriorated) hereinafter referred to as “extraction reference signal”).
  • the signal comparison unit 36 has a timer 44 for obtaining a count value T used for calculating a time lag between the input timing of the generation reference signal (that is, the generation timing of the reference signal) and the extraction timing of the extraction reference signal. It is connected.
  • the signal comparison unit 36 compares the sound waveform of the generated reference signal with the sound waveform of the extracted reference signal, and obtains a time shift (delay) and a level shift (attenuation) of the extracted reference signal with respect to the generated reference signal.
  • the output of the signal comparison unit 36 is connected to a delay processing unit 38 and an attenuation processing unit 40.
  • the delay processing unit 38 receives the output audio signal output from the adder 24 and the time shift information (P) obtained by the signal comparison unit 36.
  • the delay processing unit 38 performs a process of delaying and outputting (delaying) the input output audio signal based on the time lag information.
  • the attenuation processing unit 40 receives the output audio signal that has been subjected to delay processing and is output from the delay processing unit 38 and the level shift information (L) obtained by the signal comparison unit 36 as described above.
  • the attenuation processing unit 40 performs a process of lowering (attenuating) the signal level of the output audio signal subjected to the delay process based on the level shift information.
  • the input of the subtractor 42 is connected to the attenuation processing unit 40 and the microphone 64 via the distributor 32 and the A / D converter 30. That is, two types of audio signals are input to the subtractor 42.
  • One audio signal is an output audio signal (hereinafter referred to as “acoustic echo component”) output from the adder 24 and subjected to delay processing and attenuation processing via the delay processing unit 38 and the attenuation processing unit 40.
  • the other audio signal is the aforementioned input audio signal that is output from the adder 24, converted into audio by the speaker 74, output to the microphone 64 together with the surrounding audio, and converted into the audio signal again. is there.
  • the subtractor 42 superimposes the sound waveform of the acoustic echo component on the speech waveform of the input speech signal, and removes the acoustic echo component from the input speech signal (hereinafter referred to as “removed speech signal”). Process to generate.
  • the output of the subtracter 42 is connected to the communication unit 46 via the input / output interface 88.
  • the removed audio signal is transmitted as a transmission audio signal from the communication unit 46 to the terminal devices 3 and 4 via the network 1.
  • the removed audio signal obtained by removing the acoustic echo component from the input audio signal based on the audio input to the microphone 64 is transmitted to the terminal devices 3 and 4 as a transmission audio signal.
  • the flow of processing will be described with reference to FIGS. For convenience, each step in the flowchart is abbreviated as “S”.
  • the terminal device 2 shown in FIG. 1 is driven when the power is turned on. That is, the CPU 80 drives the terminal device 2 by causing each processing unit to execute a sequence at the start of driving in accordance with a program stored in the ROM 82 and controlling transmission / reception of signals between the processing units (devices). .
  • the communication unit 46 negotiates with the terminal devices 3 and 4 via the network 1 to establish communication.
  • the initialization process (S9) shown in FIG. 2 is performed, and parameters (time shift information (P) and level shift information (L)) necessary for removing the acoustic echo component are set. Is done.
  • the details of the initialization process are performed according to the process flow shown in FIG. First, the timer 44 is started (S61), and the count value of the internal timer is incremented at regular intervals.
  • a reference signal is generated (S63).
  • the initialization process is performed in a state where no received audio signal is input from the terminal devices 3 and 4 (a state where communication is not established or a state where communication is interrupted). Therefore, the switch control unit 22 shown in FIG. 1 determines that the received audio signal is in a silent state, and the connection of the switch 18 is switched to the A side. Accordingly, in S63, the reference signal generation unit 14 is driven, and a reference signal whose frequency of the audio waveform is the frequency of the audible region (1 KHz) is generated. As shown in FIG.
  • the reference signal is generated as a signal in which a signal having a frequency of 1 KHz is intermittently repeated at regular intervals (the sound waveform of the reference signal (generated reference signal) is shown by a solid line in FIG. ).
  • the generated reference signal is input to the signal comparison unit 36 as a generated reference signal via the distributor 20.
  • the signal comparison unit 36 obtains the count value T of the timer 44 in response to the input of the generation reference signal, and uses this timing as a reference for determining the delay of the reference signal as a reference signal generation timing T0 (see FIG. 4). Hold. Further, the signal comparison unit 36 obtains the signal level of the generation reference signal and holds it as the generation level L0 (see FIG. 4).
  • the generated reference signal is output as sound from the speaker 74 of the sound output device 70 via the distributor 20, the adder 24, and the D / A converter 28 (S65). Since the received audio signal is silent, the reference signal passes through the adder 24 as it is and is output as an output audio signal, and the speaker 74 outputs an audible sound based on the 1 KHz reference signal.
  • the microphone 64 of the voice input device 60 is in a voice input waiting state (S67: NO).
  • the 1 KHz sound output from the speaker 74 is input to the microphone 64 (S67: YES)
  • it is converted into an input sound signal and input to the digital filter 34 via the A / D converter 30 and the distributor 32.
  • the switch control unit 22 is set so that the received audio signal is in a silent state, that is, a setting for selectively extracting a 1 KHz signal. Therefore, even if the input sound signal includes not only the reference signal but also a signal based on the sound around the microphone 64, the 1 kHz reference signal is extracted from the input sound signal and input to the signal comparison unit 36 as the extracted reference signal. (S69).
  • the signal comparison unit 36 acquires the count value T of the timer 44 in response to the input of the extraction reference signal, and holds it as the reference signal extraction timing T1, as shown in FIG.
  • the voice waveform of the extracted reference signal is indicated by a solid line
  • the voice waveform of the generated reference signal is indicated by a dotted line. Further, the signal comparison unit 36 obtains the signal level of the extraction reference signal and holds it as the extraction level L1.
  • the signal comparison unit 36 calculates T1-T0 and obtains the time shift P (S71). This time shift information (P) is transmitted to the delay processing unit 38 and set as a parameter for the delay processing. Similarly, the signal comparison unit 36 calculates L1 / L0 and obtains the level deviation L (S73). This level shift information (L) is transmitted to the attenuation processing unit 40 and set as a parameter for the attenuation processing. This is the end of the initialization process (S9).
  • a series of processes for removing acoustic echoes using the set parameters (P, L) are performed.
  • transmission / reception of audio signals (reception of reception audio signals and transmission of transmission audio signals) is performed by communication with the terminal devices 3 and 4 via the network 1 (S11).
  • the audio processing unit 10 as described above, if there is a change (movement) in the arrangement position of the audio input device 60 (microphone 64) or the audio output device 70 (speaker 74), the movement detection unit 12 detects it, and the reference The signal generators 14 and 16 are caused to generate a reference signal.
  • the reference signal is not generated.
  • the received audio signal received from the terminal devices 3 and 4 passes through the adder 24 as it is and is output as an output audio signal, and is output as audio from the speaker 74 of the audio output device 70 via the D / A converter 28. (S15).
  • the sound output from the speaker 74 is input to the microphone 64 that is in a voice input waiting state (S17: NO), it is converted into an input sound signal and passed through the A / D converter 30. Are input to the subtractor 42.
  • the input audio signal is also input to the digital filter 34 via the distributor 32, but since no reference signal is generated, no processing is performed in the signal comparison unit 36 input after passing through the digital filter 34. Not. However, when the reference signal is not generated, the input path from the distributor 32 to the digital filter 34 may be blocked.
  • the output audio signal output from the adder 24 (the received audio signal on which the reference signal is not superimposed here) is also input to the delay processing unit 38.
  • the delay processing unit 38 holds the time lag information (P) transmitted from the signal comparison unit 36, delays the output audio signal input from the adder 24 by P time, and outputs it to the attenuation processing unit 40.
  • the attenuation processing unit 40 holds the level shift information (L) transmitted from the signal comparison unit 36, attenuates the output audio signal input from the delay processing unit 38 by L and attenuates the acoustic echo component.
  • S21 the subtractor 42
  • the subtractor 42 receives an input audio signal input from the microphone 64 and an acoustic echo component generated by performing delay processing and subtraction processing on the output audio signal.
  • the subtractor 42 cancels the acoustic echo component included in the input speech signal by superimposing the opposite waveform of the speech waveform of the acoustic echo component on the speech waveform of the input speech signal, and removes the removed speech signal from which the acoustic echo has been removed.
  • Generate (S23) After S23, the process returns to S11, and the generated removed audio signal is transmitted as a transmission audio signal from the communication unit 46 to the terminal devices 3 and 4 via the network 1 (S11).
  • This transmitted audio signal does not include the audio output from the speaker 74 based on the received audio signals from the terminal devices 3 and 4 among the audio around the terminal device 2 input to the microphone 64, and is on the terminal device 2 side. It will be based only on the newly uttered voice. Therefore, even if the sound based on this transmission sound signal is output from the speaker on the terminal device 3 or 4 side, no acoustic echo is generated. Thereafter, if there is no change in the arrangement position of the voice input device 60 or the voice output device 70 (S13: NO), S11, S13, S15 to S23 are repeated, and the parameters (P, L) obtained in the initialization process are repeated. The acoustic echo is removed using.
  • the signal comparison unit 36 acquires the count value T of the timer 44 when the generation reference signal is input, and holds it as the reference signal generation timing T0. Further, the signal comparison unit 36 obtains the signal level of the generation reference signal and holds it as the generation level L0.
  • the adder 24 passes the input reference signal as it is, and outputs this reference signal as an output audio signal to the D / A converter 28 and the delay processing unit 38.
  • the output audio signal is converted into an analog audio signal via the D / A converter 28, and output as an audible sound based on the 1 KHz reference signal from the speaker 74 of the audio output device 70 (S39).
  • the switch control unit 22 determines that the received voice signal is not silent (S31: NO), as described above, a reference signal whose frequency of the voice waveform is inaudible (100 KHz) is generated. (S35).
  • the signal comparison unit 36 holds the count value T of the timer 44 as the generation timing T0 of the reference signal, and holds the signal level as the generation level L0.
  • the adder 24 superimposes the reference signal on the input received audio signal, and outputs it as an output audio signal to the D / A converter 28 and the delay processing unit 38 (S37).
  • the audio based on the received audio signal is output from the speaker 74 of the audio output device 70 together with the inaudible sound based on the reference signal ( S39).
  • the presence / absence of voice input detection is determined by the microphone 64 of the voice input device 60 (S41). And when input detection is not performed, it is in a waiting state (S41: NO).
  • the sound output from the speaker 74 is input to the microphone 64 (S41: YES)
  • the sound is converted into an input sound signal.
  • the input audio signal is input to the digital filter 34 via the A / D converter 30 and the distributor 32.
  • the switch control unit 22 is configured to selectively extract a 1 KHz signal when the received audio signal is silent, and to selectively extract a 100 KHz signal when the received voice signal is not silent. Has been made. Therefore, even if the reference signal included in the input audio signal has a frequency in the non-audible region or a frequency in the audible region, the reference signal as set by the filter setting is obtained by passing through the digital filter 34. Is extracted (S43).
  • the extracted reference signal (extracted reference signal) is input to the signal comparison unit 36.
  • the signal comparison unit 36 obtains the extraction timing T1 and the extraction level L1 of the extraction reference signal, and obtains the time shift P and the level shift L based on the generation timing T0 and the generation level L0 obtained from the generation reference signal.
  • S45, S47 is the same as the processing of S71 and S73 described above.
  • the newly obtained parameters (P, L) are transmitted to the delay processing unit 38 and the attenuation processing unit 40, respectively, and already held parameters (parameters obtained in the previous processing such as initialization processing). Updated.
  • the delay processing unit 38 performs processing for delaying the output audio signal input from the adder 24 by P time (S49), and the attenuation processing unit 40 outputs the output audio signal input from the delay processing unit 38.
  • the processing for generating an acoustic echo component by attenuating the signal by L times (S51) is the same as the processing of S19 and S21 described above.
  • the process of S23 described above is also performed in the subtractor 42 in which the process (S53) of generating the removed voice signal by superimposing the voice waveform of the acoustic echo component on the voice waveform of the input voice signal is performed. It is the same.
  • the process returns to S11, and the removed audio signal generated using the new parameter is transmitted as a transmission audio signal from the communication unit 46 to the terminal devices 3 and 4 via the network 1 (S11).
  • the path until the audio output from the speaker 74 is input to the microphone 64 changes, and the parameters for generating the acoustic echo component also change. Change. Therefore, when a change in the arrangement position of at least one of the voice input device 60 and the voice output device 70 is detected, the parameter is updated to remove the acoustic echo component in accordance with the (current) environment after the arrangement position change. It can be done reliably. Therefore, if the removed audio signal generated using the new parameter is transmitted to the terminal devices 3 and 4 as a transmission audio signal, the audio based on the transmission audio signal is output from the speaker on the terminal device 3 and 4 side. No acoustic echo is produced.
  • the reference signal generated when obtaining the parameters (time shift information (P) and level shift information (L)) necessary for generating the acoustic echo component is used.
  • And can be output from the speaker 74 while being superimposed on the received audio signal. Accordingly, even when the video conference system is operated and audio signals are being transmitted / received between the terminal device 2 and the terminal devices 3 and 4 (in operation), parameters are obtained using the reference signal. Can be updated. Thereby, during operation, a change occurs in the arrangement position of the audio input device 60 (microphone 64) and the audio output device 70 (speaker 74), and an appropriate acoustic echo component cannot be generated with the parameters used so far. Immediately, new parameters can be obtained and updated.
  • an appropriate acoustic echo component can be generated in response to a change in the situation that may affect the generation accuracy of the acoustic echo component during operation, and the accuracy of removing the acoustic echo component from the transmission voice signal is maintained. be able to.
  • the reference signal can be generated when it is detected that a change has occurred in at least one arrangement position of the audio input device 60 and the audio output device 70.
  • the reference signal is not generated, and the calculation for obtaining the parameters (information about time shift and level shift) is not performed.
  • the parameter is updated appropriately when a necessary situation occurs (when the arrangement position of the audio input device 60 or the audio output device 70 is changed), and is updated constantly or periodically. Compared to the case where the echo canceling unit 8 is used, no unnecessary load is applied to the echo removing unit 8.
  • the movement detection unit 12 detects a change in the arrangement position of the voice input device 60 and the voice output device 70, but not only changes in the relative positional relationship between the voice input device 60 and the voice output device 70, Changes in the respective absolute positions are detected. Therefore, it is possible to reliably detect a change in the situation that may affect the generation accuracy of the acoustic echo component.
  • the acceleration sensors 62 and 72 can be easily provided integrally with the microphone 64 and the speaker 74. If there is a change in the arrangement position of the voice input device 60 in which the acceleration sensor 62 and the microphone 64 are integrated, or the voice output device 70 in which the acceleration sensor 72 and the speaker 74 are integrated, acceleration is applied to the acceleration sensors 62 and 72. . Therefore, if the presence or absence of movement of the voice input device 60 or the voice output device 70 is grasped based on the detection results of the acceleration sensors 62 and 72, at least one of the voice input device 60 and the voice output device 70 can be easily and reliably obtained. It is possible to detect an absolute change in the arrangement position.
  • the frequency of the sound waveform of the reference signal is a frequency in the non-audible region
  • the reference signal is superimposed on the received sound signal and output from the speaker 74
  • the sound (reference sound) based on the reference signal is output to the user. Cannot be heard. In this case, the user can hear only the voice based on the received voice signal. Therefore, even if the reference signal is output during operation, the user's utterance or listening is not hindered by the reference signal. Therefore, when a change occurs in the arrangement position of the voice input device 60 or the voice output device 70, immediately. New parameters can be obtained and updated.
  • an appropriate acoustic echo component can be generated in response to a change in the situation that may affect the generation accuracy of the acoustic echo component during operation, and the accuracy of removing the acoustic echo component from the transmission voice signal is maintained. be able to.
  • signals in the audible frequency range have a wider directivity than signals in the non-audible frequency range.
  • the frequency of the acoustic echo component is also the frequency in the audible region. Therefore, if the parameters (time shift information and level shift information) are obtained using a audible frequency reference signal having a wide directivity and frequency characteristics close to those of the acoustic echo component, the accuracy of generating the acoustic echo component is Can be increased.
  • a reference signal having a frequency in the audible region is superimposed on the received sound signal and output from the speaker 74, the user can hear the sound based on the reference signal together with the sound based on the received sound signal. There is a risk that the person's utterance and listening will be hindered by the reference signal. Therefore, it is preferable to generate the reference signal having a frequency in the audible region when the received audio signal is in a silent state.
  • the speaker 74 corresponds to the “output unit” of the first aspect
  • the microphone 64 corresponds to the “input unit”.
  • the movement detection unit 12 corresponds to “position detection unit”
  • the reference signal generation units 14 and 16 correspond to “generation unit”.
  • the adder 24 corresponds to “superimposing means”
  • the digital filter 34 corresponds to “extraction means”.
  • the signal comparison unit 36 corresponds to “calculation unit”
  • the delay processing unit 38, the attenuation processing unit 40, and the subtractor 42 correspond to “removal unit”.
  • the communication unit 46 corresponds to “transmission means”.
  • the acceleration sensors 62 and 72 correspond to “acceleration detection means”
  • the switch control unit 22 corresponds to “determination means”.
  • FIG. 6 shows a configuration example of an echo removal apparatus when a personal computer (PC) 102 is used as the terminal apparatus 2.
  • PC personal computer
  • a portion that functions as an echo removal device is an echo removal unit 108.
  • the part which comprises the structure equivalent to the terminal device 2 is shown with the same code
  • the PC 102 includes a known CPU 180, and a ROM 82, a RAM 84, and an input / output interface 88 are connected to the CPU 180 via a bus 86.
  • the input / output interface 88 includes an operation input device 92 such as a mouse and a keyboard, an external storage device 90 such as a hard disk drive (HDD), a flash memory drive (SSD), and a DVD-ROM drive, a video processing unit 94, and a communication unit 46. Is connected.
  • a video input device 96 such as a web camera and a video output device 98 such as a monitor are connected to the video processing unit 94.
  • An audio input device 60 including a microphone 64 and an acceleration sensor 62 and an audio output device 70 including a speaker 74 and an acceleration sensor 72 are also connected to the input / output interface 88.
  • a speaker 74 is connected to the input / output interface 88 via the D / A converter 28
  • a microphone 64 is connected via the A / D converter 30, and the acceleration sensor 62, via the A / D converter 26. 72 is connected.
  • the audio input device 60, the audio output device 70, the operation input device 92, the video input device 96, and the video output device 98 are provided as external devices of the PC 102.
  • the echo removing unit 108 includes a voice input device 60, a voice output device 70, a communication unit 46, an external storage device 90, and various components (CPU 180, ROM 82, RAM 84, etc.) for controlling these processing units (each device). Consists of.
  • the PC 102 is connected to the network 1 via the communication unit 46, and the video conference system is constructed together with the terminal devices 3 and 4 connected through the network 1 as in the present embodiment.
  • the CPU 180 executes a program installed in the external storage device 90, so that the CPU 180 can perform processing equivalent to that of the audio processing unit 10 of the present embodiment. That is, it is only necessary to design a sound processing unit 110 that combines known modules for realizing the processes in the flowcharts of FIGS. 2 and 3 and can process a sound signal according to the process flow shown in the flowcharts as a program. Note that each processing unit constituting the audio processing unit 110 is a function realized by the CPU 180, and in FIG. 6, it is shown as a virtual processing unit so that it can be compared with the one in this embodiment (see FIG. 1). However, the same reference numerals are given in parentheses.
  • the CPU 180 that performs the process of S39 functions as the “output process” of the second and third aspects, and the CPU 180 that performs the process of S41 functions as the “input process”.
  • the CPU 180 that performs the process of S13 functions in the “position detection process”, and the CPU 180 that performs the process of S33 or S35 functions in the “generation process”.
  • the CPU 180 that performs the process of S37 functions as the “superimposition process”, and the CPU 180 that performs the process of S43 functions as the “extraction process”.
  • the CPU 180 that performs the processes of S45 and S47 functions as the “calculation process”, and the CPU 180 that performs the processes of S49, S51, and S53 functions as the “removal process”.
  • the CPU 180 that performs the process of S11 functions in the “transmission step”.
  • a change in the arrangement position of the voice input device 260 or the voice output device 270 may be detected by shooting the voice input device 60 or the voice output device 70 from a fixed position and analyzing the shot image.
  • the voice input device 260 and the voice output device 270 are configured as movable devices each including a microphone 64 and a speaker 74 without including an acceleration sensor.
  • the output of the camera 250 for photographing the voice input device 260 and the voice output device 270 is input to the input / output interface 88.
  • an image analysis unit 252 that performs a known image analysis process is provided, and the positions (for example, coordinates) of the audio input device 260 and the audio output device 270 in the image captured by the camera 250 are specified.
  • the image analysis unit 252 may be realized by, for example, the CPU 280 executing a program and performing a known image analysis process.
  • the analysis result of the image analysis unit 252 (for example, coordinate information of the audio input device 260 and the audio output device 270) is input to the movement detection unit 12. Note that, in the terminal device 202 of this modification, a portion that functions as an echo removal device is indicated as an echo removal unit 208.
  • the echo removal unit 208 has a sound processing unit 210 (except for the A / D converter 26, which may have the same configuration as the sound processing unit 10 of the present embodiment), a sound input device 260, a sound output device 270, a communication.
  • the terminal device 202 is configured in this way, and the camera 250 is installed at an appropriate fixed position overlooking the movable range of the voice input device 260 and the voice output device 270. Then, the image captured by the camera 250 is analyzed by the image analysis unit 252 and the positions of the audio input device 260 and the audio output device 270 in the captured image are specified. Based on the analysis result, the movement detection unit 12 determines whether or not a change has occurred in the arrangement position of the voice input device 260 or the voice output device 270. Thus, if the audio input device 260 and the audio output device 270 are photographed from a fixed position using the camera 250, the photographed image is analyzed and only the position of both in the photographed image is grasped. An absolute change in the arrangement position of at least one of the voice input device 260 and the voice output device 270 can be detected.
  • the camera 250 corresponds to the “photographing means” of the first aspect.
  • the CPU 280 that realizes an image analysis unit 252 that performs a known image analysis process and can specify the position of the audio input device 260 and the audio output device 270 in the captured image of the camera 250 functions as an “analysis unit”. To do.
  • a marker for identification is written on the voice input device 260 and the voice output device 270
  • the marker position (coordinates) may be specified in the captured image of the camera 250 fixed at a fixed position. In this way, it is possible to specify the arrangement positions of both in the captured image without performing shape recognition of the voice input device 260 and the voice output device 270, and the image analysis process can be simplified.
  • sound caused by a phase shift or a reflected wave phase shift when radio waves, infrared rays, laser beams, etc. are emitted from two or more fixed points and received by a voice input device or a voice output device. You may detect the change of the arrangement position of an input device or an audio
  • a digital output may be used for the speaker 74, the microphone 64, and the acceleration sensors 62 and 72.
  • an audio input device 60 or an audio output device 70 may be provided with an A / D converter or a D / A converter.
  • the count value T may be acquired from the CPU 80 by using an interval timer of the CPU 80 instead of the timer 44.
  • a band pass filter is used as the digital filter 34, a high pass filter (HPF), a low pass filter (LPF), or a combination of these various filters may be used.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Cable Transmission Systems, Equalization Of Radio And Reduction Of Echo (AREA)

Abstract

L'invention porte sur un dispositif de suppression d'écho, un procédé de suppression d'écho et un programme pour le dispositif de suppression d'écho qui permettent d'obtenir nouvellement des informations de différence de temps et des informations de différence de niveau lorsque la position d'agencement d'un moyen d'entrée ou d'un moyen de sortie est modifiée, et de supprimer des composantes d'écho acoustique sur la base des informations les plus récentes. Lorsqu'une modification de la position d'agencement d'un dispositif d'entrée vocale (60) ou d'un dispositif de sortie vocale (70) est détectée par une section de détection de mouvement (12), un signal de référence est généré par des sections de génération de signal de référence (14, 16), superposé au signal vocal reçu en provenance de dispositifs terminaux (3, 4) par un additionneur (24), et émis par un haut-parleur (74). Pour supprimer l'écho acoustique, le signal de référence est extrait par un filtre numérique (34) de la voix captée par un microphone (64), et une composante d'écho acoustique, qui est le signal vocal reçu retardé et atténué, est générée sur la base des informations de différence de temps et des informations de différence de niveau obtenues par comparaison au signal de référence d'origine ; la composante d'écho acoustique est ensuite supprimée du signal vocal à transmettre par un soustracteur (42), et le signal vocal est transmis aux dispositifs terminaux (3, 4).
PCT/JP2010/064678 2009-09-17 2010-08-30 Dispositif de suppression d'écho, procédé de suppression d'écho et programme pour dispositif de suppression d'écho WO2011033924A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2009215283A JP2011066668A (ja) 2009-09-17 2009-09-17 エコー除去装置、エコー除去方法、およびエコー除去装置のプログラム
JP2009-215283 2009-09-17

Publications (1)

Publication Number Publication Date
WO2011033924A1 true WO2011033924A1 (fr) 2011-03-24

Family

ID=43758533

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/064678 WO2011033924A1 (fr) 2009-09-17 2010-08-30 Dispositif de suppression d'écho, procédé de suppression d'écho et programme pour dispositif de suppression d'écho

Country Status (2)

Country Link
JP (1) JP2011066668A (fr)
WO (1) WO2011033924A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013078002A1 (fr) * 2011-11-23 2013-05-30 Qualcomm Incorporated Annulation d'écho acoustique sur la base d'une détection de mouvements par ultrasons

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5666063B2 (ja) * 2012-08-03 2015-02-12 三菱電機株式会社 通話装置
US9131041B2 (en) 2012-10-19 2015-09-08 Blackberry Limited Using an auxiliary device sensor to facilitate disambiguation of detected acoustic environment changes
JP6347029B2 (ja) * 2014-03-19 2018-06-27 アイホン株式会社 インターホンシステム
KR20210108232A (ko) * 2020-02-25 2021-09-02 삼성전자주식회사 에코 캔슬링을 위한 방법 및 그 장치
KR20220017775A (ko) * 2020-08-05 2022-02-14 삼성전자주식회사 오디오 신호 처리 장치 및 그 동작 방법

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0983412A (ja) * 1995-09-08 1997-03-28 Ricoh Co Ltd ディジタルエコーキャンセラ装置
JP2001119470A (ja) * 1999-10-15 2001-04-27 Fujitsu Ten Ltd 電話音声処理装置
JP2006080660A (ja) * 2004-09-07 2006-03-23 Oki Electric Ind Co Ltd エコーキャンセラ付き通信端末及びそのエコーキャンセル方法
JP2007072351A (ja) * 2005-09-09 2007-03-22 Mitsubishi Electric Corp 音声認識装置
JP2007336364A (ja) * 2006-06-16 2007-12-27 Oki Electric Ind Co Ltd エコーキャンセラ

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0983412A (ja) * 1995-09-08 1997-03-28 Ricoh Co Ltd ディジタルエコーキャンセラ装置
JP2001119470A (ja) * 1999-10-15 2001-04-27 Fujitsu Ten Ltd 電話音声処理装置
JP2006080660A (ja) * 2004-09-07 2006-03-23 Oki Electric Ind Co Ltd エコーキャンセラ付き通信端末及びそのエコーキャンセル方法
JP2007072351A (ja) * 2005-09-09 2007-03-22 Mitsubishi Electric Corp 音声認識装置
JP2007336364A (ja) * 2006-06-16 2007-12-27 Oki Electric Ind Co Ltd エコーキャンセラ

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013078002A1 (fr) * 2011-11-23 2013-05-30 Qualcomm Incorporated Annulation d'écho acoustique sur la base d'une détection de mouvements par ultrasons
CN103988487A (zh) * 2011-11-23 2014-08-13 高通股份有限公司 基于超声波运动检测的声学回声消除
US9363386B2 (en) 2011-11-23 2016-06-07 Qualcomm Incorporated Acoustic echo cancellation based on ultrasound motion detection

Also Published As

Publication number Publication date
JP2011066668A (ja) 2011-03-31

Similar Documents

Publication Publication Date Title
US10993025B1 (en) Attenuating undesired audio at an audio canceling device
JP5085556B2 (ja) エコー除去の構成
US9494683B1 (en) Audio-based gesture detection
US8842851B2 (en) Audio source localization system and method
US9595997B1 (en) Adaption-based reduction of echo and noise
WO2011033924A1 (fr) Dispositif de suppression d'écho, procédé de suppression d'écho et programme pour dispositif de suppression d'écho
US9385779B2 (en) Acoustic echo control for automated speaker tracking systems
JP2008288785A (ja) テレビ会議装置
US9928847B1 (en) System and method for acoustic echo cancellation
JP2003060530A (ja) エコー抑制処理システム
US20230353953A1 (en) Voice input/output apparatus, hearing aid, voice input/output method, and voice input/output program
EP2795884A1 (fr) Audioconférence
JP3607625B2 (ja) 多チャネル反響抑圧方法、その装置、そのプログラム及びその記録媒体
JP2009141560A (ja) 音声信号処理装置、音声信号処理方法
KR102112018B1 (ko) 영상 회의 시스템에서의 음향 반향 제거 장치 및 방법
US20230419943A1 (en) Devices, methods, systems, and media for spatial perception assisted noise identification and cancellation
US8976956B2 (en) Speaker phone noise suppression method and apparatus
JP6569853B2 (ja) 指向性制御システム及び音声出力制御方法
CN113556652B (zh) 语音处理方法、装置、设备及***
JP2010226403A (ja) ハウリングキャンセラ
JP2008294600A (ja) 放収音装置、および放収音システム
JP2008219240A (ja) 放収音システム
JP4743085B2 (ja) エコーキャンセラ
JP6347029B2 (ja) インターホンシステム
JP2015103824A (ja) 音声発生システムおよび音声発生機器用スタンド

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10817037

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10817037

Country of ref document: EP

Kind code of ref document: A1