CN113470683A - Signal output method, device, equipment and storage medium of microphone array - Google Patents

Signal output method, device, equipment and storage medium of microphone array Download PDF

Info

Publication number
CN113470683A
CN113470683A CN202110716886.XA CN202110716886A CN113470683A CN 113470683 A CN113470683 A CN 113470683A CN 202110716886 A CN202110716886 A CN 202110716886A CN 113470683 A CN113470683 A CN 113470683A
Authority
CN
China
Prior art keywords
sound source
microphone array
input signal
sound
signal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110716886.XA
Other languages
Chinese (zh)
Inventor
陈英博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Lianzhou International Technology Co Ltd
Original Assignee
Shenzhen Lianzhou International Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Lianzhou International Technology Co Ltd filed Critical Shenzhen Lianzhou International Technology Co Ltd
Priority to CN202110716886.XA priority Critical patent/CN113470683A/en
Publication of CN113470683A publication Critical patent/CN113470683A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention relates to the technical field of sound source signal processing, and discloses a signal output method, a signal output device, signal output equipment and a signal output storage medium of a microphone array, which can be used for flexibly tracking, monitoring and outputting sound source signals in multiple directions. The method comprises the steps of acquiring an input signal of each frame of a microphone array, and calculating the output power of each monitoring direction according to the input signal; selecting M first sound sources from the Nth frame input signal according to a preset first rule; selecting Q second sound sources from the ith frame input signal according to a preset second rule; when the sound source direction of any one second sound source is the same as that of any one first sound source, updating the sound source information of the corresponding first sound source according to the second sound source; acquiring a second sound source with the sound source direction different from the sound source direction of the first sound source as a third sound source; determining an output signal of the microphone array from the updated first and third sound sources according to a preset third rule.

Description

Signal output method, device, equipment and storage medium of microphone array
Technical Field
The present invention relates to the field of sound source signal processing technologies, and in particular, to a method, an apparatus, a device, and a storage medium for outputting a signal of a microphone array.
Background
The traditional microphone array signal processing method is divided into two processes of sound source localization and beam forming. The sound source positioning method comprises the following steps of calculating an incident direction of a sound source by utilizing an original input signal of a microphone array; the beam forming function is to adjust parameters of the beam former according to the original input signals of the microphone array and the incident direction obtained by sound source positioning, so that the beam former points to the sound source direction, and signals in other directions are inhibited while signals in the sound source direction pass through, thereby improving the signal-to-noise ratio of output signals.
However, in the existing sound source localization algorithm, only the sound source direction with the strongest power is localized, and then a beam former is adopted to monitor and output the direction. In this way, signal interference problems are encountered when multiple sources are present in the scene simultaneously. For example, when a person speaks in the 0 degree direction of the microphone array and a washing machine does not stop sounding in the 90 degree direction, even if the speaking volume of the person is higher than the working volume of the washing machine, because pauses exist between words and sentences of the person speaking, the sound source can be positioned in the direction of the washing machine during the pauses, and therefore great interference is brought to the output voice.
In order to solve the above technical problem, the existing signal output technology of a microphone array adopts the following steps: and directly skipping the step of sound source positioning, setting a corresponding beam former for each direction to monitor the direction, and outputting a signal in each direction. In this way, each output signal needs to be calculated, which consumes a large amount of computing power.
Disclosure of Invention
The technical problem to be solved by the embodiment of the invention is as follows: a signal output method, device, equipment and storage medium of a microphone array are provided, which can flexibly track and monitor sound source signals in a plurality of directions and output the sound source signals.
In order to solve the technical problem, in a first aspect, an embodiment of the present invention provides a signal output method for a microphone array, where the method includes:
acquiring an input signal of each frame of a microphone array, and calculating the output power of each monitoring direction according to the input signal;
selecting M first sound sources from the input signals of the Nth frame according to a preset first rule; wherein N is more than or equal to 1, and M is more than 1;
selecting Q second sound sources from the input signal of the ith frame according to a preset second rule; wherein i is more than N, and Q is more than or equal to M;
when the sound source direction of any one of the second sound sources is the same as the sound source direction of any one of the first sound sources, updating the sound source information of the corresponding first sound source according to the second sound source;
acquiring a second sound source with a sound source direction different from the sound source direction of the first sound source as a third sound source;
determining an output signal of the microphone array from the updated first and third sound sources according to a preset third rule.
As an alternative, the acquiring an input signal of each frame of the microphone array, and calculating an output power of each listening direction according to the input signal specifically includes:
acquiring an input signal of each frame of the microphone array;
calculating the output power P of each listening direction of the input signal of the first frame based on a beam forming algorithm1
Calculating the output power P of each listening direction of the input signal of the ith frame according to the following formulai
Pi=a*Pi0+(1-a)*Pi-1
Wherein, Pi0The original output power P of any monitoring direction of the input signal of the ith frame calculated based on a beam forming algorithmi-1The output power of the input signal corresponding to the monitoring direction for the (i-1) th frame, a is a smoothing factor, 0<a<1。
As an alternative, the beamforming algorithm is a delay-and-sum beamforming algorithm, a minimum variance distortionless response beamforming algorithm, a linearly constrained minimum variance beamforming algorithm, a generalized sidelobe cancellation algorithm, or a transfer function generalized sidelobe cancellation algorithm.
As an alternative, the updating the sound source information of the corresponding first sound source according to the second sound source specifically includes:
adjusting the output power of the corresponding first sound source according to the following formula:
Pm′=Pm*b+Pq*(1-b);
wherein, Pm' the output power after the adjustment of the mth first sound source, M is more than or equal to 1 and less than or equal to M, PqQ is more than or equal to 1 and less than or equal to Q, b is an adjustment factor, 0<b<1。
As an alternative, the selecting M first sound sources from the input signal of the nth frame according to a preset first rule specifically includes:
selecting M sound sources with the maximum output power from the input signals of the Nth frame as the first sound source;
alternatively, the first and second electrodes may be,
and in the pre-divided monitoring directions, selecting a fourth sound source with the maximum output power from each monitoring direction, and selecting M sound sources with the maximum output power from the fourth sound source as the first sound source.
As an alternative, the determining the output signal of the microphone array from the updated first sound source and the updated third sound source according to a preset third rule specifically includes:
combining the updated first sound source and the third sound source into a fifth sound source;
selecting M sixth sound sources with the maximum output power from the fifth sound sources;
determining an output signal of the microphone array according to a sound source direction in the sixth sound source.
As an alternative, before the acquiring the input signal of each frame of the microphone array, the method further comprises:
the monitoring direction is evenly divided into a plurality of parts.
In order to solve the above technical problem, in a second aspect, an embodiment of the present invention provides a signal output apparatus of a microphone array, the apparatus including: the microphone array comprises an input signal acquisition module, a signal detection module and a signal processing module, wherein the input signal acquisition module is used for acquiring an input signal of each frame of the microphone array and calculating the output power of each monitoring direction according to the input signal;
the first sound source selection module is used for selecting M first sound sources from the input signal of the Nth frame according to a preset first rule; wherein N is more than or equal to 1, and M is more than 1;
the second sound source selection module is used for selecting Q second sound sources from the input signal of the ith frame according to a preset second rule; wherein i is more than N, and Q is more than or equal to M;
the first sound source updating module is used for updating the sound source information of the corresponding first sound source according to the second sound source when the sound source direction of any one of the second sound sources is the same as the sound source direction of any one of the first sound sources;
a third sound source selection module, configured to acquire a second sound source having a sound source direction different from the sound source direction of the first sound source, as a third sound source;
and the signal output module is used for determining the output signals of the microphone array from the updated first sound source and the updated third sound source according to a preset third rule.
In order to solve the technical problem, in a third aspect, an embodiment of the present invention provides a signal output device of a microphone array, the device including a memory, a processor, and a computer program stored in the memory and configured to be executed by the processor, the computer program, when executed by the processor, implementing the signal output method of the microphone array according to any one of the first aspect.
In order to solve the above technical problem, in a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium having stored therein a computer program, which when executed, implements the signal output method of a microphone array according to any one of the first aspect.
Compared with the prior art, the signal output method, the signal output device, the signal output equipment and the signal output storage medium of the microphone array provided by the embodiment of the invention have the beneficial effects that: the incident direction of the sound source can be automatically and dynamically selected, sound source signals in a plurality of incident directions are tracked, monitored and output, the problem of monitoring leakage in a single direction is avoided, and high calculation power consumption for calculating each path of signal is avoided.
Drawings
In order to more clearly illustrate the technical features of the embodiments of the present invention, the drawings needed to be used in the embodiments of the present invention will be briefly described below, and it is apparent that the drawings described below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on the drawings without inventive labor.
Fig. 1 is a schematic flow chart of an alternative embodiment of a signal output method of a microphone array according to the present invention;
fig. 2 is a schematic structural diagram of an alternative embodiment of a signal output apparatus of a microphone array according to the present invention;
fig. 3 is a schematic structural diagram of an alternative embodiment of a signal output device of a microphone array according to the present invention.
Detailed Description
In order to clearly understand the technical features, objects and effects of the present invention, the following detailed description of the embodiments of the present invention is provided with reference to the accompanying drawings and examples. The following examples are intended to illustrate the invention, but are not intended to limit the scope of the invention. Other embodiments, which can be derived by those skilled in the art from the embodiments of the present invention without inventive step, shall fall within the scope of the present invention.
In the description of the present invention, it should be understood that the numbers themselves, such as "first", "second", etc., are used only for distinguishing the described objects, do not have a sequential or technical meaning, and cannot be understood as defining or implying the importance of the described objects.
Fig. 1 is a schematic flow chart of a signal output method of a microphone array according to a preferred embodiment of the present invention.
As shown in fig. 1, the method includes:
s11: acquiring an input signal of each frame of a microphone array, and calculating the output power of each monitoring direction according to the input signal;
s12: selecting M first sound sources from the input signals of the Nth frame according to a preset first rule; wherein N is more than or equal to 1, and M is more than 1;
s13: selecting Q second sound sources from the input signal of the ith frame according to a preset second rule; wherein i is more than N, and Q is more than or equal to M;
s14: when the sound source direction of any one of the second sound sources is the same as the sound source direction of any one of the first sound sources, updating the sound source information of the corresponding first sound source according to the second sound source;
s15: acquiring a second sound source with a sound source direction different from the sound source direction of the first sound source as a third sound source;
s16: determining an output signal of the microphone array from the updated first and third sound sources according to a preset third rule.
In the embodiment of the invention, the sound source direction of the input signal is calculated based on a minimum variance spectrum estimation calculation method.
Specifically, firstly, an input signal of each frame of the microphone line array is acquired, an arbitrary sound source positioning algorithm based on beam scanning azimuth spectrum estimation is adopted, the output power of each monitoring direction is calculated, Px is used for representing the output power of the X-th monitoring direction, and 0 < X (the total number of the monitoring directions) is used. Next, the maximum number of sound sources that need to be tracked simultaneously is localized M (2 or 3 in the case of a smart speaker), and the sound source information of the mth tracked sound source is represented by Tm, (Dm, Pm), where Tm represents the sound source direction of the current sound source, and Pm represents the output power of the current sound source. Then, for the 1 st (for convenience of description, N takes 1) frame input signal, M first sound sources are selected from the input signal according to a preset first rule and are recorded as Tm (Dm, Pm); for the input signal of the ith frame, Q second sound sources are selected according to a preset second rule, and sound source information of the qth second sound source is represented by Tq ═ Dq (Pq, Pq). Next, for every q second sound sources, if it is determined that Dq in the second sound source is tracked, that is, there is already one Dm ═ Dq, then it is necessary to update the output power of the first sound source with the same sound source direction according to the obtained output power of the second sound source; when it is determined that Dq of the second sound source is not tracked, K sound sources that are not tracked are regarded as third sound sources, and sound source information indicating the kth third sound source is expressed as Tk ═ (Dk, Pk). Finally, the output signal of the microphone array is determined from the updated first and third sound sources according to a preset third rule.
The signal output method of the microphone array provided by the embodiment of the invention can automatically and dynamically select the incident direction of the sound source, and track, monitor and output the sound source signals in a plurality of incident directions, thereby not only avoiding the problem of monitoring leakage existing in the process of monitoring only in a single direction, but also avoiding high computational power consumption for calculating each path of signal.
In an optional embodiment, before the acquiring the input signal of each frame of the microphone array, the method further comprises:
s10: the monitoring direction is evenly divided into a plurality of parts.
Specifically, taking the XOY plane as an example, the XOY plane is 360 ° in total, and is divided into X parts uniformly according to the scanning interval. For example, if the scanning interval is 10 °, the XOY plane is divided into 36; the scan interval is 30 °, the XOY plane is divided into 12.
This embodiment will monitor the direction and evenly divide, is convenient for monitor the signal, can acquire the monitoring direction at signal place fast.
In an optional embodiment, the acquiring an input signal of each frame of a microphone array, and calculating an output power of each listening direction according to the input signal specifically includes:
acquiring an input signal of each frame of the microphone array;
calculating the output power P of each listening direction of the input signal of the first frame based on a beam forming algorithm1(ii) a Wherein the beam forming algorithm is a delayed sum waveA beam forming algorithm, a minimum variance distortionless response beam forming algorithm, a linear constraint minimum variance beam forming algorithm, a generalized sidelobe canceling algorithm or a transfer function generalized sidelobe canceling algorithm;
calculating the output power P of each listening direction of the input signal of the ith frame according to the following formulai
Pi=a*Pi0+(1-a)*Pi-1
Wherein, Pi0The original output power P of any monitoring direction of the input signal of the ith frame calculated based on a beam forming algorithmi-1The output power of the input signal corresponding to the monitoring direction for the (i-1) th frame, a is a smoothing factor, 0<a<1。
Further, the value of the smoothing factor a is based on Pi0And Pi-1Changes in the value of (mapping table may be set): when P is presenti0And Pi-1When the absolute value of the difference value of (a) is larger, the value of a is smaller; when P is presenti0And Pi-1When the absolute value of the difference value of (a) is small, the value of a is large. The change of the smoothing factor a can make the voice signal become smoother, and the degree of noise interference is reduced.
Specifically, first, the output power P of the input signal of the first frame in each listening direction is calculated1Then, in order to suppress noise, P of the subsequent frame is subjectediSmoothing is carried out on a time axis, and whether the historical data is utilized and the proportion degree of the historical data is utilized are controlled through a smoothing factor.
In an optional embodiment, the updating the sound source information of the corresponding first sound source according to the second sound source specifically includes:
adjusting the output power of the corresponding first sound source according to the following formula:
Pm′=Pm*b+Pq*(1-b);
wherein, Pm' the output power after the adjustment of the mth first sound source, M is more than or equal to 1 and less than or equal to M, PqQ is more than or equal to 1 and less than or equal to Q, b is an adjustment factor, 0<b<1。
Go toStep by step, the value of the adjustment factor b is based on PmAnd PqChanges in the value of (mapping table may be set): when P is presentmAnd PqWhen the absolute value of the difference value of (b) is larger, the value of b is smaller; when P is presentmAnd PqWhen the absolute value of the difference value of (b) is small, the value of b is large. The change of the adjustment factor b can make the voice signal smoother, and reduce the degree of noise interference.
Specifically, when the sound source direction of the second sound source is the same as the sound source direction of the first sound source, since the sound source direction in the sound source information is the same, updating is not necessary. For the output power, it needs to be updated according to corresponding rules, wherein there are many kinds of update rules, and this embodiment adopts a smoothing algorithm according to time and energy.
In an optional embodiment, the selecting M first sound sources from the input signal of the nth frame according to a preset first rule specifically includes:
selecting M sound sources with the maximum output power from the input signals of the Nth frame as the first sound source;
alternatively, the first and second electrodes may be,
and in the pre-divided monitoring directions, selecting a fourth sound source with the maximum output power from each monitoring direction, and selecting M sound sources with the maximum output power from the fourth sound source as the first sound source.
Specifically, the first selection mode is global maximum selection, that is, for the nth frame input signal, no matter whether the sound source direction is close or not, only M sound sources with the maximum output power are taken as the first sound source; the second selection mode is local maximum selection, that is, for the nth frame input signal, considering the monitoring direction of the sound source, only one fourth sound source with the maximum output power is selected from each monitoring direction (if there are X monitoring directions, there are X fourth sound sources correspondingly), and then M sound sources with the maximum output power are selected from the fourth sound sources as the first sound sources, so that only one output signal in each monitoring direction can be ensured.
In an optional embodiment, the determining the output signal of the microphone array from the updated first sound source and the updated third sound source according to a preset third rule specifically includes:
combining the updated first sound source and the third sound source into a fifth sound source;
selecting M sixth sound sources with the maximum output power from the fifth sound sources;
determining an output signal of the microphone array according to a sound source direction in the sixth sound source.
Specifically, the updated sound source information of the mth sound source in the first sound source is Tm ═ (Dm, P)m'), the sound source information of the kth sound source in the third sound source is Tk ═ Dk, Pk, the M updated first sound sources and the K third sound sources are rearranged and combined into a fifth sound source according to the output power from large to small, and then M sound sources with the maximum output power are selected from the fifth sound source as a sixth sound source. And then, for each output direction in the sixth sound source, a path of output signal is formed through a beam former, so that multi-signal output is realized.
In summary, according to the signal output method of the microphone array provided by the embodiment of the present invention, the output power of the input signal on the time axis is calculated, and the sound source information in the time before and after is compared to perform dynamic tracking monitoring on the sound source, so that the incident direction of the sound source can be automatically dynamically selected, and the sound source signals in multiple incident directions are tracked, monitored and output, thereby not only avoiding the problem of missing monitoring in a single direction, but also avoiding high computational power consumption for calculating each path of signal.
It should be understood that all or part of the processes in the method for outputting a signal of a microphone array according to the present invention may also be implemented by a computer program instructing associated hardware, and the computer program may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method for outputting a signal of a microphone array may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain other components which may be suitably increased or decreased as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, in accordance with legislation and patent practice, the computer readable medium does not include electrical carrier signals and telecommunications signals.
Fig. 2 is a schematic structural diagram of a preferred embodiment of a signal output apparatus of a microphone array according to the present invention, which is capable of implementing the whole process of the signal output method of the microphone array according to any of the above embodiments.
As shown in fig. 2, the apparatus includes:
an input signal obtaining module 21, which obtains an input signal of each frame of the microphone array, and calculates an output power of each monitoring direction according to the input signal;
a first sound source selecting module 22, configured to select M first sound sources from the nth frame of the input signal according to a preset first rule; wherein N is more than or equal to 1, and M is more than 1;
a second sound source selecting module 23, configured to select Q second sound sources from the input signal of the ith frame according to a preset second rule; wherein i is more than N, and Q is more than or equal to M;
a first sound source updating module 24, configured to update sound source information of a corresponding first sound source according to any one of the second sound sources when a sound source direction of the any one of the second sound sources is the same as a sound source direction of the any one of the first sound sources;
a third sound source selection module 25, configured to acquire the second sound source with a sound source direction different from the sound source direction of the first sound source, as a third sound source;
a signal output module 26, configured to determine an output signal of the microphone array from the updated first sound source and the third sound source according to a preset third rule.
The signal output device of the microphone array provided by the embodiment of the invention can automatically and dynamically select the incident direction of the sound source, and track, monitor and output the sound source signals in a plurality of incident directions, thereby not only avoiding the problem of missing monitoring existing in the process of monitoring only a single direction, but also avoiding high computational power consumption for calculating each path of signal.
Optionally, the apparatus further comprises:
and the monitoring direction dividing module is used for uniformly dividing the monitoring direction into a plurality of parts.
Optionally, the input signal acquiring module 21 specifically includes:
an input signal acquisition unit for acquiring an input signal of each frame of the microphone array;
an output power calculation unit for calculating an output power P for each listening direction of the input signal of a first frame based on a beamforming algorithm1(ii) a The beam forming algorithm is a delay-sum beam forming algorithm, a minimum variance distortionless response beam forming algorithm, a linear constraint minimum variance beam forming algorithm, a generalized sidelobe cancellation algorithm or a transfer function generalized sidelobe cancellation algorithm;
the output power calculating unit is further used for calculating the output power P of each monitoring direction of the input signal of the ith frame according to the following formulai
Pi=a*Pi0+(1-a)*Pi-1
Wherein, Pi0The original output power P of any monitoring direction of the input signal of the ith frame calculated based on a beam forming algorithmi-1The output power of the input signal corresponding to the monitoring direction for the (i-1) th frame, a is a smoothing factor, 0<a<1。
Further, the value of the smoothing factor a is based on Pi0And Pi-1Changes in the value of (mapping table may be set): when P is presenti0And Pi-1When the absolute value of the difference value of (a) is larger, the value of a is smaller; when P is presenti0And Pi-1When the absolute value of the difference of (a) is smallThe value of a is large. The change of the smoothing factor a can make the voice signal become smoother, and the degree of noise interference is reduced.
Optionally, the first sound source updating module 24 adjusts the output power of the corresponding first sound source according to the following formula:
Pm′=Pm*b+Pq*(1-b);
wherein, Pm' the output power after the adjustment of the mth first sound source, M is more than or equal to 1 and less than or equal to M, PqQ is more than or equal to 1 and less than or equal to Q, b is an adjustment factor, 0<b<1。
Further, the value of the adjustment factor b is based on PmAnd PqChanges in the value of (mapping table may be set): when P is presentmAnd PqWhen the absolute value of the difference value of (b) is larger, the value of b is smaller; when P is presentmAnd PqWhen the absolute value of the difference value of (b) is small, the value of b is large. The change of the adjustment factor b can make the voice signal smoother, and reduce the degree of noise interference.
Optionally, the first sound source selecting module 22 is specifically configured to:
selecting M sound sources with the maximum output power from the input signals of the Nth frame as the first sound source;
alternatively, the first and second electrodes may be,
and in the pre-divided monitoring directions, selecting a fourth sound source with the maximum output power from each monitoring direction, and selecting M sound sources with the maximum output power from the fourth sound source as the first sound source.
Optionally, the signal output module 26 specifically includes:
a fifth sound source combining unit, configured to combine the updated first sound source and the third sound source into a fifth sound source;
a sixth sound source selecting unit, configured to select M sixth sound sources with the largest output power from the fifth sound sources;
a signal output unit for determining an output signal of the microphone array according to a sound source direction in the sixth sound source.
Fig. 3 is a schematic structural diagram of an alternative embodiment of a signal output device of a microphone array according to the present invention, which is capable of implementing the entire process of the signal output method of the microphone array according to any of the above embodiments.
As shown in fig. 3, the apparatus includes a memory 31, a processor 32; wherein the memory 31 has stored therein a computer program configured to be executed by the processor 32, and when being executed by the processor 32, to implement the signal output method of the microphone array according to any of the embodiments described above.
The signal output equipment of the microphone array provided by the embodiment of the invention can automatically and dynamically select the incident direction of the sound source, and track, monitor and output the sound source signals in a plurality of incident directions, thereby not only avoiding the problem of monitoring leakage existing in the process of monitoring only in a single direction, but also avoiding high computational power consumption for calculating each path of signal.
Illustratively, the computer program may be divided into one or more modules/units, which are stored in the memory 31 and executed by the processor 32 to accomplish the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions describing the execution of the computer program in the signal output device of the microphone array.
The Processor 32 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 31 may be used to store the computer programs and/or modules, and the processor 32 implements various functions of the signal output device of the microphone array by running or executing the computer programs and/or modules stored in the memory 31 and invoking data stored in the memory 31. The memory 31 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, a phonebook, etc.) created according to the use of the cellular phone, and the like. In addition, the memory 31 may include a high speed random access memory, and may also include a non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.
It should be noted that the signal output device of the microphone array includes, but is not limited to, a processor and a memory, and those skilled in the art will understand that the schematic diagram of the structure in fig. 3 is only an example of the signal output device of the microphone array, and does not constitute a limitation to the signal output device of the microphone array, and may include more components than those shown in the figure, or some components in combination, or different components.
The above description is only a possible embodiment of the present invention, but the protection scope of the present invention is not limited thereto, and it should be noted that, for those skilled in the art, several equivalent obvious modifications and/or equivalent substitutions can be made without departing from the technical principle of the present invention, and these obvious modifications and/or equivalent substitutions should also be regarded as the protection scope of the present invention.

Claims (10)

1. A signal output method of a microphone array, comprising:
acquiring an input signal of each frame of a microphone array, and calculating the output power of each monitoring direction according to the input signal;
selecting M first sound sources from the input signals of the Nth frame according to a preset first rule; wherein N is more than or equal to 1, and M is more than 1;
selecting Q second sound sources from the input signal of the ith frame according to a preset second rule; wherein i is more than N, and Q is more than or equal to M;
when the sound source direction of any one of the second sound sources is the same as the sound source direction of any one of the first sound sources, updating the sound source information of the corresponding first sound source according to the second sound source;
acquiring a second sound source with a sound source direction different from the sound source direction of the first sound source as a third sound source;
determining an output signal of the microphone array from the updated first and third sound sources according to a preset third rule.
2. The method as claimed in claim 1, wherein the obtaining an input signal of each frame of the microphone array and calculating an output power of each listening direction according to the input signal comprises:
acquiring an input signal of each frame of the microphone array;
calculating the output power P of each listening direction of the input signal of the first frame based on a beam forming algorithm1
Calculating the output power P of each listening direction of the input signal of the ith frame according to the following formulai
Pi=a*Pi0+(1-a)*Pi-1
Wherein, Pi0The original output power P of any monitoring direction of the input signal of the ith frame calculated based on a beam forming algorithmi-1The output power of the input signal corresponding to the monitoring direction for the (i-1) th frame, a is a smoothing factor, 0<a<1。
3. The signal output method of the microphone array as set forth in claim 2, wherein the beamforming algorithm is a delay-and-sum beamforming algorithm, a minimum variance distortionless response beamforming algorithm, a linearly constrained minimum variance beamforming algorithm, a generalized sidelobe cancellation algorithm, or a transfer function generalized sidelobe cancellation algorithm.
4. The method for outputting signals of a microphone array according to claim 1, wherein the updating the sound source information of the corresponding first sound source according to the second sound source specifically comprises:
adjusting the output power of the corresponding first sound source according to the following formula:
Pm′=Pm*b+Pq*(1-b);
wherein, Pm' the output power after the adjustment of the mth first sound source, M is more than or equal to 1 and less than or equal to M, PqQ is more than or equal to 1 and less than or equal to Q, b is an adjustment factor, 0<b<1。
5. The signal output method of the microphone array according to claim 1, wherein the selecting M first sound sources from the input signal of the nth frame according to a preset first rule includes:
selecting M sound sources with the maximum output power from the input signals of the Nth frame as the first sound source;
alternatively, the first and second electrodes may be,
and in the pre-divided monitoring directions, selecting a fourth sound source with the maximum output power from each monitoring direction, and selecting M sound sources with the maximum output power from the fourth sound source as the first sound source.
6. The method for outputting signals of a microphone array according to claim 1, wherein the determining the output signals of the microphone array from the updated first and third sound sources according to a preset third rule comprises:
combining the updated first sound source and the third sound source into a fifth sound source;
selecting M sixth sound sources with the maximum output power from the fifth sound sources;
determining an output signal of the microphone array according to a sound source direction in the sixth sound source.
7. The signal output method of a microphone array according to claim 1, wherein before the acquiring an input signal of each frame of a microphone array, the method further comprises:
the monitoring direction is evenly divided into a plurality of parts.
8. A signal output apparatus of a microphone array, comprising:
the microphone array comprises an input signal acquisition module, a signal detection module and a signal processing module, wherein the input signal acquisition module is used for acquiring an input signal of each frame of the microphone array and calculating the output power of each monitoring direction according to the input signal;
the first sound source selection module is used for selecting M first sound sources from the input signal of the Nth frame according to a preset first rule; wherein N is more than or equal to 1, and M is more than 1;
the second sound source selection module is used for selecting Q second sound sources from the input signal of the ith frame according to a preset second rule; wherein i is more than N, and Q is more than or equal to M;
the first sound source updating module is used for updating the sound source information of the corresponding first sound source according to the second sound source when the sound source direction of any one of the second sound sources is the same as the sound source direction of any one of the first sound sources;
a third sound source selection module, configured to acquire a second sound source having a sound source direction different from the sound source direction of the first sound source, as a third sound source;
and the signal output module is used for determining the output signals of the microphone array from the updated first sound source and the updated third sound source according to a preset third rule.
9. A signal output device of a microphone array, characterized in that the device comprises a memory, a processor and a computer program stored in the memory and configured to be executed by the processor, the computer program, when executed by the processor, implementing a signal output method of a microphone array as claimed in any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a computer program is stored therein, which when executed implements a signal output method of a microphone array according to any one of claims 1 to 7.
CN202110716886.XA 2021-06-25 2021-06-25 Signal output method, device, equipment and storage medium of microphone array Pending CN113470683A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110716886.XA CN113470683A (en) 2021-06-25 2021-06-25 Signal output method, device, equipment and storage medium of microphone array

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110716886.XA CN113470683A (en) 2021-06-25 2021-06-25 Signal output method, device, equipment and storage medium of microphone array

Publications (1)

Publication Number Publication Date
CN113470683A true CN113470683A (en) 2021-10-01

Family

ID=77873178

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110716886.XA Pending CN113470683A (en) 2021-06-25 2021-06-25 Signal output method, device, equipment and storage medium of microphone array

Country Status (1)

Country Link
CN (1) CN113470683A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090207131A1 (en) * 2008-02-19 2009-08-20 Hitachi, Ltd. Acoustic pointing device, pointing method of sound source position, and computer system
US20110038229A1 (en) * 2009-08-17 2011-02-17 Broadcom Corporation Audio source localization system and method
JP2012042664A (en) * 2010-08-18 2012-03-01 Nippon Telegr & Teleph Corp <Ntt> Sound source parameter estimating device, sound source separating device and their method, and program and memory medium
CN107144820A (en) * 2017-06-21 2017-09-08 歌尔股份有限公司 Sound localization method and device
CN110095755A (en) * 2019-04-01 2019-08-06 北京云知声信息技术有限公司 A kind of sound localization method
CN111696573A (en) * 2020-05-20 2020-09-22 湖南湘江地平线人工智能研发有限公司 Sound source signal processing method and device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090207131A1 (en) * 2008-02-19 2009-08-20 Hitachi, Ltd. Acoustic pointing device, pointing method of sound source position, and computer system
US20110038229A1 (en) * 2009-08-17 2011-02-17 Broadcom Corporation Audio source localization system and method
JP2012042664A (en) * 2010-08-18 2012-03-01 Nippon Telegr & Teleph Corp <Ntt> Sound source parameter estimating device, sound source separating device and their method, and program and memory medium
CN107144820A (en) * 2017-06-21 2017-09-08 歌尔股份有限公司 Sound localization method and device
CN110095755A (en) * 2019-04-01 2019-08-06 北京云知声信息技术有限公司 A kind of sound localization method
CN111696573A (en) * 2020-05-20 2020-09-22 湖南湘江地平线人工智能研发有限公司 Sound source signal processing method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110491403B (en) Audio signal processing method, device, medium and audio interaction equipment
CN109102822B (en) Filtering method and device based on fixed beam forming
CN109661705B (en) Sound source separation device and method, and program
US10347272B2 (en) De-reverberation control method and apparatus for device equipped with microphone
KR101670313B1 (en) Signal separation system and method for selecting threshold to separate sound source
US10755728B1 (en) Multichannel noise cancellation using frequency domain spectrum masking
US20110075859A1 (en) Apparatus for gain calibration of a microphone array and method thereof
CN110060696B (en) Sound mixing method and device, terminal and readable storage medium
CN108717495A (en) The method, apparatus and electronic equipment of multi-beam beam forming
CN111031463A (en) Microphone array performance evaluation method, device, equipment and medium
CN113314138B (en) Sound source monitoring and separating method and device based on microphone array and storage medium
CN111615045B (en) Audio processing method, device, equipment and storage medium
CN112735370B (en) Voice signal processing method and device, electronic equipment and storage medium
US10438330B2 (en) Method and device for compensating dead pixels of image, and non-transitory computer-readable storage medium
CN113470683A (en) Signal output method, device, equipment and storage medium of microphone array
CN110309284B (en) Automatic answer method and device based on Bayesian network reasoning
CN112309418A (en) Method and device for inhibiting wind noise
CN106448693A (en) Speech signal processing method and apparatus
CN113223552A (en) Speech enhancement method, speech enhancement device, speech enhancement apparatus, storage medium, and program
US11272286B2 (en) Method, apparatus and computer program for processing audio signals
JP2012044609A (en) Stereo echo erasing method, stereo echo erasing device, and stereo echo erasing program
CN113763975A (en) Voice signal processing method and device and terminal
CN111540372B (en) Method and device for noise reduction processing of multi-microphone array
CN111724808A (en) Audio signal processing method, device, terminal and storage medium
CN114724576B (en) Method, device and system for updating threshold in howling detection in real time

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination