CN114173273A - Microphone array detection method, related device and readable storage medium - Google Patents

Microphone array detection method, related device and readable storage medium Download PDF

Info

Publication number
CN114173273A
CN114173273A CN202111612528.0A CN202111612528A CN114173273A CN 114173273 A CN114173273 A CN 114173273A CN 202111612528 A CN202111612528 A CN 202111612528A CN 114173273 A CN114173273 A CN 114173273A
Authority
CN
China
Prior art keywords
microphone array
sound source
determining
microphone
detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111612528.0A
Other languages
Chinese (zh)
Other versions
CN114173273B (en
Inventor
许凌
庄永平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN202111612528.0A priority Critical patent/CN114173273B/en
Publication of CN114173273A publication Critical patent/CN114173273A/en
Application granted granted Critical
Publication of CN114173273B publication Critical patent/CN114173273B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements
    • H04R29/001Monitoring arrangements; Testing arrangements for loudspeakers
    • H04R29/002Loudspeaker arrays
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0264Noise filtering characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L2021/02161Number of inputs available containing the signal or the noise to be suppressed
    • G10L2021/02166Microphone arrays; Beamforming

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)

Abstract

The application discloses a microphone array detection method, related equipment and a readable storage medium. For each microphone array to be detected, determining the measurement orientation of each sound source relative to the microphone array, and determining the estimation orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on the measured orientation of each sound source relative to the microphone array and the estimated orientation of each sound source determined by the microphone array; based on the detection results of the respective microphone arrays, the detection results of the microphone arrays are determined. By adopting the scheme, the detection of the microphone array can be realized.

Description

Microphone array detection method, related device and readable storage medium
Technical Field
The present application relates to the field of speech processing technologies, and in particular, to a microphone array detection method, a related device, and a readable storage medium.
Background
With the continuous development of social science and technology, some scenes have the requirements of recording voice and converting the recorded voice into characters. For example, in a single speech scene such as a lecture, there is a need to record a course voice and convert the recorded course voice into a course summary, and in a multi-person discussion scene such as a conference, there is a need to record a conference voice and convert the recorded conference voice into a conference summary. The microphone array has the characteristic of high recording quality, and more users are selected.
In order to ensure the reliability of the microphone array, the microphone array needs to be detected before the microphone array factory.
Therefore, how to provide a microphone array detection method becomes a technical problem to be solved by those skilled in the art.
Disclosure of Invention
In view of the foregoing, the present application provides a microphone array detection method, a related device and a readable storage medium. The specific scheme is as follows:
a microphone array detection method, the method comprising:
determining a microphone array to be detected and at least one sound source, wherein the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group;
for each microphone array, determining a measurement orientation of each sound source relative to the microphone array, and determining an estimated orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
determining a detection result of the microphone array based on a detection result of each microphone array group.
Optionally, the determining an estimated orientation of the sound source based on the set of audio signals acquired by the microphone array with respect to the sound source comprises:
converting audio signal groups acquired by the microphone array group for the sound source into a plurality of discrete digital signal groups through a windowing mechanism;
for each window, calculating an estimated position within the window based on the window's corresponding set of discrete digital signals;
based on the estimated bearing within each window, an estimated bearing of the audio source is determined.
Optionally, the calculating an estimated orientation within the window based on the discrete set of digital signals to which the window corresponds comprises:
calculating an optimal time difference based on a cross-correlation of two discrete digital signals in the set of discrete digital signals;
and calculating the estimated orientation in the window according to the propagation speed of sound in the air, the optimal time difference and the distance between two microphone elements in the microphone array.
Optionally, the determining an estimated bearing of the audio source based on the estimated bearings within the respective windows comprises:
screening the estimated orientations in the windows to determine the estimated orientations in the reserved windows;
and carrying out mean value filtering processing on the estimation direction in the reserved window to obtain the estimation direction of the sound source.
Optionally, the determining a detection result of the microphone array based on the measured orientation of each sound source relative to the microphone array and the estimated orientation of each sound source determined by the microphone array comprises:
for each sound source, determining an orientation deviation value of the sound source according to the measured orientation of the sound source relative to the microphone array and the estimated orientation of the sound source determined by the microphone array;
if the azimuth deviation values of all the sound sources are not larger than a first preset threshold value, determining that the microphone array group passes the detection;
determining that the microphone array fails to detect if there is an azimuth deviation value of a sound source greater than the first preset threshold.
Optionally, if there is an azimuth deviation value of the audio source greater than the first preset threshold, the method further comprises:
judging whether the azimuth deviation value of the sound source is greater than a second preset threshold value, wherein the second preset threshold value is greater than the first preset threshold value;
if the direction deviation value of the sound source larger than the first preset threshold value is not larger than the second preset threshold value, judging whether the microphone array has a correction sign, if the microphone array does not have the correction sign, correcting the microphone array, and adding the correction sign to the microphone array after correction.
Optionally, determining the detection result of the microphone array based on the detection result of each microphone array group includes:
judging whether each microphone array group passes the detection;
determining that the microphone array passes detection if each microphone array group passes detection;
determining that the microphone array is not detected if there is a microphone array that is not detected.
A microphone array detection apparatus, the apparatus comprising:
the device comprises a determining unit, a detecting unit and a processing unit, wherein the determining unit is used for determining a microphone array to be detected and at least one sound source, the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group;
the microphone array detection unit is used for determining the measurement orientation of each sound source relative to the microphone array for each microphone array, and determining the estimation orientation of the sound source based on the audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
and the microphone array detection unit is used for determining the detection result of the microphone array based on the detection result of each microphone array group.
Optionally, the microphone array detection unit includes: an estimated orientation determination unit;
the estimated orientation determining unit includes:
the signal conversion unit is used for converting audio signal groups acquired by the microphone array group for the sound source into a plurality of discrete digital signal groups through a windowing mechanism;
the estimated direction calculating unit in the window is used for calculating the estimated direction in the window according to the discrete digital signal group corresponding to each window and based on the discrete digital signal group corresponding to the window;
an estimated bearing of the audio source determining unit for determining an estimated bearing of the audio source based on the estimated bearings within the respective windows.
Optionally, the estimated orientation calculation unit within the window includes:
an optimal time difference calculation unit, configured to calculate an optimal time difference based on a cross-correlation between two discrete digital signals in the discrete digital signal group;
and the estimated azimuth calculation unit is used for calculating the estimated azimuth in the window according to the propagation speed of sound in the air, the optimal time difference and the distance between two microphone elements in the microphone array.
Optionally, the estimated orientation of the audio source determining unit includes:
the screening unit is used for screening the estimated direction in each window and determining the estimated direction in the reserved window;
and the mean value filtering processing unit is used for carrying out mean value filtering processing on the estimation direction in the reserved window to obtain the estimation direction of the sound source.
Optionally, the microphone array detection unit includes: a detection result determination unit of a microphone array; the detection result determining unit of the microphone array includes:
an offset value determination unit for determining, for each audio source, an orientation offset value for the audio source from the measured orientation of the audio source relative to the microphone array and the estimated orientation of the audio source determined by the microphone array;
the detection result determining unit is used for determining that the microphone array group passes the detection if the azimuth deviation values of all the sound sources are not larger than a first preset threshold value; determining that the microphone array fails to detect if there is an azimuth deviation value of a sound source greater than the first preset threshold.
Optionally, the apparatus further comprises:
the correction processing unit is used for judging whether the direction deviation value of the sound source is greater than a second preset threshold value if the direction deviation value of the sound source is greater than the first preset threshold value, wherein the second preset threshold value is greater than the first preset threshold value; if the direction deviation value of the sound source larger than the first preset threshold value is not larger than the second preset threshold value, judging whether the microphone array has a correction sign, if the microphone array does not have the correction sign, correcting the microphone array, and adding the correction sign to the microphone array after correction.
Optionally, the microphone array detecting unit specifically includes:
the judging unit is used for judging whether each microphone array group passes the detection;
a processing unit for determining that the microphone array passes detection if each microphone array group passes detection; determining that the microphone array is not detected if there is a microphone array that is not detected.
A microphone array detection apparatus comprising a memory and a processor;
the memory is used for storing programs;
the processor is configured to execute the program to implement the steps of the microphone array detection method as described above.
A readable storage medium having stored thereon a computer program which, when being executed by a processor, carries out the steps of the microphone array detection method as described above.
By means of the technical scheme, the application discloses a microphone array detection method, related equipment and a readable storage medium. For each microphone array to be detected, determining the measurement orientation of each sound source relative to the microphone array, and determining the estimation orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on the measured orientation of each sound source relative to the microphone array and the estimated orientation of each sound source determined by the microphone array; based on the detection results of the respective microphone arrays, the detection results of the microphone arrays are determined. By adopting the scheme, the detection of the microphone array can be realized.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 is a schematic diagram of a linear microphone array including two microphone elements according to an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of an audio signal collected by a microphone array under normal conditions according to an embodiment of the disclosure;
FIG. 3 is a schematic diagram of an audio signal collected by a microphone array under abnormal conditions according to an embodiment of the disclosure;
fig. 4 is a schematic structural diagram of a microphone array detection apparatus disclosed in an embodiment of the present application;
fig. 5 is a schematic flow chart of a microphone array detection method disclosed in an embodiment of the present application;
FIG. 6 is a schematic diagram of an estimated azimuth distribution within a window disclosed in an embodiment of the present application;
fig. 7 is a schematic structural diagram of a microphone array detection apparatus disclosed in an embodiment of the present application;
fig. 8 is a block diagram of a hardware structure of a microphone array detection apparatus according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
In order to detect the microphone array, the inventor researches and discovers that:
at present, a microphone array mostly adopts a scheme of realizing voice enhancement by adopting a beam forming technology, and the principle of the scheme is that sound source direction estimation is firstly carried out on audio signals collected by the microphone array, then, sound in the sound source direction is enhanced, and sound in other directions is suppressed. However, in the actual use process of the microphone array, the audio signal collected by the microphone array may deviate from the actual audio signal for some reasons, so that the sound source direction estimated by the microphone deviates from the actual sound source direction, which may result in enhancing invalid speech, suppressing valid speech, and affecting the reliability of the microphone array.
In order to understand the deviation of the audio signal collected by the microphone array from the actual audio signal, a linear microphone array comprising two microphone elements is used in this application.
Referring to fig. 1, fig. 1 is a schematic diagram of a linear microphone array including two microphone elements, which includes a mic1 and a mic2, according to an embodiment of the present disclosure, and it is assumed that sound propagates from a mic1 direction to a mic2 direction.
Referring to fig. 2, fig. 2 is a schematic diagram of audio signals collected by a microphone array under normal conditions disclosed in the embodiment of the present application, in fig. 2, an upper curve indicates a mic1 collected audio signal, and a lower curve indicates a mic2 collected audio signal, as shown in a vertical line in fig. 2, it can be seen that a sound reaches the mic1 first and then reaches the mic2, which corresponds to an actual propagation path of the sound.
Referring to fig. 3, fig. 3 is a schematic diagram of audio signals collected by a microphone array under abnormal conditions disclosed in the embodiment of the present application, in fig. 3, an upper curve indicates a mic1 collected audio signal, and a lower curve indicates a mic2 collected audio signal, as shown in a vertical line in fig. 3, it can be seen that a sound reaches the mic2 before reaching the mic1, which does not conform to an actual propagation path of the sound, i.e., the audio signals collected by the microphone array deviate from the actual audio signals.
In order to be more intuitive, in the present application, a relationship between a deviation of sampling points of audio signals collected by two microphone elements and a relative distance error between the two microphone elements is used to explain a deviation between the audio signals collected by a microphone array and actual audio signals, specifically as follows:
the calculation formula of the error value D of the relative distance between the microphone elements caused by the deviation of the sampling points is as follows:
Figure BDA0003435889950000071
where S represents the sampling rate of the microphone array, n represents the number of deviation points when the mic2 is aligned with respect to the own sampling data, and v represents the propagation velocity of sound in the air. When the sampling data consistency deviates, the relative error value of the microphone element distance can be calculated according to the formula. When S is 16kHz and v is 340m/S, it can be calculated that when 1 sample point deviation occurs, a relative spacing error of 21.25mm will be generated, and assuming that n is-1, the spacing is-11.25 mm after the error values are superimposed in the array with the spacing of 10mm, that is, the mic2 is considered to be on the left of the mic1, and the calculated sound propagation direction will be opposite to the real situation.
In view of the above problems, the inventors of the present invention conducted extensive research, and considered whether an audio signal acquired by a microphone array is consistent with a real audio signal, which is also a factor affecting the reliability of the microphone array, and finally proposed a microphone array detection apparatus and a microphone array detection method.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a microphone array detection device disclosed in an embodiment of the present application, and as shown in fig. 4, the microphone array detection device includes a sound muffling box and a detection terminal, the sound muffling box includes a microphone array to be detected and at least one sound source, wherein the microphone array to be detected is connected to the detection terminal through a USB cable, the at least one sound source is connected to the detection terminal through an audio cable, and the detection terminal and the at least one sound source are respectively connected to an external power source. The detection terminal is internally provided with microphone array detection software which is used for executing the microphone array detection method.
As an implementable mode, a groove can be further formed in the silencing box and used for fixing the microphone array to be detected, the groove can be convenient for detection personnel to quickly place the microphone array to be detected, the placing consistency can be improved, and the detection efficiency is improved.
It should be noted that the number of sound sources may be configured based on the number of beams designed by the microphone array to be detected, and at least one sound source standard is configured in each beam range, so as to improve the accuracy of the detection result.
Next, a microphone array detection method provided by the present application is described by the following embodiments.
Referring to fig. 5, fig. 5 is a schematic flowchart of a microphone array detection method disclosed in an embodiment of the present application, where the method may include:
step S101: determining a microphone array to be detected and at least one sound source, wherein the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group.
In this application, the detection terminal may determine the microphone array to be detected and the at least one sound source based on the interface signal. For example, the microphone array connected with the USB interface is a microphone array to be detected, and the microphone array connected with the audio line interface is at least one sound source.
Often more than two microphone elements are included in the microphone array, in this application each two microphone elements form a microphone array, for example, a microphone array is included in a linear microphone array including two microphone elements, and a microphone array is included in a linear microphone array including three microphone elements.
Step S102: for each microphone array, determining a measurement orientation of each sound source relative to the microphone array, and determining an estimated orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array.
In this application, the measurement orientation of each sound source with respect to the microphone array may be determined by means of a universal protractor, goniometer, or the like, measuring the angle.
For one microphone array, each microphone array can acquire the sound source to obtain an audio signal, and the two microphone arrays in the microphone array combine the audio signals acquired by the sound source to obtain an audio signal group acquired by the microphone array to the sound source.
It should be noted that, in the present application, the estimated orientation of the sound source may be determined based on the cross-correlation between two audio signals in the audio signal group acquired by the microphone array for the sound source, and a specific implementation manner will be described in detail in the following embodiments.
It should be noted that in the present application, the measured orientations of the sound sources relative to the microphone array may be compared with the estimated orientations of the sound sources determined by the microphone array, and the detection result of the microphone array may be determined, and a specific implementation manner will be described in detail in the following embodiments.
Step S103: determining a detection result of the microphone array based on a detection result of each microphone array group.
The detection results of the microphone arrays can be obtained by performing step S102 a plurality of times, and the detection results of the microphone arrays can be determined based on the detection results of the microphone arrays. Specifically, whether each microphone array group passes the detection is judged; determining that the microphone array passes detection if each microphone array group passes detection; determining that the microphone array is not detected if there is a microphone array that is not detected.
Of course, as another possible implementation manner, if the proportion of the microphone array groups passing the detection to all the microphone array groups exceeds a preset threshold value, the microphone array can also be determined to pass the detection, otherwise, the microphone array is determined not to pass the detection.
It should be noted that after determining the detection result of the microphone array, the detection result of the microphone array may also be output and displayed on the detection terminal.
The embodiment discloses a microphone array detection method, which comprises the steps of determining the measurement orientation of each sound source relative to a microphone array for each microphone array group in a microphone array to be detected, and determining the estimation orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on the measured orientation of each sound source relative to the microphone array and the estimated orientation of each sound source determined by the microphone array; based on the detection results of the respective microphone arrays, the detection results of the microphone arrays are determined. By adopting the scheme, the detection of the microphone array can be realized.
In another embodiment of the present application, a specific implementation manner of determining the estimated orientation of the sound source based on the audio signal group acquired by the microphone array for the sound source in step S102 is described, which may include the following steps:
step S201: and converting the audio signal group acquired by the microphone array for the sound source into a plurality of discrete digital signal groups through a windowing mechanism.
In this application, a rectangular window w (t) may be added to an audio signal acquired by two microphone elements in a microphone array to the sound source in the process of acquiring the sound source by the microphone array:
Figure BDA0003435889950000101
the size of the rectangular window is 256, and audio signal groups acquired by the microphone array for the sound sources are converted into a plurality of discrete digital signal groups through a windowing mechanism.
Step S202: for each window corresponding set of discrete digital signals, an estimated position within the window is calculated based on the window corresponding set of discrete digital signals.
As an implementation, the calculating the estimated position within the window based on the discrete digital signal group corresponding to the window may include: calculating an optimal time difference based on a cross-correlation of two discrete digital signals in the set of discrete digital signals; and calculating the estimated orientation in the window according to the propagation speed of sound in the air, the optimal time difference and the distance between two microphone elements in the microphone array.
It should be noted that, the time difference corresponding to the maximum cross correlation between two discrete digital signals in the discrete digital signal group is the optimal time difference.
For ease of understanding, assume h1Representing the audio signal of the microphone element 1, h2Representing the audio signal of the microphone element 2, n1(t) noise signal n representing noise signal collected by microphone element 1 at time t2(t) represents the noise signal collected by the microphone element 2 at time t, and(s) (t) represents the sound source signal collected by the microphone element 1 at time t. Assuming that L represents the time difference between the sound source arriving at microphone element 1 and microphone element 2, s (t-L) represents the sound source signal collected by microphone element 2 at time t, and signal r1、r2As follows:
r1(t)=h1×w(t)=s(t)+n1(t)
r2(t)=h2×w(t)=s(t-L)+n2(t)
in the case of a sampling rate of 16K, the time interval u of two sampling points:
Figure BDA0003435889950000111
i.e. a minimum time step scale of 62.5us, the time length T u 256 of the discrete digital signal set corresponding to the window, since s (T), n are in a broad sense1(t)、n2(t) if the three are not correlated, then the signal r1、r2The cross-correlation is as follows:
Figure BDA0003435889950000112
R1,2when (τ) is maximized, τ is equal to N, and the optimum time difference β is 62.5us × N.
Assuming that v represents the speed of sound propagation in air, D represents the distance of two microphone elements in the microphone array, and α represents the estimated orientation of the sound source, it can be based on the formula:
Figure BDA0003435889950000113
the estimated orientation of the audio source is calculated.
Step S203: based on the estimated bearing within each window, an estimated bearing of the audio source is determined.
As an implementation, the determining the estimated bearing of the sound source based on the estimated bearings in the respective windows may include: screening the estimated orientations in the windows to determine the estimated orientations in the reserved windows; and carrying out mean value filtering processing on the estimation direction in the reserved window to obtain the estimation direction of the sound source.
Based on the step S202, the estimated orientations in the windows can be obtained, and in the present application, the estimated orientations in the windows can be screened based on the dense distribution degree of the estimated orientations in the windows, so as to determine the estimated orientations in the remaining windows. For ease of understanding, referring to FIG. 6, FIG. 6 illustrates an estimate within a window disclosed in an embodiment of the present applicationThe orientation distribution diagram of the meter, as can be clearly seen from FIG. 6, the estimated orientation in each window is (α n1, … …, α nn), and the estimated orientation in the remaining window is the estimated orientation in the window in the region of the most dense distribution map, i.e., (α n1, … …, α nn)n3n4,…αnj)。
It should be noted that, by screening the estimated orientations in the windows, the influence of noise data on the detection result can be reduced, and the accuracy of the detection result can be improved.
In another embodiment of the present application, a specific implementation manner of determining the detection result of the microphone array based on the measured orientation of each sound source relative to the microphone array and the estimated orientation of each sound source determined by the microphone array in step S102 is described, which includes:
step S301: for each sound source, determining an orientation deviation value of the sound source according to the measured orientation of the sound source relative to the microphone array and the estimated orientation of the sound source determined by the microphone array; if the direction deviation values of all the sound sources are not larger than the first preset threshold value, executing the step S302; if the azimuth deviation value of the sound source is larger than the first preset threshold value, step S303 is executed.
Step S302: determining that the microphone array passes detection.
Step S303: determining that the microphone array failed detection.
In another embodiment of the present application, if there is an orientation deviation value of the audio source greater than the first preset threshold, the method may further include: judging whether the azimuth deviation value of the sound source is greater than a second preset threshold (such as 20 degrees), wherein the second preset threshold is greater than the first preset threshold; if the direction deviation value of the sound source larger than the first preset threshold value is not larger than the second preset threshold value, judging whether the microphone array has a correction mark, if the microphone array does not have the correction mark, correcting the microphone array, adding the correction mark to the microphone array after correction, and if not, not correcting the microphone array.
The following describes a microphone array detection device disclosed in an embodiment of the present application, and the microphone array detection device described below and the microphone array detection method described above may be referred to in correspondence with each other.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a microphone array detection device disclosed in the embodiment of the present application. As shown in fig. 7, the microphone array detecting apparatus may include:
the determining unit 11 is configured to determine a microphone array to be detected and at least one sound source, where the microphone array includes a plurality of microphone elements, and each two microphone elements form a microphone array group;
a microphone array detection unit 12, configured to determine, for each microphone array, a measurement orientation of each sound source relative to the microphone array, and determine an estimated orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
a microphone array detection unit 13, configured to determine a detection result of the microphone array based on a detection result of each microphone array group.
As one possible implementation, the microphone array detection unit includes: an estimated orientation determination unit;
the estimated orientation determining unit includes:
the signal conversion unit is used for converting audio signal groups acquired by the microphone array group for the sound source into a plurality of discrete digital signal groups through a windowing mechanism;
the estimated direction calculating unit in the window is used for calculating the estimated direction in the window according to the discrete digital signal group corresponding to each window and based on the discrete digital signal group corresponding to the window;
an estimated bearing of the audio source determining unit for determining an estimated bearing of the audio source based on the estimated bearings within the respective windows.
As an implementable manner, the estimated orientation calculation unit within the window includes:
an optimal time difference calculation unit, configured to calculate an optimal time difference based on a cross-correlation between two discrete digital signals in the discrete digital signal group;
and the estimated azimuth calculation unit is used for calculating the estimated azimuth in the window according to the propagation speed of sound in the air, the optimal time difference and the distance between two microphone elements in the microphone array.
As an embodiment, the estimated orientation of the audio source determining unit includes:
the screening unit is used for screening the estimated direction in each window and determining the estimated direction in the reserved window;
and the mean value filtering processing unit is used for carrying out mean value filtering processing on the estimation direction in the reserved window to obtain the estimation direction of the sound source.
As one possible implementation, the microphone array detection unit includes: a detection result determination unit of a microphone array; the detection result determining unit of the microphone array includes:
an offset value determination unit for determining, for each audio source, an orientation offset value for the audio source from the measured orientation of the audio source relative to the microphone array and the estimated orientation of the audio source determined by the microphone array;
the detection result determining unit is used for determining that the microphone array group passes the detection if the azimuth deviation values of all the sound sources are not larger than a first preset threshold value; determining that the microphone array fails to detect if there is an azimuth deviation value of a sound source greater than the first preset threshold.
As an implementable manner, the apparatus further comprises:
the correction processing unit is used for judging whether the direction deviation value of the sound source is greater than a second preset threshold value if the direction deviation value of the sound source is greater than the first preset threshold value, wherein the second preset threshold value is greater than the first preset threshold value; if the direction deviation value of the sound source larger than the first preset threshold value is not larger than the second preset threshold value, judging whether the microphone array has a correction sign, if the microphone array does not have the correction sign, correcting the microphone array, and adding the correction sign to the microphone array after correction.
As an implementation manner, the microphone array detection unit specifically includes:
the judging unit is used for judging whether each microphone array group passes the detection;
a processing unit for determining that the microphone array passes detection if each microphone array group passes detection; determining that the microphone array is not detected if there is a microphone array that is not detected.
Referring to fig. 8, fig. 8 is a block diagram of a hardware structure of a microphone array detection apparatus according to an embodiment of the present disclosure, and referring to fig. 8, the hardware structure of the microphone array detection apparatus may include: at least one processor 1, at least one communication interface 2, at least one memory 3 and at least one communication bus 4;
in the embodiment of the application, the number of the processor 1, the communication interface 2, the memory 3 and the communication bus 4 is at least one, and the processor 1, the communication interface 2 and the memory 3 complete mutual communication through the communication bus 4;
the processor 1 may be a central processing unit CPU, or an application Specific Integrated circuit asic, or one or more Integrated circuits configured to implement embodiments of the present invention, etc.;
the memory 3 may include a high-speed RAM memory, and may further include a non-volatile memory (non-volatile memory) or the like, such as at least one disk memory;
wherein the memory stores a program and the processor can call the program stored in the memory, the program for:
determining a microphone array to be detected and at least one sound source, wherein the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group;
for each microphone array, determining a measurement orientation of each sound source relative to the microphone array, and determining an estimated orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
determining a detection result of the microphone array based on a detection result of each microphone array group.
Alternatively, the detailed function and the extended function of the program may be as described above.
Embodiments of the present application further provide a readable storage medium, where a program suitable for being executed by a processor may be stored, where the program is configured to:
determining a microphone array to be detected and at least one sound source, wherein the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group;
for each microphone array, determining a measurement orientation of each sound source relative to the microphone array, and determining an estimated orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
determining a detection result of the microphone array based on a detection result of each microphone array group.
Alternatively, the detailed function and the extended function of the program may be as described above.
Finally, it should also be noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present application. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the application. Thus, the present application is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A microphone array detection method, the method comprising:
determining a microphone array to be detected and at least one sound source, wherein the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group;
for each microphone array, determining a measurement orientation of each sound source relative to the microphone array, and determining an estimated orientation of the sound source based on an audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
determining a detection result of the microphone array based on a detection result of each microphone array group.
2. The method of claim 1, wherein determining the estimated bearing of the audio source based on the set of audio signals acquired by the microphone array for the audio source comprises:
converting audio signal groups acquired by the microphone array group for the sound source into a plurality of discrete digital signal groups through a windowing mechanism;
for each window, calculating an estimated position within the window based on the window's corresponding set of discrete digital signals;
based on the estimated bearing within each window, an estimated bearing of the audio source is determined.
3. The method of claim 2, wherein computing the estimated position within the window based on the set of discrete digital signals to which the window corresponds comprises:
calculating an optimal time difference based on a cross-correlation of two discrete digital signals in the set of discrete digital signals;
and calculating the estimated orientation in the window according to the propagation speed of sound in the air, the optimal time difference and the distance between two microphone elements in the microphone array.
4. The method of claim 2, wherein determining the estimated bearing of the audio source based on the estimated bearings within the respective windows comprises:
screening the estimated orientations in the windows to determine the estimated orientations in the reserved windows;
and carrying out mean value filtering processing on the estimation direction in the reserved window to obtain the estimation direction of the sound source.
5. The method of claim 1, wherein determining the detection of the microphone array based on a measured orientation of each audio source relative to the microphone array and an estimated orientation of each audio source determined by the microphone array comprises:
for each sound source, determining an orientation deviation value of the sound source according to the measured orientation of the sound source relative to the microphone array and the estimated orientation of the sound source determined by the microphone array;
if the azimuth deviation values of all the sound sources are not larger than a first preset threshold value, determining that the microphone array group passes the detection;
determining that the microphone array fails to detect if there is an azimuth deviation value of a sound source greater than the first preset threshold.
6. The method of claim 5, wherein if there is an orientation deviation value of an audio source greater than the first preset threshold, the method further comprises:
judging whether the azimuth deviation value of the sound source is greater than a second preset threshold value, wherein the second preset threshold value is greater than the first preset threshold value;
if the direction deviation value of the sound source larger than the first preset threshold value is not larger than the second preset threshold value, judging whether the microphone array has a correction sign, if the microphone array does not have the correction sign, correcting the microphone array, and adding the correction sign to the microphone array after correction.
7. The method of claim 1, wherein determining the detection results of the microphone array based on the detection results of the respective microphone array groups comprises:
judging whether each microphone array group passes the detection;
determining that the microphone array passes detection if each microphone array group passes detection;
determining that the microphone array is not detected if there is a microphone array that is not detected.
8. An apparatus for detecting a microphone array, the apparatus comprising:
the device comprises a determining unit, a detecting unit and a processing unit, wherein the determining unit is used for determining a microphone array to be detected and at least one sound source, the microphone array comprises a plurality of microphone elements, and each two microphone elements form a microphone array group;
the microphone array detection unit is used for determining the measurement orientation of each sound source relative to the microphone array for each microphone array, and determining the estimation orientation of the sound source based on the audio signal group acquired by the microphone array for the sound source; determining a detection result of the microphone array based on a measured orientation of each sound source relative to the microphone array and an estimated orientation of each sound source determined by the microphone array;
and the microphone array detection unit is used for determining the detection result of the microphone array based on the detection result of each microphone array group.
9. A microphone array detection apparatus comprising a memory and a processor;
the memory is used for storing programs;
the processor, which is configured to execute the program, realizes the steps of the microphone array detection method according to any one of claims 1 to 7.
10. A readable storage medium having stored thereon a computer program, characterized in that the computer program, when being executed by a processor, carries out the steps of the microphone array detection method as set forth in any one of claims 1 to 7.
CN202111612528.0A 2021-12-27 2021-12-27 Microphone array detection method, related device and readable storage medium Active CN114173273B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111612528.0A CN114173273B (en) 2021-12-27 2021-12-27 Microphone array detection method, related device and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111612528.0A CN114173273B (en) 2021-12-27 2021-12-27 Microphone array detection method, related device and readable storage medium

Publications (2)

Publication Number Publication Date
CN114173273A true CN114173273A (en) 2022-03-11
CN114173273B CN114173273B (en) 2024-02-13

Family

ID=80488474

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111612528.0A Active CN114173273B (en) 2021-12-27 2021-12-27 Microphone array detection method, related device and readable storage medium

Country Status (1)

Country Link
CN (1) CN114173273B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007089058A (en) * 2005-09-26 2007-04-05 Yamaha Corp Microphone array controller
KR20080070196A (en) * 2007-01-25 2008-07-30 한국과학기술연구원 Sound source direction detecting system by sound source position-time difference of arrival interrelation reverse estimation
JP2013097273A (en) * 2011-11-02 2013-05-20 Toyota Motor Corp Sound source estimation device, method, and program and moving body
CN105467364A (en) * 2015-11-20 2016-04-06 百度在线网络技术(北京)有限公司 Method and apparatus for localizing target sound source
CN110648685A (en) * 2019-09-26 2020-01-03 百度在线网络技术(北京)有限公司 Device detection method and device, electronic device and readable storage medium
CN112859000A (en) * 2020-12-31 2021-05-28 华为技术有限公司 Sound source positioning method and device
CN112951261A (en) * 2021-03-02 2021-06-11 北京声智科技有限公司 Sound source positioning method and device and voice equipment
CN113281706A (en) * 2021-04-02 2021-08-20 南方科技大学 Target positioning method and device and computer readable storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007089058A (en) * 2005-09-26 2007-04-05 Yamaha Corp Microphone array controller
KR20080070196A (en) * 2007-01-25 2008-07-30 한국과학기술연구원 Sound source direction detecting system by sound source position-time difference of arrival interrelation reverse estimation
JP2013097273A (en) * 2011-11-02 2013-05-20 Toyota Motor Corp Sound source estimation device, method, and program and moving body
CN105467364A (en) * 2015-11-20 2016-04-06 百度在线网络技术(北京)有限公司 Method and apparatus for localizing target sound source
CN110648685A (en) * 2019-09-26 2020-01-03 百度在线网络技术(北京)有限公司 Device detection method and device, electronic device and readable storage medium
CN112859000A (en) * 2020-12-31 2021-05-28 华为技术有限公司 Sound source positioning method and device
CN112951261A (en) * 2021-03-02 2021-06-11 北京声智科技有限公司 Sound source positioning method and device and voice equipment
CN113281706A (en) * 2021-04-02 2021-08-20 南方科技大学 Target positioning method and device and computer readable storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
TATSUYA KAKO; 等: "Wiener filter design by estimating sensitivities between distributed asynchronous microphones and sound sources", 《 2015 IEEE WORKSHOP ON APPLICATIONS OF SIGNAL PROCESSING TO AUDIO AND ACOUSTICS (WASPAA)》 *
杨尚衡;孟银阔;林成秋;陈想;张家兴;杨婷婕;: "基于TDOA的麦克风阵列声源方位估计算法研究", 科技视界, no. 32 *

Also Published As

Publication number Publication date
CN114173273B (en) 2024-02-13

Similar Documents

Publication Publication Date Title
JP4812302B2 (en) Sound source direction estimation system, sound source direction estimation method, and sound source direction estimation program
KR101349268B1 (en) Method and apparatus for mesuring sound source distance using microphone array
Thiergart et al. On the spatial coherence in mixed sound fields and its application to signal-to-diffuse ratio estimation
CN112652320B (en) Sound source positioning method and device, computer readable storage medium and electronic equipment
JP6203714B2 (en) Sound source localization using phase spectrum
JP4422662B2 (en) Sound source position / sound receiving position estimation method, apparatus thereof, program thereof, and recording medium thereof
KR101483513B1 (en) Apparatus for sound source localizatioin and method for the same
Dang et al. A feature-based data association method for multiple acoustic source localization in a distributed microphone array
CN114173273A (en) Microphone array detection method, related device and readable storage medium
Archer-Boyd et al. Biomimetic direction of arrival estimation for resolving front-back confusions in hearing aids
JP2009236688A (en) Sound source direction detection method, device, and program
KR101369043B1 (en) Method of tracing the sound source and apparatus thereof
JP5265327B2 (en) Method and system for calculating acoustic impedance
JP6711205B2 (en) Acoustic signal processing device, program and method
CN104811534A (en) Information processing method and electronic equipment
KR20160107820A (en) Direction detection apparatus for using triple acoustic sensor array
JP4488177B2 (en) Angle measuring method and apparatus
JP2019103011A (en) Converter, conversion method, and program
JP2002315089A (en) Loudspeaker direction detecting circuit
JP2003344519A (en) System and method for processing wave signal
CN113156373B (en) Sound source positioning method, digital signal processing device and audio system
CN110876100A (en) Sound source orientation method and system
Lawrence et al. Highly directional pressure sensing using the phase gradient
CN113782047B (en) Voice separation method, device, equipment and storage medium
JP5713933B2 (en) Sound source distance measuring device, acoustic direct ratio estimating device, noise removing device, method and program thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant