CN106526541B - Sound localization method based on distribution matrix decision - Google Patents

Sound localization method based on distribution matrix decision Download PDF

Info

Publication number
CN106526541B
CN106526541B CN201610893331.1A CN201610893331A CN106526541B CN 106526541 B CN106526541 B CN 106526541B CN 201610893331 A CN201610893331 A CN 201610893331A CN 106526541 B CN106526541 B CN 106526541B
Authority
CN
China
Prior art keywords
positioning
frame
signal
distribution matrix
result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610893331.1A
Other languages
Chinese (zh)
Other versions
CN106526541A (en
Inventor
王建中
叶凯
曹九稳
薛安克
王天磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Electronic Science and Technology University
Original Assignee
Hangzhou Electronic Science and Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Electronic Science and Technology University filed Critical Hangzhou Electronic Science and Technology University
Priority to CN201610893331.1A priority Critical patent/CN106526541B/en
Publication of CN106526541A publication Critical patent/CN106526541A/en
Application granted granted Critical
Publication of CN106526541B publication Critical patent/CN106526541B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

The invention discloses a kind of sound localization methods based on distribution matrix decision.The present invention includes the following steps: 1, carries out acoustic array collected multiple channel acousto sound signal pretreatment to include framing;2, voice recognition algorithm is carried out to single-channel data, each frame obtains a voice recognition result;3, wideband voice positioning is carried out to multi-channel data, each frame obtains a sound positioning result;4, what is obtained through the above steps identifies and positions results set, respectively indicates the row and column of matrix, constructs distribution matrix.5, after obtaining distribution matrix, the positioning distribution peaks of target sound source are found;6, peak value and two neighboring angular interval are selected, the average statistical in these three sections is calculated.The present invention can be improved the accuracy rate of sound location algorithm result, particularly evident especially in the case where interfering obvious and environmental background complicated.And location algorithm is low to the dependence of recognition result, has broad applicability.

Description

Sound localization method based on distribution matrix decision
Technical field
The invention belongs to signal processing technology fields, more particularly to the sound localization method based on distribution matrix decision.
Background technique
In traditional sound location algorithm, there are the following problems:
1. poor anti jamming capability.Noiseless indoors, in muting situation, location algorithm accuracy rate is high, but outdoors In the case of complex environment, once occur noise or very be interference, positioning result will be produced a very large impact.
2. it is close to identify and position algorithm connection, and complements each other for sound signal processing field.Conventional location algorithm does not have but Have well using this point, lacks the assurance to information fusion technology advantage.
Summary of the invention
In view of the above problems, the present invention provides a kind of sound localization methods based on distribution matrix decision.Now with cross It is illustrated for ideophone array.
To achieve the goals above, the technical solution adopted by the present invention includes the following steps:
Step 1 pre-processes the collected four-way voice signal of acoustic array, and pretreatment includes framing;
Step 2 carries out voice recognition to single-channel data;
Step 3 carries out wideband voice positioning to multi-channel data;
Step 4 identifies and positions results set according to what step 2,3 obtained, constructs distribution matrix;
Step 5 after obtaining distribution matrix, finds the positioning distribution peaks of target sound source;
Step 6, selection peak value and its two neighboring angular interval, calculate the average statistical in these three sections, as finally Positioning result.
The step 1: live sound signal is obtained using cross acoustic array, note sample frequency is fs.To four-way Voice signal carries out sub-frame processing, it is assumed that the frame number after framing is m.Next each frame signal after framing is handled.
The step 2: each frame single channel signal after taking framing is identified.
The algorithm identified to single channel signal is LPCC+SVM algorithm.
Each frame obtains a recognition result, to constitute the recognition result array C that length is m.
C=[c (1) c (2) c (m)];
The step 3: each frame four-way signal after taking framing carries out broadband location algorithm.
The algorithm that the four-way signal carries out broadband positioning is broadband MUSIC algorithm
3-1, frequency band and centre frequency f are chosen as needed0, the frequency band and centre frequency f0It needs according to practical mesh The frequecy characteristic of signal is marked to be selected.
3-2, FFT Fourier transformation is done to each frame four-way signal, the model X of each frame four-way signal after transformation (fj) indicate are as follows:
X(fj)=Aθ(fj)S(fj)+N(fj), j=1,2,3...J formula 1
Aθ(fj) it is guiding vector, S (fj) and N (fj) it is sound-source signal and noise after FFT Fourier transformation respectively.
It is f that selected frequency band, which is divided into multiple frequencies, after transformationjNarrow band signal combination.
3-3, using focussing matrix T, by frequency f where each narrowbandjPass through focus variations to centre frequency f0Place is narrow Band, change procedure are as follows:
T(fj)A(fj)S(fj)=A (f0)S(f0) formula 2
And centre frequency f is acquired by formula 30The autocorrelation matrix at place, for positioning:
3-4, to centre frequency f0Place narrowband is positioned, and the positioning result of this frame data is obtained.Each frame corresponding one A positioning result, to constitute the positioning result array A that length is m.
A=[a (1) a (2) a (m)]
The step 4: the recognition result array C and positioning result array A obtained according to step 2 and step 3, construction point Cloth matrix M.
Using the value of recognition result array C as abscissa, using the angular configurations range of positioning result array A as ordinate, Traverse each frame as a result, building matrix M, wherein M (Ci,Aj) what is indicated is that recognition result is C in all framesiPositioning result is AjFrame number.
The step 5: after obtaining distribution matrix, pass through recognition result CiFind the positioning distribution peaks of target sound source Atop
The step 6: in recognition result CiPositioning distribution on, select peak AtopAnd its two neighboring value Atop-1And Atop+1, the average statistical of matrix unit, formula can indicate where calculating these three values are as follows:
The wherein resolution ratio of P representing matrix ordinate angular interval.Such as 36 angular areas are divided by 360 degree of circumference Between, then resolution ratio P=10.
The present invention has the beneficial effect that:
Collected voice signal is identified and positioned algorithm by the present invention simultaneously, and constructs distribution matrix according to result, Final result is obtained by certain decision making algorithm.What the invention can make full use of in sound clip all identifies and positions letter Breath is distributed according to the positioning result of all frames under the premise of target sound is recognition result, obtains final positioning result. Advantage is can to maximize to reject interference and noise bring in voice signal and influence, and low to the dependence of recognizer, With broad applicability.
Detailed description of the invention
Fig. 1 is that the present invention proposes overall algorithm flow chart
Fig. 2 is position portion algorithm flow chart
Fig. 3 is the schematic diagram of distribution matrix
Fig. 4 is that 4 channel cross acoustic arrays establish the structure chart under rectangular coordinate system
Specific embodiment
It elaborates, is described below only as demonstration reconciliation to the present invention with reference to the accompanying drawings and detailed description It releases, it is intended that the present invention is limited in any way.
It is illustrated in figure 44 channel cross acoustic arrays and establishes the structure chart under rectangular coordinate system, wherein d is two phases The spacing of adjacent microphone;R is the radius of cross array;S (t) is sound source, its direction is θ;A, B, C, D in figure is right respectively It should be in channel 1, channel 2, channel 3, channel 4.Then signal is acquired, the signal for collecting 4 channels is always met together, is denoted as x respectively1 (t), x2(t), x3(t), x4(t)。
Guiding vector based on signal collected by cross battle array can indicate are as follows:
Wherein, the π of ω=2 f, f is signal frequency, τp(θ) (p=1,2,3,4) is the time delay between signal.Guiding vector exists Algorithm positioned below can be used.
Fig. 1 illustrates algorithm overview flow chart of the invention, according to the step in Fig. 1, connects by four-way acoustic array After having received four channel signals, pretreatment operation is carried out to it.Main pretreatment operation is framing.To four channels Signal does framing respectively, and framing length is 1024 sampled points, and step-length is the half of framing length.Assuming that after signal framing It is divided into the frame of m a length of 1024 sampled points, next our algorithm will be handled this each frame.
Firstly, carrying out recognizer to each frame single channel signal.
Any speech recognition algorithm can use, we are here with LPCC feature extraction and svm classifier learning algorithm Example illustrates.Wherein, we use 16 rank LPCC coefficients, the kernel function of SVM we choose radial basis function (Radial Basis Function, RBF), it is assumed that the sound type identified has C1, C2, C3, C4, C5 three types.
12 rank linear predictor coefficients (Linear Prediction Coefficients, LPC) value of every frame signal is acquired, Wherein LPC value can be solved using Levinson-Durbin algorithm.Followed by the corresponding relationship of LPCC value and LPC value Acquire the LPCC value of 16 ranks.
The sound fingerprint base method for building up is as follows:
The 16 rank LPCC values extracted to every frame signal by rows, a column are then added in front and are used as category, mark Number ' 0 ' represents C1, and ' 1 ' represents C2, and ' 2 ' represent C3, and ' 3 ' represent C4, and ' 4 ' represent C5.To constitute 17 ranks feature to Amount.
SVM algorithm is realized with the existing library libsvm, chooses RBF as classifier functions;There are two parameters by RBF: punishing Penalty factor c and parameter gamma can select optimal number by the grid search function opti_svm_coeff of libsvm Value.
Training process is using the svmtrain function in the library libsvm, and include four parameters: feature vector uses said extracted Labelled LPCC value out;Kernel function type selects RBF kernel function;RBF kernel functional parameter c and gamma, are searched using grid Rope method determines;To call can obtain the variable of an entitled model after svmtrain, the trained gained model letter of this variable save Breath, i.e. the sound fingerprint base, this variable save is got off and is identified for next step.
And sound is identified by the svmtest in the library libsvm to realize, LPCC value that every frame signal is obtained Carry out intelligent classification with the svmtest function of libsvm, there are three parameters by svmtest: first is category, for testing identification Not (when the sound to UNKNOWN TYPE identifies, which does not have practical significance) of rate;Second is feature vector, i.e., The variable of LPCC value is stored, it is exactly the return value of above-mentioned steps training process svmtrain function that third, which is Matching Model,.It adjusts It is exactly acquired results of classifying with the return value that svmtest is obtained, i.e. category, to can determine that the equipment class for generating this sound Type.
When in practical applications, feature extraction is carried out to signal, is then compared with established sound fingerprint base, to do To identification.
Then after this stage, we can obtain m recognition result, form array C
C=[c (1) c (2) c (m)]
Next, the present invention carries out location algorithm to the four-way signal of each frame.
Fig. 2 illustrates the specific flow chart of location algorithm part, including carries out FFT transform to subframe, to each narrowband The location algorithm of pre-estimation angle and broadband, we illustrate by taking MUSIC algorithm as an example herein.
For the autocorrelation matrix for seeking signal, this frame four-way signal is done into secondary framing, framing length is 256, and step-length is The half of frame length.To doing FFT Fourier transformation after sub- framing.The formula of FFT transform is as follows:
L is that subframe is long, as 256.
Data can indicate after FFT transform are as follows:
N is the number of sub-frames after secondary framing.
The signal frequency domain model then obtained can indicate are as follows:
X(fj)=Aθ(fj)S(fj)+N(fj), j=1,2,3...J
WhereinfsIt is the sample frequency of signal.Since actual signal is mostly broadband signal, need to choose One suitable broadband frequency domain and center frequency points f0
Broadband signal can be regarded as multiple narrow band signals and constitute.Pass through focussing matrix TjWe can make each narrowband Focusing transform is to centre frequency.
T(fj)A(fj)S(fj)=A (f0)S(f0)
A (f) is the guiding vector to be used in location algorithm.
We first do the MUSIC location algorithm an of narrowband to each narrowband, as pre-estimation when seeking focussing matrix As a result.Steps are as follows:
First seek the signal autocorrelation matrix R of each narrow band frequencyf, to autocorrelation matrix RfMake Eigenvalues Decomposition.
U in formulaSIt is the subspace by big characteristic value corresponding characteristic vector namely signal subspace, and UNIt is by small The subspace of the corresponding characteristic vector of characteristic value namely noise subspace.The Power estimation function of MUSIC algorithm is
Θ indicates angle of visibility in formula.
It allows θ to scan in the observation fan face Θ, calculates formula in the corresponding functional value of each scan position, which peak value occurs Orientation, be denoted as βj, as aspect.
Available β=[β after doing MUSIC location algorithm pre-estimation to each narrowband1 β2 ··· βJ]。
And then, we will construct focussing matrix by pre-estimation result.
T(fj)=V (fj)U(fj)H
Wherein U (fj) and V (fj) it is respectively A (fj,β)AH(f0, β) left unusual and right singular vector.Gathered using a series of Burnt matrix T (fj) transformation is focused to array received data, obtain the data autocorrelation matrix of single-frequency point
Equally, after having obtained autocorrelation matrix, we can try again to centre frequency narrowband MUSIC algorithm, just Positioning result to the end can be obtained.
After this stage, our available m positioning result forms array A.
A=[a (1) a (2) a (3) a (4) a (m)]
As shown in Figure 1, after obtaining positioning and recognition result, therefore we can construct distribution matrix M.Fig. 3 is illustrated The schematic diagram of distribution matrix.Abscissa is the possible value range section positioning result A.That ordinate indicates is recognition result C Possible value range.M(Ci,Aj) indicate that recognition result is C in all frames of this segment dataiPositioning result is AjFrame it is total Number.
After obtaining distribution matrix statistics, just by the positioning distribution of the recognition result of target sound source, determining for target is acquired Position result.
The present invention selects recognition result for that a line of target sound source, and the positioning result distribution of target sound source can be obtained.It looks for To peak Atop, determine peak value and its two neighboring value Atop-1And Atop+1, the statistics where calculating this 3 values in matrix units is equal Value, as final positioning result.
Formula can indicate are as follows:

Claims (7)

1. the sound localization method based on distribution matrix decision, it is characterised in that include the following steps:
Step 1 pre-processes the collected four-way voice signal of acoustic array;
The pretreatment is to the progress sub-frame processing of four-way voice signal;
Step 2 carries out voice recognition to each frame single channel signal;
Step 3 carries out wideband voice positioning to each frame four-way signal;
Step 4 identifies and positions results set according to what step 2,3 obtained, constructs distribution matrix M, wherein M (Ci,Aj) indicate It is that recognition result is C in all framesiPositioning result is AjFrame number;
Step 5 after obtaining distribution matrix, finds the positioning distribution peaks of target sound source;
Step 6, selection peak value and its two neighboring angular interval, calculate the average statistical in these three sections, as last determines Position result.
2. the sound localization method according to claim 1 based on distribution matrix decision, it is characterised in that the step 1: live sound signal being obtained using cross acoustic array, note sample frequency is fs;Four-way voice signal is carried out at framing Reason, it is assumed that the frame number after framing is m;Next each frame signal after framing is handled.
3. the sound localization method according to claim 2 based on distribution matrix decision, which is characterized in that the step The algorithm that 2 pairs of each frame single channel signals carry out voice recognition is LPCC+SVM algorithm;
Each frame obtains a recognition result, to constitute the recognition result array C that length is m;
C=[c (1) c (2) ... c (m)].
4. the sound localization method according to claim 3 based on distribution matrix decision, which is characterized in that the step The algorithm that 3 pairs of each frame four-way signals carry out wideband voice positioning is broadband MUSIC algorithm, specific as follows:
3-1, frequency band and centre frequency f are chosen as needed0, the frequency band and centre frequency f0It needs to be believed according to realistic objective Number frequecy characteristic selected;
3-2, FFT Fourier transformation is done to each frame four-way signal, the model X (f of each frame four-way signal after transformationj) table It is shown as:
X(fj)=Aθ(fj)S(fj)+N(fj), j=1,2,3...J formula 1
Aθ(fj) it is guiding vector, S (fj) and N (fj) it is sound-source signal and noise after FFT Fourier transformation respectively;
It is f that selected frequency band, which is divided into multiple frequencies, after transformationjNarrow band signal combination;
3-3, using focussing matrix T, by frequency f where each narrowbandjPass through focus variations to centre frequency f0Place narrowband becomes Change process is as follows:
T(fj)A(fj)S(fj)=A (f0)S(f0) formula 2
Wherein, A (f) is guiding vector;And centre frequency f is acquired by formula 30The autocorrelation matrix at place, for positioning:
3-4, to centre frequency f0Place narrowband is positioned, and the positioning result of this frame data is obtained;The corresponding positioning of each frame As a result, to constitute the positioning result array A that length is m;
A=[a (1) a (2) ... a (m)].
5. the sound localization method according to claim 4 based on distribution matrix decision, it is characterised in that the step 4: the recognition result array C and positioning result array A obtained according to step 2 and step 3, construct distribution matrix M;
Using the value of recognition result array C as abscissa, using the angular configurations range of positioning result array A as ordinate, traversal Each frame as a result, building distribution matrix M, wherein M (Ci,Aj) what is indicated is that recognition result is C in all framesiPositioning result is AjFrame number.
6. the sound localization method according to claim 5 based on distribution matrix decision, it is characterised in that the step 5: after obtaining distribution matrix, passing through recognition result CiFind the positioning distribution peaks A of target sound sourcetop
7. the sound localization method according to claim 6 based on distribution matrix decision, it is characterised in that the step 6: in recognition result CiPositioning distribution on, select peak AtopAnd its two neighboring value Atop-1And Atop+1, calculate these three value institutes In the average statistical of matrix unit, formula can be indicated are as follows:
Wherein, the resolution ratio of P representing matrix ordinate angular interval.
CN201610893331.1A 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision Active CN106526541B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610893331.1A CN106526541B (en) 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610893331.1A CN106526541B (en) 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision

Publications (2)

Publication Number Publication Date
CN106526541A CN106526541A (en) 2017-03-22
CN106526541B true CN106526541B (en) 2019-01-18

Family

ID=58332047

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610893331.1A Active CN106526541B (en) 2016-10-13 2016-10-13 Sound localization method based on distribution matrix decision

Country Status (1)

Country Link
CN (1) CN106526541B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493106B (en) * 2017-08-09 2021-02-12 河海大学 Frequency and angle joint estimation method based on compressed sensing
CN112347984A (en) * 2020-11-27 2021-02-09 安徽大学 Olfactory stimulus-based EEG (electroencephalogram) acquisition and emotion recognition method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102438189A (en) * 2011-08-30 2012-05-02 东南大学 Dual-channel acoustic signal-based sound source localization method
CN103439688A (en) * 2013-08-27 2013-12-11 大连理工大学 Sound source positioning system and method used for distributed microphone arrays
CN105609113A (en) * 2015-12-15 2016-05-25 中国科学院自动化研究所 Bispectrum weighted spatial correlation matrix-based speech sound source localization method
CN106023996A (en) * 2016-06-12 2016-10-12 杭州电子科技大学 Sound identification method based on cross acoustic array broadband wave beam formation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102438189A (en) * 2011-08-30 2012-05-02 东南大学 Dual-channel acoustic signal-based sound source localization method
CN103439688A (en) * 2013-08-27 2013-12-11 大连理工大学 Sound source positioning system and method used for distributed microphone arrays
CN105609113A (en) * 2015-12-15 2016-05-25 中国科学院自动化研究所 Bispectrum weighted spatial correlation matrix-based speech sound source localization method
CN106023996A (en) * 2016-06-12 2016-10-12 杭州电子科技大学 Sound identification method based on cross acoustic array broadband wave beam formation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于数据矩阵聚焦的宽带DOA算法;刘春静 等;《弹箭与制导学报》;20100228;第30卷(第1期);190-192

Also Published As

Publication number Publication date
CN106526541A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
US9264806B2 (en) Apparatus and method for tracking locations of plurality of sound sources
CN105976827B (en) A kind of indoor sound localization method based on integrated study
CN106023996B (en) Sound recognition methods based on cross acoustic array broad-band EDFA
CN111239680B (en) Direction-of-arrival estimation method based on differential array
CN109830245A (en) A kind of more speaker's speech separating methods and system based on beam forming
CN102565759B (en) Binaural sound source localization method based on sub-band signal to noise ratio estimation
CN102760444B (en) Support vector machine based classification method of base-band time-domain voice-frequency signal
CN106057210B (en) Quick speech blind source separation method based on frequency point selection under binaural distance
CN110726972B (en) Voice sound source positioning method using microphone array under interference and high reverberation environment
CN103854660B (en) A kind of four Mike's sound enhancement methods based on independent component analysis
CN108091345B (en) Double-ear voice separation method based on support vector machine
Kadandale et al. Multi-channel u-net for music source separation
Shimada et al. Ensemble of ACCDOA-and EINV2-based systems with D3Nets and impulse response simulation for sound event localization and detection
CN106098079A (en) Method and device for extracting audio signal
CN106526541B (en) Sound localization method based on distribution matrix decision
Delcroix et al. SoundBeam: Target sound extraction conditioned on sound-class labels and enrollment clues for increased performance and continuous learning
Le Moing et al. Learning multiple sound source 2d localization
Zhang et al. Deep learning-based direction-of-arrival estimation for multiple speech sources using a small scale array
Lin Jointly tracking and separating speech sources using multiple features and the generalized labeled multi-bernoulli framework
Hu et al. Robust binaural sound localisation with temporal attention
CN111968671B (en) Low-altitude sound target comprehensive identification method and device based on multidimensional feature space
CN115331678A (en) Generalized regression neural network acoustic signal identification method using Mel frequency cepstrum coefficient
Tang et al. CNN-based discriminative training for domain compensation in acoustic event detection with frame-wise classifier
Hu et al. A generalized network based on multi-scale densely connection and residual attention for sound source localization and detection
Wu et al. Audio-based expansion learning for aerial target recognition

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant