CN109859768A

CN109859768A - Artificial cochlea's sound enhancement method

Info

Publication number: CN109859768A
Application number: CN201910184264.XA
Authority: CN
Inventors: 樊伟; 吴瑞安; 刘新东; 朱美美; 刘根芳
Original assignee: Lishengte Medical Science & Tech Co Ltd
Current assignee: Lishengte Medical Science & Tech Co Ltd
Priority date: 2019-03-12
Filing date: 2019-03-12
Publication date: 2019-06-07

Abstract

The invention discloses a kind of artificial cochlea's sound enhancement methods comprising following steps: (A) pre-processes voice signal；(B) voice signal is subjected to end-point detection, judges noise frame and speech frame in voice signal, and the two is separated；(C) noise frame is subjected to feature extraction, extracts feature vector；(D) this feature vector is subjected to CNN operation, identifies noise scene；(E) corresponding gain table G (γ is selected_k,ξ_k), and speech enhan-cement parameter is chosen by calculating prior weight and posteriori SNR, voice signal is then subjected to speech enhan-cement；(F) encoded information output is converted by voice signal.Artificial cochlea's sound enhancement method gain table different by training, different speech enhan-cement parameters is made it have to adapt to different auditory scenes, the stimulus signal being more consistent with practical auditory scene is exported, clarity, the intelligibility of the voice signal of patient in a noisy environment are improved.

Description

Artificial cochlea's sound enhancement method

Technical field

The present invention relates to a kind of sound enhancement method more particularly to a kind of artificial cochlea's sound enhancement methods.

Background technique

Artificial cochlea is recognized in the world bilateral severe or pole profound sensorineural hearing loss patient to be made to restore to listen The unique effective ways and device felt.Existing artificial cochlea's operation workflow are as follows: sound is first converted to telecommunications by microphone acquisition Number, it by special digitized processing, is encoded according still further to certain strategy, is transmitted to body by being loaded in the transmitting coil after ear It is interior, it after the receiving coil of implant senses signal, is decoded by decoding chip, the stimulating electrode of implant is made to generate electric current, To stimulate auditory nerve to generate the sense of hearing.Due to the limitation of use environment, environment noise is necessarily adulterated in sound, is needed to sound Signal carries out certain enhancing optimization (i.e. noise reduction optimization), but in view of the diversification of use environment, traditional enhancing optimization does not have There is universality, the signal after traditional enhancing optimization is deviated with actual conditions sometimes, is unable to reach optimal sense of hearing effect Fruit.

Summary of the invention

In view of the above drawbacks of the prior art, technical problem to be solved by the invention is to provide a kind of artificial cochlea's languages Sound Enhancement Method, with different speech enhan-cement parameters to adapt to different auditory scenes.

To achieve the above object, the present invention provides a kind of artificial cochlea's sound enhancement methods comprising following steps: (A) preprocessor module pre-processes voice signal；(B) end-point detection program module is by pretreated voice signal End-point detection is carried out, judges the noise frame and speech frame in voice signal, and the two is separated；(C) feature extraction program mould The noise frame is carried out feature extraction by block, extracts feature vector；(D) this feature vector is carried out CNN by scene Recognition program module Operation obtains the probability value of each default scene, and the maximum scene of probability value is determined as noise scene；(E) noise reduction process program Module selects corresponding gain table G (γ according to the noise scene_k,ξ_k), and by calculating prior weight and posteriori SNR choosing Take gain table G (γ_k,ξ_k) in concrete sound enhance parameter, then by pretreated voice signal carry out speech enhan-cement； (F) enhanced voice signal is converted encoded information output by tactful coded program module.

In step, which includes framing, adding window, preemphasis and its frequency domain conversion, wherein the adding window uses Hamming window or Hanning window.

In stepb, which uses double threshold method, wherein the double threshold method is energy threshold and zero-crossing rate door Limit.

In step C, this feature, which is extracted, uses MFCC, FBank or sound spectrograph.

In step E, gain table G (γ_k,ξ_k) γ_k, ξ_kRange take -19dB-20dB, training step is as follows:

(a) local minimum of signals with noise power spectrum is tracked:

(b) voice existing probability: Sr=P (λ, k)/P is calculated_min(λ, k),

δ (k) is to determine empirical value related with frequency by experiment, is judged as that voice has frequency if Sr > δ (k) Otherwise band is grass, it may be assumed that

According to the judgment rule of above formula, carry out more new speech existing probability:

P (λ, k)=α_pp(λ-1,k)+(1-α_p) I (λ, k), wherein α_p=0.2；

(c) smoothing constant related with frequency: α is calculated_s(λ, k)=α_d+(1-α_d) p (λ, k), wherein α_d=0.85, α_s The value range of (λ, k) is α_d≤α_s(λ,k)≤1；

(d) noise power spectrum: D (λ, k)=α is updated_s(λ,k)D(λ-1,k)+(1-α_s(λ, k)) Shu Y (λ, k) Shu², wherein D (λ, k) is the estimated value of noise power spectrum, and wherein gain, which calculates, uses the least-mean-square error algorithm based on logarithmic spectrum (LOGSTAS-MMSE), the amplitude spectrum estimator of voice:

Gain function:

Above formula integration type can approximate calculation, i.e.,

Further, the calculating formula of the prior weight isThe calculating formula of the posteriori SNR isWherein, λ_s(k)、λ_nIt (k) is the variance of voice and noise under k-th of frequency band, Y is noisy speech.

Artificial cochlea's sound enhancement method of the present invention gain table different by training, makes it have different speech enhan-cements Parameter exports the stimulus signal being more consistent with practical auditory scene to adapt to different auditory scenes, improves patient in noise Clarity, the intelligibility of voice signal under environment, improve the quality of life of artificial cave patient.

It is described further below with reference to technical effect of the attached drawing to design of the invention, specific structure and generation, with It is fully understood from the purpose of the present invention, feature and effect.

Detailed description of the invention

Fig. 1 is the flow diagram of artificial cochlea's sound enhancement method of the present invention.

Specific embodiment

The present invention provides a kind of artificial cochlea's sound enhancement method, which can judge to be presently in ring Border, and according to the corresponding speech enhan-cement parameter of the environmental selection judged, so that cochlear implant is obtained preferable sense of hearing sense By.

As shown in Figure 1, artificial cochlea's sound enhancement method includes pretreatment, end-point detection, feature extraction, scene knowledge Not, noise reduction process, strategy six steps of coding.

Pretreatment: voice signal is carried out framing, adding window, preemphasis and its frequency domain and converted by preprocessor module, wherein The adding window uses Hamming window or Hanning window, and the purpose of frequency domain conversion is to carry out the conversion of time domain to frequency domain.

End-point detection: pretreated voice signal is carried out endpoint inspection using double threshold method by end-point detection program module Survey, judge the noise frame and speech frame in voice signal, and the two is separated, wherein the double threshold method be energy threshold and Zero-crossing rate thresholding.

Feature extraction: the noise frame is carried out feature extraction by feature extraction program module, is extracted feature vector, is used MFCC (Mel-Frequency Cepstrum Coefficient, mel-frequency cepstrum coefficient), FBank (Mel-scale Filter Bank, Meier scale filter group) or sound spectrograph.

Scene Recognition: this feature vector is carried out CNN (ConvolutionNeural by scene Recognition program module Network, convolutional neural networks) operation, it obtains the probability value of each default scene, the maximum scene of probability value is determined as noise Scene.

Noise reduction process: noise reduction process program module selects corresponding gain table G (γ according to the noise scene_k,ξ_k), and lead to It crosses calculating prior weight and posteriori SNR chooses gain table G (γ_k,ξ_k) in concrete sound enhance parameter, then to pre- Treated, and voice signal carries out speech enhan-cement, to achieve the purpose that noise reduction.

Gain table G (γ_k,ξ_k), γ_k, ξ_kRange take -19dB-20dB, preferably take 0.5dB, training step is such as Under:

(1) local minimum of signals with noise power spectrum is tracked:

(2) voice existing probability: Sr=P (λ, k)/P is calculated_min(λ, k),

P (λ, k)=α_pp(λ-1,k)+(1-α_p) I (λ, k), wherein α_p=0.2；

(3) smoothing constant related with frequency: α is calculated_s(λ, k)=α_d+(1-α_d) p (λ, k), wherein α_d=0.85, α_s The value range of (λ, k) is α_d≤α_s(λ,k)≤1；

(4) noise power spectrum: D (λ, k)=α is updated_s(λ,k)D(λ-1,k)+(1-α_s(λ, k)) Shu Y (λ, k) Shu², wherein D (λ, k) is the estimated value of noise power spectrum, and wherein gain, which calculates, uses the least-mean-square error algorithm based on logarithmic spectrum (LOGSTAS-MMSE), the amplitude spectrum estimator of voice:

Gain function:

Above formula integration type can approximate calculation, i.e.,

The calculating formula of the prior weight isThe calculating formula of the posteriori SNR isIts In, λ_s(k)、λ_nIt (k) is the variance of voice and noise under k-th of frequency band, Y is noisy speech.

Strategy coding: tactful coded program module converts encoded information for the voice signal enhanced and exports, step packet Envelope extraction is included, frequency spectrum choosing is big, non-linear compression, coding output, so that encoded information is sent in vivo by transmitting coil Implant.

It has to be noted that the sound enhancement method can identify auditory scene in real time, it, can be automatic when auditory scene variation Using new speech enhan-cement parameter.And the model of the CNN and gain table G (γ_k,ξ_k) lower training is completed online, actually make Small with hour operation quantity, algorithm accounts for that hardware resource is few, and algorithmic delay is low, can run in the high mobile device of portability.

The preferred embodiment of the present invention has been described in detail above.It should be appreciated that those skilled in the art without It needs creative work according to the present invention can conceive and makes many modifications and variations.Therefore, all technologies in the art Personnel are available by logical analysis, reasoning, or a limited experiment on the basis of existing technology under this invention's idea Technical solution, all should be within the scope of protection determined by the claims.

Claims

1. a kind of artificial cochlea's sound enhancement method comprising following steps: (A) preprocessor module carries out voice signal Pretreatment；(B) pretreated voice signal is carried out end-point detection by end-point detection program module, judges making an uproar in voice signal Acoustic frame and speech frame, and the two is separated；(C) noise frame is carried out feature extraction by feature extraction program module, is extracted special Levy vector；(D) this feature vector is carried out CNN operation by scene Recognition program module, obtains the probability value of each default scene, will be general The maximum scene of rate value is determined as noise scene；(E) noise reduction process program module selects corresponding gain according to the noise scene Table G (γ_k,ξ_k), and gain table G (γ is chosen by calculating prior weight and posteriori SNR_k,ξ_k) in concrete sound Enhance parameter, pretreated voice signal is then subjected to speech enhan-cement；(F) tactful coded program module is by enhanced sound Sound signal is converted into encoded information output.

2. artificial cochlea's sound enhancement method as described in claim 1, it is characterised in that: in step E, gain table G (γ_k,ξ_k) γ_k, ξ_kRange take -19dB-20dB, training step is as follows:

(a) local minimum of signals with noise power spectrum is tracked:

(b) voice existing probability: Sr=P (λ, k)/P is calculated_min(λ, k),

δ (k) is to determine empirical value related with frequency by experiment, and being judged as voice if Sr > δ (k), there are frequency bands, no It is then grass, that is:

P (λ, k)=α_pp(λ-1,k)+(1-α_p) I (λ, k), wherein α_p=0.2；

(c) smoothing constant related with frequency: α is calculated_s(λ, k)=α_d+(1-α_d) p (λ, k), wherein α_d=0.85, α_s(λ,k) Value range be α_d≤α_s(λ,k)≤1；

(d) noise power spectrum: D (λ, k)=α is updated_s(λ,k)D(λ-1,k)+(1-α_s(λ, k)) Shu Y (λ, k) Shu², wherein D (λ, k) It is the estimated value of noise power spectrum, wherein gain, which calculates, uses the least-mean-square error algorithm (LOGSTAS- based on logarithmic spectrum MMSE), the amplitude spectrum estimator of voice:

Gain function:

Above formula integration type can approximate calculation, i.e.,

3. artificial cochlea's sound enhancement method as claimed in claim 2, it is characterised in that: the calculating formula of the prior weight isThe calculating formula of the posteriori SNR isWherein, λ_s(k)、λ_n(k) for voice under k-th frequency band and The variance of noise, Y are noisy speech.

4. artificial cochlea's sound enhancement method as claimed in claim 3, it is characterised in that: in step, which includes Framing, adding window, preemphasis and its frequency domain conversion, wherein the adding window uses Hamming window or Hanning window.

5. artificial cochlea's sound enhancement method as claimed in claim 3, it is characterised in that: in stepb, which adopts With double threshold method, wherein the double threshold method is energy threshold and zero-crossing rate thresholding.

6. artificial cochlea's sound enhancement method as claimed in claim 3, it is characterised in that: in step C, this feature extraction is adopted With MFCC, FBank or sound spectrograph.