WO2020087716A1 - 人工耳蜗听觉场景识别方法 - Google Patents

人工耳蜗听觉场景识别方法 Download PDF

Info

Publication number
WO2020087716A1
WO2020087716A1 PCT/CN2018/123296 CN2018123296W WO2020087716A1 WO 2020087716 A1 WO2020087716 A1 WO 2020087716A1 CN 2018123296 W CN2018123296 W CN 2018123296W WO 2020087716 A1 WO2020087716 A1 WO 2020087716A1
Authority
WO
WIPO (PCT)
Prior art keywords
scene
scene recognition
sound signal
recognition method
auditory
Prior art date
Application number
PCT/CN2018/123296
Other languages
English (en)
French (fr)
Inventor
樊伟
刘新东
刘根芳
魏清
Original Assignee
上海力声特医学科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海力声特医学科技有限公司 filed Critical 上海力声特医学科技有限公司
Publication of WO2020087716A1 publication Critical patent/WO2020087716A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window

Definitions

  • the invention relates to an auditory scene recognition method, and in particular to a cochlear implant auditory scene recognition method.
  • Cochlear implants are currently recognized as the only effective method and device in the world that can restore hearing to patients with bilateral severe or extremely severe sensorineural hearing loss.
  • the existing cochlear implant operation process is: the sound is first collected by the microphone and converted into an electrical signal, after special digital processing, and then encoded according to a certain strategy, transmitted to the body through the transmitting coil carried behind the ear, and the receiving coil of the implant After the signal is sensed, it is decoded by the decoding chip, so that the stimulating electrode of the implant generates electric current, thereby stimulating the auditory nerve to produce hearing. Due to the limitation of the use environment, the sound is bound to be mixed with environmental noise, and certain algorithm optimization of the sound signal is required.
  • the technical problem to be solved by the present invention is to provide a cochlear implant auditory scene recognition method, which can recognize different auditory scenes.
  • the present invention provides a cochlear auditory auditory scene recognition method, which includes the following steps: (A) the preprocessing program module divides the sound signal into frames and windowing; (B) the feature extraction program module The processed sound signal is subjected to feature extraction; (C) The scene recognition program module performs CNN operation on the sound signal after feature extraction to obtain the probability value of each preset scene, and determines the scene with the largest probability value as the final scene.
  • step A the windowing process uses Hamming window or Hanning window.
  • step B the feature vector is extracted using MFCC, FBank, or spectrogram.
  • the CNN includes an input layer, an intermediate layer, and an output layer, wherein the input layer is a two-dimensional data matrix composed of sound signal features, the intermediate layer includes a convolutional output layer, a pooled output layer, and a fully connected output Layer, the fully connected output layer is composed of one-dimensional data, and the pooled output layer is one less than the convolutional output layer.
  • pooling process uses Maxpooling or Meanpooling.
  • the activation function uses ReLU, sigmoid, tanh, or Logistic, where the ReLU formula:
  • the cochlear auditory scene recognition method of the present invention can be processed by CNN to identify different auditory scenes, provide instructions for the signal processing module of the speech processor's subsequent speech enhancement and speech strategy, and make the signal processing of the speech processor more closely match the auditory scene and output
  • the stimulus signal more in line with the actual auditory scene improves the clarity and intelligibility of the patient's voice signal in a noisy environment, and also improves the listening effect in the music scene, further improving the quality of life of patients with cochlear implants.
  • FIG. 1 is a schematic flowchart of a method for recognizing an auditory scene of a cochlear implant of the present invention.
  • FIG. 2 is a schematic flow chart of the CNN processing sound signals of the present invention.
  • FIG. 3 is a flowchart of a specific embodiment of CNN processing sound signals according to the present invention.
  • the present invention provides a cochlear auditory auditory scene recognition method for identifying different auditory scenes, such as classrooms, streets, concert halls, shopping malls, train stations, and vegetable markets.
  • the cochlear auditory auditory scene recognition method includes three steps of preprocessing, feature extraction, and scene recognition.
  • the preprocessing program module divides the sound signal into frames and adds windows.
  • the purpose of the preprocessing is to use the window function to smoothly divide the sampled sound signal in units of frames. Different frame lengths and The window function will affect the results of the system output.
  • the purpose of windowing is to reduce the leakage in the frequency domain of the signal and reduce the amplitude of the side lobes.
  • the windowing process can also use other window functions such as Hanning window, and the frame length and frame shift can also be changed and set according to the needs of the system.
  • the feature extraction program module performs feature extraction on the preprocessed sound signal, where the feature extraction uses MFCC (Mel-Frequency Cepstrum Coefficient, Mel frequency cepstrum coefficient), FBank (Mel-scale Filter Bank, Mel Scale filter bank) or spectrogram.
  • MFCC Mel-Frequency Cepstrum Coefficient, Mel frequency cepstrum coefficient
  • FBank Mel-scale Filter Bank, Mel Scale filter bank
  • spectrogram spectrogram
  • the feature extraction method using Fbank is as follows:
  • H m (k) is the frequency response of the Mel filter, and m is the number of Mel filters, which is 40 here;
  • the scene recognition program module performs CNN (Convolution Neural Network, Convolutional Neural Network) operation on the sound signal after feature extraction to obtain the probability value of each preset scene, and determines the scene with the largest probability value as the final scene, thus Provides instructions for the signal processing modules of the speech processor, such as subsequent speech enhancement and speech strategy, so that the signal processing of the speech processor more closely matches the auditory scene.
  • CNN Convolution Neural Network, Convolutional Neural Network
  • the CNN includes an input layer, an intermediate layer, and an output layer, where the input layer is a two-dimensional data matrix composed of sound signal features, and the intermediate layer includes a convolutional output layer, a pooled output layer, and a fully connected
  • the output layer, the convolution pool output layer has a convolution effect, the pooled output layer has a pooling effect, and the fully connected output layer also plays a pooling role and is composed of one-dimensional data, whose purpose is to reduce dimensionality and convolution And pooling occurs in pairs, that is, the pooled output layer is one less than the convolutional output layer.
  • the process of processing the sound signal by the CNN is as follows: the sound signal enters the first convolution output layer from the input layer and outputs the feature group C_1 after the convolution process; the feature group C_1 enters the first pooled output layer and is pooled After processing, the feature group S_1 is output; the feature group S_1 enters the second convolution output layer for convolution processing, and then outputs the feature group C_2, and then enters the second pooled output layer for pooling processing, and then outputs the feature group S_2, to
  • the final feature group is finally output by the Nth convolution output layer, and the final feature group is finally pooled by the fully connected output layer to obtain the classification result of each preset scene, that is, the probability of each preset scene Value, and finally the output layer determines the preset scene with the highest probability as the final scene, where N is greater than or equal to 2.
  • the pooling process uses Maxpooling or Meanpooling.
  • the activation function uses ReLU (Rectified Linear Units), the formula is as follows:
  • the activation function can also use sigmoid, tanh or logistic.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • Prostheses (AREA)

Abstract

一种人工耳蜗听觉场景识别方法,其包括如下步骤:(A)预处理程序模块将声音信号进行分帧与加窗处理;(B)特征提取程序模块将预处理后的声音信号进行特征提取;(C)场景识别程序模块将特征提取后的声音信号进行CNN运算,得出各预设场景的概率值,将概率值最大的场景判定为最终场景并输出。该通过CNN处理,能识别不同的听觉场景,为语音处理器后续语音增强及言语策略等信号处理模块提供指示,使语音处理器的信号处理与听觉场景更加匹配,输出与实际听觉场景更加相符的刺激信号,提高患者在噪声环境下的语音信号的清晰度、可懂度,同时还可提高音乐场景下的聆听效果,进一步改善人工耳蜗植入患者的生活质量。

Description

人工耳蜗听觉场景识别方法 技术领域
本发明涉及一种听觉场景识别方法,尤其涉及一种人工耳蜗听觉场景识别方法。
背景技术
人工耳蜗是目前世界公认的能使双侧重度或极重度感音神经性耳聋患者恢复听觉的唯一有效方法及装置。现有的人工耳蜗运作流程为:声音先由麦克风采集转换为电信号,经过特殊的数字化处理,再按照一定的策略编码,通过载在耳后的发射线圈传送到体内,植入体的接收线圈感应到信号后,经过解码芯片解码,使植入体的刺激电极产生电流,从而刺激听神经产生听觉。由于使用环境的限制,声音中必然掺杂着环境杂音,需要对声音信号进行一定的算法优化,但鉴于使用环境的多样化,如果只使用单一算法优化,则算法优化后的信号有时会与实际情况有所偏差,无法达到最佳的听觉效果,故需要一种听觉场景的识别方法,使得不同场景使用不同的优化算法,已达到最佳的听觉效果。
发明内容
有鉴于现有技术的上述缺陷,本发明所要解决的技术问题是提供一种人工耳蜗听觉场景识别方法,其能识别不同的听觉场景。
为实现上述目的,本发明提供了一种人工耳蜗听觉场景识别方法,其包括如下步骤:(A)预处理程序模块将声音信号进行分帧与加窗处理;(B)特征提取程序模块将预处理后的声音信号进行特征提取;(C)场景识别程序模块将特征提取后的声音信号进行CNN运算,得出各预设场景的概率值,将概率值最大的场景判定为最终场景。
在步骤A中,该加窗处理使用Hamming窗或Hanning窗。
进一步,Hamming窗:
Figure PCTCN2018123296-appb-000001
其中,窗长N=256,帧移取128。
在步骤B中,该特征向量提取采用MFCC、FBank或语谱图。
进一步,Fbank的特征提取方法:对预处理输出的每一帧声音信号进行FFT变换:X[i,k]=FFT[x i(m)];对每一帧FFT后的数据计算谱线能量:E[i,k]=[x i(k)] 2;计算Mel滤波器能量:
Figure PCTCN2018123296-appb-000002
其中,H m(k)为Mel滤波器的频率响应,m为Mel滤波器个数,这里取40;取对数运算:Fbank=log[S(i,m)]。
在步骤C中,该CNN包括输入层,中间层及输出层,其中,该输入层为声音信号特征构成的二维数据矩阵,该中间层包括卷积输出层,池化输出层以及全连接输出层,该全连接输出层由一个一维数据组成,该池化输出层比该卷积输出层少一个。
进一步,池化处理采用Maxpooling或Meanpooling。
再进一步,激活函数使用ReLU、sigmoid、tanh或Logistic,其中,ReLU公式:
Figure PCTCN2018123296-appb-000003
本发明人工耳蜗听觉场景识别方法通过CNN处理,能识别不同的听觉场景,为语音处理器后续语音增强及言语策略等信号处理模块提供指示,使语音处理器的信号处理与听觉场景更加匹配,输出与实际听觉场景更加相符的刺激信号,提高患者在噪声环境下的语音信号的清晰度、可懂度,同时还可提高音乐场景下的聆听效果,进一步改善人工耳蜗植入患者的生活质量。
以下将结合附图对本发明的构思、具体结构及产生的技术效果作进一步说明,以充分地了解本发明的目的、特征和效果。
附图说明
图1是本发明人工耳蜗听觉场景识别方法的流程示意图。
图2是本发明CNN处理声音信号的流程示意图。
图3是本发明CNN处理声音信号一具体实施例的流程图。
具体实施方式
本发明提供了一种人工耳蜗听觉场景识别方法,用于识别不同的听觉场景,比如教室、街道、音乐厅、商场、火车站、菜市场等。
如图1所示,该人工耳蜗听觉场景识别方法包括预处理,特征提取,场景识别三个步 骤。
预处理:预处理程序模块将声音信号进行分帧与加窗处理,其中,预处理的目的是使用窗函数平滑地在对采样后的声音信号以帧为单位进行切分,不同的帧长及窗函数都会影响***输出的结果,加窗的目的是减少信号频域中的泄露,降低旁瓣幅度。
以***采样频率为16kHz为例。
该加窗处理使用Hamming窗,窗长N=256,帧移取窗长一半,即128。
Hamming窗:
Figure PCTCN2018123296-appb-000004
该加窗处理也可以使用Hanning窗等其他窗函数,帧长和帧移也可以根据***需要进行变化设置。
特征提取:特征提取程序模块将预处理后的声音信号进行特征提取,其中,该特征提取采用MFCC(Mel-Frequency Cepstrum Coefficient,梅尔频率倒谱系数)、FBank(Mel-scale Filter Bank,梅尔标度滤波器组)或语谱图。
采用Fbank的特征提取方法如下:
对预处理输出的每一帧声音信号进行FFT变换:X[i,k]=FFT[x i(m)];
对每一帧FFT后的数据计算谱线能量:E[i,k]=[x i(k)] 2
计算Mel滤波器能量:
Figure PCTCN2018123296-appb-000005
其中,H m(k)为Mel滤波器的频率响应,m为Mel滤波器个数,这里取40;
取对数运算:Fbank=log[S(i,m)]。
场景识别:场景识别程序模块将特征提取后的声音信号进行CNN(Convolution Neural Network,卷积神经网络)运算,得出各预设场景的概率值,将概率值最大的场景判定为最终场景,从而为语音处理器后续语音增强及言语策略等信号处理模块提供指示,使语音处理器的信号处理与听觉场景更加匹配。
如图2所示,该CNN包括输入层,中间层及输出层,其中,该输入层为声音信号特征构成的二维数据矩阵,该中间层包括卷积输出层,池化输出层以及全连接输出层,该卷积池输出层其卷积作用,该池化输出层其池化作用,该全连接输出层也起池化作用且由一个 一维数据组成,其目的是降维,卷积及池化是成对出现的,即,该池化输出层比该卷积输出层少一个。该CNN处理声音信号的流程为:声音信号由该输入层进入第一个卷积输出层,经卷积处理后输出特征组C_1;该特征组C_1进入第一个池化输出层,经池化处理后输出特征组S_1;该特征组S_1进入第二个卷积输出层进行卷积处理后输出特征组C_2,然后进入第二个池化输出层输出进行池化处理后输出特征组S_2,以此类推,最后由第N个卷积输出层输出最终特征组,该最终特征组由该全连接输出层进行最后池化处理,以得出各预设场景分类结果,即各预设场景的概率值,最后由该输出层将最大概率的预设场景判定为最终场景,其中,N大于等于2。
如图3所示,举一CNN框架参数配置属性进行说明,见下表。
Figure PCTCN2018123296-appb-000006
池化处理采用Maxpooling或Meanpooling。
激活函数使用ReLU(Rectified Linear Units),公式如下:
Figure PCTCN2018123296-appb-000007
该激活函数也可以采用sigmoid、tanh或Logistic。
以上详细描述了本发明的较佳具体实施例。应当理解,本领域的普通技术人员无需创造性劳动就可以根据本发明的构思作出诸多修改和变化。因此,凡本技术领域中技术人员依本发明的构思在现有技术的基础上通过逻辑分析、推理或者有限的实验可以得到的技术 方案,皆应在由权利要求书所确定的保护范围内。

Claims (8)

  1. 一种人工耳蜗听觉场景识别方法,其包括如下步骤:(A)预处理程序模块将声音信号进行分帧与加窗处理;(B)特征提取程序模块将预处理后的声音信号进行特征提取;(C)场景识别程序模块将特征提取后的声音信号进行CNN运算,得出各预设场景的概率值,将概率值最大的场景判定为最终场景。
  2. 如权利要求1所述的人工耳蜗听觉场景识别方法,其特征在于:在步骤A中,该加窗处理使用Hamming窗或Hanning窗。
  3. 如权利要求2所述的人工耳蜗听觉场景识别方法,其特征在于:Hamming窗:
    Figure PCTCN2018123296-appb-100001
    其中,窗长N=256,帧移取128。
  4. 如权利要求1所述的人工耳蜗听觉场景识别方法,其特征在于:在步骤B中,该特征向量提取采用MFCC、FBank或语谱图。
  5. 如权利要求4所述的人工耳蜗听觉场景识别方法,其特征在于:Fbank的特征提取流程:对预处理输出的每一帧声音信号进行FFT变换:X[i,k]=FFT[x i(m)];对每一帧FFT后的数据计算谱线能量:E[i,k]=[x i(k)] 2;计算Mel滤波器能量:
    Figure PCTCN2018123296-appb-100002
    其中,H m(k)为Mel滤波器的频率响应,m为Mel滤波器个数,这里取40;取对数运算:Fbank=log[S(i,m)]。
  6. 如权利要求1所述的人工耳蜗听觉场景识别方法,其特征在于:在步骤C中,该CNN包括输入层,中间层及输出层,其中,该输入层为声音信号特征构成的二维数据矩阵,该中间层包括卷积输出层,池化输出层以及全连接输出层,该全连接输出层由一个一维数据组成,该池化输出层比该卷积输出层少一个。
  7. 如权利要求6所述的人工耳蜗听觉场景识别方法,其特征在于:池化处理采用Maxpooling或Meanpooling。
  8. 如权利要求7所述的人工耳蜗听觉场景识别方法,其特征在于:激活函数使用ReLU、sigmoid、tanh或Logistic,其中,ReLU公式:
    Figure PCTCN2018123296-appb-100003
PCT/CN2018/123296 2018-10-30 2018-12-25 人工耳蜗听觉场景识别方法 WO2020087716A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811276582.0A CN109448702A (zh) 2018-10-30 2018-10-30 人工耳蜗听觉场景识别方法
CN201811276582.0 2018-10-30

Publications (1)

Publication Number Publication Date
WO2020087716A1 true WO2020087716A1 (zh) 2020-05-07

Family

ID=65549467

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123296 WO2020087716A1 (zh) 2018-10-30 2018-12-25 人工耳蜗听觉场景识别方法

Country Status (2)

Country Link
CN (1) CN109448702A (zh)
WO (1) WO2020087716A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109859768A (zh) * 2019-03-12 2019-06-07 上海力声特医学科技有限公司 人工耳蜗语音增强方法
CN110796027B (zh) * 2019-10-10 2023-10-17 天津大学 一种基于紧密卷积的神经网络模型的声音场景识别方法
CN111491245B (zh) * 2020-03-13 2022-03-04 天津大学 基于循环神经网络的数字助听器声场识别算法及实现方法
CN113160844A (zh) * 2021-04-27 2021-07-23 山东省计算中心(国家超级计算济南中心) 基于噪声背景分类的语音增强方法及***

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477798A (zh) * 2009-02-17 2009-07-08 北京邮电大学 一种分析和提取设定场景的音频数据的方法
CN103456301A (zh) * 2012-05-28 2013-12-18 中兴通讯股份有限公司 一种基于环境声音的场景识别方法及装置及移动终端
CN107103901A (zh) * 2017-04-03 2017-08-29 浙江诺尔康神经电子科技股份有限公司 人工耳蜗声音场景识别***和方法
CN108231067A (zh) * 2018-01-13 2018-06-29 福州大学 基于卷积神经网络与随机森林分类的声音场景识别方法
CN108520757A (zh) * 2018-03-31 2018-09-11 华南理工大学 基于听觉特性的音乐适用场景自动分类方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102016214745B4 (de) * 2016-08-09 2018-06-14 Carl Von Ossietzky Universität Oldenburg Verfahren zur Stimulation einer implantierten Elektrodenanordnung einer Hörprothese
CN106682574A (zh) * 2016-11-18 2017-05-17 哈尔滨工程大学 一维深度卷积网络的水下多目标识别方法
CN108550375A (zh) * 2018-03-14 2018-09-18 鲁东大学 一种基于语音信号的情感识别方法、装置和计算机设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101477798A (zh) * 2009-02-17 2009-07-08 北京邮电大学 一种分析和提取设定场景的音频数据的方法
CN103456301A (zh) * 2012-05-28 2013-12-18 中兴通讯股份有限公司 一种基于环境声音的场景识别方法及装置及移动终端
CN107103901A (zh) * 2017-04-03 2017-08-29 浙江诺尔康神经电子科技股份有限公司 人工耳蜗声音场景识别***和方法
CN108231067A (zh) * 2018-01-13 2018-06-29 福州大学 基于卷积神经网络与随机森林分类的声音场景识别方法
CN108520757A (zh) * 2018-03-31 2018-09-11 华南理工大学 基于听觉特性的音乐适用场景自动分类方法

Also Published As

Publication number Publication date
CN109448702A (zh) 2019-03-08

Similar Documents

Publication Publication Date Title
Lai et al. Deep learning–based noise reduction approach to improve speech intelligibility for cochlear implant recipients
WO2020087716A1 (zh) 人工耳蜗听觉场景识别方法
CN108766419B (zh) 一种基于深度学习的非常态语音区别方法
CN109326302B (zh) 一种基于声纹比对和生成对抗网络的语音增强方法
CN105741849B (zh) 数字助听器中融合相位估计与人耳听觉特性的语音增强方法
CN110120227B (zh) 一种深度堆叠残差网络的语音分离方法
Stern et al. Hearing is believing: Biologically inspired methods for robust automatic speech recognition
CN107767859B (zh) 噪声环境下人工耳蜗信号的说话人可懂性检测方法
CN110111769B (zh) 一种电子耳蜗控制方法、装置、可读存储介质及电子耳蜗
CN109410976A (zh) 双耳助听器中基于双耳声源定位和深度学习的语音增强方法
CN103761974B (zh) 一种人工耳蜗
CN109448755A (zh) 人工耳蜗听觉场景识别方法
US7787640B2 (en) System and method for spectral enhancement employing compression and expansion
CN1967659A (zh) 用于助听器的语音增强方法
Henry et al. Noise reduction in cochlear implant signal processing: A review and recent developments
Hazrati et al. Reverberation suppression in cochlear implants using a blind channel-selection strategy
CN110992967A (zh) 一种语音信号处理方法、装置、助听器及存储介质
CN106782500A (zh) 一种基于基音周期和mfcc的融合特征参数提取方法
CN109859768A (zh) 人工耳蜗语音增强方法
CN104778948A (zh) 一种基于弯折倒谱特征的抗噪语音识别方法
CN105845143A (zh) 基于支持向量机的说话人确认方法及其***
CN114189781A (zh) 双麦神经网络降噪耳机的降噪方法及***
Saba et al. The effects of Lombard perturbation on speech intelligibility in noise for normal hearing and cochlear implant listeners
Nogueira et al. Development of a sound coding strategy based on a deep recurrent neural network for monaural source separation in cochlear implants
Zaman et al. Classification of Harmful Noise Signals for Hearing Aid Applications using Spectrogram Images and Convolutional Neural Networks

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18938809

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18938809

Country of ref document: EP

Kind code of ref document: A1