CN108447490A - 基于记忆性瓶颈特征的声纹识别的方法及装置 - Google Patents
基于记忆性瓶颈特征的声纹识别的方法及装置 Download PDFInfo
- Publication number
- CN108447490A CN108447490A CN201810146310.2A CN201810146310A CN108447490A CN 108447490 A CN108447490 A CN 108447490A CN 201810146310 A CN201810146310 A CN 201810146310A CN 108447490 A CN108447490 A CN 108447490A
- Authority
- CN
- China
- Prior art keywords
- layer
- speaker
- bottleneck
- neural network
- models
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 30
- 238000013528 artificial neural network Methods 0.000 claims abstract description 79
- 238000001228 spectrum Methods 0.000 claims abstract description 70
- 239000013598 vector Substances 0.000 claims abstract description 61
- 238000000605 extraction Methods 0.000 claims abstract description 32
- 230000015654 memory Effects 0.000 claims description 15
- 230000007774 longterm Effects 0.000 claims description 9
- 239000000284 extract Substances 0.000 claims description 8
- 238000004590 computer program Methods 0.000 claims description 5
- 230000015572 biosynthetic process Effects 0.000 claims description 2
- 238000010586 diagram Methods 0.000 description 16
- 230000006386 memory function Effects 0.000 description 13
- 238000012545 processing Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 230000000306 recurrent effect Effects 0.000 description 5
- 238000012549 training Methods 0.000 description 5
- 230000003446 memory effect Effects 0.000 description 4
- 230000009467 reduction Effects 0.000 description 4
- 230000001755 vocal effect Effects 0.000 description 4
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000005284 excitation Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012512 characterization method Methods 0.000 description 2
- 238000007796 conventional method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009432 framing Methods 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007787 long-term memory Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000004064 recycling Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000005236 sound signal Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/18—Artificial neural networks; Connectionist approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/30—Authentication, i.e. establishing the identity or authorisation of security principals
- G06F21/31—User authentication
- G06F21/32—User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/046—Forward inferencing; Production systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/24—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N7/00—Computing arrangements based on specific mathematical models
- G06N7/01—Probabilistic graphical models, e.g. probabilistic networks
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Computer Security & Cryptography (AREA)
- Neurology (AREA)
- Computer Hardware Design (AREA)
- Image Analysis (AREA)
- Telephonic Communication Services (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Description
Claims (16)
Priority Applications (7)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810146310.2A CN108447490B (zh) | 2018-02-12 | 2018-02-12 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
TW107146358A TW201935464A (zh) | 2018-02-12 | 2018-12-21 | 基於記憶性瓶頸特徵的聲紋識別的方法及裝置 |
SG11202006090RA SG11202006090RA (en) | 2018-02-12 | 2019-01-25 | Voiceprint Recognition Method And Device Based On Memory Bottleneck Feature |
EP19750725.4A EP3719798B1 (en) | 2018-02-12 | 2019-01-25 | Voiceprint recognition method and device based on memorability bottleneck feature |
PCT/CN2019/073101 WO2019154107A1 (zh) | 2018-02-12 | 2019-01-25 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
EP21198758.1A EP3955246B1 (en) | 2018-02-12 | 2019-01-25 | Voiceprint recognition method and device based on memory bottleneck feature |
US16/905,354 US20200321008A1 (en) | 2018-02-12 | 2020-06-18 | Voiceprint recognition method and device based on memory bottleneck feature |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810146310.2A CN108447490B (zh) | 2018-02-12 | 2018-02-12 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108447490A true CN108447490A (zh) | 2018-08-24 |
CN108447490B CN108447490B (zh) | 2020-08-18 |
Family
ID=63192672
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810146310.2A Active CN108447490B (zh) | 2018-02-12 | 2018-02-12 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20200321008A1 (zh) |
EP (2) | EP3719798B1 (zh) |
CN (1) | CN108447490B (zh) |
SG (1) | SG11202006090RA (zh) |
TW (1) | TW201935464A (zh) |
WO (1) | WO2019154107A1 (zh) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109036467A (zh) * | 2018-10-26 | 2018-12-18 | 南京邮电大学 | 基于tf-lstm的cffd提取方法、语音情感识别方法及*** |
CN109360553A (zh) * | 2018-11-20 | 2019-02-19 | 华南理工大学 | 一种用于语音识别的新型时延递归神经网络 |
CN109754812A (zh) * | 2019-01-30 | 2019-05-14 | 华南理工大学 | 一种基于卷积神经网络的防录音攻击检测的声纹认证方法 |
WO2019154107A1 (zh) * | 2018-02-12 | 2019-08-15 | 阿里巴巴集团控股有限公司 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
CN110379412A (zh) * | 2019-09-05 | 2019-10-25 | 腾讯科技(深圳)有限公司 | 语音处理的方法、装置、电子设备及计算机可读存储介质 |
CN111028847A (zh) * | 2019-12-17 | 2020-04-17 | 广东电网有限责任公司 | 一种基于后端模型的声纹识别优化方法和相关装置 |
CN111354364A (zh) * | 2020-04-23 | 2020-06-30 | 上海依图网络科技有限公司 | 一种基于rnn聚合方式的声纹识别方法与*** |
CN111653270A (zh) * | 2020-08-05 | 2020-09-11 | 腾讯科技(深圳)有限公司 | 语音处理方法、装置、计算机可读存储介质及电子设备 |
CN112241467A (zh) * | 2020-12-18 | 2021-01-19 | 北京爱数智慧科技有限公司 | 一种音频查重的方法和装置 |
CN112333545A (zh) * | 2019-07-31 | 2021-02-05 | Tcl集团股份有限公司 | 一种电视内容推荐方法、***、存储介质和智能电视 |
CN114333900A (zh) * | 2021-11-30 | 2022-04-12 | 南京硅基智能科技有限公司 | 端到端提取bnf特征的方法、网络模型、训练方法及*** |
US11315550B2 (en) * | 2018-11-19 | 2022-04-26 | Panasonic Intellectual Property Corporation Of America | Speaker recognition device, speaker recognition method, and recording medium |
CN114882906A (zh) * | 2022-06-30 | 2022-08-09 | 广州伏羲智能科技有限公司 | 一种新型环境噪声识别方法及*** |
CN116072123A (zh) * | 2023-03-06 | 2023-05-05 | 南昌航天广信科技有限责任公司 | 广播信息播放方法、装置、可读存储介质及电子设备 |
Families Citing this family (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102637339B1 (ko) * | 2018-08-31 | 2024-02-16 | 삼성전자주식회사 | 음성 인식 모델을 개인화하는 방법 및 장치 |
JP7024691B2 (ja) * | 2018-11-13 | 2022-02-24 | 日本電信電話株式会社 | 非言語発話検出装置、非言語発話検出方法、およびプログラム |
WO2020199013A1 (en) * | 2019-03-29 | 2020-10-08 | Microsoft Technology Licensing, Llc | Speaker diarization with early-stop clustering |
KR102294638B1 (ko) * | 2019-04-01 | 2021-08-27 | 한양대학교 산학협력단 | 잡음 환경에 강인한 화자 인식을 위한 심화 신경망 기반의 특징 강화 및 변형된 손실 함수를 이용한 결합 학습 방법 및 장치 |
US11899765B2 (en) | 2019-12-23 | 2024-02-13 | Dts Inc. | Dual-factor identification system and method with adaptive enrollment |
CN113411723A (zh) * | 2021-01-13 | 2021-09-17 | 神盾股份有限公司 | 语音助理*** |
CN112951256B (zh) * | 2021-01-25 | 2023-10-31 | 北京达佳互联信息技术有限公司 | 语音处理方法及装置 |
CN112992126B (zh) * | 2021-04-22 | 2022-02-25 | 北京远鉴信息技术有限公司 | 语音真伪的验证方法、装置、电子设备及可读存储介质 |
CN113284508B (zh) * | 2021-07-21 | 2021-11-09 | 中国科学院自动化研究所 | 基于层级区分的生成音频检测*** |
CN117238320B (zh) * | 2023-11-16 | 2024-01-09 | 天津大学 | 一种基于多特征融合卷积神经网络的噪声分类方法 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9324320B1 (en) * | 2014-10-02 | 2016-04-26 | Microsoft Technology Licensing, Llc | Neural network-based speech processing |
CN107492382A (zh) * | 2016-06-13 | 2017-12-19 | 阿里巴巴集团控股有限公司 | 基于神经网络的声纹信息提取方法及装置 |
CN107610707A (zh) * | 2016-12-15 | 2018-01-19 | 平安科技(深圳)有限公司 | 一种声纹识别方法及装置 |
Family Cites Families (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103971690A (zh) * | 2013-01-28 | 2014-08-06 | 腾讯科技(深圳)有限公司 | 一种声纹识别方法和装置 |
CN105575394A (zh) * | 2016-01-04 | 2016-05-11 | 北京时代瑞朗科技有限公司 | 基于全局变化空间及深度学习混合建模的声纹识别方法 |
US9824692B1 (en) * | 2016-09-12 | 2017-11-21 | Pindrop Security, Inc. | End-to-end speaker recognition using deep neural network |
CN106448684A (zh) * | 2016-11-16 | 2017-02-22 | 北京大学深圳研究生院 | 基于深度置信网络特征矢量的信道鲁棒声纹识别*** |
CN106875942B (zh) * | 2016-12-28 | 2021-01-22 | 中国科学院自动化研究所 | 基于口音瓶颈特征的声学模型自适应方法 |
CN106952644A (zh) * | 2017-02-24 | 2017-07-14 | 华南理工大学 | 一种基于瓶颈特征的复杂音频分割聚类方法 |
CN108447490B (zh) * | 2018-02-12 | 2020-08-18 | 阿里巴巴集团控股有限公司 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
-
2018
- 2018-02-12 CN CN201810146310.2A patent/CN108447490B/zh active Active
- 2018-12-21 TW TW107146358A patent/TW201935464A/zh unknown
-
2019
- 2019-01-25 SG SG11202006090RA patent/SG11202006090RA/en unknown
- 2019-01-25 WO PCT/CN2019/073101 patent/WO2019154107A1/zh unknown
- 2019-01-25 EP EP19750725.4A patent/EP3719798B1/en active Active
- 2019-01-25 EP EP21198758.1A patent/EP3955246B1/en active Active
-
2020
- 2020-06-18 US US16/905,354 patent/US20200321008A1/en not_active Abandoned
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9324320B1 (en) * | 2014-10-02 | 2016-04-26 | Microsoft Technology Licensing, Llc | Neural network-based speech processing |
CN107492382A (zh) * | 2016-06-13 | 2017-12-19 | 阿里巴巴集团控股有限公司 | 基于神经网络的声纹信息提取方法及装置 |
CN107610707A (zh) * | 2016-12-15 | 2018-01-19 | 平安科技(深圳)有限公司 | 一种声纹识别方法及装置 |
Non-Patent Citations (2)
Title |
---|
YICHI HUANG等: "Investigating the stacked phonetic bottleneck feature for speaker verification with short voice commands", 《2017 4TH IAPR ASIAN CONFERENCE ON PATTERN RECOGNITION》 * |
黄光许等: "低资源条件下基于i-vector特征的LSTM递归神经网络语音识别***", 《计算机应用研究》 * |
Cited By (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2019154107A1 (zh) * | 2018-02-12 | 2019-08-15 | 阿里巴巴集团控股有限公司 | 基于记忆性瓶颈特征的声纹识别的方法及装置 |
CN109036467A (zh) * | 2018-10-26 | 2018-12-18 | 南京邮电大学 | 基于tf-lstm的cffd提取方法、语音情感识别方法及*** |
US11315550B2 (en) * | 2018-11-19 | 2022-04-26 | Panasonic Intellectual Property Corporation Of America | Speaker recognition device, speaker recognition method, and recording medium |
CN109360553A (zh) * | 2018-11-20 | 2019-02-19 | 华南理工大学 | 一种用于语音识别的新型时延递归神经网络 |
CN109360553B (zh) * | 2018-11-20 | 2023-06-20 | 华南理工大学 | 一种用于语音识别的时延递归神经网络 |
CN109754812A (zh) * | 2019-01-30 | 2019-05-14 | 华南理工大学 | 一种基于卷积神经网络的防录音攻击检测的声纹认证方法 |
CN112333545A (zh) * | 2019-07-31 | 2021-02-05 | Tcl集团股份有限公司 | 一种电视内容推荐方法、***、存储介质和智能电视 |
CN110379412B (zh) * | 2019-09-05 | 2022-06-17 | 腾讯科技(深圳)有限公司 | 语音处理的方法、装置、电子设备及计算机可读存储介质 |
CN110379412A (zh) * | 2019-09-05 | 2019-10-25 | 腾讯科技(深圳)有限公司 | 语音处理的方法、装置、电子设备及计算机可读存储介质 |
US11948552B2 (en) | 2019-09-05 | 2024-04-02 | Tencent Technology (Shenzhen) Company Limited | Speech processing method, apparatus, electronic device, and computer-readable storage medium |
CN111028847A (zh) * | 2019-12-17 | 2020-04-17 | 广东电网有限责任公司 | 一种基于后端模型的声纹识别优化方法和相关装置 |
CN111028847B (zh) * | 2019-12-17 | 2022-09-09 | 广东电网有限责任公司 | 一种基于后端模型的声纹识别优化方法和相关装置 |
CN111354364B (zh) * | 2020-04-23 | 2023-05-02 | 上海依图网络科技有限公司 | 一种基于rnn聚合方式的声纹识别方法与*** |
CN111354364A (zh) * | 2020-04-23 | 2020-06-30 | 上海依图网络科技有限公司 | 一种基于rnn聚合方式的声纹识别方法与*** |
CN111653270A (zh) * | 2020-08-05 | 2020-09-11 | 腾讯科技(深圳)有限公司 | 语音处理方法、装置、计算机可读存储介质及电子设备 |
CN112241467A (zh) * | 2020-12-18 | 2021-01-19 | 北京爱数智慧科技有限公司 | 一种音频查重的方法和装置 |
CN114333900A (zh) * | 2021-11-30 | 2022-04-12 | 南京硅基智能科技有限公司 | 端到端提取bnf特征的方法、网络模型、训练方法及*** |
CN114333900B (zh) * | 2021-11-30 | 2023-09-05 | 南京硅基智能科技有限公司 | 端到端提取bnf特征的方法、网络模型、训练方法及*** |
CN114882906A (zh) * | 2022-06-30 | 2022-08-09 | 广州伏羲智能科技有限公司 | 一种新型环境噪声识别方法及*** |
CN116072123A (zh) * | 2023-03-06 | 2023-05-05 | 南昌航天广信科技有限责任公司 | 广播信息播放方法、装置、可读存储介质及电子设备 |
Also Published As
Publication number | Publication date |
---|---|
EP3955246B1 (en) | 2023-03-29 |
CN108447490B (zh) | 2020-08-18 |
EP3719798A4 (en) | 2021-03-24 |
EP3719798B1 (en) | 2022-09-21 |
EP3955246A1 (en) | 2022-02-16 |
WO2019154107A1 (zh) | 2019-08-15 |
US20200321008A1 (en) | 2020-10-08 |
TW201935464A (zh) | 2019-09-01 |
EP3719798A1 (en) | 2020-10-07 |
SG11202006090RA (en) | 2020-07-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108447490A (zh) | 基于记忆性瓶颈特征的声纹识别的方法及装置 | |
Liu et al. | Voice Conversion Across Arbitrary Speakers Based on a Single Target-Speaker Utterance. | |
CN110675859B (zh) | 结合语音与文本的多情感识别方法、***、介质及设备 | |
CN109887484A (zh) | 一种基于对偶学习的语音识别与语音合成方法及装置 | |
Wan | Speaker verification using support vector machines | |
CN109313892A (zh) | 稳健的语言识别方法和*** | |
Xia et al. | Using denoising autoencoder for emotion recognition. | |
CN111091809B (zh) | 一种深度特征融合的地域性口音识别方法及装置 | |
Omar et al. | Training Universal Background Models for Speaker Recognition. | |
Pascual et al. | Multi-output RNN-LSTM for multiple speaker speech synthesis and adaptation | |
Zheng et al. | An improved speech emotion recognition algorithm based on deep belief network | |
JPH09507921A (ja) | ニューラルネットワークを使用した音声認識システムおよびその使用方法 | |
Gadasin et al. | Using Formants for Human Speech Recognition by Artificial Intelligence | |
Liu et al. | Learning salient features for speech emotion recognition using CNN | |
Konangi et al. | Emotion recognition through speech: A review | |
Koolagudi et al. | Speaker recognition in the case of emotional environment using transformation of speech features | |
Chen et al. | Voice conversion using generative trained deep neural networks with multiple frame spectral envelopes. | |
CN114360553B (zh) | 一种提升声纹安全性的方法 | |
Gamage et al. | An i-vector gplda system for speech based emotion recognition | |
Gao | Audio deepfake detection based on differences in human and machine generated speech | |
CN114333790A (zh) | 数据处理方法、装置、设备、存储介质及程序产品 | |
Kang et al. | SVLDL: Improved speaker age estimation using selective variance label distribution learning | |
Yusuf et al. | RMWSaug: robust multi-window spectrogram augmentation approach for deep learning based speech emotion recognition | |
Bansal et al. | Automatic speech recognition by cuckoo search optimization based artificial neural network classifier | |
CN108510995B (zh) | 面向语音通信的身份信息隐藏方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20200924 Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Patentee after: Innovative advanced technology Co.,Ltd. Address before: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Patentee before: Advanced innovation technology Co.,Ltd. Effective date of registration: 20200924 Address after: Cayman Enterprise Centre, 27 Hospital Road, George Town, Grand Cayman Islands Patentee after: Advanced innovation technology Co.,Ltd. Address before: A four-storey 847 mailbox in Grand Cayman Capital Building, British Cayman Islands Patentee before: Alibaba Group Holding Ltd. |