WO2017162053A1 - Procédé et dispositif d'authentification d'identité - Google Patents
Procédé et dispositif d'authentification d'identité Download PDFInfo
- Publication number
- WO2017162053A1 WO2017162053A1 PCT/CN2017/076336 CN2017076336W WO2017162053A1 WO 2017162053 A1 WO2017162053 A1 WO 2017162053A1 CN 2017076336 W CN2017076336 W CN 2017076336W WO 2017162053 A1 WO2017162053 A1 WO 2017162053A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- text
- segmentation
- score
- target
- hmm
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000011218 segmentation Effects 0.000 claims abstract description 373
- 238000013528 artificial neural network Methods 0.000 claims abstract description 7
- 239000002131 composite material Substances 0.000 claims description 57
- 230000007704 transition Effects 0.000 claims description 37
- 239000011159 matrix material Substances 0.000 claims description 25
- 239000000203 mixture Substances 0.000 claims description 14
- 238000012549 training Methods 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 9
- 230000008569 process Effects 0.000 description 13
- 238000012360 testing method Methods 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000010606 normalization Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- 108010001267 Protein Subunits Proteins 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000005309 stochastic process Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
- G10L17/24—Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/14—Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
- G10L15/142—Hidden Markov Models [HMMs]
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3226—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
- H04L9/3231—Biological data, e.g. fingerprint, voice or retina
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3226—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
Definitions
- the adjacent two initial segmentation points are sequentially selected as a range start and end point, in which the average energy is calculated in units of specified frames, and the point where the average energy continuously increases by a specified number of times is found, and the point at which the increase is started is started.
- the initial segmentation unit is divided by the initial segmentation point.
- the performing the initial segmentation unit to perform the forced segmentation so that the total number of the segmentation units is the same as the preset target text number, including:
- the performing the initial segmentation unit to perform the forced segmentation so that the total number of the segmentation units is the same as the preset target text number, including:
- the voiceprint matching module is configured to match the target voice with the target voiceprint model to obtain a first voiceprint score, and match the non-target voice with the target voiceprint model to obtain a second voiceprint score;
- the processing module is further configured to sequentially select the target text model, and the voice feature of the non-target text is matched with the corresponding target text model, and the recognized text is scored. Obtaining a mean value and a standard deviation of the spoofed text score corresponding to the target text model; and subtracting the average of the corresponding spoofed text scores by the first text score and the second text score respectively and dividing by the The standard deviation is respectively scored by the regular text; the first text score after the regularization is scored and the first voiceprint is scored, and the maximum value and the minimum value corresponding to each target text are obtained; the maximum value and the minimum value are used to be regularized.
- the text matching module matches the voice features of each of the segmentation units with all the target text models to obtain a segmentation of each of the segmentation units and each of the target text models.
- the unit text matching score includes: using the voice feature of each of the segmentation units as an input of each target text hidden Markov model HMM, and using the output probability obtained according to the Viterbi algorithm as the corresponding segmentation unit text matching score .
- step d Combining the voice scores and the regularized text scores, obtaining the maximum and minimum values corresponding to each target text, and using the maximum and minimum values in step d to score the voiceprint scores and texts. Normalize; for example:
- Target speaker A person who is trusted by the system, who needs to pass the voiceprint authentication
- the adjacent two initial segmentation points are sequentially selected as a range start and end point, in which the average energy is calculated in units of specified frames, and the point where the average energy continuously increases by a specified number of times is found, and the point at which the increase is started is started.
- a new initial segmentation point otherwise, the initial segmentation point is not updated, and the initial segmentation unit is divided by the initial segmentation point.
- Step 110 The decision is made by using the integrated decision classifier to determine the input feature vector new_score. For each input, the output is 1 or 0. When the output is 1, the test voice decision is passed, and when the output is 0, the test voice is rejected.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Computer Security & Cryptography (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Biodiversity & Conservation Biology (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Machine Translation (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Telephonic Communication Services (AREA)
Abstract
La présente invention concerne un procédé d'authentification d'identité, comprenant : l'acquisition d'une caractéristique vocale d'une voix d'entrée, et la mise en correspondance de la caractéristique vocale avec un modèle d'empreinte vocale cible pré-stocké, l'obtention d'un score de correspondance d'empreinte vocale (11) ; la segmentation de la voix d'entrée en fonction de la caractéristique vocale et d'un modèle de texte cible, et l'acquisition d'une pluralité d'unités de segmentation initiales et d'unités de segmentation vocale initiales (12) - si le nombre d'unités de segmentation vocale initiales est supérieur ou égal à un premier seuil, la conduite d'une segmentation forcée sur les unités de segmentation initiales, l'ajustement du nombre total d'unités de segmentation de façon à être égal au nombre dans un texte cible prédéfini ; la mise en correspondance de la caractéristique vocale de chaque unité de segmentation avec chaque modèle de texte cible, l'obtention d'un score de correspondance de texte d'unité de segmentation pour chaque unité de segmentation et chaque modèle de texte cible (13) ; la conduite d'une authentification d'identité en fonction des scores de correspondance de texte d'unité de segmentation, des scores de correspondance d'empreinte vocale et d'un classificateur de réseau neuronal probabiliste (PNN) pré-entraîné (14). Le présent procédé réalise l'objectif d'authentification à deux facteurs d'un utilisateur, de façon à augmenter la sécurité du système.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610162027.XA CN107221333B (zh) | 2016-03-21 | 2016-03-21 | 一种身份认证的方法和装置 |
CN201610162027.X | 2016-03-21 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017162053A1 true WO2017162053A1 (fr) | 2017-09-28 |
Family
ID=59899353
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2017/076336 WO2017162053A1 (fr) | 2016-03-21 | 2017-03-10 | Procédé et dispositif d'authentification d'identité |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107221333B (fr) |
WO (1) | WO2017162053A1 (fr) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110010135A (zh) * | 2018-01-05 | 2019-07-12 | 北京搜狗科技发展有限公司 | 一种基于语音的身份识别方法、装置和电子设备 |
WO2019194787A1 (fr) * | 2018-04-02 | 2019-10-10 | Visa International Service Association | Détection d'anomalie d'entité en temps réel |
CN111131237A (zh) * | 2019-12-23 | 2020-05-08 | 深圳供电局有限公司 | 基于bp神经网络的微网攻击识别方法及并网接口装置 |
CN111862933A (zh) * | 2020-07-20 | 2020-10-30 | 北京字节跳动网络技术有限公司 | 用于生成合成语音的方法、装置、设备和介质 |
CN112423063A (zh) * | 2020-11-03 | 2021-02-26 | 深圳Tcl新技术有限公司 | 一种智能电视自动设置方法、装置及存储介质 |
CN112751838A (zh) * | 2020-12-25 | 2021-05-04 | 中国人民解放军陆军装甲兵学院 | 身份认证方法、装置及身份认证*** |
US12041140B2 (en) | 2018-04-02 | 2024-07-16 | Visa International Service Association | Real-time entity anomaly detection |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108154588B (zh) * | 2017-12-29 | 2020-11-27 | 深圳市艾特智能科技有限公司 | 解锁方法、***、可读存储介质及智能设备 |
CN108831484A (zh) * | 2018-05-29 | 2018-11-16 | 广东声将军科技有限公司 | 一种离线的且与语言种类无关的声纹识别方法及装置 |
CN109545226B (zh) * | 2019-01-04 | 2022-11-22 | 平安科技(深圳)有限公司 | 一种语音识别方法、设备及计算机可读存储介质 |
EP3921833A1 (fr) * | 2019-04-05 | 2021-12-15 | Google LLC | Reconnaissance vocale automatique conjointe et segmentation et regroupement de haut-parleur |
CN110502610A (zh) * | 2019-07-24 | 2019-11-26 | 深圳壹账通智能科技有限公司 | 基于文本语义相似度的智能语音签名方法、装置及介质 |
CN111862967B (zh) * | 2020-04-07 | 2024-05-24 | 北京嘀嘀无限科技发展有限公司 | 一种语音识别方法、装置、电子设备及存储介质 |
CN111882543B (zh) * | 2020-07-29 | 2023-12-26 | 南通大学 | 一种基于AA R2Unet和HMM的香烟滤棒计数方法 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6671672B1 (en) * | 1999-03-30 | 2003-12-30 | Nuance Communications | Voice authentication system having cognitive recall mechanism for password verification |
CN102413101A (zh) * | 2010-09-25 | 2012-04-11 | 盛乐信息技术(上海)有限公司 | 声纹密码语音提示的声纹认证***及其实现方法 |
CN102457845A (zh) * | 2010-10-14 | 2012-05-16 | 阿里巴巴集团控股有限公司 | 无线业务身份认证方法、设备及*** |
CN103220286A (zh) * | 2013-04-10 | 2013-07-24 | 郑方 | 基于动态密码语音的身份确认***及方法 |
CN104021790A (zh) * | 2013-02-28 | 2014-09-03 | 联想(北京)有限公司 | 声控解锁方法以及电子设备 |
CN104064189A (zh) * | 2014-06-26 | 2014-09-24 | 厦门天聪智能软件有限公司 | 一种声纹动态口令的建模和验证方法 |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20060294390A1 (en) * | 2005-06-23 | 2006-12-28 | International Business Machines Corporation | Method and apparatus for sequential authentication using one or more error rates characterizing each security challenge |
CN102543084A (zh) * | 2010-12-29 | 2012-07-04 | 盛乐信息技术(上海)有限公司 | 在线声纹认证***及其实现方法 |
-
2016
- 2016-03-21 CN CN201610162027.XA patent/CN107221333B/zh active Active
-
2017
- 2017-03-10 WO PCT/CN2017/076336 patent/WO2017162053A1/fr active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6671672B1 (en) * | 1999-03-30 | 2003-12-30 | Nuance Communications | Voice authentication system having cognitive recall mechanism for password verification |
CN102413101A (zh) * | 2010-09-25 | 2012-04-11 | 盛乐信息技术(上海)有限公司 | 声纹密码语音提示的声纹认证***及其实现方法 |
CN102457845A (zh) * | 2010-10-14 | 2012-05-16 | 阿里巴巴集团控股有限公司 | 无线业务身份认证方法、设备及*** |
CN104021790A (zh) * | 2013-02-28 | 2014-09-03 | 联想(北京)有限公司 | 声控解锁方法以及电子设备 |
CN103220286A (zh) * | 2013-04-10 | 2013-07-24 | 郑方 | 基于动态密码语音的身份确认***及方法 |
CN104064189A (zh) * | 2014-06-26 | 2014-09-24 | 厦门天聪智能软件有限公司 | 一种声纹动态口令的建模和验证方法 |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110010135A (zh) * | 2018-01-05 | 2019-07-12 | 北京搜狗科技发展有限公司 | 一种基于语音的身份识别方法、装置和电子设备 |
CN110010135B (zh) * | 2018-01-05 | 2024-05-07 | 北京搜狗科技发展有限公司 | 一种基于语音的身份识别方法、装置和电子设备 |
WO2019194787A1 (fr) * | 2018-04-02 | 2019-10-10 | Visa International Service Association | Détection d'anomalie d'entité en temps réel |
US12041140B2 (en) | 2018-04-02 | 2024-07-16 | Visa International Service Association | Real-time entity anomaly detection |
CN111131237A (zh) * | 2019-12-23 | 2020-05-08 | 深圳供电局有限公司 | 基于bp神经网络的微网攻击识别方法及并网接口装置 |
CN111131237B (zh) * | 2019-12-23 | 2020-12-29 | 深圳供电局有限公司 | 基于bp神经网络的微网攻击识别方法及并网接口装置 |
CN111862933A (zh) * | 2020-07-20 | 2020-10-30 | 北京字节跳动网络技术有限公司 | 用于生成合成语音的方法、装置、设备和介质 |
CN112423063A (zh) * | 2020-11-03 | 2021-02-26 | 深圳Tcl新技术有限公司 | 一种智能电视自动设置方法、装置及存储介质 |
CN112751838A (zh) * | 2020-12-25 | 2021-05-04 | 中国人民解放军陆军装甲兵学院 | 身份认证方法、装置及身份认证*** |
Also Published As
Publication number | Publication date |
---|---|
CN107221333A (zh) | 2017-09-29 |
CN107221333B (zh) | 2019-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017162053A1 (fr) | Procédé et dispositif d'authentification d'identité | |
JP6766221B2 (ja) | 話者検証のためのニューラルネットワーク | |
US10325602B2 (en) | Neural networks for speaker verification | |
US8050919B2 (en) | Speaker recognition via voice sample based on multiple nearest neighbor classifiers | |
KR100655491B1 (ko) | 음성인식 시스템에서의 2단계 발화 검증 방법 및 장치 | |
US20170236520A1 (en) | Generating Models for Text-Dependent Speaker Verification | |
CN111199741A (zh) | 声纹识别方法、声纹验证方法、装置、计算设备及介质 | |
JPH11507443A (ja) | 話者確認システム | |
JP2007133414A (ja) | 音声の識別能力推定方法及び装置、ならびに話者認証の登録及び評価方法及び装置 | |
CN111524527A (zh) | 话者分离方法、装置、电子设备和存储介质 | |
US10909991B2 (en) | System for text-dependent speaker recognition and method thereof | |
US7050973B2 (en) | Speaker recognition using dynamic time warp template spotting | |
CN113744742B (zh) | 对话场景下的角色识别方法、装置和*** | |
Yun et al. | An end-to-end text-independent speaker verification framework with a keyword adversarial network | |
Ozaydin | Design of a text independent speaker recognition system | |
Georgescu et al. | GMM-UBM modeling for speaker recognition on a Romanian large speech corpora | |
Das et al. | Comparison of DTW score and warping path for text dependent speaker verification system | |
CN105575385A (zh) | 语音密码设置***及方法、语音密码验证***及方法 | |
WO2002029785A1 (fr) | Procede, appareil et systeme permettant la verification du locuteur s'inspirant d'un modele de melanges de gaussiennes (gmm) | |
Sun et al. | A new study of GMM-SVM system for text-dependent speaker recognition | |
Nallagatla et al. | Sequential decision fusion for controlled detection errors | |
Mishra et al. | Speaker identification, differentiation and verification using deep learning for human machine interface | |
JP2000099090A (ja) | 記号列を用いた話者認識方法 | |
Kanrar | Dimension compactness in speaker identification | |
WO2006027844A1 (fr) | Analyseur d'orateur |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17769330 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 17769330 Country of ref document: EP Kind code of ref document: A1 |