WO2017162053A1 - Procédé et dispositif d'authentification d'identité - Google Patents

Procédé et dispositif d'authentification d'identité Download PDF

Info

Publication number
WO2017162053A1
WO2017162053A1 PCT/CN2017/076336 CN2017076336W WO2017162053A1 WO 2017162053 A1 WO2017162053 A1 WO 2017162053A1 CN 2017076336 W CN2017076336 W CN 2017076336W WO 2017162053 A1 WO2017162053 A1 WO 2017162053A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
segmentation
score
target
hmm
Prior art date
Application number
PCT/CN2017/076336
Other languages
English (en)
Chinese (zh)
Inventor
朱长宝
李欢欢
袁浩
王金明
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2017162053A1 publication Critical patent/WO2017162053A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/04Segmentation; Word boundary detection
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • G10L15/142Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN

Definitions

  • the adjacent two initial segmentation points are sequentially selected as a range start and end point, in which the average energy is calculated in units of specified frames, and the point where the average energy continuously increases by a specified number of times is found, and the point at which the increase is started is started.
  • the initial segmentation unit is divided by the initial segmentation point.
  • the performing the initial segmentation unit to perform the forced segmentation so that the total number of the segmentation units is the same as the preset target text number, including:
  • the performing the initial segmentation unit to perform the forced segmentation so that the total number of the segmentation units is the same as the preset target text number, including:
  • the voiceprint matching module is configured to match the target voice with the target voiceprint model to obtain a first voiceprint score, and match the non-target voice with the target voiceprint model to obtain a second voiceprint score;
  • the processing module is further configured to sequentially select the target text model, and the voice feature of the non-target text is matched with the corresponding target text model, and the recognized text is scored. Obtaining a mean value and a standard deviation of the spoofed text score corresponding to the target text model; and subtracting the average of the corresponding spoofed text scores by the first text score and the second text score respectively and dividing by the The standard deviation is respectively scored by the regular text; the first text score after the regularization is scored and the first voiceprint is scored, and the maximum value and the minimum value corresponding to each target text are obtained; the maximum value and the minimum value are used to be regularized.
  • the text matching module matches the voice features of each of the segmentation units with all the target text models to obtain a segmentation of each of the segmentation units and each of the target text models.
  • the unit text matching score includes: using the voice feature of each of the segmentation units as an input of each target text hidden Markov model HMM, and using the output probability obtained according to the Viterbi algorithm as the corresponding segmentation unit text matching score .
  • step d Combining the voice scores and the regularized text scores, obtaining the maximum and minimum values corresponding to each target text, and using the maximum and minimum values in step d to score the voiceprint scores and texts. Normalize; for example:
  • Target speaker A person who is trusted by the system, who needs to pass the voiceprint authentication
  • the adjacent two initial segmentation points are sequentially selected as a range start and end point, in which the average energy is calculated in units of specified frames, and the point where the average energy continuously increases by a specified number of times is found, and the point at which the increase is started is started.
  • a new initial segmentation point otherwise, the initial segmentation point is not updated, and the initial segmentation unit is divided by the initial segmentation point.
  • Step 110 The decision is made by using the integrated decision classifier to determine the input feature vector new_score. For each input, the output is 1 or 0. When the output is 1, the test voice decision is passed, and when the output is 0, the test voice is rejected.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Security & Cryptography (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Telephonic Communication Services (AREA)

Abstract

La présente invention concerne un procédé d'authentification d'identité, comprenant : l'acquisition d'une caractéristique vocale d'une voix d'entrée, et la mise en correspondance de la caractéristique vocale avec un modèle d'empreinte vocale cible pré-stocké, l'obtention d'un score de correspondance d'empreinte vocale (11) ; la segmentation de la voix d'entrée en fonction de la caractéristique vocale et d'un modèle de texte cible, et l'acquisition d'une pluralité d'unités de segmentation initiales et d'unités de segmentation vocale initiales (12) - si le nombre d'unités de segmentation vocale initiales est supérieur ou égal à un premier seuil, la conduite d'une segmentation forcée sur les unités de segmentation initiales, l'ajustement du nombre total d'unités de segmentation de façon à être égal au nombre dans un texte cible prédéfini ; la mise en correspondance de la caractéristique vocale de chaque unité de segmentation avec chaque modèle de texte cible, l'obtention d'un score de correspondance de texte d'unité de segmentation pour chaque unité de segmentation et chaque modèle de texte cible (13) ; la conduite d'une authentification d'identité en fonction des scores de correspondance de texte d'unité de segmentation, des scores de correspondance d'empreinte vocale et d'un classificateur de réseau neuronal probabiliste (PNN) pré-entraîné (14). Le présent procédé réalise l'objectif d'authentification à deux facteurs d'un utilisateur, de façon à augmenter la sécurité du système.
PCT/CN2017/076336 2016-03-21 2017-03-10 Procédé et dispositif d'authentification d'identité WO2017162053A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610162027.XA CN107221333B (zh) 2016-03-21 2016-03-21 一种身份认证的方法和装置
CN201610162027.X 2016-03-21

Publications (1)

Publication Number Publication Date
WO2017162053A1 true WO2017162053A1 (fr) 2017-09-28

Family

ID=59899353

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/076336 WO2017162053A1 (fr) 2016-03-21 2017-03-10 Procédé et dispositif d'authentification d'identité

Country Status (2)

Country Link
CN (1) CN107221333B (fr)
WO (1) WO2017162053A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110010135A (zh) * 2018-01-05 2019-07-12 北京搜狗科技发展有限公司 一种基于语音的身份识别方法、装置和电子设备
WO2019194787A1 (fr) * 2018-04-02 2019-10-10 Visa International Service Association Détection d'anomalie d'entité en temps réel
CN111131237A (zh) * 2019-12-23 2020-05-08 深圳供电局有限公司 基于bp神经网络的微网攻击识别方法及并网接口装置
CN111862933A (zh) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 用于生成合成语音的方法、装置、设备和介质
CN112423063A (zh) * 2020-11-03 2021-02-26 深圳Tcl新技术有限公司 一种智能电视自动设置方法、装置及存储介质
CN112751838A (zh) * 2020-12-25 2021-05-04 中国人民解放军陆军装甲兵学院 身份认证方法、装置及身份认证***
US12041140B2 (en) 2018-04-02 2024-07-16 Visa International Service Association Real-time entity anomaly detection

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108154588B (zh) * 2017-12-29 2020-11-27 深圳市艾特智能科技有限公司 解锁方法、***、可读存储介质及智能设备
CN108831484A (zh) * 2018-05-29 2018-11-16 广东声将军科技有限公司 一种离线的且与语言种类无关的声纹识别方法及装置
CN109545226B (zh) * 2019-01-04 2022-11-22 平安科技(深圳)有限公司 一种语音识别方法、设备及计算机可读存储介质
EP3921833A1 (fr) * 2019-04-05 2021-12-15 Google LLC Reconnaissance vocale automatique conjointe et segmentation et regroupement de haut-parleur
CN110502610A (zh) * 2019-07-24 2019-11-26 深圳壹账通智能科技有限公司 基于文本语义相似度的智能语音签名方法、装置及介质
CN111862967B (zh) * 2020-04-07 2024-05-24 北京嘀嘀无限科技发展有限公司 一种语音识别方法、装置、电子设备及存储介质
CN111882543B (zh) * 2020-07-29 2023-12-26 南通大学 一种基于AA R2Unet和HMM的香烟滤棒计数方法

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6671672B1 (en) * 1999-03-30 2003-12-30 Nuance Communications Voice authentication system having cognitive recall mechanism for password verification
CN102413101A (zh) * 2010-09-25 2012-04-11 盛乐信息技术(上海)有限公司 声纹密码语音提示的声纹认证***及其实现方法
CN102457845A (zh) * 2010-10-14 2012-05-16 阿里巴巴集团控股有限公司 无线业务身份认证方法、设备及***
CN103220286A (zh) * 2013-04-10 2013-07-24 郑方 基于动态密码语音的身份确认***及方法
CN104021790A (zh) * 2013-02-28 2014-09-03 联想(北京)有限公司 声控解锁方法以及电子设备
CN104064189A (zh) * 2014-06-26 2014-09-24 厦门天聪智能软件有限公司 一种声纹动态口令的建模和验证方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060294390A1 (en) * 2005-06-23 2006-12-28 International Business Machines Corporation Method and apparatus for sequential authentication using one or more error rates characterizing each security challenge
CN102543084A (zh) * 2010-12-29 2012-07-04 盛乐信息技术(上海)有限公司 在线声纹认证***及其实现方法

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6671672B1 (en) * 1999-03-30 2003-12-30 Nuance Communications Voice authentication system having cognitive recall mechanism for password verification
CN102413101A (zh) * 2010-09-25 2012-04-11 盛乐信息技术(上海)有限公司 声纹密码语音提示的声纹认证***及其实现方法
CN102457845A (zh) * 2010-10-14 2012-05-16 阿里巴巴集团控股有限公司 无线业务身份认证方法、设备及***
CN104021790A (zh) * 2013-02-28 2014-09-03 联想(北京)有限公司 声控解锁方法以及电子设备
CN103220286A (zh) * 2013-04-10 2013-07-24 郑方 基于动态密码语音的身份确认***及方法
CN104064189A (zh) * 2014-06-26 2014-09-24 厦门天聪智能软件有限公司 一种声纹动态口令的建模和验证方法

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110010135A (zh) * 2018-01-05 2019-07-12 北京搜狗科技发展有限公司 一种基于语音的身份识别方法、装置和电子设备
CN110010135B (zh) * 2018-01-05 2024-05-07 北京搜狗科技发展有限公司 一种基于语音的身份识别方法、装置和电子设备
WO2019194787A1 (fr) * 2018-04-02 2019-10-10 Visa International Service Association Détection d'anomalie d'entité en temps réel
US12041140B2 (en) 2018-04-02 2024-07-16 Visa International Service Association Real-time entity anomaly detection
CN111131237A (zh) * 2019-12-23 2020-05-08 深圳供电局有限公司 基于bp神经网络的微网攻击识别方法及并网接口装置
CN111131237B (zh) * 2019-12-23 2020-12-29 深圳供电局有限公司 基于bp神经网络的微网攻击识别方法及并网接口装置
CN111862933A (zh) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 用于生成合成语音的方法、装置、设备和介质
CN112423063A (zh) * 2020-11-03 2021-02-26 深圳Tcl新技术有限公司 一种智能电视自动设置方法、装置及存储介质
CN112751838A (zh) * 2020-12-25 2021-05-04 中国人民解放军陆军装甲兵学院 身份认证方法、装置及身份认证***

Also Published As

Publication number Publication date
CN107221333A (zh) 2017-09-29
CN107221333B (zh) 2019-11-08

Similar Documents

Publication Publication Date Title
WO2017162053A1 (fr) Procédé et dispositif d'authentification d'identité
JP6766221B2 (ja) 話者検証のためのニューラルネットワーク
US10325602B2 (en) Neural networks for speaker verification
US8050919B2 (en) Speaker recognition via voice sample based on multiple nearest neighbor classifiers
KR100655491B1 (ko) 음성인식 시스템에서의 2단계 발화 검증 방법 및 장치
US20170236520A1 (en) Generating Models for Text-Dependent Speaker Verification
CN111199741A (zh) 声纹识别方法、声纹验证方法、装置、计算设备及介质
JPH11507443A (ja) 話者確認システム
JP2007133414A (ja) 音声の識別能力推定方法及び装置、ならびに話者認証の登録及び評価方法及び装置
CN111524527A (zh) 话者分离方法、装置、电子设备和存储介质
US10909991B2 (en) System for text-dependent speaker recognition and method thereof
US7050973B2 (en) Speaker recognition using dynamic time warp template spotting
CN113744742B (zh) 对话场景下的角色识别方法、装置和***
Yun et al. An end-to-end text-independent speaker verification framework with a keyword adversarial network
Ozaydin Design of a text independent speaker recognition system
Georgescu et al. GMM-UBM modeling for speaker recognition on a Romanian large speech corpora
Das et al. Comparison of DTW score and warping path for text dependent speaker verification system
CN105575385A (zh) 语音密码设置***及方法、语音密码验证***及方法
WO2002029785A1 (fr) Procede, appareil et systeme permettant la verification du locuteur s'inspirant d'un modele de melanges de gaussiennes (gmm)
Sun et al. A new study of GMM-SVM system for text-dependent speaker recognition
Nallagatla et al. Sequential decision fusion for controlled detection errors
Mishra et al. Speaker identification, differentiation and verification using deep learning for human machine interface
JP2000099090A (ja) 記号列を用いた話者認識方法
Kanrar Dimension compactness in speaker identification
WO2006027844A1 (fr) Analyseur d'orateur

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17769330

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17769330

Country of ref document: EP

Kind code of ref document: A1