CN105245497B

CN105245497B - A kind of identity identifying method and device

Info

Publication number: CN105245497B
Application number: CN201510542515.9A
Authority: CN
Inventors: 刘申宁
Original assignee: Individual
Current assignee: Individual
Priority date: 2015-08-31
Filing date: 2015-08-31
Publication date: 2019-01-04
Anticipated expiration: 2035-08-31
Also published as: CN105245497A

Abstract

In order to accurately carry out authentication, the application provides a kind of identity identifying method and device, the method includes the steps: step 1 establishes voice authentication database；Step 2 receives and stores voice document to be certified；Step 3 identifies the true and false of voice document to be certified, if true, then enter step four, if there is vacation, terminates direct authentication output failed message；Step 4 extracts phonic signal character and rebuilds the feature in voice signal；Step 5 authenticates the reliable characteristic vector of voice and exports identity authentication result.

Description

A kind of identity identifying method and device

Technical field

This application involves technical field of biometric identification more particularly to a kind of identity identifying methods and device.

Background technique

With the rapid development of information technology, the especially development of Internet, data information deepens continuously.It is more next More affairs, the process of handling require to carry out authentication, such as: intelligent entrance guard, intelligence are used in public safety field It can video monitoring, public security are deployed to ensure effective monitoring and control of illegal activities, customs's authentication, practical driving license are verified etc. intelligent robot；In civil and economic field To all kinds of bank cards, fiscard, credit card, save card holder carry out authentication intelligent robot.For information peace Entirely, it is usually required before transacting business by after verifying personnel identity.

Biological identification technology is exactly a kind of technology that authentication is carried out using human body biological characteristics.Biological characteristic is unique , can measure or can automatic identification and verifying physiological property or behavior.Biological recognition system carries out biological characteristic Sampling, extracts its unique feature and is converted into digital code, and these codes are further formed feature templates, Ren Mentong When identifying system interaction carries out authentication, identifying system obtains its feature and compares in feature templates, to determine whether Matching.

Biological identification technology is presently the most convenient and safely identifies technology, does not need to remember complicated password, also not Need to carry the thing of key, smart card etc.The feature of common bio-identification has fingerprint, iris, retina, DNA With voice etc..The voice of natural quality one of of the language as the mankind, speaker has respective biological characteristic, everyone hair Sound organ not only has inborn differences of Physiological, also has posteriori behavioral difference, this makes more and more identity identifying technologies It is middle that speaker's identity identification is carried out using speech analysis.

But the voice document of people is also easy to be replicated or steal, if the voice document of people to be verified is by fraudulent copying And be used to handle the business agreed to without party, thus bring very big risk.

In addition, identify the identity of personnel by voice, when receiving voice input, ambient noise will affect for The accuracy that voice identifies.How the environmental noise of influence to(for) speech recognition be also urgently to be resolved one is solved the problems, such as.

Summary of the invention

In view of this, the application provides one kind for identity identifying method and device, it is avoided that fraudulent copying and reduction The problem of recognition accuracy, realization with high-accuracy carry out personnel identity identification.

The application provides a kind of identity identifying method, which comprises

Step 1 establishes voice authentication database；

Step 2 receives and stores voice document to be certified；

Step 3 identifies the true and false of voice document to be certified, if true, then enters step four, if there is vacation, terminates straight Connect authentication output failed message；

Step 4 extracts phonic signal character and rebuilds the reliable speech feature in voice signal；

Step 5 authenticates the reliable characteristic vector of voice and exports identity authentication result.

In one specific embodiment of the application, the step 1 includes: the proprietary reliable speech of acquisition, reliable speech Feature extraction and voice characteristics information is recorded in the voice authentication database.

In one specific embodiment of the application, the step 3 are as follows: since duplication bring is specific in identification voice document Feature whether be more than setting first threshold.

In one specific embodiment of the application, calculating the duplication bring special characteristic includes:

If X={ X₁[n],…,X_T[n] } indicate that frame number is the voice signal of T, then q (1≤q≤T) frame signal X_q[n](0 ≤ n≤N-1) discrete Fourier transform are as follows:

Wherein, N is the status number of equine husband's chain in voice signal；

The then expression formula of mean value frame are as follows:

Original real speech and the copying voice difference existing for frequency-portions is extracted by formula (3-3):

Wherein, filter () is arbitrary filter function in the prior art.

In one specific embodiment of the application, the step 4 includes:

The speech feature extraction algorithm used when being established using speech recognition library extracts the voice letter inputted in daily life The feature vector, X of number λ；

The feature vector, X of the voice signal λ inputted in daily life is separated, speech vector X is divided into_rWith noise to Measure X_u；

Feature vector, X is divided into speech vector X according to the mean value and variance of its Gaussian function_rWith noise vector X_u, and calculate V-th of given speech vector X_rPrior probability P (v | X_r):

By the speech vector X of voice signal_rReliable characteristic vector as input speech signal.

Disclosed herein as well is a kind of identification authentication system, described device includes:

Voice authentication database 1, for storing the voice characteristics information of all personnel；

Acquisition module 2 acquires the voice messaging of personnel to be certified；

Authenticity module 3, for identifying the true and false of voice document to be certified；

Characteristic extracting module 4 extracts phonic signal character and rebuilds the feature in voice signal；

Authentication module 5 authenticates the reliable characteristic vector of voice and exports identity authentication result.

In one specific embodiment of the application, the voice authentication database 1 for acquire proprietary reliable speech, can Voice characteristics information is recorded by the feature extraction of voice and in the voice authentication database.

In one specific embodiment of the application, the authenticity module 3 is specifically used in identification voice document due to multiple Bring special characteristic processed whether be more than setting first threshold.

Wherein, N is the status number of equine husband's chain in voice signal；

The then expression formula of mean value frame are as follows:

Wherein, filter () is arbitrary filter function in the prior art.

In one specific embodiment of the application, the authentication module 5 includes:

Detailed description of the invention

In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application can also be obtained according to these attached drawings other for those of ordinary skill in the art Attached drawing.

Fig. 1 is the flow chart that voice authentication database is established in the application；

Identity identifying method flow chart is in Fig. 2 the application；

Fig. 3 is identification authentication system structure chart in the application.

Specific embodiment

In order to make those skilled in the art more fully understand the technical solution in the application, below in conjunction with the embodiment of the present application In attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only It is some embodiments of the present application, instead of all the embodiments.Based on the embodiment in the application, ordinary skill people Member's every other embodiment obtained, shall fall within the protection scope of the present application.

In order to solve existing in the prior art use in voice progress authentication due to fraudulent copying and environmental noise So that the problem of speech recognition accuracy declines, this application discloses a kind of identity identifying method and devices.

Further illustrate that the application implements below with reference to illustrations.

As shown in Figure 1, firstly the need of establishing voice authentication database, the foundation of speech database before carrying out voice authentication Journey can use any existing speech feature extraction technology, it is preferable that following set-up process can be used:

As needed, all n voice messagings for needing to authenticate personnel are acquired.In order to guarantee the accuracy rate of speech recognition, It can be continuously increased the voice messaging of all personnel in varied situations, to improve discrimination.

Audio signal x (i) is corresponded to for the voice messaging of collected personnel i and stores the original into speech recognition database Beginning phonetic storage area, i=1 ... ..., n (i is positive integer).

The characteristic extraction procedure of voice the following steps are included:

The each audio signal x (i) stored in raw tone memory block is carried out the following processing:

(1) audio signal x (i) is divided into a series of continuous frames, Fourier transformation is done to every frame signal.

(2) audio signal is handled using filter, to reduce the mutual leakage of spectrum energy between nearby frequency bands； Filter function used in filter are as follows:

Filter (t)=Bⁿt^n-1e^-2πBtcos(2πf₀T+ θ) u (t)) formula (1)

Wherein:

Parameter θ is the initial phase of filter, and n is the order of filter；

As t<0, u (t)=0, as t>0, u (t)=1；

B=1.019*ERB (f₀), ERB (f₀) be filter Equivalent Rectangular Bandwidth, it is the same as filter centre frequency f₀'s Relationship are as follows:

ERB(f₀)=24.7+0.108f₀Formula (2).

(3) the intermediate deviation of audio signal is removed.

After audio signal framing, a certain number of frames are formed a segmentation, in the present invention preferably by 7 frame compositions one A segmentation, this can be configured according to the processing capacity of system.

The frame length that most of speech recognition systems use be 20ms-30ms, present invention preferably uses 26.5ms as sea Bright window, a length of 10ms of overlapping frame, the intermediate quantity Q (i, j) of every frame are obtained by calculating the average value of frame energy P (i, j) in section:

Since the present invention preferably 7 frames form a segmentation, thus M=3 in formula (3).I is channel number, and j is required The sequence of frame, j ' are the sequence of each frame in required segmentation.

In noise energy removal process, it can be indicated using the ratio (AM/GM) of arithmetic mean of instantaneous value and geometrical mean The degree that voice signal is corroded obtains after seeking logarithm to above-mentioned ratio:

Z is floor coefficient in formula (4), guarantees that the deviation of calculated result is allowing with to avoid bearing infinitesimal valuation In range；J is the sequence sum of frame.

Assuming that B (i) is the deviation as caused by ambient noise, i indicates channel sequence, and by conditional probability, that thing is obtained, and removes Intermediate quantity Q ' (i, j | B (i)) after deviation are as follows:

Q'(i, j | B (i))=max (Q (i, j)-B (i), 10^-3Q (i, j)) formula (5)

It is available:

It, can be in the hope of when AM/GM value of the ratio of AM/GM under noise situations closest to acoustic signal for formula (6) The estimated value of B (i) are as follows:

B'(i)=min B (i) | and G'(i | B (i)) >=G_c(i) } formula (7)

Wherein, G_c(i) G (i) respective value in acoustic signal is indicated, it is right to being obtained after each channel calculation formula (7) In each time-frequency BIN signal (i, j), the ratio of noise removal are as follows:

For smoothing computation, the noise removal ratio of channel i-N to i+N is averaged, final function after adjustment are as follows:

Audio signals all in filter are handled using formula (10), as filter after the intermediate deviation of removal Output.

(4) non-linear power-function arithmetic, used power function are done to the audio signal data of all filters output are as follows:

Y=X^0.1Formula (11).

(5) speech characteristic parameter is obtained after further doing discrete cosine transform to the output of (4) step.

As discrete cosine transform (DCT) be speech processes field well known to processing mode, details are not described herein.

In the database by the phonetic feature being calculated storage.

As shown in Fig. 2, this application discloses a kind of identity identifying methods comprising following steps:

Step1: voice authentication database is established.

Arbitrary speech feature extraction technology in the prior art can be used and establish voice authentication database, can also lead to It crosses preferred embodiment described above and establishes voice authentication database.

Step2: voice document to be certified is received and stored.

Speech prompt information can be set in voice capture device, the personnel of identity to be identified is prompted to input voice text Part.For example, passing through the voice of microphone collector.Other voice capture devices can also be used.

Step3: identifying the true and false of voice document to be certified, if true, then enters Step4, if there is vacation, terminates directly Authentication output failed message.

In general, what the approach that false voice document is usually agreed to without party was obtained by the method replicated, but It is the copying voice file by way of multiple copies, will necessarily changes the characteristic information in voice document, and this change Usually equably exist along with the signal in entire voice document.Thus, by identifying in voice document due to dubs Whether the special characteristic come is more than the first threshold of setting to carry out authenticity.

Firstly, setting X={ X₁[n],…,X_T[n] } indicate that frame number is the voice signal of T, then q (1≤q≤T) frame signal X_q The discrete Fourier transform of [n] (0≤n≤N-1) (status number that N is equine husband's chain in voice signal) are as follows:

The then expression formula of mean value frame are as follows:

In general, original real speech and copying voice has differences in frequency-portions, can be mentioned by formula (3-3) Take this species diversity:

Wherein, filter () is the filter function in arbitrary filter function, such as formula (1) in the prior art.

In this application, through overtesting, preferably first threshold=0.53.

Step4: it extracts phonic signal character and rebuilds the feature in voice signal.

The voice signal usually used when constructing voice authentication database is usually that profession is carried out under quiet environment Acquisition, and be usually may exist various during actual authentication, when voice inputs in daily living environment Noise, if directly carrying out feature extraction to the voice signal inputted under noise conditions will receive the influence of noise speech information, And then influence the accuracy rate of authentication.

The speech feature extraction algorithm used when being established using speech recognition library extracts the voice letter inputted in daily life The feature vector, X of number λ.Therefore, it is possible to which the feature vector, X of the voice signal λ inputted in daily life is separated, it is divided into language Sound vector X_rWith noise vector X_u。

The prior probability p (X) of voice signal is established into model, then is obtained by merging, training data:

Wherein, V is the quantity of mixed cell, and v is serial number, and p (v) is that the prior probability P (X | v) an of mixed cell is indicated V-th of Gaussian Profile, skilled person will appreciate that its Mean Matrix is μ_v, diagonal covariance matrix σ_v ².If one given The feature of voice signal is just divided into speech vector X according to the mean value of its Gaussian function and variance_rWith noise vector X_u, and then calculate V-th of given speech vector X out_rPrior probability P (v | X_r):

In reconstruction process, the speech vector X of voice signal_rIt is retained the reliable characteristic as input speech signal Vector.

Step5: the reliable characteristic vector of voice is authenticated and exports identity authentication result.

The reliable characteristic vector input voice authentication database of voice signal is compared, if in voice authentication database It finds the voice signal being consistent and then passes through verifying, tested if not passing through if voice authentication database does not find the voice signal being consistent Card.

As shown in figure 3, present invention also provides a kind of identification authentication systems comprising:

Voice authentication database 1, for storing the voice characteristics information of all personnel.

Arbitrary speech feature extraction technology in the prior art can be used and establish voice authentication database, can also lead to It crosses preferred embodiment described below and establishes voice authentication database.

Filter (t)=Bⁿt^n-1e^-2πBtcos(2πf₀T+ θ) u (t)) formula (1)

Wherein:

Parameter θ is the initial phase of filter, and n is the order of filter；

As t<0, u (t)=0, as t>0, u (t)=1；

ERB(f₀)=24.7+0.108f₀Formula (2).

(3) the intermediate deviation of audio signal is removed.

Q'(i, j | B (i))=max (Q (i, j)-B (i), 10^-3Q (i, j)) formula (5)

It is available:

B'(i)=min B (i) | and G'(i | B (i)) >=G_c(i) } formula (7)

Y=X^0.1Formula (11).

In the database by the phonetic feature being calculated storage.

Acquisition module 2 acquires the voice messaging of personnel to be certified.

Authenticity module 3, for identifying the true and false of voice document to be certified.

The then expression formula of mean value frame are as follows:

In this application, through overtesting, preferably first threshold=0.53.

Characteristic extracting module 4 extracts phonic signal character and rebuilds the feature in voice signal.

Wherein, V is the quantity of mixed cell, and v is serial number, and p (v) is that the prior probability P (X | v) an of mixed cell is indicated V-th of Gaussian Profile, skilled person will appreciate that its Mean Matrix is μ_v, diagonal covariance matrix isIf one given The feature of voice signal is just divided into speech vector X according to the mean value of its Gaussian function and variance_rWith noise vector X_u, and then calculate V-th of given speech vector X out_rPrior probability P (v | X_r):

Certainly, any technical solution for implementing the application must be not necessarily required to reach simultaneously above all advantages.

It will be understood by those skilled in the art that embodiments herein can provide as method, apparatus (equipment) or computer Program product.Therefore, in terms of the application can be used complete hardware embodiment, complete software embodiment or combine software and hardware Embodiment form.Moreover, it wherein includes the meter of computer usable program code that the application, which can be used in one or more, The computer journey implemented in calculation machine usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) The form of sequence product.

The application is flow chart of the reference according to method, apparatus (equipment) and computer program product of the embodiment of the present application And/or block diagram describes.It should be understood that each process in flowchart and/or the block diagram can be realized by computer program instructions And/or the combination of the process and/or box in box and flowchart and/or the block diagram.It can provide these computer programs to refer to Enable the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to generate One machine so that by the instruction that the processor of computer or other programmable data processing devices executes generate for realizing The device for the function of being specified in one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Although the preferred embodiment of the application has been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the application range.Obviously, those skilled in the art can be to the application Various modification and variations are carried out without departing from spirit and scope.If in this way, these modifications and variations of the application Belong within the scope of the claim of this application and its equivalent technologies, then the application is also intended to encompass these modification and variations and exists It is interior.

Claims

1. a kind of identity identifying method, which comprises

Step 1 establishes voice authentication database；

Step 2 receives and stores voice document to be certified；

Step 3 identifies the true and false of voice document to be certified, if true, then enters step four, if there is vacation, terminates directly defeated Authentification failure message out；

Step 4 extracts phonic signal character and rebuilds the reliable characteristic vector of the voice in voice signal；

Step 5 authenticates the reliable characteristic vector of voice and exports identity authentication result；

The step 3 are as follows: identification voice document in due to duplication bring special characteristic whether be more than setting first threshold； If being less than the first threshold, voice document be it is true, it is on the contrary then be vacation；

Calculating the duplication bring special characteristic includes:

If X={ X₁[n],...,X_T[n] } indicate that frame number is the voice signal of T, then q (1≤q≤T) frame signal X_q[n](0≤n ≤ N-1) discrete Fourier transform are as follows:

Wherein, N is the status number of equine husband's chain in voice signal；

The then expression formula of mean value frame are as follows:

Wherein, filter () is filter function in the prior art.

2. the method according to claim 1, wherein the step 1 include: the proprietary reliable speech of acquisition, The feature extraction of reliable speech and voice characteristics information is recorded in the voice authentication database.

3. a kind of identification authentication system, described device include:

Voice authentication database, for storing the voice characteristics information of all personnel；

Acquisition module acquires the voice messaging of personnel to be certified；

Authenticity module, for identifying the true and false of voice document to be certified；

Characteristic extracting module extracts phonic signal character and rebuilds the reliable characteristic vector in voice signal；

Authentication module authenticates the reliable characteristic vector in voice signal and exports identity authentication result；

Whether the authenticity module is specifically used in identification voice document due to duplication bring special characteristic being more than setting First threshold；If being less than the first threshold, voice document be it is true, it is on the contrary then be vacation；

Calculating the duplication bring special characteristic includes:

Wherein, N is the status number of equine husband's chain in voice signal；

The then expression formula of mean value frame are as follows:

Wherein, filter () is filter function in the prior art.

4. device according to claim 3, which is characterized in that the voice authentication database for acquire it is proprietary can Voice characteristics information is recorded by voice, the feature extraction of reliable speech and in the voice authentication database.