CN106971730A - A kind of method for recognizing sound-groove based on channel compensation - Google Patents

A kind of method for recognizing sound-groove based on channel compensation Download PDF

Info

Publication number
CN106971730A
CN106971730A CN201610025193.5A CN201610025193A CN106971730A CN 106971730 A CN106971730 A CN 106971730A CN 201610025193 A CN201610025193 A CN 201610025193A CN 106971730 A CN106971730 A CN 106971730A
Authority
CN
China
Prior art keywords
frequency range
sequence number
data group
sequence
identification feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610025193.5A
Other languages
Chinese (zh)
Inventor
祝铭明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yutou Technology Hangzhou Co Ltd
Original Assignee
Yutou Technology Hangzhou Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yutou Technology Hangzhou Co Ltd filed Critical Yutou Technology Hangzhou Co Ltd
Priority to CN201610025193.5A priority Critical patent/CN106971730A/en
Publication of CN106971730A publication Critical patent/CN106971730A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a kind of method for recognizing sound-groove based on channel compensation, belong to technical field of biometric identification;Method includes:Receive the sound source of outside input;The sound source is converted to the voice of standard according to default compensation model using channel compensating method;The voice is fitted with first frequency range and second frequency range respectively;Then the two frequency ranges are proceeded as follows respectively:Voice is divided into multiple identification sections;Each identification section is done characteristic point is identified after eigentransformation, and and then formation identification feature space;Identification feature space is divided into many sub-spaces;Done according to training sentence and time sequence characteristic point is obtained after eigentransformation and is dispensed into each sub-spaces, First ray, and and then formation training identification feature are formed according to the sequence number of subspace.Similarly, obtained testing identification feature according to test statement;Finally test identification feature and training identification feature are contrasted, and the result for obtaining Application on Voiceprint Recognition is handled according to comparing result.

Description

A kind of method for recognizing sound-groove based on channel compensation
Technical field
The present invention relates to technical field of biometric identification, more particularly to a kind of Application on Voiceprint Recognition based on channel compensation Method.
Background technology
As Application on Voiceprint Recognition and fingerprint, iris, recognition of face etc., belong to one kind of bio-identification, recognized To be most natural living things feature recognition identity authentication mode.Can be easily to saying by Application on Voiceprint Recognition The identity of words people is verified, and the privacy of this verification mode is very high, because the usual nothing of vocal print Method and is stolen at fraudulent copying, thus Application on Voiceprint Recognition have in various fields especially smart machine field it is prominent The application advantage gone out.
The basic process of Application on Voiceprint Recognition is voice collecting, feature extraction, disaggregated model.Common voice is special It is the short-term stationarity characteristic using voice to levy extracting method, is converted speech into using U.S. Cepstrum Transform method Identification feature collection, is modeled the classification mould for obtaining speaker to speaker's voice by learning process afterwards Type, then obtains the result of Application on Voiceprint Recognition by all kinds of identification models.But said process exist it is following several Individual problem:(1) model of above-mentioned Application on Voiceprint Recognition needs to learn more samples to apply;(2) foundation The complexity of the calculating for the Application on Voiceprint Recognition that above-mentioned identification model is carried out is higher;(3) according to above-mentioned identification mould It is larger that type calculates obtained model data amount;(4) due to transmit sound source channel be it is variable, such as:Sound Source for recording when, to voice carry out Application on Voiceprint Recognition when can due to recording noise the problems such as cause voice to lose Very, so as to cause the accuracy of Application on Voiceprint Recognition to be greatly reduced.In summary, for the intelligence of resource-constrained For system, the above-mentioned application that voiceprint recognition algorithm of the prior art is limited the problem of both deposit.
The content of the invention
According to the above-mentioned problems in the prior art, a kind of Application on Voiceprint Recognition based on channel compensation is now provided The technical scheme of method, is specifically included:
A kind of method for recognizing sound-groove based on channel compensation, wherein:Default one first frequency range and one second frequency Section, first frequency range is higher than second frequency range, comprises the steps:
Step S1, receives the sound source of outside input;
Step S2, standard is converted to according to default compensation model using channel compensating method by the sound source Voice;
Step S3, the voice is fitted with first frequency range and second frequency range respectively;
Step S4, the voice being respectively under first frequency range or second frequency range is split For the identification section of length-specific;
Step S5, does to each identification section and corresponding multiple identification features is obtained after eigentransformation, And respectively constitute correspondence described first using all identification features for being associated with all identification sections The identification feature space of frequency range, or correspond to the identification feature space of second frequency range;
Step S6, plural sub-spaces are divided into by the identification feature space, and each with description information The subspace being divided, and assign a corresponding sequence number to each subspace respectively;
Step S7, will be associated with training in first frequency range or in second frequency range respectively Every training sentence of model is done to be obtained including the time sequence characteristic point of corresponding time sequence characteristic point after eigentransformation Collection, each described subspace that each time sequence characteristic point is respectively allocated under same frequency range, according to every The sequence number of the corresponding subspace of the individual time sequence characteristic point formed respectively be associated with first frequency range or The First ray of second frequency range described in person, and and then the corresponding training identification feature of formation;
Step S8, will be associated with test in first frequency range or in second frequency range respectively Every test statement of model, which is done, obtains the temporal aspect point set after eigentransformation, each sequential is special Levy and be a little respectively allocated into subspace each described, according to the corresponding son of each time sequence characteristic point The sequence number in space forms the second sequence for being associated with first frequency range or second frequency range respectively, and And then form corresponding test identification feature;
Step S9, contrast is associated with the training identification feature of first frequency range and the test is recognized Whether feature is similar, and the confirmation knot for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result Really, or
For whether being associated with the training identification feature of second frequency range and the test identification feature It is similar, and the confirmation result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein, in the step S7, each The time sequence characteristic point is dispensed into each described subspace according to nearest neighbouring rule.
It is preferred that, should method for recognizing sound-groove based on channel compensation, wherein, will be by the step S7 Each the described subspace for being dispensed into the time sequence characteristic point constitutes a spatial sequence according to the sequence number, and Using the spatial sequence as the First ray, to form the training identification feature.
It is preferred that, should method for recognizing sound-groove based on channel compensation, wherein, will be by the step S8 Each the described subspace for being dispensed into the time sequence characteristic point constitutes a spatial sequence according to the sequence number, and Using the control sequence as second sequence, to form the test identification feature.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein, it is described in the step S7 Spatial sequence includes being associated with the data group of each subspace, a data group correspondence one The sequence number;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency The process for the first data compression that the spatial sequence of section is carried out, be specially:
Step S71, the sequence number of each data group of record, and record is associated with each sequence number Repetition sequence number quantity;
Step S72, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing Step S73 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S73, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S74, judge the deleted data group previous data group the sequence number whether with quilt The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by first data compression to all data groups in the spatial sequence State First ray.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein, it is described in the step S8 Spatial sequence includes being associated with the data group of each subspace, a data group correspondence one The sequence number;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency The process for the second data compression that the spatial sequence of section is carried out, be specially:
Step S81, the sequence number of each data group of record, and record is associated with each sequence number Repetition sequence number quantity;
Step S82, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing Step S83 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S83, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S84, judge the deleted data group previous data group the sequence number whether with quilt The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by second data compression to all data groups in the spatial sequence State the second sequence.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein:The eigentransformation is U.S. cepstrum Conversion.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein:In the execution U.S. Cepstrum Transform During, every sentence is divided into the frames of 20ms mono- respectively, and 10ms frame is pipetted into out pass It is coupled to the sentence frame of the sentence;
Then, remove Jing Yin in units of frame, help every frame after Cepstrum Transform to stay 12 to the sentence frame Coefficient, and constituted the identification feature with 12 coefficients.
It is preferred that, the method for recognizing sound-groove based on channel compensation is somebody's turn to do, wherein:In the step S6, adopt Identification feature space is divided into several subspaces with " K- averages " algorithm, each son after division is empty Between the description information of the correspondence subspace is recorded as with the central point of " K- averages " respectively.
The beneficial effect of above-mentioned technical proposal is:A kind of method for recognizing sound-groove based on channel compensation is provided, Channel compensation first is carried out to sound source before voice is identified, sound source is converted to the voice of standard, So as to ensure the accuracy of Application on Voiceprint Recognition, make the amount of calculation of Application on Voiceprint Recognition smaller, storage can be saved and counted Resource is calculated, and overcomes the problem of modeling method based on probability statistics is present, is suitable for system resource Limited intelligence system is used.The first frequency and table for representing the speaker of children are pre-set simultaneously It is shown as the second frequency of the speaker in year and is compared respectively, further improves based on channel compensation The degree of accuracy of Application on Voiceprint Recognition.
Brief description of the drawings
Fig. 1 be the present invention preferred embodiment in, a kind of method for recognizing sound-groove based on channel compensation Overview flow chart;
Fig. 2 be the present invention preferred embodiment in, the schematic flow sheet of the first data compression;
Fig. 3 be the present invention preferred embodiment in, the schematic flow sheet of the second data compression.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out Clearly and completely describe, it is clear that described embodiment is only a part of embodiment of the invention, and The embodiment being not all of.Based on the embodiment in the present invention, those of ordinary skill in the art are not making The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that in the case where not conflicting, the embodiment in the present invention and the spy in embodiment Levying to be mutually combined.
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings, but not as the present invention's Limit.
In the preferred embodiment of the present invention, based on the above-mentioned problems in the prior art, one is now provided Plant the method for recognizing sound-groove based on channel compensation.The method for recognizing sound-groove based on channel compensation can be applicable In the smart machine with voice control function, such as the intelligent robot in applied to personal air.
In the above-mentioned method for recognizing sound-groove based on channel compensation, one first frequency range and one the are preset first Two frequency ranges, first frequency range is higher than second frequency range.Specifically, for different users, The frequency of its voice may be different, and progress division rough to frequency can be divided into the speaker's of correspondence adult Relatively low frequency range, and correspond to the higher frequency range of the speaker of children.
Further, for the speaker of adult and the speaker of children, it is based on channel compensation Application on Voiceprint Recognition may and differ, be characterized in particular in the extraction of its vocal print feature and corresponding sound-groove model Structure might have difference.Therefore in technical solution of the present invention, the frequency range of two phonetic inceptings is set, And recognized the voice of adult and the speech differentiation of children according to the two frequency ranges, so as to further be lifted Accuracy of identification.In other words, the first the above frequency range can be used to indicate that the language of the speaker of children Audio section, the second frequency range can be used to indicate that the voice band of the speaker of adult.Therefore, it is of the invention In preferred embodiment, above-mentioned two frequency range can accordingly be changed according to the constantly cumulative of experimental data, So as to reach that can accurately represent the voice band of adult speaker and children speaker respectively Purpose.
Then in preferred embodiment of the invention, as shown in figure 1, the above-mentioned vocal print based on channel compensation is known Other method specifically includes following step:
Step S1, receives the sound source of outside input;
Step S2, is converted to sound source using channel compensating method according to default compensation model the language of standard Sound;
Step S3, voice is fitted with the first frequency range and the second frequency range respectively;
Step S4, length-specific is divided into by the voice being respectively under the first frequency range or the second frequency range Identification section;
Step S5, does to each identification section and corresponding multiple identification features is obtained after eigentransformation, and adopt The identification feature for respectively constituting the first frequency range of correspondence with all identification features for being associated with all identification sections is empty Between, or correspond to the identification feature space of the second frequency range;
Step S6, is divided into plural sub-spaces, and each drawn with description information by identification feature space The subspace divided, and assign a corresponding sequence number to every sub-spaces respectively;
Step S7, will be associated with the every of training pattern in the first frequency range or in the second frequency range respectively Bar training sentence, which is done, obtains the temporal aspect point set for including corresponding time sequence characteristic point after eigentransformation, each Time sequence characteristic point is respectively allocated each sub-spaces under same frequency range, according to each time sequence characteristic point correspondence The sequence number of subspace form the First ray for being associated with the first frequency range or the second frequency range respectively, and and then Form corresponding training identification feature;
Step S8, will be associated with the every of test model in the first frequency range or in the second frequency range respectively Bar test statement does and temporal aspect point set is obtained after eigentransformation, each time sequence characteristic point be respectively allocated into Each sub-spaces, form and are associated with first respectively according to the sequence number of the corresponding subspace of each time sequence characteristic point Second sequence of frequency range or the second frequency range, and and then the corresponding test identification feature of formation;
Step S9, contrast be associated with the training identification feature of the first frequency range with test identification feature whether phase Seemingly, the confirmation result for obtaining the Application on Voiceprint Recognition based on channel compensation and is handled according to comparing result, or
It is whether similar to testing identification feature for being associated with the training identification feature of the second frequency range, and according to Comparing result processing obtains the confirmation result of the Application on Voiceprint Recognition based on channel compensation.
In the present embodiment, method for recognizing sound-groove first carries out channel before voice is identified to sound source Compensation, sound source is converted to the voice of standard, so as to ensure the accuracy of Application on Voiceprint Recognition, with avoid because Sound source for recording or the external world the cause influence Application on Voiceprint Recognition result such as noise is excessive the problem of.
In the preferred embodiment of the present invention, on the basis of above-mentioned pre-set, above-mentioned steps S4-S5 In, obtain first be respectively under the first frequency range or the second frequency range based on different background, different voice Voice, and these voices are divided into the identification section of length-specific.Specifically, can be by the different back ofs the body Scape, the corresponding every sentence of voice of different voice are divided into multiple sentence frames by a frame of 20ms, And pipette 10ms sentence frame, then remove Jing Yin in units of every frame, cepstrum is helped to speech frame Conversion, 12 coefficients are stayed per frame, and 12 coefficients are to constitute identification feature.The identification of all voice segments Feature constitutes identification feature collection, that is, constitutes corresponding identification feature space.
In the preferred embodiment of the present invention, in above-mentioned steps S6, it will be recognized using " K- averages " algorithm Feature space is divided into plural sub-spaces, and several subspaces after division are respectively with the center of " K- averages " Point is recorded as the data description of the subspace, and each sub-spaces are numbered, and record is per sub-spaces Description information sequence number corresponding with its.Above-mentioned steps are same under the first frequency range or the second frequency range Identification feature space perform respectively.
It is empty to the son under the first frequency range or the second frequency range respectively in the preferred embodiment of the present invention Between carry out as above-mentioned step S7 operation:Every training sentence for being associated with training pattern is done into feature change Obtain including the temporal aspect point set of corresponding time sequence characteristic point after changing, each time sequence characteristic point is divided respectively Supplying is distinguished with each sub-spaces under frequency range according to the sequence number of the corresponding subspace of each time sequence characteristic point Form the First ray for being associated with the first frequency range or the second frequency range, and and then the corresponding training identification of formation Feature.
Specifically, in preferred embodiment of the invention, so-called training sentence can be by instructing repeatedly The part for the training pattern that reference is carried out when internal system is compared for system is defaulted in after white silk.
Specifically, in preferred embodiment of the invention, in above-mentioned steps S7, by each temporal aspect Point is dispensed under same frequency range (the first frequency range or the second frequency range) respectively according to nearest neighbouring rule In each sub-spaces, and the sequence number of the corresponding subspace of each time sequence characteristic point is recorded, ultimately form one Individual First ray, the First ray is made up of the sequence number of different subspaces, for example (2,2,4,8,8, 8th, 5,5,5,5,5), and then corresponding training identification feature is formed according to the First ray.
In the preferred embodiment of the present invention, similarly, in above-mentioned steps S8, respectively in above-mentioned Subspace under first frequency range or the second frequency range is proceeded as follows:Test to being associated with test model Sentence is done and temporal aspect point set is obtained after eigentransformation, and each time sequence characteristic point is respectively allocated into each height Space, formed respectively according to the sequence number of the corresponding subspace of each time sequence characteristic point be associated with the first frequency range or Second sequence of the frequency range of person second, and and then the corresponding test identification feature of formation.
In the preferred embodiment of the present invention, so-called test statement, it is associated with test model, that is, Need the sentence compared.
Specifically, in preferred embodiment of the invention, in above-mentioned steps S8, equally by above-mentioned test Each time sequence characteristic point in sentence is dispensed into (first under same frequency range respectively according to nearest neighbouring rule Frequency range or the second frequency range) each sub-spaces in, and it is empty to record the corresponding son of each time sequence characteristic point Between sequence number, ultimately form second sequence, the same sequence number by different subspaces of second sequence Composition, such as (2,3,3,5,5,8,6,6,6,4,4), and then according to the second sequence shape Into corresponding test identification feature.In the preferred embodiment of the present invention, above-mentioned steps S7 and step S8 Between and in the absence of the relation that mutually depends on, (i.e. step S8 execution is necessarily finished with step S7 Premised on), therefore above-mentioned steps S7 and step S8 can carry out simultaneously.Step is still shown in Fig. 1 The embodiment that S7 and step S8 orders are carried out.
In the preferred embodiment of the present invention, in above-mentioned steps S9, the training of above-mentioned formation is recognized special Test identification feature of seeking peace is compared, and obtains the vocal print based on channel compensation according to comparison result processing The final result of identification.
Specifically, in above-mentioned steps S9, equally compared respectively in accordance with the first frequency range and the second frequency range It is right, i.e., by the test identification feature under the first frequency range and the training identification feature being similarly under the first frequency range It is compared, and the result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparison result.Equally Ground, the test identification feature under the second frequency range and the training identification feature that is similarly under the second frequency range are entered Row is compared, and the result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparison result.
Further, in preferred embodiment of the invention, in above-mentioned steps S7, wrapped in spatial sequence Include the data group for being associated with every sub-spaces, data group one sequence number of correspondence;
Then after spatial sequence is formed, in addition to respectively to the space in the first frequency range or the second frequency range The process for the first data compression that sequence is carried out, specifically as shown in Fig. 2 being:
Step S71, records the sequence number of each data group, and record the repetition sequence number for being associated with each sequence number Quantity;
Step S72, the repetition sequence number quantity for judging whether sequence number is 1, and repeats sequence number existing Step S73 is turned to when quantity is 1 data group;
Step S73, deletes the corresponding data group of sequence number for repeating that sequence number quantity is 1;
Step S74, judge deleted data group previous data group sequence number whether with it is deleted The sequence number of latter data group of data group is identical:
If identical, previous data group and latter data are combined simultaneously;
If differing, retain previous data group and latter data group;
First ray is formed after being performed both by the first data compression to all data groups in spatial sequence.
Specifically, in preferred embodiment of the invention, during above-mentioned first data compression, record The sequence number of subspace and the quantity of same sequence number, regard the quantity of sequence number and same sequence number as one group of data Arranged, when the quantity of same sequence number is 1, remove this group of data.In the foot stool of the present invention Embodiment in, the data of serial number 4 only have 1, then carry out the first data compression during delete Fall this group of data.
If after this group of data were removed, sequence number and one group of rear data in front of the data in one group of data In sequence number it is identical when, then by two combination simultaneously.The sequence number of the data group newly formed and the deleted data The sequence number of the one group of data in front of group is identical, and the quantity of same sequence number is in front of this group of deleted data one The quantity sum of the quantity and deleted one group of this group of data rear data of group data.Or, deleting After this group of data, the sequence number in front of the data in one group of data is different with the sequence number in the data of one group of rear, Then retain this two groups of data simultaneously.For example, in the preferred embodiment of the present invention, working as serial number After 4 data group is removed, positioned at the serial number 2 of the data of this group of data previous group, positioned at this group of data The serial number 8,2 and 8 of the data of later group is differed, so retaining former data group.
In the preferred embodiment of the present invention, the First ray after the first data compression is above-mentioned instruction Practice identification feature.
Correspondingly, in preferred embodiment of the invention, in above-mentioned steps S8, spatial sequence includes It is associated with the data group of every sub-spaces, data group one sequence number of correspondence;
Then after spatial sequence is formed, in addition to respectively to the space in the first frequency range or the second frequency range The process for the second data compression that sequence is carried out, specifically as shown in figure 3, being:
Step S81, records the sequence number of each data group, and record the repetition sequence number for being associated with each sequence number Quantity;
Step S82, the repetition sequence number quantity for judging whether sequence number is 1, and repeats sequence number existing Step S83 is turned to when quantity is 1 data group;
Step S83, deletes the corresponding data group of sequence number for repeating that sequence number quantity is 1;
Step S84, judge deleted data group previous data group sequence number whether with it is deleted The sequence number of latter data group of data group is identical:
If identical, previous data group and latter data are combined simultaneously;
If differing, retain previous data group and latter data group;
All data groups in spatial sequence are performed both by after the second data compression forming the second sequence.
Specifically, in the step in similar above-mentioned steps S7, step S8, the sequence of same record subspace Number and same sequence number quantity, arranged the quantity of sequence number and same sequence number as one group of data. When the quantity of same sequence number is 1, remove this group of data.
If after this group of data were removed, sequence number and one group of rear data in front of the data in one group of data In sequence number it is identical when, then by two combination simultaneously.The sequence number of the data group newly formed and the deleted data The sequence number of the one group of data in front of group is identical, and the quantity of same sequence number is in front of this group of deleted data one The quantity sum of the quantity and deleted one group of this group of data rear data of group data.Or, deleting After this group of data, the sequence number in front of the data in one group of data is different with the sequence number in the data of one group of rear, Then retain this two groups of data simultaneously.For example, in the preferred embodiment of the present invention, working as serial number After 4 data group is removed, positioned at the serial number 2 of the data of this group of data previous group, positioned at this group of data The serial number 8,2 and 8 of the data of later group is differed, so retaining former data group.
Similarly, in preferred embodiment of the invention, above-mentioned the second sequence Jing Guo the second data compression As test identification feature.
In above-mentioned steps S9, eventually through same frequency range (the first frequency range or the second frequency range) will be in Under training identification feature and test identification feature be compared, and handled according to comparison result and obtain final The Application on Voiceprint Recognition based on channel compensation result.
The execution of above-mentioned steps make it that the amount of calculation of the Application on Voiceprint Recognition based on channel compensation is smaller, and discrimination is more It is good, and need data volume to be processed also relatively small.
The foregoing is only preferred embodiments of the present invention, not thereby limit embodiments of the present invention and Protection domain, to those skilled in the art, should can appreciate that all utilization description of the invention And the equivalent substitution made by diagramatic content and the scheme obtained by obvious change, it should include Within the scope of the present invention.

Claims (9)

1. a kind of method for recognizing sound-groove based on channel compensation, it is characterised in that:Default one first frequency range with And one second frequency range, first frequency range is higher than second frequency range, comprises the steps:
Step S1, receives the sound source of outside input;
Step S2, standard is converted to according to default compensation model using channel compensating method by the sound source Voice;
Step S3, the voice is fitted with first frequency range and second frequency range respectively;
Step S4, the voice being respectively under first frequency range or second frequency range is split For the identification section of length-specific;
Step S5, does to each identification section and corresponding multiple identification features is obtained after eigentransformation, And respectively constitute correspondence described first using all identification features for being associated with all identification sections The identification feature space of frequency range, or correspond to the identification feature space of second frequency range;
Step S6, plural sub-spaces are divided into by the identification feature space, and each with description information The subspace being divided, and assign a corresponding sequence number to each subspace respectively;
Step S7, will be associated with training in first frequency range or in second frequency range respectively Every training sentence of model is done to be obtained including the time sequence characteristic point of corresponding time sequence characteristic point after eigentransformation Collection, each described subspace that each time sequence characteristic point is respectively allocated under same frequency range, according to every The sequence number of the corresponding subspace of the individual time sequence characteristic point formed respectively be associated with first frequency range or The First ray of second frequency range described in person, and and then the corresponding training identification feature of formation;
Step S8, will be associated with test in first frequency range or in second frequency range respectively Every test statement of model, which is done, obtains the temporal aspect point set after eigentransformation, each sequential is special Levy and be a little respectively allocated into subspace each described, according to the corresponding son of each time sequence characteristic point The sequence number in space forms the second sequence for being associated with first frequency range or second frequency range respectively, and And then form corresponding test identification feature;
Step S9, contrast is associated with the training identification feature of first frequency range and the test is recognized Whether feature is similar, and the confirmation knot for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result Really, or
For whether being associated with the training identification feature of second frequency range and the test identification feature It is similar, and the confirmation result for obtaining the Application on Voiceprint Recognition based on channel compensation is handled according to comparing result.
2. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that institute State in step S7, it is empty that each time sequence characteristic point is dispensed into each described son according to nearest neighbouring rule In.
3. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that institute State in step S7, by each the described subspace for being dispensed into the time sequence characteristic point according to the sequence number A spatial sequence is constituted, and the spatial sequence is known as the First ray with forming the training Other feature.
4. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that institute State in step S8, by each the described subspace for being dispensed into the time sequence characteristic point according to the sequence number A spatial sequence is constituted, and the control sequence is known as second sequence with forming the test Other feature.
5. the method for recognizing sound-groove as claimed in claim 3 based on channel compensation, it is characterised in that institute State in step S7, the spatial sequence includes being associated with the data group of each subspace, one One sequence number of the data group correspondence;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency The process for the first data compression that the spatial sequence of section is carried out, be specially:
Step S71, the sequence number of each data group of record, and record is associated with each sequence number Repetition sequence number quantity;
Step S72, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing Step S73 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S73, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S74, judge the deleted data group previous data group the sequence number whether with quilt The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by first data compression to all data groups in the spatial sequence State First ray.
6. the method for recognizing sound-groove as claimed in claim 4 based on channel compensation, it is characterised in that institute State in step S8, the spatial sequence includes being associated with the data group of each subspace, one One sequence number of the data group correspondence;
After the spatial sequence is formed, in addition to respectively in first frequency range or second frequency The process for the second data compression that the spatial sequence of section is carried out, be specially:
Step S81, the sequence number of each data group of record, and record is associated with each sequence number Repetition sequence number quantity;
Step S82, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing Step S83 is turned to when stating the data group that repetition sequence number quantity is 1;
Step S83, it is the 1 corresponding data group of the sequence number to delete the sequence number quantity that repeats;
Step S84, judge the deleted data group previous data group the sequence number whether with quilt The sequence number of latter data group of the data group deleted is identical:
If identical, the previous data group and the latter data are combined simultaneously;
If differing, retain the previous data group and the latter data group;
Institute is formed after being performed both by second data compression to all data groups in the spatial sequence State the second sequence.
7. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that:Institute It is U.S. Cepstrum Transform to state eigentransformation.
8. the method for recognizing sound-groove as claimed in claim 7 based on channel compensation, it is characterised in that:In During performing the U.S. Cepstrum Transform, every sentence is divided into the frames of 20ms mono- respectively, and 10ms frame is pipetted out to the sentence frame for being associated with the sentence;
Then, remove Jing Yin in units of frame, help every frame after Cepstrum Transform to stay 12 to the sentence frame Coefficient, and constituted the identification feature with 12 coefficients.
9. the method for recognizing sound-groove as claimed in claim 1 based on channel compensation, it is characterised in that:Institute State in step S6, identification feature space is divided into by several subspaces using " K- averages " algorithm, after division Each subspace the described of the correspondence subspace be recorded as with the central point of " K- averages " respectively retouched State information.
CN201610025193.5A 2016-01-14 2016-01-14 A kind of method for recognizing sound-groove based on channel compensation Pending CN106971730A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610025193.5A CN106971730A (en) 2016-01-14 2016-01-14 A kind of method for recognizing sound-groove based on channel compensation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610025193.5A CN106971730A (en) 2016-01-14 2016-01-14 A kind of method for recognizing sound-groove based on channel compensation

Publications (1)

Publication Number Publication Date
CN106971730A true CN106971730A (en) 2017-07-21

Family

ID=59335188

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610025193.5A Pending CN106971730A (en) 2016-01-14 2016-01-14 A kind of method for recognizing sound-groove based on channel compensation

Country Status (1)

Country Link
CN (1) CN106971730A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108492830A (en) * 2018-03-28 2018-09-04 深圳市声扬科技有限公司 Method for recognizing sound-groove, device, computer equipment and storage medium
CN111312283A (en) * 2020-02-24 2020-06-19 中国工商银行股份有限公司 Cross-channel voiceprint processing method and device
CN113488058A (en) * 2021-06-23 2021-10-08 武汉理工大学 Voiceprint recognition method based on short voice

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030075330A (en) * 2002-03-18 2003-09-26 정희석 Channel Mis-match Compensation apparatus and method for Robust Speaker Verification system
CN101241699A (en) * 2008-03-14 2008-08-13 北京交通大学 A speaker identification system for remote Chinese teaching
CN101661754A (en) * 2003-10-03 2010-03-03 旭化成株式会社 Data processing unit, method and control program
CN101944359A (en) * 2010-07-23 2011-01-12 杭州网豆数字技术有限公司 Voice recognition method facing specific crowd
CN102129859A (en) * 2010-01-18 2011-07-20 盛乐信息技术(上海)有限公司 Voiceprint authentication system and method for rapid channel compensation
CN102623008A (en) * 2011-06-21 2012-08-01 中国科学院苏州纳米技术与纳米仿生研究所 Voiceprint identification method
CN104185868A (en) * 2012-01-24 2014-12-03 澳尔亚有限公司 Voice authentication and speech recognition system and method
CN104392718A (en) * 2014-11-26 2015-03-04 河海大学 Robust voice recognition method based on acoustic model array
US20150066494A1 (en) * 2013-09-03 2015-03-05 Amazon Technologies, Inc. Smart circular audio buffer

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20030075330A (en) * 2002-03-18 2003-09-26 정희석 Channel Mis-match Compensation apparatus and method for Robust Speaker Verification system
CN101661754A (en) * 2003-10-03 2010-03-03 旭化成株式会社 Data processing unit, method and control program
CN101241699A (en) * 2008-03-14 2008-08-13 北京交通大学 A speaker identification system for remote Chinese teaching
CN102129859A (en) * 2010-01-18 2011-07-20 盛乐信息技术(上海)有限公司 Voiceprint authentication system and method for rapid channel compensation
CN101944359A (en) * 2010-07-23 2011-01-12 杭州网豆数字技术有限公司 Voice recognition method facing specific crowd
CN102623008A (en) * 2011-06-21 2012-08-01 中国科学院苏州纳米技术与纳米仿生研究所 Voiceprint identification method
CN104185868A (en) * 2012-01-24 2014-12-03 澳尔亚有限公司 Voice authentication and speech recognition system and method
US20150066494A1 (en) * 2013-09-03 2015-03-05 Amazon Technologies, Inc. Smart circular audio buffer
CN104392718A (en) * 2014-11-26 2015-03-04 河海大学 Robust voice recognition method based on acoustic model array

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108492830A (en) * 2018-03-28 2018-09-04 深圳市声扬科技有限公司 Method for recognizing sound-groove, device, computer equipment and storage medium
CN111312283A (en) * 2020-02-24 2020-06-19 中国工商银行股份有限公司 Cross-channel voiceprint processing method and device
CN113488058A (en) * 2021-06-23 2021-10-08 武汉理工大学 Voiceprint recognition method based on short voice

Similar Documents

Publication Publication Date Title
CN106971737A (en) A kind of method for recognizing sound-groove spoken based on many people
CN108597496B (en) Voice generation method and device based on generation type countermeasure network
CN108281137A (en) A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
CN102509547B (en) Method and system for voiceprint recognition based on vector quantization based
CN105374356B (en) Audio recognition method, speech assessment method, speech recognition system and speech assessment system
CN108122556A (en) Reduce the method and device that driver's voice wakes up instruction word false triggering
CN108172218B (en) Voice modeling method and device
CN107492382A (en) Voiceprint extracting method and device based on neutral net
CN109346086A (en) Method for recognizing sound-groove, device, computer equipment and computer readable storage medium
CN106898355B (en) Speaker identification method based on secondary modeling
CN110047517A (en) Speech-emotion recognition method, answering method and computer equipment
CN109473105A (en) The voice print verification method, apparatus unrelated with text and computer equipment
CN106504772B (en) Speech-emotion recognition method based on weights of importance support vector machine classifier
CN111091809B (en) Regional accent recognition method and device based on depth feature fusion
CN107039036A (en) A kind of high-quality method for distinguishing speek person based on autocoding depth confidence network
CN106952648A (en) A kind of output intent and robot for robot
CN109637526A (en) The adaptive approach of DNN acoustic model based on personal identification feature
CN106971730A (en) A kind of method for recognizing sound-groove based on channel compensation
CN108461085A (en) A kind of method for distinguishing speek person under the conditions of Short Time Speech
CN105679323B (en) A kind of number discovery method and system
CN105845143A (en) Speaker confirmation method and speaker confirmation system based on support vector machine
CN106971727A (en) A kind of verification method of Application on Voiceprint Recognition
CN106971731A (en) A kind of modification method of Application on Voiceprint Recognition
CN116434758A (en) Voiceprint recognition model training method and device, electronic equipment and storage medium
CN116564315A (en) Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170721

RJ01 Rejection of invention patent application after publication