CN106653061A - Audio matching tracking device and tracking method thereof based on dictionary classification - Google Patents

Audio matching tracking device and tracking method thereof based on dictionary classification Download PDF

Info

Publication number
CN106653061A
CN106653061A CN201610967738.4A CN201610967738A CN106653061A CN 106653061 A CN106653061 A CN 106653061A CN 201610967738 A CN201610967738 A CN 201610967738A CN 106653061 A CN106653061 A CN 106653061A
Authority
CN
China
Prior art keywords
signal
dictionary
atom
sparse
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610967738.4A
Other languages
Chinese (zh)
Inventor
胡瑞敏
姜林
胡霞
王晓晨
江游
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Research Institute of Wuhan University
Original Assignee
Shenzhen Research Institute of Wuhan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Research Institute of Wuhan University filed Critical Shenzhen Research Institute of Wuhan University
Priority to CN201610967738.4A priority Critical patent/CN106653061A/en
Publication of CN106653061A publication Critical patent/CN106653061A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The invention discloses an audio matching tracking device based on dictionary classification, which comprises a signal decomposing unit and a signal reconstructing unit, wherein the signal decomposing unit comprises a dictionary building module, a signal classification module, a weight comparison module, a residual calculation module and a threshold control module; and the signal reconstructing unit comprises a reconstruction coefficient extraction module and a signal synthesis module. The invention also discloses a tracking method for the tracking device. through classifying the signals, different sparse dictionaries are adopted for different types of signals for MP algorithm, the irrelevant traversal times are reduced, and the calculation complexity is reduced; during a classification pre-processing process, an adapted sparse dictionary is judged through calculating the energy distribution interval of original signals; and the method reduces the dimension of the needed dictionary, the coding rate is improved, and the use effects are good.

Description

A kind of Audio Matching follow-up mechanism classified based on dictionary and its method for tracing
Technical field
The present invention relates to audio coding field, specifically a kind of Audio Matching follow-up mechanism classified based on dictionary.
Background technology
Rarefaction representation refers generally to represent primary signal exactly with as far as possible few basic function, so as to catch the main of signal Feature, and then inherently reduce signal transacting cost.Match tracing (MP:Matching pursuit) as using wider One of rarefaction representation algorithm, its basic ideas are to select from excessively complete dictionary optimum atom successively in an iterative process so that Approaching for signal more optimizes.Because MP algorithms are used for representing that the excessively complete dictionary base of signal can be with self adaptation according to signal itself The characteristics of neatly choosing;And what it took in atom selection course is the greedy algorithm that a kind of iteration is approached, Ensure that the atomic number for finally giving is less, MP algorithms are widely used in the every field of signal analysis, such as image Process, processing of biomedical signals, audio frequency process etc..
With the raising of people's streamed media quality requirement and being continuously increased for number of users of mobile terminal amount, audio frequency and video are compiled The requirement of code efficiency is also day by day improved.Traditional matching pursuit algorithm is higher because of its computation complexity, is not suitable for real-time processing.Mesh Before have pointed out various Rapid matching tracing algorithms, but be directed to time-consuming optimization greatly, or sacrifice rarefaction representation efficiency for compensation, Calculating speed is also difficult to meet the needs of extensive problem.
The content of the invention
It is an object of the invention to provide a kind of Audio Matching follow-up mechanism classified based on dictionary, to solve above-mentioned background The problem proposed in technology.
For achieving the above object, the present invention provides following technical scheme:
A kind of Audio Matching follow-up mechanism classified based on dictionary, including signal decomposition unit and signal reconstruction unit, letter Number resolving cell sets up module, Modulation recognition module, weights comparison module, residual computations module and threshold value control mould including dictionary Block, signal reconstruction unit includes reconstruction coefficients extraction module and signal synthesizing module.
The method for tracing of the Audio Matching follow-up mechanism classified based on dictionary, including signal decomposition method and signal weight Structure method;The signal decomposition method is comprised the following steps that:
Step one, according to different disposal signal type corresponding sparse dictionary is selected;
Step 2, judges the type of pending signal, according to the sparse dictionary that its type selecting is adapted therewith, to original Signal carries out classification pretreatment, the matching degree of the sparse dictionary set up in calculating it with step one;
Step 3, by the sparse dictionary drawn in step 2 D is designated as, to seek each dictionary in each atom on pending signal Weight coefficient, the atom in sparse dictionary is done into successively inner product with pending signal, calculate the maximum of inner product absolute value;
Step 4, by step 3 the component of pending signal maximum atom in dictionary can be obtained, then this time believed after iteration Number residual error is the vector difference of signal and the component, while recording the sparse coefficient of renewal in sparse coefficient vector;
Step 5, matching pursuit algorithm process signal be by add up iteration, by primary signal be expressed as weight with it is corresponding The superposition that atom is multiplied and residual error sum, iteration can be terminated when signal residual error is reduced to certain value, and the certain value can be by repeatedly Generation number and signal to noise ratio are together decided on;
The signal reconfiguring method is comprised the following steps that:
Step one, extracts atom weight, atom label and used that reconstruction signal will be used from sparse coefficient code stream Dictionary type label;
Step 2, when determining according to dictionary type label and encoding sparse dictionary type is adopted, by gained atom in step one Atom in the corresponding sparse dictionary of weight does product, and adds up successively, obtain output signal for this matching pursuit algorithm most Eventually to the fitted signal of primary signal.
As further scheme of the invention:Step 2 calculates pending signal with institute in step one in signal decomposition method The concrete calculation procedure of the matching degree of the sparse dictionary of foundation is as follows:The corresponding frequency domain value of pending signal is calculated, will be pending The time-domain value of signal is divided into j length after normalizing respectively with frequency domain value be the segment (a≤N/2) of a, then calculates each segment Energy value, the energy value of a continuous sample can approximate calculation formula it is as follows:Time-frequency domain energy is calculated respectively The twice of the sum of amount, and compare its size,
As further scheme of the invention:The meter of the inner product maximum absolute value value in signal decomposition method described in step 3 Calculating formula is:iopt∈ [1, M] is atom label in dictionary,For each atom and S Inner product value,The weight limit coefficient for being atom in dictionary on S.
As further scheme of the invention:Signal to noise ratio reduced mechanical model in signal decomposition method described in step 5 For:S is the original signal amplitude value of rarefaction representation before processing, and S ' is this sparse table Show the signal amplitude value after recovering.
Compared with prior art, the invention has the beneficial effects as follows:The present invention by classifying to signal, to different type Signal take different sparse dictionaries to carry out MP algorithms, reduce unrelated traversal number of times, reduce computation complexity;In classification In preprocessing process, by the sparse dictionary for calculating its adaptation of the Energy distribution interval judgement of primary signal;This method is reduced The dimension of required dictionary, improves code rate, and using effect is good.
Description of the drawings
Fig. 1 is the flow chart of signal decomposition method in the Audio Matching follow-up mechanism classified based on dictionary.
Fig. 2 is the flow chart of signal reconfiguring method in the Audio Matching follow-up mechanism classified based on dictionary.
Fig. 3 is the structural representation of signal decomposition unit in the Audio Matching follow-up mechanism classified based on dictionary.
Fig. 4 is the structural representation of signal reconstruction unit in the Audio Matching follow-up mechanism classified based on dictionary.
Wherein:101- dictionaries set up module, 102- Modulation recognition modules, 103- weights comparison modules, 104- residual computations Module, 105- threshold control blocks, 201- reconstruction coefficients extraction modules, 202- signal synthesizing modules.
Specific embodiment
The technical scheme of this patent is described in more detail with reference to specific embodiment.
Refer to Fig. 1-4, a kind of Audio Matching follow-up mechanism classified based on dictionary, including signal decomposition unit and signal Reconfiguration unit, signal decomposition unit sets up module 101, Modulation recognition module 102, weights comparison module 103, residual error including dictionary Computing module 104 and threshold control block 105, signal reconstruction unit includes reconstruction coefficients extraction module 201 and signal synthesis mould Block 202.
Dictionary set up module 101 for build be applied to signal with different type sparse dictionary.For example for speech processes System, selects the dictionary with characteristics of speech sounds;For transient signal processing system, the dictionary of relative snapshot is selected;And for one A little features substantially or while needing to process the system of polytype signal for, then select the stronger word of universality Allusion quotation.Modulation recognition module 102 is used to judge the type of pending signal, according to its type selecting sparse word adaptable therewith Allusion quotation, to primary signal classification pretreatment is carried out, and calculates its matching degree that the sparse dictionary set up in module is set up with dictionary.Tool Body calculation procedure is as follows:The corresponding frequency domain value of pending signal is calculated first, by the time-domain value of pending signal and frequency domain value point Not Gui Yihua after be divided into a certain number of fixed length segment, then the energy value for calculating each segment;Due to signal amplitude square it And there is class proportional relation with signal amplitude sum, while the amount of calculation for calculating signal amplitude sum far smaller than calculates signal Amplitude square sum, therefore come approximate with signal amplitude sum.Calculate the sum of time-frequency domain energy respectively again, and compare its size, Energy and bigger, signal energy distribution is more intensive.Assume time domain energy and sum more than frequency domain energy, selection time domain energy compared with For the Gabor dictionaries concentrated, conversely, selecting the cosine dictionary that frequency domain energy is more concentrated, sparse dictionary type is recorded.Weights ratio It is used to seek weight coefficient of each atom on pending signal compared with module 103, the sparse word drawn from Modulation recognition module 102 Allusion quotation, by the atom in sparse dictionary inner product is done successively with pending signal, calculates the maximum of inner product absolute value.Residual computations mould Block 104 is used to calculate the residual values of pending signal and gained atomic orientation component in weights comparison module, due to atomic orientation Component is equal to the maximum of weights absolute value and the product of the atom, and the residual values are equal to the vector of signal and atomic orientation component Difference, while the sparse coefficient for updating is recorded in sparse coefficient vector, atom label is somebody's turn to do corresponding in sparse coefficient vector Sparse coefficient position, the value of the position is atom weight.Threshold control block 105 is used for using control signal to noise ratio (SNR) Threshold value supervises the precision that circulation is matched jointly with iterations, and matching pursuit algorithm process signal is by adding up iteration, by original Beginning signal is expressed as superposition and the residual error sum that weight is multiplied with corresponding atom, can be whole when signal residual error is reduced to certain value Only iteration, the certain value can be together decided on by iterations, signal to noise ratio (SNR), when reaching target SNR or default iterations When terminate loop iteration, export sparse coefficient vector, on the contrary repeat pretreatment module and terminate until meeting to residual computations module Condition, because amplitude square sum and the signal amplitude sum of signal have class proportional relation, while calculating signal amplitude sum Amount of calculation far smaller than calculate the amplitude square sum of signal, therefore signal to noise ratio can approximate calculation be that primary signal is believed with residual error The logarithm of the ratio of number amplitude absolute value.Reconstruction coefficients extraction module 201 will for extracting reconstruction signal in sparse coefficient code stream Using the atom weight (sparse coefficient), atom label (corresponding to the sparse coefficient position in sparse coefficient vector) for arriving and Dictionary type label used.Signal synthesizing module 202 is used for composite signal, using what is obtained in reconstruction coefficients extraction module 201 Dictionary type label adopts sparse dictionary type when determining coding, and the atom in the corresponding sparse dictionary of atom weight is taken advantage of Product, and adds up successively, and it is this matching pursuit algorithm finally to the fitted signal of primary signal to obtain output signal
The method for tracing of the Audio Matching follow-up mechanism classified based on dictionary, including signal decomposition method and signal weight Structure method;The signal decomposition method is comprised the following steps that:
Step one, according to different disposal signal type corresponding sparse dictionary is selected, such as speech processing system, choosing Select the dictionary with characteristics of speech sounds;For transient signal processing system, the dictionary of relative snapshot is selected;And for some features simultaneously Substantially or while needing to process the system of polytype signal for, then select the stronger dictionary of universality, it is conventional It is the Gabor dictionaries that time domain energy is more concentrated and the cosine dictionary that frequency domain energy is more concentrated;Gabor dictionaries:Wherein, w represents dimensions in frequency;μ represents time offset; σ represents time scale;λ represents nuclear energy.The excursion for assuming time scale is 1 to N, and N is the length of pending signal. Dimensions in frequency w, time scale σ, nuclear energy λ have M1Individual combining form, then dictionary size is M1×N;Cosine dictionary:Wherein, m represents analysis window rope Draw, u represents frequency constriction coefficient, k represents dimensions in frequency, L0Represent initial step length, that is, the half of first analysis window length. The excursion for assuming dimensions in frequency is 1 to N, analysis window index m, the total M of frequency constriction coefficient u2Individual combining form, then dictionary Size is M2×N。
Step 2, judges the type of pending signal, according to the sparse dictionary that its type selecting is adapted therewith, to original Signal carries out classification pretreatment, the matching degree of the sparse dictionary set up in calculating it with step one, the concrete calculating of matching degree Step is as follows:The corresponding frequency domain value of pending signal is calculated, after the time-domain value of pending signal and frequency domain value are normalized respectively It is the segment (a≤N/2) of a to be divided into j length, then calculates the energy value of each segment, due to signal amplitude square sum with There is class proportional relation in signal amplitude sum, while the amount of calculation for calculating signal amplitude sum far smaller than calculates the amplitude of signal Square sum, the energy value of a continuous sample can approximate calculation formula it is as follows:When calculating respectively
The twice of the sum of frequency domain energy, and compare its size, energy and bigger, signal energy distribution is more intensive,Assume time domain energy and more than frequency domain energy and Energyt,j> Energyf,j, choosing The Gabor dictionaries of time domain energy concentration are selected, otherwise selects the cosine dictionary of frequency domain energy concentration, record sparse dictionary type;
Step 3, by the sparse dictionary drawn in step 2 D is designated as, to seek each dictionary in each atom on pending signal Weight coefficient, the atom in sparse dictionary is done into successively inner product with pending signal, calculate the maximum of inner product absolute value, it is interior The computing formula of product maximum absolute value value is:iopt∈ [1, M] is atom in dictionary Label,For the inner product value of each atom and S,The weight limit coefficient for being atom in dictionary on S;
Step 4, by step 3 the component of pending signal maximum atom in dictionary can be obtained, then this time believed after iteration Number residual error is the vector difference of signal and the component, while recording the sparse coefficient of renewal in sparse coefficient vector, S is in dictionary Component at middle maximum atom isThen understand that signal residual error is after this iteration:
Record simultaneously and update sparse coefficient:
Wherein,For sparse coefficient vector, it is in place that atom label corresponds to sparse coefficient institute in sparse coefficient vector Put, the value of the position is atom weight
Step 5, matching pursuit algorithm process signal be by add up iteration, by primary signal be expressed as weight with it is corresponding The superposition that atom is multiplied and residual error sum, signal residual error is S 'later, as S 'laterReduce to iteration can be terminated during certain value, The certain value can be together decided on by iterations, signal to noise ratio (SNR), be terminated when target SNR or default iterations is reached Loop iteration, exports sparse coefficient vectorThe sparse dictionary type label obtained with step one, on the contrary repeat step two to Step 5 is until meet end condition, signal to noise ratio is defined as follows:
Because there is class with signal amplitude sum in the amplitude square sum of signal Proportional relation, while the amount of calculation for calculating signal amplitude sum far smaller than calculates the amplitude square sum of signal, therefore noise It is than reduced mechanical model:Wherein, S is the primary signal width of rarefaction representation before processing
Angle value, S ' is the signal amplitude value after this rarefaction representation recovery;
The signal reconfiguring method is comprised the following steps that:
Step one, extracts atom weight, atom label and used that reconstruction signal will be used from sparse coefficient code stream Dictionary type label;
Step 2, when determining according to dictionary type label and encoding sparse dictionary type is adopted, by gained atom in step one WeightAtom g in corresponding sparse dictionaryiProduct is done, and is added up successively, obtain SoutFor this matching pursuit algorithm most Eventually to the fitted signal of primary signal S,
Wherein, k is sparse coefficient number.
It is obvious to a person skilled in the art that the invention is not restricted to the details of above-mentioned one exemplary embodiment, Er Qie In the case of spirit or essential attributes without departing substantially from the present invention, the present invention can be in other specific forms realized.Therefore, no matter From the point of view of which point, embodiment all should be regarded as exemplary, and be nonrestrictive, the scope of the present invention is by appended power Profit is required rather than described above is limited, it is intended that all in the implication and scope of the equivalency of claim by falling Change is included in the present invention.Any reference in claim should not be considered as and limit involved claim.
Moreover, it will be appreciated that although this specification is been described by according to embodiment, not each embodiment is only wrapped Containing an independent technical scheme, this narrating mode of specification is only that for clarity those skilled in the art should Using specification as an entirety, the technical scheme in each embodiment can also Jing it is appropriately combined, form those skilled in the art Understandable other embodiment.

Claims (5)

1. it is a kind of based on dictionary classify Audio Matching follow-up mechanism, it is characterised in that including signal decomposition unit and signal weight Structure unit, signal decomposition unit including dictionary set up module, Modulation recognition module, weights comparison module, residual computations module and Threshold control block, signal reconstruction unit includes reconstruction coefficients extraction module and signal synthesizing module.
2. a kind of method for tracing of the Audio Matching follow-up mechanism based on dictionary classification as claimed in claim 1, its feature exists In, including signal decomposition method and signal reconfiguring method;The signal decomposition method is comprised the following steps that:
Step one, according to different disposal signal type corresponding sparse dictionary is selected;
Step 2, judges the type of pending signal, according to the sparse dictionary that its type selecting is adapted therewith, to primary signal Classification pretreatment is carried out, the matching degree of the sparse dictionary set up in calculating it with step one;
Step 3, by the sparse dictionary drawn in step 2 D is designated as, to seek each dictionary in power of each atom on pending signal Weight coefficient, by the atom in sparse dictionary inner product is done successively with pending signal, calculates the maximum of inner product absolute value;
Step 4, by step 3 the component of pending signal maximum atom in dictionary can be obtained, then this time signal is residual after iteration Difference is signal and the vector difference of the component, while recording the sparse coefficient of renewal in sparse coefficient vector;
Step 5, matching pursuit algorithm process signal is by adding up iteration, primary signal being expressed as into weight with corresponding atom The superposition of multiplication and residual error sum, iteration can be terminated when signal residual error is reduced to certain value, and the certain value can be by iteration time Number and signal to noise ratio are together decided on;
The signal reconfiguring method is comprised the following steps that:
Step one, extracts atom weight, atom label and dictionary used that reconstruction signal will be used from sparse coefficient code stream Type label;
Step 2, when determining according to dictionary type label and encoding sparse dictionary type is adopted, by gained atom weight in step one Atom in corresponding sparse dictionary does product, and adds up successively, and it is that this matching pursuit algorithm is finally right to obtain output signal The fitted signal of primary signal.
3. it is according to claim 2 based on dictionary classify Audio Matching follow-up mechanism method for tracing, it is characterised in that Step 2 calculates the concrete meter of the matching degree of the sparse dictionary set up in pending signal and step one in signal decomposition method Calculate step as follows:The corresponding frequency domain value of pending signal is calculated, the time-domain value of pending signal and frequency domain value are normalized respectively It is the segment (a≤N/2) of a to be divided into j length afterwards, then calculates the energy value of each segment, and the energy value of a continuous sample can be near It is as follows like calculating formula:The twice of the sum of time-frequency domain energy is calculated respectively, and compares its size,
4. it is according to claim 2 based on dictionary classify Audio Matching follow-up mechanism method for tracing, it is characterised in that The computing formula of the inner product maximum absolute value value in signal decomposition method described in step 3 is: iopt∈ [1, M] is atom label in dictionary,For the inner product value of each atom and S,For maximum of the atom in dictionary on S Weight coefficient.
5. it is according to claim 2 based on dictionary classify Audio Matching follow-up mechanism method for tracing, it is characterised in that Signal to noise ratio reduced mechanical model in signal decomposition method described in step 5 is:S For the original signal amplitude value of rarefaction representation before processing, S ' is the signal amplitude value after this rarefaction representation recovers.
CN201610967738.4A 2016-11-01 2016-11-01 Audio matching tracking device and tracking method thereof based on dictionary classification Pending CN106653061A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610967738.4A CN106653061A (en) 2016-11-01 2016-11-01 Audio matching tracking device and tracking method thereof based on dictionary classification

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610967738.4A CN106653061A (en) 2016-11-01 2016-11-01 Audio matching tracking device and tracking method thereof based on dictionary classification

Publications (1)

Publication Number Publication Date
CN106653061A true CN106653061A (en) 2017-05-10

Family

ID=58820673

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610967738.4A Pending CN106653061A (en) 2016-11-01 2016-11-01 Audio matching tracking device and tracking method thereof based on dictionary classification

Country Status (1)

Country Link
CN (1) CN106653061A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403618A (en) * 2017-07-21 2017-11-28 山东师范大学 Based on the audio event sorting technique and computer equipment for stacking base rarefaction representation
CN112075932A (en) * 2020-10-15 2020-12-15 中国医学科学院生物医学工程研究所 High-resolution time-frequency analysis method for evoked potential signals
CN112466315A (en) * 2020-12-02 2021-03-09 公安部第三研究所 High code rate obtaining method for audio and video

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103280221A (en) * 2013-05-09 2013-09-04 北京大学 Audio frequency lossless compression coding and decoding method and system based on basis pursuit
CN103473451A (en) * 2013-09-05 2013-12-25 江苏大学 Ultrasonic professional dictionary construction and use method
CN103778919A (en) * 2014-01-21 2014-05-07 南京邮电大学 Speech coding method based on compressed sensing and sparse representation
CN105551503A (en) * 2015-12-24 2016-05-04 武汉大学 Audio matching tracking method based on atom pre-selection and system thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103280221A (en) * 2013-05-09 2013-09-04 北京大学 Audio frequency lossless compression coding and decoding method and system based on basis pursuit
CN103473451A (en) * 2013-09-05 2013-12-25 江苏大学 Ultrasonic professional dictionary construction and use method
CN103778919A (en) * 2014-01-21 2014-05-07 南京邮电大学 Speech coding method based on compressed sensing and sparse representation
CN105551503A (en) * 2015-12-24 2016-05-04 武汉大学 Audio matching tracking method based on atom pre-selection and system thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
MALLAT S. G.: ""Matching Pursuits with Time-Frequency Dictionaries"", 《IEEE TRANSACTIONS ON SIGNAL PROCESSING》 *
周忠根: ""Gabor和Chirplet字典中的子空间匹配追踪算法对比"", 《昆明理工大学学报(理工版)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107403618A (en) * 2017-07-21 2017-11-28 山东师范大学 Based on the audio event sorting technique and computer equipment for stacking base rarefaction representation
CN107403618B (en) * 2017-07-21 2020-05-05 山东师范大学 Audio event classification method based on stacking base sparse representation and computer equipment
CN112075932A (en) * 2020-10-15 2020-12-15 中国医学科学院生物医学工程研究所 High-resolution time-frequency analysis method for evoked potential signals
CN112075932B (en) * 2020-10-15 2023-12-05 中国医学科学院生物医学工程研究所 High-resolution time-frequency analysis method for evoked potential signals
CN112466315A (en) * 2020-12-02 2021-03-09 公安部第三研究所 High code rate obtaining method for audio and video

Similar Documents

Publication Publication Date Title
CN109326283B (en) Many-to-many voice conversion method based on text encoder under non-parallel text condition
CN110600047B (en) Perceptual STARGAN-based multi-to-multi speaker conversion method
CN109671442B (en) Many-to-many speaker conversion method based on STARGAN and x vectors
CN109599091B (en) Star-WAN-GP and x-vector based many-to-many speaker conversion method
CN110060690B (en) Many-to-many speaker conversion method based on STARGAN and ResNet
Ren et al. Portaspeech: Portable and high-quality generative text-to-speech
CN109410917B (en) Voice data classification method based on improved capsule network
CN102800316B (en) Optimal codebook design method for voiceprint recognition system based on nerve network
CN110060701B (en) Many-to-many voice conversion method based on VAWGAN-AC
CN111816156A (en) Many-to-many voice conversion method and system based on speaker style feature modeling
CN110047501B (en) Many-to-many voice conversion method based on beta-VAE
CN103531205A (en) Asymmetrical voice conversion method based on deep neural network feature mapping
CN110060657B (en) SN-based many-to-many speaker conversion method
CN109584893B (en) VAE and i-vector based many-to-many voice conversion system under non-parallel text condition
CN111429894A (en) Many-to-many speaker conversion method based on SE-ResNet STARGAN
Cui et al. Emovie: A mandarin emotion speech dataset with a simple emotional text-to-speech model
CN106653061A (en) Audio matching tracking device and tracking method thereof based on dictionary classification
Han et al. Self-supervised learning with cluster-aware-dino for high-performance robust speaker verification
Huang et al. Speech emotion recognition using autoencoder bottleneck features and LSTM
CN116939320A (en) Method for generating multimode mutually-friendly enhanced video semantic communication
Huang et al. Speech emotion recognition using convolutional neural network with audio word-based embedding
CN110600046A (en) Many-to-many speaker conversion method based on improved STARGAN and x vectors
CN113362804A (en) Method, device, terminal and storage medium for synthesizing voice
Zhang et al. Distance-based weight transfer for fine-tuning from near-field to far-field speaker verification
Zhao et al. Research on voice cloning with a few samples

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170510