CN107622774B - A kind of music-tempo spectrogram generation method based on match tracing - Google Patents

A kind of music-tempo spectrogram generation method based on match tracing Download PDF

Info

Publication number
CN107622774B
CN107622774B CN201710675484.3A CN201710675484A CN107622774B CN 107622774 B CN107622774 B CN 107622774B CN 201710675484 A CN201710675484 A CN 201710675484A CN 107622774 B CN107622774 B CN 107622774B
Authority
CN
China
Prior art keywords
music
tempo
atom
spectrogram
frequency
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710675484.3A
Other languages
Chinese (zh)
Other versions
CN107622774A (en
Inventor
桂文明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jinling Institute of Technology
Original Assignee
Jinling Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jinling Institute of Technology filed Critical Jinling Institute of Technology
Priority to CN201710675484.3A priority Critical patent/CN107622774B/en
Publication of CN107622774A publication Critical patent/CN107622774A/en
Application granted granted Critical
Publication of CN107622774B publication Critical patent/CN107622774B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Auxiliary Devices For Music (AREA)

Abstract

The present invention provides a kind of music-tempo spectrogram generation method based on match tracing, is related to content-based music information retrieval field, this approach includes the following steps:Music signal is inputted, generates note starting point detection function o (n) and to its framing;Common music-tempo section is taken to be converted into frequency sets;To each frequency in frequency sets, a corresponding parent is created;Shifting function is carried out to parent, it is often mobile once to generate a new atom;All parents and new atom are assembled into redundant dictionary;With the dictionary, match tracing is carried out to each frame of o (n), the decomposition coefficient of each music-tempo is obtained, ultimately produces the music-tempo spectrogram of the music.The music-tempo spectrogram that the present invention generates has the characteristics that high resolution, sparsity are strong, and the resolution ratio of music-tempo, the shift granularity of parent and match tracing cycle-index can be flexibly set according to oneself requirement, to generate the music-tempo spectrogram of different resolution and different sparsities.

Description

A kind of music-tempo spectrogram generation method based on match tracing
Technical field
The present invention relates to content-based music information retrieval fields, more particularly to a kind of music based on match tracing Normal-moveout spectrum drawing generating method.
Background technology
One, related notion of the present invention and application field
The speed that music carries out is music-tempo (tempo), usually with " clapping per minute " (beats per in contemporary music Minute, abbreviation bpm) it is used as the measurement of speed, such as music-tempo labelIndicate that the speed of the music is every point 120 crotchets of clock, that is, the duration of each crotchet account for 0.5 second, and bmp values are bigger, and speed is faster.
Music-tempo and the beat of music, rhythm etc. are closely related, are one of important features of music.It is examined in music information Rope field, velocity estimation refers to the content based on music, and from forms such as mp3, wav, the file of the waveform containing music signal sets out Estimate the gait of march of music.Velocity estimation itself is a challenging important topic, while being music beat sense again Know, music rhythm identification, music type identification, the element task of the research directions such as music structure analysis.For example, in the Music Day It claps in perception, generally requires first to estimate music-tempo, pushing away for beat type and beat structure is then carried out according to speed It is disconnected;For another example in music type identification, rhythm and speed can be used as a kind of notable feature of identification types.
Music-tempo is continually changing in music traveling process, and one kind of variation is the reason is that since musical composition is created Variation is inherently required when making, however the change frequency of this form is general little in a piece of music, many music are very To not changing;It is another the reason is that the error played or sung generated, the variation of this form is difficult to avoid that, generally It is present in all parts of music.Therefore, estimation music-tempo is actually to need to estimate the velocity amplitude of each time point.Due to depositing The liaison and rest phenomena such as, music-tempo obscures difficulty and distinguishes;Meanwhile there is error again in speed, therefore the speed of each time point is real The vector of multiple velocity component compositions can be regarded on border as.The speed of each time point of a piece of music can use music-tempo spectrogram (tempogram) it is described.The application such as beat-tracking, rhythm identification, type identification of music can be by music-tempo Spectrogram extracts useful information.
Two, the existing generation technique and process of music-tempo spectrogram
The generating process of music-tempo spectrogram is broken generally into two stages, and the first stage is note starting point detection function (note onset detection function) generation phase, note starting point refer to that each note strikes up in music Or that time sung, some documents such as [1] this stage are referred to as novel curve (Novelty Curve) and generate;Second-order Section is spectrogram generating process.
First stage includes mainly that several parts, the signals such as signal transformation, feature extraction, the generation of starting point detection function become It is that musical waveform signal is converted to low frequency from one-dimensional high-frequency data with the method that signal converts to indicate to change purpose.Usually first To signal framing, signal transformation then is carried out to every frame signal, signal transformation method includes short time discrete Fourier transform (Short Time Fourier Transform, abbreviation STFT), wavelet transformation (Wavelet Transform, abbreviation WT) etc..Feature carries It is the features such as extraction time domain, frequency domain and time-frequency representation from the expression of the signal low frequency of previous stage to take.Temporal signatures are typical Such as amplitude envelops feature, frequency domain character such as composes fluctuation characteristic (Spectral Flux) and Frequency Domain Energy, frequency schedule Show that feature is mainly based upon wavelet transformation or the character representation of Cohen class time-frequency distributions.The generation of starting point detection function is basis The situation of change of frame before and after the feature calculation extracted per frame signal, it is prominent that note starting point is generally present in front and back frame positive change In the case of so increasing.Typical note starting point detection function generating process can refer to document [1].
Second stage is to form music speed according to the value of previous stage note starting point detection function, extracting cycle characteristic Spend spectrogram.This stage, main method included auto-relativity function method (Autocorrelation Function, abbreviation at present ACF), Fourier transform two kinds of (Fourier Transform, abbreviation FT) [1].
ACF is risen according to delay extraction note by note starting point detection function adding window and carrying out autocorrelation calculation The periodicity of initial point, and delay is converted into music-tempo measurement, to form music-tempo spectrogram.Its calculation formula is [1]:
A (t, l)=∑n∈Zo(n)o(n+l)W(n-t)/(2N+1-l) (1.1)
Wherein t, n are discrete time, and it is delay to take l=1...N, and o (n) is note starting point detection function, during W (n) is Heart point is t=0, is supported as the rectangular window of [- N, N].If fsFor the sampling frequency of o (n), then it is l/f to postpone the l corresponding periodss, Frequency is fs/ l, corresponding music-tempo τ=60*fs/l。
FT methods are to carry out windowed FFT to note starting point detection function, acquire frequency domain characteristic, and by frequency domain Measurement is converted into music-tempo measurement, to form music-tempo spectrogram.Its calculation formula is:
F (t, ω)=∑n∈Zo(n)W(n-t)e-2πiωn (1.2)
Wherein t, n are discrete time, and ω is frequency, and o (n) is note starting point detection function, and it is t=to be put centered on W (n) 0, it supports as the Hanning window of [- N, N].For ω, there are two types of methods to determine at present, and one is the discrete Fouriers according to document [2] Leaf transformation method (Discrete Fourier Transform, abbreviation DFT) turns to N number of frequency point by ω > 0 are discrete, is divided into fs/ NHz;Another kind is similar document [1] way, and it is common music-tempo range to take ω=τ/60Hz, τ ∈ [30,480] bpm, and The corresponding coefficients of each time point ω are calculated using above-mentioned formula.
Three, the deficiencies in the prior art
The achievement of the present invention is embodied in the second stage of music-tempo spectrogram generation.To illustrate the deficiencies in the prior art, draw Enter two concepts of music-tempo resolution ratio and music-tempo spectrogram sparsity, and is illustrated respectively.Music-tempo resolution ratio, Here the frequency resolution for using for reference field, with the gap size of two adjacent active dots of music-tempo spectrogram medium velocity component It indicates, interval is bigger, and velocity resolution is poorer.Music-tempo spectrogram sparsity refers to the nonzero element in all spectrogram coefficients Number, nonzero element number is few, and sparsity is strong, and discrimination degree is good.
1, the prior art cannot be satisfied requirement of the common music-tempo to velocity resolution
The interval of two adjacent active dots is in changing inversely in music-tempo resolution ratio and spectrogram, is spaced bigger resolution ratio It is poorer, conversely, interval is smaller then better.
Investigate ACF method medium velocity components front and back 2 points of difference beThis illustrates speed Degree interval with delay increase and reduces, delay it is bigger, velocity resolution is higher, that is, velocity resolution with speed increase and Increase.We take f by document [1]s=1/0.023=43.5, when l=51 (τ=51.2), Δ τ=0.98, as l < 51, The resolution ratio of speed is respectively less than 1, and when l=21 (τ=124.2), Δ τ=5.6 cannot differentiate common music at this time Speed (τ ∈ [30,480]), needless to say the case where l < 21.And Δ τ maximum values (when n=1) to be made to be less than 1, fsIt needs small In 1/30, that is, frame length is greater than 30 seconds when first stage framing, is equal to then the error of note starting point will be also greater than 30 seconds, this was obviously infeasible, and therefore, for ACF, music-tempo resolution ratio is non-constant, when music-tempo is more than 51bpm, Resolution ratio is less than 1, cannot meet the resolution ratio of common music-tempo.
Investigate the DFT method in document [2], speed interval 60*fs/ Nbpm, by fs=1/0.023=43.5 is calculated, such as Fruit will reach the music-tempo resolution ratio of 1bpm, need N >=fs* 60=2610, and a length of 60 seconds when the window of such a length, Major part music is at 300 seconds hereinafter, therefore, length of window requires and general music length is incompatible at present.Improve speed Resolution ratio is spent, another method is to reduce fs, and the required precision contradiction of this and o (n), therefore be infeasible.
Investigate document [1] FT methods, this method be actually to music signal adding window after, pass through calculate discrete time The method of Fourier transform (Discrete Time Fourier Transform, abbreviation DTFT), which calculates, commonly uses music-tempo pair The coefficient for the frequencies omega answered, this method actually only carry out ω approximate sampling, practical frequency discrimination on discrete point Rate does not get a promotion.
In conclusion the prior art cannot be satisfied requirement of the common music-tempo to velocity resolution, that is, generate The subregion of music-tempo spectrogram will be smudgy.
2, the music-tempo spectrogram sparsity that the prior art generates is not good enough
Music-tempo spectrogram sparsity is stronger, and discrimination degree is better.From the perspective of from still further aspect, spectrogram sparsity illustrates by force Spectrogram energy is concentrated, and property is good, good application effect.
To ACF methods, when music-tempo is more than 51bpm, component coefficient needs under normal precision (such as 1bpm) By interpolation method design factor, necessarily sparsity is caused to decline.And for FT methods, since spectral leakage and resolution ratio are asked The sparsity of the presence of topic, frequency coefficient obviously will equally decline.Therefore, the music-tempo spectrogram sparsity that the prior art generates Not good enough, encircled energy is poor.
In conclusion the music-tempo spectrogram that the prior art generates is in terms of resolution ratio and sparsity, existing defects, and sharp With the producible resolution ratio higher of the present invention, the better music-tempo spectrogram of sparsity.Bibliography used in this patent is as follows:
1.P.Grosche,M.Müller,F.Kurth.Cyclic tempogram—a mid-level tempo representation for musicsignals[C]. in Acoustics Speech and Signal Processing (ICASSP),2010IEEE International Conference on.2010:IEEE.
2.G.Peeters.Time variable Tempo Detection and beat Marking[C].in ICMC.2005.
3.MIREX.MIREX music test data sets
http://www.music-ir.org/evaluation/MIREX/data/2006/tempo/tempo_train_ 2006.zip.2017.
Invention content
The music-tempo spectra resolution rate and sparsity generated the purpose of the invention is to overcome the shortcomings of the prior art Problem provides a kind of music-tempo spectrogram generation method based on match tracing.
In order to solve the above technical problems, the technical solution adopted by the present invention is:
1, music signal is inputted, note starting point detection function o (n) is generated;
2, to o (n) framings, several frame signals are formed;
3, common music-tempo section is taken, by certain music-tempo resolution ratio, sets of speeds is converted into frequency sets;
4, to each frequency in frequency sets, a corresponding parent is created;
5, it presses certain particle size and shifting function is carried out to all parents, often move one atom of generation that moves a step, these are moved The dynamic atom generated forms the atom set of the corresponding frequency of the parent together with parent;
6, the corresponding atom set of all frequencies in frequency sets is assembled into redundant dictionary;
7, match tracing is carried out with redundant dictionary to each frame signal of o (n), recycles certain number, generates a system Row decomposition coefficient and corresponding atom;
8, the decomposition coefficient of each frame signal of o (n) is returned according to the relationship of atom in redundant dictionary and music-tempo Belong to the coefficient of a certain music-tempo;
9, merge the music-tempo spectrum vector per frame signal, form music-tempo spectrogram.
Beneficial effects of the present invention:
It is characteristic of the invention that the matching pursuit algorithm based on redundant dictionary, generates music-tempo spectrogram.Advantage It is to generate the fine resolution and sparse characteristic of spectrogram.
The good resolution ratio of the present invention has benefited from the flexible setting of atom in redundant dictionary, can be according to music-tempo resolution ratio The atom that demand generates higher resolution forms redundant dictionary, to make the resolution ratio higher of spectrogram.Fig. 2-Fig. 4 is to use sound Happy information retrieval exchange comparation and assessment center (Music Information Retrieval Evaluation eXchange, referred to as MIREX a piece of music (train1.wav) in test data set [3]), is respectively adopted auto-relativity function method (Fig. 2), Fourier Leaf transformation method (Fig. 3) and match tracing method of the present invention (Fig. 4), the music-tempo spectrogram of generation.Music-tempo axis it is adjacent Interval is 1bpm (totally 571 point), and in terms of resolution ratio, Fig. 2 auto-relativity function methods are fine in low speed sections resolution ratio, but Highspeed portion is smudgy, and ribbon is gradually wide, and resolution ratio significantly reduces.In high speed and low speed portion in Fig. 3 Fourier transforms Point, ribbon is all wider, resolution ratio obviously be not so good as Fig. 4 of the present invention result (for auto-relativity function method, Fourier transform Compare, cycle-index 571).
The excellent sparse characteristic of the present invention has benefited from redundant dictionary and provides the similar atom with original signal height, and It ensure that the decomposition coefficient of these similar atoms of height is relatively large with tracing algorithm, non-similar atomic is smaller even Zero.Compare from Fig. 2-Fig. 4 it can be seen that the coefficient of Fig. 4 is significantly sparse, zero or close zero coefficient accounting is significantly big.
The music-tempo spectrogram that the present invention generates also has the spirit of application in addition to having good resolution ratio and sparsity Activity.Flexibility be embodied in the resolution ratio of music-tempo, the cycle-index of the shift granularity of parent, match tracing it is adjustable Property.The adjustment of velocity resolution can be implemented during common speed section is converted into frequency sets;The shifting of parent Position granularity can be configured when generating atom set, and granularity is smaller, and atom set is bigger, and spectrogram precision is higher, Fig. 8-10 points Not Wei shift granularity 50,20,5 three kind of situation, can be seen that precision is higher and higher from comparing result;Cycle-index is in match tracing It is arranged in algorithm, cycle-index is bigger, and the coefficient of generation is more, and spectrogram is more intensive, but coefficient magnitude sequence is still constant , Fig. 5-7 is three kinds of situations of cycle-index 20,10,5 respectively, it is clear that the coefficient of spectrogram is fewer and fewer, but larger coefficient is not Become.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, drawings discussed below is only this hair The part attached drawing of bright embodiment, for those of ordinary skill in the art, without creative efforts, Other drawings may also be obtained based on these drawings.
Fig. 1 is the music-tempo spectrogram product process figure that inventive embodiments provide;
Fig. 2 is the music-tempo spectrogram generated using auto-relativity function method;
Fig. 3 is the music-tempo spectrogram generated using Fourier transform;
Fig. 4 is music-tempo spectrogram (cycle-index 571, the shifting that the present invention uses the method based on match tracing to generate 2) position granularity is;
Fig. 5 is music-tempo spectrogram (cycle-index 20, the displacement that the present invention uses the method based on match tracing to generate 2) granularity is;
Fig. 6 is music-tempo spectrogram (cycle-index 10, the displacement that the present invention uses the method based on match tracing to generate 2) granularity is;
Fig. 7 is music-tempo spectrogram (cycle-index 5, the displacement that the present invention uses the method based on match tracing to generate 2) granularity is;
Fig. 8 is music-tempo spectrogram (cycle-index 20, the displacement that the present invention uses the method based on match tracing to generate 50) granularity is;
Fig. 9 is music-tempo spectrogram (cycle-index 20, the displacement that the present invention uses the method based on match tracing to generate 20) granularity is;
Figure 10 is music-tempo spectrogram (cycle-index 20, the shifting that the present invention uses the method based on match tracing to generate 5) position granularity is.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those of ordinary skill in the art obtained without creative efforts it is all its His embodiment, shall fall within the protection scope of the present invention.
Music-tempo resolution ratio uses for reference the frequency resolution in field, with the two of music-tempo spectrogram medium velocity component here The gap size of a adjacent active dot indicates that interval is bigger, and velocity resolution is poorer.Music-tempo spectrogram sparsity refers to institute There is the number of the nonzero element in spectrogram coefficient, nonzero element number is few, and sparsity is strong, and discrimination degree is good.The embodiment of the present invention A kind of music-tempo spectrogram generation method based on match tracing is provided, as shown in Figure 1, this method includes:
1. inputting music signal, note starting point detection function o (n) is generated
The music signal of input is usually the forms such as wav, mp3, the file containing waveform.Music-tempo spectrogram generate the One stage included the processes such as signal transformation, feature extraction, the generation of starting point detection function, the note starting point that output length is N Detection function o (n), i.e. a vector.This stage can refer to document [1] and be implemented.
2. pair o (n) framings, form several frame signals
Framing is carried out to o (n), it is preferable that the frame length of framing is 6 seconds (setting in frame has M point), often jumps (hopsize) about 0.2 second, form detection function matrix X=X (m, n) m ∈ [1...M] the n ∈ [1...N] that line number is M, columns is N.
3. taking common music-tempo section τ ∈ [30,480], τ ∈ R turn sets of speeds by music-tempo resolution requirement Change frequency sets into
The prior art cannot be selected music-tempo resolution ratio to generate required music-tempo spectrogram by user, and the present invention can be certainly By selection music-tempo resolution ratio, corresponding redundant dictionary is generated, and matched tracing algorithm generates corresponding music-tempo spectrum Figure, this embodies the present invention can be configured the flexibility of music-tempo resolution ratio according to application.The music-tempo point of the present invention Positive integer value that resolution value can be 1,2 ... is simultaneously identical in all subintervals, can also be by auto-relativity function method or Fourier The obtaining value method value of converter technique, it might even be possible to be to divide subinterval, and different music-tempo resolution ratio is pressed in each subinterval Value, for example it is 0.25 to take music-tempo resolution ratio in the most common speed interval of music [80,150], and other subintervals take 0.5.Compare for convenience, it is 1 that entire section, which takes music-tempo resolution ratio, in embodiment, then for τ ∈ [30,480], τ ∈ Z is converted into frequency sets set { fb|fb=τ/60, τ=[30,31 ... 480], b=[1..B] } and, wherein b is corresponding frequency Frequency serial number in rate set, B are serial number maximum value.
4. each frequency in pair frequency sets creates a corresponding parent
Specifically, for the frequency sets obtained in step 3, by each frequency f in the setb, create the frequency Cosine function as corresponding parent αb, the framing length M of the length of o (n), form is: αb=cos (2 π fbt),t =(0...M-1)/fo,foFor the sampling rate of o (n), t indicates the time.
5. carrying out moving to right bit manipulation to all parents by certain particle size, one atom of generation that moves a step often is moved, these The mobile atom generated forms the atom set of the corresponding frequency of the parent together with parent
Parent αbSupporting domain be [0, M-1], shift granularity d=1,2,3... be a positive integer, by parent αbTo Move right d*j (j=1,2,3...), parent αbAfter moving to right, the value cos (- 2 of the left side [0, M-d*j-1] supporting domain πfbT), t=(M-d*j...1)/foSupplement, it is often mobile primary in this way, a new atom can be obtained.Parent is herein Therefore periodic function is arranged maximum mobile digit and is no more than a cycle.All parent αbThe atom obtained with these displacements Together constitute the corresponding atom set d of the parentb
The adjustability of parent shift granularity embodies the flexibility using the present invention again in this step.Granularity is smaller, Atom set is bigger, and spectrogram precision is higher, but simultaneously entire music-tempo spectrogram calculating take it is more.Fig. 8-10 is respectively Shift granularity 50,20,5 three kind of situation can be seen that the precision of spectrogram is higher and higher from comparing result.Using can root when the present invention Requirement and spectrogram required precision are taken according to calculating, determines the shift granularity of parent.
6. being assembled into redundant dictionary by the corresponding atom set of all frequencies in frequency sets in the 5th step
All frequency f in frequency setsbCorresponding atom set db, it is assembled into a redundant dictionary D.
7. each frame signal of couple o (n) carries out match tracing with redundant dictionary, certain number is recycled, generates a system Row decomposition coefficient and corresponding atom:
To each frame signal of o (n), i.e., to each row X of detection function matrixi, i ∈ [1..N], with redundant dictionary D implements matching pursuit algorithm:
(1) residual signal y is setn=Xi, n=0 starts to execute cycle;
(2) all atom g of computing redundancy dictionaryj∈ D and residual signal ynInner product < yn,gj> is selected in all The corresponding atom g of maximum absolute value person in productkFor this matched atom of cycle, the decomposition coefficient s of n-th cycle is preservedn=| < yn,gk> | and corresponding atom gn=gk
(3) residual signal y is recalculatedn+1=yn| < yn,gk> | gk
(4) if cycle-index or residual signal reach required precision with original signal energy ratio, cycle is exited, n is otherwise set =n+1 is continued to execute since step (2).
Preferably, the present invention generally presses cycle-index and terminates cycle, can be arranged according to the requirement of music-tempo spectrogram and be recycled Number, such as K=10 times, 20 times ... etc..S is obtained after loop terminationn,gn, n=[1...K].
The present invention is based on the redundant dictionaries that common music-tempo section generates to provide the similar atom with original signal height, Matching pursuit algorithm ensure that the decomposition coefficient of these similar atoms of height is relatively large, and non-similar atomic is smaller even It is zero, so that the music-tempo spectrogram that the present invention generates has more sparse characteristic.Compare from Fig. 2-Fig. 4 and can be seen that Fig. 4's Coefficient is significantly sparse, and zero or close zero coefficient accounting is significantly big.
The cycle-index of matching pursuit algorithm is adjustable, and cycle-index is bigger, and the coefficient of generation is more, and spectrogram is more intensive (coefficient magnitude and genesis sequence be still constant), but calculate and take and will increase with the increase of cycle-index.Fig. 5-7 It is three kinds of situations of cycle-index 20,10,5 respectively, it is clear that the coefficient of spectrogram is fewer and fewer.Big system a small amount of in some applications It counts and just completes task enough, it is smaller value that cycle-index can be arranged at this time, and required calculating takes smaller;And other applications need It wants big coefficient of discharge to provide enough information, only need to increase cycle-index.This also embodies the flexibility that the present invention applies, And this to be the prior art do not have.
8. according to the relationship of atom in redundant dictionary and music-tempo, the decomposition coefficient of each frame signal of o (n), return Belong to the coefficient of a certain music-tempo
To each frame signal, the music-tempo that an initial value is 0 is created first and composes vector Sn, n=[1..N], each component Serial number be music-tempo serial number b, b=[1..B], the value of each component is the decomposition coefficient of the music-tempo.Then, to each The decomposition coefficient s of frame signaln, according to atom g in redundant dictionarynRespective frequencies find corresponding music-tempo serial number b, point Solve coefficient snAs the decomposition coefficient of the music-tempo, identical music-tempo serial number is answered if there is multiple atom pairs, then will After the cumulative summation of multiple decomposition coefficients, then as the decomposition coefficient of the music-tempo.
9. merging the music-tempo spectrum vector per frame signal, music-tempo spectrogram is formed
The music-tempo spectrum vector S of all framesn, assembled by row mode and be merged into music-tempo spectrogram S=S (b, n), b= [1..B], n=[1...N].
Example the above is only the implementation of the present invention is not intended to limit the scope of the invention, every to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other Technical field is included within the scope of the present invention.

Claims (2)

1. a kind of music-tempo spectrogram generation method based on match tracing, specifically comprises the following steps:
S1. music signal is inputted, note starting point detection function o (n) is generated;
S2. to o (n) framings, several frame signals are formed;
Framing is carried out to o (n), it is preferable that the frame length of framing is 6 seconds, if there is M point in frame, often jumps 0.2 second, then forms line number Detection function matrix X=X (m, n) m ∈ [1...M] the n ∈ [1...N] for being N for M, columns;
S3. take common music-tempo section τ ∈ [30,480], τ ∈ R that sets of speeds is converted by music-tempo resolution requirement At frequency sets:
Positive integer value that the value of music-tempo resolution ratio is 1,2 ..., and it is identical in all subintervals;Or press auto-correlation function The obtaining value method value of method or Fourier transform;Subinterval is either divided, and in each subinterval by different music speed Resolution ratio value is spent, when it is 1 that entire section, which takes music-tempo resolution ratio, then for τ ∈ [30,480], τ ∈ Z, Z expression are just Integer is converted into frequency sets { fb|fb=τ/60, τ=[30,31 ... 480], b=[1..B] } and, wherein b is corresponding frequency Frequency serial number in set, B are serial number maximum value;
S4. to each frequency in frequency sets, a corresponding parent is created:
For the frequency sets obtained in step S3, by each frequency f in the setb, create the cosine function conduct of the frequency Corresponding parent αb, the framing length M of the length of o (n), form is:αb=cos (2 π fbT), t=(0...M-1)/fo,fo For the sampling rate of o (n), t indicates the time;
S5. all parents are carried out by certain particle size moving to right bit manipulation, often moves one atom of generation that moves a step, these is moved The atom of generation forms the atom set of the corresponding frequency of the parent together with parent:
Parent αbSupporting domain be [0, M-1], shift granularity d=1,2,3... be a positive integer, by parent αbIt moves right It moves d*j (j=1,2,3...), parent αbAfter moving to right, value cos (- 2 π f of the left side [0, M-d*j-1] supporting domainbt),t =(M-d*j...1)/foSupplement, it is often mobile primary in this way, a new atom can be obtained;Parent is period letter herein Therefore number is arranged maximum mobile digit and is no more than a cycle;All parent αbThe atom obtained with these displacements group together At the corresponding atom set d of the parentb
S6. being assembled into redundant dictionary by the corresponding atom set of all frequencies in frequency sets in step S5:
All frequency f in frequency setsbCorresponding atom set db, it is assembled into a redundant dictionary D;
S7. match tracing is carried out with redundant dictionary to each frame signal of o (n), recycles certain number, generates a series of points Solve coefficient and corresponding atom:
To each frame signal of o (n), i.e., to each row X of detection function matrixi, i ∈ [1..N], with redundant dictionary D, implementation Matching pursuit algorithm:
(1) residual signal y is setn=Xi, n=0 starts to execute cycle;
(2) all atom g of computing redundancy dictionaryj∈ D and residual signal ynInner product<yn,gj>, select in all inner products absolutely It is worth the corresponding atom g of the maximumkFor this matched atom of cycle, the decomposition coefficient s of n-th cycle is preservedn=|<yn,gk>| With corresponding atom gn=gk
(3) residual signal y is recalculatedn+1=yn-|<yn,gk>|gk
(4) if cycle-index or residual signal reach required precision with original signal energy ratio, cycle is exited, n=n+ is otherwise set 1, it is continued to execute since step (2);
S8. the decomposition coefficient of each frame signal of o (n) is belonged to according to the relationship of atom in redundant dictionary and music-tempo The coefficient of a certain music-tempo:
To each frame signal, the music-tempo that an initial value is 0 is created first and composes vector Sn, n=[1..N], the sequence of each component Number it is music-tempo serial number b, b=[1..B], the value of each component is the decomposition coefficient of the music-tempo;Then, each frame is believed Number decomposition coefficient sn, according to atom g in redundant dictionarynRespective frequencies find corresponding music-tempo serial number b, resolving system Number snAs the decomposition coefficient of the music-tempo, identical music-tempo serial number is answered if there is multiple atom pairs, then it will be multiple After the cumulative summation of decomposition coefficient, then as the decomposition coefficient of the music-tempo;
S9. merge the music-tempo spectrum vector per frame signal, form music-tempo spectrogram:
The music-tempo spectrum vector S of all framesn, assembled by row mode and be merged into music-tempo spectrogram S=S (b, n), b= [1..B], n=[1...N].
2. a kind of music-tempo spectrogram generation method based on match tracing, it is characterised in that (4) step in step S7 is exited and followed The condition of ring is to terminate to recycle by cycle-index, cycle-index is arranged according to the requirement of music-tempo spectrogram, i.e., to K as cycle Number carries out assignment and exits cycle when K reaches preset value;S is obtained after loop terminationn,gn, n=[1...K].
CN201710675484.3A 2017-08-09 2017-08-09 A kind of music-tempo spectrogram generation method based on match tracing Active CN107622774B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710675484.3A CN107622774B (en) 2017-08-09 2017-08-09 A kind of music-tempo spectrogram generation method based on match tracing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710675484.3A CN107622774B (en) 2017-08-09 2017-08-09 A kind of music-tempo spectrogram generation method based on match tracing

Publications (2)

Publication Number Publication Date
CN107622774A CN107622774A (en) 2018-01-23
CN107622774B true CN107622774B (en) 2018-08-21

Family

ID=61088662

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710675484.3A Active CN107622774B (en) 2017-08-09 2017-08-09 A kind of music-tempo spectrogram generation method based on match tracing

Country Status (1)

Country Link
CN (1) CN107622774B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109256146B (en) * 2018-10-30 2021-07-06 腾讯音乐娱乐科技(深圳)有限公司 Audio detection method, device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101471068A (en) * 2007-12-26 2009-07-01 三星电子株式会社 Method and system for searching music files based on wave shape through humming music rhythm
CN101512636A (en) * 2006-09-11 2009-08-19 惠普开发有限公司 Computational music-tempo estimation
CN101625855A (en) * 2008-07-09 2010-01-13 爱思开电讯投资(中国)有限公司 Method and device for manufacturing guide sound track and background music

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4940588B2 (en) * 2005-07-27 2012-05-30 ソニー株式会社 Beat extraction apparatus and method, music synchronization image display apparatus and method, tempo value detection apparatus and method, rhythm tracking apparatus and method, music synchronization display apparatus and method
JP5008766B2 (en) * 2008-04-11 2012-08-22 パイオニア株式会社 Tempo detection device and tempo detection program

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101512636A (en) * 2006-09-11 2009-08-19 惠普开发有限公司 Computational music-tempo estimation
CN101471068A (en) * 2007-12-26 2009-07-01 三星电子株式会社 Method and system for searching music files based on wave shape through humming music rhythm
CN101625855A (en) * 2008-07-09 2010-01-13 爱思开电讯投资(中国)有限公司 Method and device for manufacturing guide sound track and background music

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
A New Tempo Feature Extraction Based on Modulation Spectrum Analysis for Music Information Retrieval Tasks;Kim HG;《The Journal of The Korea Institute of Intelligent Transport Systems》;20070831;第6卷(第2期);第95-106页 *
基于匹配追踪的音符起始点检测;桂文明 等;《电子学报》;20130630;第4卷(第6期);第1225-1230页 *
音符起始点检测算法研究;桂文明;《中国博士学位论文全文数据库 信息科技辑》;20150615(第06期);第1-104页 *

Also Published As

Publication number Publication date
CN107622774A (en) 2018-01-23

Similar Documents

Publication Publication Date Title
Allen et al. Signal analysis: time, frequency, scale, and structure
Sun A pitch determination algorithm based on subharmonic-to-harmonic ratio
US6745155B1 (en) Methods and apparatuses for signal analysis
CN1319042C (en) Voice analysis device, voice analysis method and voice analysis program
Hussain Coherent structures and studies of perturbed and unperturbed jets
CN103854661A (en) Method and device for extracting music characteristics
CN107622774B (en) A kind of music-tempo spectrogram generation method based on match tracing
Mohammad et al. Robust singular spectrum transform
Chittora et al. Classification of normal and pathological infant cries using bispectrum features
Ranjani et al. A compact pitch and time representation for melodic contours in Indian art music
Le et al. Hyperbolic wavelet power spectra of nonstationary signals
Chang et al. Speech feature extracted from adaptive wavelet for speech recognition
Pratama et al. Human vocal type classification using MFCC and convolutional neural network
Liu et al. A note on time-frequency analysis of finger tapping
Leyuan et al. Research on time-frequency energy distribution characteristics of PSWFs signals based on WVD
Rust et al. The fast fourier transform for experimentalists, Part IV: Autoregressive spectral analysis
JPH0218598A (en) Speech analyzing device
Tomar et al. On the development of variable length Teager energy operator (VTEO).
Rust et al. The fast Fourier transform for experimentalists. Part III. Classical spectral analysis
Kumar et al. Raaga identification using clustering algorithm
Smaragdis et al. Non-negative matrix factorization for irregularly-spaced transforms
Lee et al. Chaos in segments from Korean traditional singing and Western singing
Cantri et al. Cumulative Scores Based for Real-Time Music Beat Detection System
Abraham et al. Signal periodicity detection using Ramanujan subspace projection
Yeh et al. The expected amplitude of overlapping partials of harmonic sounds

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant