CN107622774B - A kind of music-tempo spectrogram generation method based on match tracing - Google Patents
A kind of music-tempo spectrogram generation method based on match tracing Download PDFInfo
- Publication number
- CN107622774B CN107622774B CN201710675484.3A CN201710675484A CN107622774B CN 107622774 B CN107622774 B CN 107622774B CN 201710675484 A CN201710675484 A CN 201710675484A CN 107622774 B CN107622774 B CN 107622774B
- Authority
- CN
- China
- Prior art keywords
- music
- tempo
- atom
- spectrogram
- frequency
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Landscapes
- Auxiliary Devices For Music (AREA)
Abstract
The present invention provides a kind of music-tempo spectrogram generation method based on match tracing, is related to content-based music information retrieval field, this approach includes the following steps:Music signal is inputted, generates note starting point detection function o (n) and to its framing;Common music-tempo section is taken to be converted into frequency sets;To each frequency in frequency sets, a corresponding parent is created;Shifting function is carried out to parent, it is often mobile once to generate a new atom;All parents and new atom are assembled into redundant dictionary;With the dictionary, match tracing is carried out to each frame of o (n), the decomposition coefficient of each music-tempo is obtained, ultimately produces the music-tempo spectrogram of the music.The music-tempo spectrogram that the present invention generates has the characteristics that high resolution, sparsity are strong, and the resolution ratio of music-tempo, the shift granularity of parent and match tracing cycle-index can be flexibly set according to oneself requirement, to generate the music-tempo spectrogram of different resolution and different sparsities.
Description
Technical field
The present invention relates to content-based music information retrieval fields, more particularly to a kind of music based on match tracing
Normal-moveout spectrum drawing generating method.
Background technology
One, related notion of the present invention and application field
The speed that music carries out is music-tempo (tempo), usually with " clapping per minute " (beats per in contemporary music
Minute, abbreviation bpm) it is used as the measurement of speed, such as music-tempo labelIndicate that the speed of the music is every point
120 crotchets of clock, that is, the duration of each crotchet account for 0.5 second, and bmp values are bigger, and speed is faster.
Music-tempo and the beat of music, rhythm etc. are closely related, are one of important features of music.It is examined in music information
Rope field, velocity estimation refers to the content based on music, and from forms such as mp3, wav, the file of the waveform containing music signal sets out
Estimate the gait of march of music.Velocity estimation itself is a challenging important topic, while being music beat sense again
Know, music rhythm identification, music type identification, the element task of the research directions such as music structure analysis.For example, in the Music Day
It claps in perception, generally requires first to estimate music-tempo, pushing away for beat type and beat structure is then carried out according to speed
It is disconnected;For another example in music type identification, rhythm and speed can be used as a kind of notable feature of identification types.
Music-tempo is continually changing in music traveling process, and one kind of variation is the reason is that since musical composition is created
Variation is inherently required when making, however the change frequency of this form is general little in a piece of music, many music are very
To not changing;It is another the reason is that the error played or sung generated, the variation of this form is difficult to avoid that, generally
It is present in all parts of music.Therefore, estimation music-tempo is actually to need to estimate the velocity amplitude of each time point.Due to depositing
The liaison and rest phenomena such as, music-tempo obscures difficulty and distinguishes;Meanwhile there is error again in speed, therefore the speed of each time point is real
The vector of multiple velocity component compositions can be regarded on border as.The speed of each time point of a piece of music can use music-tempo spectrogram
(tempogram) it is described.The application such as beat-tracking, rhythm identification, type identification of music can be by music-tempo
Spectrogram extracts useful information.
Two, the existing generation technique and process of music-tempo spectrogram
The generating process of music-tempo spectrogram is broken generally into two stages, and the first stage is note starting point detection function
(note onset detection function) generation phase, note starting point refer to that each note strikes up in music
Or that time sung, some documents such as [1] this stage are referred to as novel curve (Novelty Curve) and generate;Second-order
Section is spectrogram generating process.
First stage includes mainly that several parts, the signals such as signal transformation, feature extraction, the generation of starting point detection function become
It is that musical waveform signal is converted to low frequency from one-dimensional high-frequency data with the method that signal converts to indicate to change purpose.Usually first
To signal framing, signal transformation then is carried out to every frame signal, signal transformation method includes short time discrete Fourier transform (Short
Time Fourier Transform, abbreviation STFT), wavelet transformation (Wavelet Transform, abbreviation WT) etc..Feature carries
It is the features such as extraction time domain, frequency domain and time-frequency representation from the expression of the signal low frequency of previous stage to take.Temporal signatures are typical
Such as amplitude envelops feature, frequency domain character such as composes fluctuation characteristic (Spectral Flux) and Frequency Domain Energy, frequency schedule
Show that feature is mainly based upon wavelet transformation or the character representation of Cohen class time-frequency distributions.The generation of starting point detection function is basis
The situation of change of frame before and after the feature calculation extracted per frame signal, it is prominent that note starting point is generally present in front and back frame positive change
In the case of so increasing.Typical note starting point detection function generating process can refer to document [1].
Second stage is to form music speed according to the value of previous stage note starting point detection function, extracting cycle characteristic
Spend spectrogram.This stage, main method included auto-relativity function method (Autocorrelation Function, abbreviation at present
ACF), Fourier transform two kinds of (Fourier Transform, abbreviation FT) [1].
ACF is risen according to delay extraction note by note starting point detection function adding window and carrying out autocorrelation calculation
The periodicity of initial point, and delay is converted into music-tempo measurement, to form music-tempo spectrogram.Its calculation formula is
[1]:
A (t, l)=∑n∈Zo(n)o(n+l)W(n-t)/(2N+1-l) (1.1)
Wherein t, n are discrete time, and it is delay to take l=1...N, and o (n) is note starting point detection function, during W (n) is
Heart point is t=0, is supported as the rectangular window of [- N, N].If fsFor the sampling frequency of o (n), then it is l/f to postpone the l corresponding periodss,
Frequency is fs/ l, corresponding music-tempo τ=60*fs/l。
FT methods are to carry out windowed FFT to note starting point detection function, acquire frequency domain characteristic, and by frequency domain
Measurement is converted into music-tempo measurement, to form music-tempo spectrogram.Its calculation formula is:
F (t, ω)=∑n∈Zo(n)W(n-t)e-2πiωn (1.2)
Wherein t, n are discrete time, and ω is frequency, and o (n) is note starting point detection function, and it is t=to be put centered on W (n)
0, it supports as the Hanning window of [- N, N].For ω, there are two types of methods to determine at present, and one is the discrete Fouriers according to document [2]
Leaf transformation method (Discrete Fourier Transform, abbreviation DFT) turns to N number of frequency point by ω > 0 are discrete, is divided into fs/
NHz;Another kind is similar document [1] way, and it is common music-tempo range to take ω=τ/60Hz, τ ∈ [30,480] bpm, and
The corresponding coefficients of each time point ω are calculated using above-mentioned formula.
Three, the deficiencies in the prior art
The achievement of the present invention is embodied in the second stage of music-tempo spectrogram generation.To illustrate the deficiencies in the prior art, draw
Enter two concepts of music-tempo resolution ratio and music-tempo spectrogram sparsity, and is illustrated respectively.Music-tempo resolution ratio,
Here the frequency resolution for using for reference field, with the gap size of two adjacent active dots of music-tempo spectrogram medium velocity component
It indicates, interval is bigger, and velocity resolution is poorer.Music-tempo spectrogram sparsity refers to the nonzero element in all spectrogram coefficients
Number, nonzero element number is few, and sparsity is strong, and discrimination degree is good.
1, the prior art cannot be satisfied requirement of the common music-tempo to velocity resolution
The interval of two adjacent active dots is in changing inversely in music-tempo resolution ratio and spectrogram, is spaced bigger resolution ratio
It is poorer, conversely, interval is smaller then better.
Investigate ACF method medium velocity components front and back 2 points of difference beThis illustrates speed
Degree interval with delay increase and reduces, delay it is bigger, velocity resolution is higher, that is, velocity resolution with speed increase and
Increase.We take f by document [1]s=1/0.023=43.5, when l=51 (τ=51.2), Δ τ=0.98, as l < 51,
The resolution ratio of speed is respectively less than 1, and when l=21 (τ=124.2), Δ τ=5.6 cannot differentiate common music at this time
Speed (τ ∈ [30,480]), needless to say the case where l < 21.And Δ τ maximum values (when n=1) to be made to be less than 1, fsIt needs small
In 1/30, that is, frame length is greater than 30 seconds when first stage framing, is equal to then the error of note starting point will be also greater than
30 seconds, this was obviously infeasible, and therefore, for ACF, music-tempo resolution ratio is non-constant, when music-tempo is more than 51bpm,
Resolution ratio is less than 1, cannot meet the resolution ratio of common music-tempo.
Investigate the DFT method in document [2], speed interval 60*fs/ Nbpm, by fs=1/0.023=43.5 is calculated, such as
Fruit will reach the music-tempo resolution ratio of 1bpm, need N >=fs* 60=2610, and a length of 60 seconds when the window of such a length,
Major part music is at 300 seconds hereinafter, therefore, length of window requires and general music length is incompatible at present.Improve speed
Resolution ratio is spent, another method is to reduce fs, and the required precision contradiction of this and o (n), therefore be infeasible.
Investigate document [1] FT methods, this method be actually to music signal adding window after, pass through calculate discrete time
The method of Fourier transform (Discrete Time Fourier Transform, abbreviation DTFT), which calculates, commonly uses music-tempo pair
The coefficient for the frequencies omega answered, this method actually only carry out ω approximate sampling, practical frequency discrimination on discrete point
Rate does not get a promotion.
In conclusion the prior art cannot be satisfied requirement of the common music-tempo to velocity resolution, that is, generate
The subregion of music-tempo spectrogram will be smudgy.
2, the music-tempo spectrogram sparsity that the prior art generates is not good enough
Music-tempo spectrogram sparsity is stronger, and discrimination degree is better.From the perspective of from still further aspect, spectrogram sparsity illustrates by force
Spectrogram energy is concentrated, and property is good, good application effect.
To ACF methods, when music-tempo is more than 51bpm, component coefficient needs under normal precision (such as 1bpm)
By interpolation method design factor, necessarily sparsity is caused to decline.And for FT methods, since spectral leakage and resolution ratio are asked
The sparsity of the presence of topic, frequency coefficient obviously will equally decline.Therefore, the music-tempo spectrogram sparsity that the prior art generates
Not good enough, encircled energy is poor.
In conclusion the music-tempo spectrogram that the prior art generates is in terms of resolution ratio and sparsity, existing defects, and sharp
With the producible resolution ratio higher of the present invention, the better music-tempo spectrogram of sparsity.Bibliography used in this patent is as follows:
1.P.Grosche,M.Müller,F.Kurth.Cyclic tempogram—a mid-level tempo
representation for musicsignals[C]. in Acoustics Speech and Signal Processing
(ICASSP),2010IEEE International Conference on.2010:IEEE.
2.G.Peeters.Time variable Tempo Detection and beat Marking[C].in
ICMC.2005.
3.MIREX.MIREX music test data sets
http://www.music-ir.org/evaluation/MIREX/data/2006/tempo/tempo_train_
2006.zip.2017.
Invention content
The music-tempo spectra resolution rate and sparsity generated the purpose of the invention is to overcome the shortcomings of the prior art
Problem provides a kind of music-tempo spectrogram generation method based on match tracing.
In order to solve the above technical problems, the technical solution adopted by the present invention is:
1, music signal is inputted, note starting point detection function o (n) is generated;
2, to o (n) framings, several frame signals are formed;
3, common music-tempo section is taken, by certain music-tempo resolution ratio, sets of speeds is converted into frequency sets;
4, to each frequency in frequency sets, a corresponding parent is created;
5, it presses certain particle size and shifting function is carried out to all parents, often move one atom of generation that moves a step, these are moved
The dynamic atom generated forms the atom set of the corresponding frequency of the parent together with parent;
6, the corresponding atom set of all frequencies in frequency sets is assembled into redundant dictionary;
7, match tracing is carried out with redundant dictionary to each frame signal of o (n), recycles certain number, generates a system
Row decomposition coefficient and corresponding atom;
8, the decomposition coefficient of each frame signal of o (n) is returned according to the relationship of atom in redundant dictionary and music-tempo
Belong to the coefficient of a certain music-tempo;
9, merge the music-tempo spectrum vector per frame signal, form music-tempo spectrogram.
Beneficial effects of the present invention:
It is characteristic of the invention that the matching pursuit algorithm based on redundant dictionary, generates music-tempo spectrogram.Advantage
It is to generate the fine resolution and sparse characteristic of spectrogram.
The good resolution ratio of the present invention has benefited from the flexible setting of atom in redundant dictionary, can be according to music-tempo resolution ratio
The atom that demand generates higher resolution forms redundant dictionary, to make the resolution ratio higher of spectrogram.Fig. 2-Fig. 4 is to use sound
Happy information retrieval exchange comparation and assessment center (Music Information Retrieval Evaluation eXchange, referred to as
MIREX a piece of music (train1.wav) in test data set [3]), is respectively adopted auto-relativity function method (Fig. 2), Fourier
Leaf transformation method (Fig. 3) and match tracing method of the present invention (Fig. 4), the music-tempo spectrogram of generation.Music-tempo axis it is adjacent
Interval is 1bpm (totally 571 point), and in terms of resolution ratio, Fig. 2 auto-relativity function methods are fine in low speed sections resolution ratio, but
Highspeed portion is smudgy, and ribbon is gradually wide, and resolution ratio significantly reduces.In high speed and low speed portion in Fig. 3 Fourier transforms
Point, ribbon is all wider, resolution ratio obviously be not so good as Fig. 4 of the present invention result (for auto-relativity function method, Fourier transform
Compare, cycle-index 571).
The excellent sparse characteristic of the present invention has benefited from redundant dictionary and provides the similar atom with original signal height, and
It ensure that the decomposition coefficient of these similar atoms of height is relatively large with tracing algorithm, non-similar atomic is smaller even
Zero.Compare from Fig. 2-Fig. 4 it can be seen that the coefficient of Fig. 4 is significantly sparse, zero or close zero coefficient accounting is significantly big.
The music-tempo spectrogram that the present invention generates also has the spirit of application in addition to having good resolution ratio and sparsity
Activity.Flexibility be embodied in the resolution ratio of music-tempo, the cycle-index of the shift granularity of parent, match tracing it is adjustable
Property.The adjustment of velocity resolution can be implemented during common speed section is converted into frequency sets;The shifting of parent
Position granularity can be configured when generating atom set, and granularity is smaller, and atom set is bigger, and spectrogram precision is higher, Fig. 8-10 points
Not Wei shift granularity 50,20,5 three kind of situation, can be seen that precision is higher and higher from comparing result;Cycle-index is in match tracing
It is arranged in algorithm, cycle-index is bigger, and the coefficient of generation is more, and spectrogram is more intensive, but coefficient magnitude sequence is still constant
, Fig. 5-7 is three kinds of situations of cycle-index 20,10,5 respectively, it is clear that the coefficient of spectrogram is fewer and fewer, but larger coefficient is not
Become.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, drawings discussed below is only this hair
The part attached drawing of bright embodiment, for those of ordinary skill in the art, without creative efforts,
Other drawings may also be obtained based on these drawings.
Fig. 1 is the music-tempo spectrogram product process figure that inventive embodiments provide;
Fig. 2 is the music-tempo spectrogram generated using auto-relativity function method;
Fig. 3 is the music-tempo spectrogram generated using Fourier transform;
Fig. 4 is music-tempo spectrogram (cycle-index 571, the shifting that the present invention uses the method based on match tracing to generate
2) position granularity is;
Fig. 5 is music-tempo spectrogram (cycle-index 20, the displacement that the present invention uses the method based on match tracing to generate
2) granularity is;
Fig. 6 is music-tempo spectrogram (cycle-index 10, the displacement that the present invention uses the method based on match tracing to generate
2) granularity is;
Fig. 7 is music-tempo spectrogram (cycle-index 5, the displacement that the present invention uses the method based on match tracing to generate
2) granularity is;
Fig. 8 is music-tempo spectrogram (cycle-index 20, the displacement that the present invention uses the method based on match tracing to generate
50) granularity is;
Fig. 9 is music-tempo spectrogram (cycle-index 20, the displacement that the present invention uses the method based on match tracing to generate
20) granularity is;
Figure 10 is music-tempo spectrogram (cycle-index 20, the shifting that the present invention uses the method based on match tracing to generate
5) position granularity is.
Specific implementation mode
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art obtained without creative efforts it is all its
His embodiment, shall fall within the protection scope of the present invention.
Music-tempo resolution ratio uses for reference the frequency resolution in field, with the two of music-tempo spectrogram medium velocity component here
The gap size of a adjacent active dot indicates that interval is bigger, and velocity resolution is poorer.Music-tempo spectrogram sparsity refers to institute
There is the number of the nonzero element in spectrogram coefficient, nonzero element number is few, and sparsity is strong, and discrimination degree is good.The embodiment of the present invention
A kind of music-tempo spectrogram generation method based on match tracing is provided, as shown in Figure 1, this method includes:
1. inputting music signal, note starting point detection function o (n) is generated
The music signal of input is usually the forms such as wav, mp3, the file containing waveform.Music-tempo spectrogram generate the
One stage included the processes such as signal transformation, feature extraction, the generation of starting point detection function, the note starting point that output length is N
Detection function o (n), i.e. a vector.This stage can refer to document [1] and be implemented.
2. pair o (n) framings, form several frame signals
Framing is carried out to o (n), it is preferable that the frame length of framing is 6 seconds (setting in frame has M point), often jumps (hopsize) about
0.2 second, form detection function matrix X=X (m, n) m ∈ [1...M] the n ∈ [1...N] that line number is M, columns is N.
3. taking common music-tempo section τ ∈ [30,480], τ ∈ R turn sets of speeds by music-tempo resolution requirement
Change frequency sets into
The prior art cannot be selected music-tempo resolution ratio to generate required music-tempo spectrogram by user, and the present invention can be certainly
By selection music-tempo resolution ratio, corresponding redundant dictionary is generated, and matched tracing algorithm generates corresponding music-tempo spectrum
Figure, this embodies the present invention can be configured the flexibility of music-tempo resolution ratio according to application.The music-tempo point of the present invention
Positive integer value that resolution value can be 1,2 ... is simultaneously identical in all subintervals, can also be by auto-relativity function method or Fourier
The obtaining value method value of converter technique, it might even be possible to be to divide subinterval, and different music-tempo resolution ratio is pressed in each subinterval
Value, for example it is 0.25 to take music-tempo resolution ratio in the most common speed interval of music [80,150], and other subintervals take
0.5.Compare for convenience, it is 1 that entire section, which takes music-tempo resolution ratio, in embodiment, then for τ ∈ [30,480], τ ∈
Z is converted into frequency sets set { fb|fb=τ/60, τ=[30,31 ... 480], b=[1..B] } and, wherein b is corresponding frequency
Frequency serial number in rate set, B are serial number maximum value.
4. each frequency in pair frequency sets creates a corresponding parent
Specifically, for the frequency sets obtained in step 3, by each frequency f in the setb, create the frequency
Cosine function as corresponding parent αb, the framing length M of the length of o (n), form is: αb=cos (2 π fbt),t
=(0...M-1)/fo,foFor the sampling rate of o (n), t indicates the time.
5. carrying out moving to right bit manipulation to all parents by certain particle size, one atom of generation that moves a step often is moved, these
The mobile atom generated forms the atom set of the corresponding frequency of the parent together with parent
Parent αbSupporting domain be [0, M-1], shift granularity d=1,2,3... be a positive integer, by parent αbTo
Move right d*j (j=1,2,3...), parent αbAfter moving to right, the value cos (- 2 of the left side [0, M-d*j-1] supporting domain
πfbT), t=(M-d*j...1)/foSupplement, it is often mobile primary in this way, a new atom can be obtained.Parent is herein
Therefore periodic function is arranged maximum mobile digit and is no more than a cycle.All parent αbThe atom obtained with these displacements
Together constitute the corresponding atom set d of the parentb。
The adjustability of parent shift granularity embodies the flexibility using the present invention again in this step.Granularity is smaller,
Atom set is bigger, and spectrogram precision is higher, but simultaneously entire music-tempo spectrogram calculating take it is more.Fig. 8-10 is respectively
Shift granularity 50,20,5 three kind of situation can be seen that the precision of spectrogram is higher and higher from comparing result.Using can root when the present invention
Requirement and spectrogram required precision are taken according to calculating, determines the shift granularity of parent.
6. being assembled into redundant dictionary by the corresponding atom set of all frequencies in frequency sets in the 5th step
All frequency f in frequency setsbCorresponding atom set db, it is assembled into a redundant dictionary D.
7. each frame signal of couple o (n) carries out match tracing with redundant dictionary, certain number is recycled, generates a system
Row decomposition coefficient and corresponding atom:
To each frame signal of o (n), i.e., to each row X of detection function matrixi, i ∈ [1..N], with redundant dictionary
D implements matching pursuit algorithm:
(1) residual signal y is setn=Xi, n=0 starts to execute cycle;
(2) all atom g of computing redundancy dictionaryj∈ D and residual signal ynInner product < yn,gj> is selected in all
The corresponding atom g of maximum absolute value person in productkFor this matched atom of cycle, the decomposition coefficient s of n-th cycle is preservedn=|
< yn,gk> | and corresponding atom gn=gk;
(3) residual signal y is recalculatedn+1=yn| < yn,gk> | gk;
(4) if cycle-index or residual signal reach required precision with original signal energy ratio, cycle is exited, n is otherwise set
=n+1 is continued to execute since step (2).
Preferably, the present invention generally presses cycle-index and terminates cycle, can be arranged according to the requirement of music-tempo spectrogram and be recycled
Number, such as K=10 times, 20 times ... etc..S is obtained after loop terminationn,gn, n=[1...K].
The present invention is based on the redundant dictionaries that common music-tempo section generates to provide the similar atom with original signal height,
Matching pursuit algorithm ensure that the decomposition coefficient of these similar atoms of height is relatively large, and non-similar atomic is smaller even
It is zero, so that the music-tempo spectrogram that the present invention generates has more sparse characteristic.Compare from Fig. 2-Fig. 4 and can be seen that Fig. 4's
Coefficient is significantly sparse, and zero or close zero coefficient accounting is significantly big.
The cycle-index of matching pursuit algorithm is adjustable, and cycle-index is bigger, and the coefficient of generation is more, and spectrogram is more intensive
(coefficient magnitude and genesis sequence be still constant), but calculate and take and will increase with the increase of cycle-index.Fig. 5-7
It is three kinds of situations of cycle-index 20,10,5 respectively, it is clear that the coefficient of spectrogram is fewer and fewer.Big system a small amount of in some applications
It counts and just completes task enough, it is smaller value that cycle-index can be arranged at this time, and required calculating takes smaller;And other applications need
It wants big coefficient of discharge to provide enough information, only need to increase cycle-index.This also embodies the flexibility that the present invention applies,
And this to be the prior art do not have.
8. according to the relationship of atom in redundant dictionary and music-tempo, the decomposition coefficient of each frame signal of o (n), return
Belong to the coefficient of a certain music-tempo
To each frame signal, the music-tempo that an initial value is 0 is created first and composes vector Sn, n=[1..N], each component
Serial number be music-tempo serial number b, b=[1..B], the value of each component is the decomposition coefficient of the music-tempo.Then, to each
The decomposition coefficient s of frame signaln, according to atom g in redundant dictionarynRespective frequencies find corresponding music-tempo serial number b, point
Solve coefficient snAs the decomposition coefficient of the music-tempo, identical music-tempo serial number is answered if there is multiple atom pairs, then will
After the cumulative summation of multiple decomposition coefficients, then as the decomposition coefficient of the music-tempo.
9. merging the music-tempo spectrum vector per frame signal, music-tempo spectrogram is formed
The music-tempo spectrum vector S of all framesn, assembled by row mode and be merged into music-tempo spectrogram S=S (b, n), b=
[1..B], n=[1...N].
Example the above is only the implementation of the present invention is not intended to limit the scope of the invention, every to utilize this hair
Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content, it is relevant to be applied directly or indirectly in other
Technical field is included within the scope of the present invention.
Claims (2)
1. a kind of music-tempo spectrogram generation method based on match tracing, specifically comprises the following steps:
S1. music signal is inputted, note starting point detection function o (n) is generated;
S2. to o (n) framings, several frame signals are formed;
Framing is carried out to o (n), it is preferable that the frame length of framing is 6 seconds, if there is M point in frame, often jumps 0.2 second, then forms line number
Detection function matrix X=X (m, n) m ∈ [1...M] the n ∈ [1...N] for being N for M, columns;
S3. take common music-tempo section τ ∈ [30,480], τ ∈ R that sets of speeds is converted by music-tempo resolution requirement
At frequency sets:
Positive integer value that the value of music-tempo resolution ratio is 1,2 ..., and it is identical in all subintervals;Or press auto-correlation function
The obtaining value method value of method or Fourier transform;Subinterval is either divided, and in each subinterval by different music speed
Resolution ratio value is spent, when it is 1 that entire section, which takes music-tempo resolution ratio, then for τ ∈ [30,480], τ ∈ Z, Z expression are just
Integer is converted into frequency sets { fb|fb=τ/60, τ=[30,31 ... 480], b=[1..B] } and, wherein b is corresponding frequency
Frequency serial number in set, B are serial number maximum value;
S4. to each frequency in frequency sets, a corresponding parent is created:
For the frequency sets obtained in step S3, by each frequency f in the setb, create the cosine function conduct of the frequency
Corresponding parent αb, the framing length M of the length of o (n), form is:αb=cos (2 π fbT), t=(0...M-1)/fo,fo
For the sampling rate of o (n), t indicates the time;
S5. all parents are carried out by certain particle size moving to right bit manipulation, often moves one atom of generation that moves a step, these is moved
The atom of generation forms the atom set of the corresponding frequency of the parent together with parent:
Parent αbSupporting domain be [0, M-1], shift granularity d=1,2,3... be a positive integer, by parent αbIt moves right
It moves d*j (j=1,2,3...), parent αbAfter moving to right, value cos (- 2 π f of the left side [0, M-d*j-1] supporting domainbt),t
=(M-d*j...1)/foSupplement, it is often mobile primary in this way, a new atom can be obtained;Parent is period letter herein
Therefore number is arranged maximum mobile digit and is no more than a cycle;All parent αbThe atom obtained with these displacements group together
At the corresponding atom set d of the parentb;
S6. being assembled into redundant dictionary by the corresponding atom set of all frequencies in frequency sets in step S5:
All frequency f in frequency setsbCorresponding atom set db, it is assembled into a redundant dictionary D;
S7. match tracing is carried out with redundant dictionary to each frame signal of o (n), recycles certain number, generates a series of points
Solve coefficient and corresponding atom:
To each frame signal of o (n), i.e., to each row X of detection function matrixi, i ∈ [1..N], with redundant dictionary D, implementation
Matching pursuit algorithm:
(1) residual signal y is setn=Xi, n=0 starts to execute cycle;
(2) all atom g of computing redundancy dictionaryj∈ D and residual signal ynInner product<yn,gj>, select in all inner products absolutely
It is worth the corresponding atom g of the maximumkFor this matched atom of cycle, the decomposition coefficient s of n-th cycle is preservedn=|<yn,gk>|
With corresponding atom gn=gk;
(3) residual signal y is recalculatedn+1=yn-|<yn,gk>|gk;
(4) if cycle-index or residual signal reach required precision with original signal energy ratio, cycle is exited, n=n+ is otherwise set
1, it is continued to execute since step (2);
S8. the decomposition coefficient of each frame signal of o (n) is belonged to according to the relationship of atom in redundant dictionary and music-tempo
The coefficient of a certain music-tempo:
To each frame signal, the music-tempo that an initial value is 0 is created first and composes vector Sn, n=[1..N], the sequence of each component
Number it is music-tempo serial number b, b=[1..B], the value of each component is the decomposition coefficient of the music-tempo;Then, each frame is believed
Number decomposition coefficient sn, according to atom g in redundant dictionarynRespective frequencies find corresponding music-tempo serial number b, resolving system
Number snAs the decomposition coefficient of the music-tempo, identical music-tempo serial number is answered if there is multiple atom pairs, then it will be multiple
After the cumulative summation of decomposition coefficient, then as the decomposition coefficient of the music-tempo;
S9. merge the music-tempo spectrum vector per frame signal, form music-tempo spectrogram:
The music-tempo spectrum vector S of all framesn, assembled by row mode and be merged into music-tempo spectrogram S=S (b, n), b=
[1..B], n=[1...N].
2. a kind of music-tempo spectrogram generation method based on match tracing, it is characterised in that (4) step in step S7 is exited and followed
The condition of ring is to terminate to recycle by cycle-index, cycle-index is arranged according to the requirement of music-tempo spectrogram, i.e., to K as cycle
Number carries out assignment and exits cycle when K reaches preset value;S is obtained after loop terminationn,gn, n=[1...K].
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710675484.3A CN107622774B (en) | 2017-08-09 | 2017-08-09 | A kind of music-tempo spectrogram generation method based on match tracing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710675484.3A CN107622774B (en) | 2017-08-09 | 2017-08-09 | A kind of music-tempo spectrogram generation method based on match tracing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107622774A CN107622774A (en) | 2018-01-23 |
CN107622774B true CN107622774B (en) | 2018-08-21 |
Family
ID=61088662
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710675484.3A Active CN107622774B (en) | 2017-08-09 | 2017-08-09 | A kind of music-tempo spectrogram generation method based on match tracing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107622774B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109256146B (en) * | 2018-10-30 | 2021-07-06 | 腾讯音乐娱乐科技(深圳)有限公司 | Audio detection method, device and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101471068A (en) * | 2007-12-26 | 2009-07-01 | 三星电子株式会社 | Method and system for searching music files based on wave shape through humming music rhythm |
CN101512636A (en) * | 2006-09-11 | 2009-08-19 | 惠普开发有限公司 | Computational music-tempo estimation |
CN101625855A (en) * | 2008-07-09 | 2010-01-13 | 爱思开电讯投资(中国)有限公司 | Method and device for manufacturing guide sound track and background music |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP4940588B2 (en) * | 2005-07-27 | 2012-05-30 | ソニー株式会社 | Beat extraction apparatus and method, music synchronization image display apparatus and method, tempo value detection apparatus and method, rhythm tracking apparatus and method, music synchronization display apparatus and method |
JP5008766B2 (en) * | 2008-04-11 | 2012-08-22 | パイオニア株式会社 | Tempo detection device and tempo detection program |
-
2017
- 2017-08-09 CN CN201710675484.3A patent/CN107622774B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101512636A (en) * | 2006-09-11 | 2009-08-19 | 惠普开发有限公司 | Computational music-tempo estimation |
CN101471068A (en) * | 2007-12-26 | 2009-07-01 | 三星电子株式会社 | Method and system for searching music files based on wave shape through humming music rhythm |
CN101625855A (en) * | 2008-07-09 | 2010-01-13 | 爱思开电讯投资(中国)有限公司 | Method and device for manufacturing guide sound track and background music |
Non-Patent Citations (3)
Title |
---|
A New Tempo Feature Extraction Based on Modulation Spectrum Analysis for Music Information Retrieval Tasks;Kim HG;《The Journal of The Korea Institute of Intelligent Transport Systems》;20070831;第6卷(第2期);第95-106页 * |
基于匹配追踪的音符起始点检测;桂文明 等;《电子学报》;20130630;第4卷(第6期);第1225-1230页 * |
音符起始点检测算法研究;桂文明;《中国博士学位论文全文数据库 信息科技辑》;20150615(第06期);第1-104页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107622774A (en) | 2018-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Allen et al. | Signal analysis: time, frequency, scale, and structure | |
Sun | A pitch determination algorithm based on subharmonic-to-harmonic ratio | |
US6745155B1 (en) | Methods and apparatuses for signal analysis | |
CN1319042C (en) | Voice analysis device, voice analysis method and voice analysis program | |
Hussain | Coherent structures and studies of perturbed and unperturbed jets | |
CN103854661A (en) | Method and device for extracting music characteristics | |
CN107622774B (en) | A kind of music-tempo spectrogram generation method based on match tracing | |
Mohammad et al. | Robust singular spectrum transform | |
Chittora et al. | Classification of normal and pathological infant cries using bispectrum features | |
Ranjani et al. | A compact pitch and time representation for melodic contours in Indian art music | |
Le et al. | Hyperbolic wavelet power spectra of nonstationary signals | |
Chang et al. | Speech feature extracted from adaptive wavelet for speech recognition | |
Pratama et al. | Human vocal type classification using MFCC and convolutional neural network | |
Liu et al. | A note on time-frequency analysis of finger tapping | |
Leyuan et al. | Research on time-frequency energy distribution characteristics of PSWFs signals based on WVD | |
Rust et al. | The fast fourier transform for experimentalists, Part IV: Autoregressive spectral analysis | |
JPH0218598A (en) | Speech analyzing device | |
Tomar et al. | On the development of variable length Teager energy operator (VTEO). | |
Rust et al. | The fast Fourier transform for experimentalists. Part III. Classical spectral analysis | |
Kumar et al. | Raaga identification using clustering algorithm | |
Smaragdis et al. | Non-negative matrix factorization for irregularly-spaced transforms | |
Lee et al. | Chaos in segments from Korean traditional singing and Western singing | |
Cantri et al. | Cumulative Scores Based for Real-Time Music Beat Detection System | |
Abraham et al. | Signal periodicity detection using Ramanujan subspace projection | |
Yeh et al. | The expected amplitude of overlapping partials of harmonic sounds |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |