CN106782583B - Robust scale contour feature extraction algorithm based on nuclear norm - Google Patents

Robust scale contour feature extraction algorithm based on nuclear norm Download PDF

Info

Publication number
CN106782583B
CN106782583B CN201611132721.3A CN201611132721A CN106782583B CN 106782583 B CN106782583 B CN 106782583B CN 201611132721 A CN201611132721 A CN 201611132721A CN 106782583 B CN106782583 B CN 106782583B
Authority
CN
China
Prior art keywords
matrix
frequency
spectrum
music
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201611132721.3A
Other languages
Chinese (zh)
Other versions
CN106782583A (en
Inventor
李锵
王蒙蒙
关欣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin University
Original Assignee
Tianjin University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin University filed Critical Tianjin University
Priority to CN201611132721.3A priority Critical patent/CN106782583B/en
Publication of CN106782583A publication Critical patent/CN106782583A/en
Application granted granted Critical
Publication of CN106782583B publication Critical patent/CN106782583B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/18Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • G10L2021/0135Voice conversion or morphing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Auxiliary Devices For Music (AREA)

Abstract

The invention discloses a robust scale contour feature extraction algorithm based on a nuclear norm, which comprises the following steps of 1, converting a music signal to be input; step 2, windowing the music signals, carrying out Fourier transform to obtain a time-frequency matrix of the music signals, and determining an initial beat point; step 3, performing spectrum low-ranking on the rank of the time-frequency matrix by using nuclear norm constraint; meanwhile, noise points in a norm constraint matrix are used for conducting low-rank transformation on a signal spectrum by using a convex-down optimization problem, and noise is removed; step 4, in the iterative constraint process, the low-rank characteristic of the frequency spectrum is utilized to realize a threshold value self-adaptive adjustment algorithm; and 5, performing effective dimension reduction processing on the time frequency matrix to obtain 12-dimensional chord characteristics. Compared with the prior art, the method extracts the chord characteristics of robustness; the time of the algorithm is effectively reduced; the scale contour characteristics of different types and styles of music signals can be accurately recovered.

Description

Robust scale contour feature extraction algorithm based on nuclear norm
Technical Field
The invention belongs to the field of audio signal analysis in a computer auditory system, and particularly relates to a scale profile feature extraction algorithm.
Background
Harmonic components of music are important elements of music, and are important subjects in the field of music information retrieval. Fundamental frequencies of different frequencies of an audio signal and harmonic components thereof are important components constituting chords and affecting music colors. In addition, the extension of the different frequency components in time constitutes a key factor in chord progression. Intuitively, music in chord duration will present certain structurality in frequency domain-low rank characteristic. Chord feature extraction of music belongs to a part of audio signal analysis in a computer auditory system, and the field mainly processes various information separated from sound signals. Meanwhile, the chord characteristics of music are also the basis for extracting some advanced music information.
The mid-level features of music refer to information that is extracted from the audio signal and can represent the audio signal, and can ultimately be part of the high-level features. In recent years, there have been many scholars proposing a variety of mid-level features that can characterize music. The most widely used of these is the Pitch Class Profiles (PCP). However, since the original music signal contains human voice, drumhead, plosive and gaussian noise, the performance of the PCP features has a great relationship with the type of the music signal to be analyzed. Many scholars have proposed improvements based on PCP, for example, hpcp (harmonic PCP) by Gomez, epcp (enhanced PCP) by Lee. These schemes start with varying frequency domain extraction components and then obtain superior features that are appropriate for a particular music genre.
In addition, since each chord has a certain duration, the stability of the PCP feature during this time determines the accuracy of chord identification. There are many scholars who propose an improved scheme based on PCP marching-chromagram. Fujisjima assumes that the chord continues to count frames and adopts sliding window mean filtering, thereby reducing the influence of noise and avoiding frequent change of the chord; GeoffroyPeeters adopts sliding window median filtering to avoid frequent chord change; bello assumes that the chord is invariant within a beat, and beat synchronization techniques are used to avoid frequent chord changes.
Most beat tracking models consist of two parts, namely note endpoint detection and endpoint intensity curve period extraction. In any model, the fundamental purpose of endpoint detection is to select the peak value of the effective endpoint curve, which is essentially the problem of whether the extreme point is a beat point or not.
Therefore, most chord feature extraction schemes do not consider the structural property of the music signal expressed on the frequency spectrum, and apply some known assumptions, so as to adopt some simple processing methods to optimize the chord features.
Disclosure of Invention
Based on the prior art, the invention provides a robust scale profile feature extraction algorithm based on a nuclear norm, which converts a chord feature extraction problem into a convex optimization problem, and realizes a threshold self-adaptive algorithm by using nuclear norm constraint and first norm constraint and low-rank characteristics expressed by a frequency spectrum of a chord.
The invention relates to a robust scale contour feature extraction algorithm based on a nuclear norm, which comprises the following steps:
step 1, converting a music signal to be input into a standard audio with a sampling rate of 22050Hz/16 bit/single channel as a referenced audio signal x (n), wherein n is the number of data points contained in the converted audio signal;
step 2, performing windowing processing on the music signal X (n), wherein the window function is W (k), and k is the window width of the window function, thereby obtaining a signal time domain matrix Xk×mWherein X is·,mX (k · m/2: k · m/2+ m) · w (k), where m is the number of frames obtained after framing, and then Fourier Transform (Fourier Transform) is performed to obtain the time-frequency moment of the music signalThe matrix D is F.X, wherein F is a Fourier transform matrix;
step 3, it is assumed that harmonic components contained in the spectrum of the audio signal and noise are mutually independent, that is, D is a + E, where the matrix a represents a matrix formed by harmonic components contained in the spectrum matrix, and E represents a matrix formed by noise components contained in the spectrum matrix; from the above assumptions, the recovery of the harmonic matrix a can be ascribed to the following convex optimization problem:
Figure GDA0002202087550000031
s.t.A+E=D
wherein | · | purple sweet*Representing the nuclear norm of the matrix, i.e. the sum of the singular values of the matrix, | · | luminance1Representing a norm of a matrix, namely the sum of all non-zero elements, wherein the separated matrix A is a frequency spectrum after low rank treatment, the matrix E contains sparse loud noise and other non-harmonic components, and the matrix D is a frequency spectrum of an original music signal;
step 4, in the iterative constraint process, the low-rank characteristic of the frequency spectrum is utilized to realize a threshold value self-adaptive adjustment algorithm; the method comprises the following specific steps: initializing singular value truncation threshold parameter mu, parameter lambda, iteration index k being 0, and temporary matrix Y0=D,E0Is an all-zero matrix; performing singular value decomposition
Figure GDA0002202087550000032
Obtaining a singular value matrix sigma; then, from mukTo 1.5 mukTwenty data points are selected at equal intervals
Figure GDA0002202087550000033
Wherein 1 ≦ i ≦ 20 for each
Figure GDA0002202087550000034
Performing an inverse singular value decomposition operation
Figure GDA0002202087550000035
Since harmonic components are distributed only at several frequency points, the meterCalculation matrix
Figure GDA0002202087550000036
Variance of a certain column in and from
Figure GDA0002202087550000037
When the variance is maximum, the corresponding index i is selected and used
Figure GDA0002202087550000038
Namely, completing a threshold value self-adaptive selection algorithm; calculating the matrix obtained in this step
Figure GDA0002202087550000039
Updating
Figure GDA00022020875500000310
Yk+1=Ykk(D-Ak+1-Ek+1) And k ═ k +1 until convergence;
and 5, performing effective dimension reduction processing on the time frequency matrix to obtain 12-dimensional chord characteristics. Normally, note A is specified0At a frequency of 440Hz as a reference frequency and passes
Figure GDA00022020875500000311
Obtaining frequency values of other notes, wherein b is the note and A0The difference in pitch between them, then, by a mapping formula
Figure GDA00022020875500000312
Mapping each frequency component of the harmonic matrix A to obtain a robust scale profile feature vector, wherein x corresponds to the frequency value corresponding to each row of the matrix A, and f corresponds to the frequency value corresponding to each row of the matrix ArefThen pass through
Figure GDA00022020875500000313
And (4) obtaining.
Compared with the prior art, the method effectively removes the damage of human voice and other noises to the chord structure and extracts the chord characteristics with robustness while not damaging the original structure of the music frequency domain; the time of the algorithm is effectively reduced; the scale contour characteristics of different types and styles of music signals can be accurately recovered.
Drawings
FIG. 1 is an overall flow chart of the present invention;
FIG. 2 is a diagram of different types of chord progression;
FIG. 3 is a schematic diagram of the comparison of the results of the present invention with other algorithms, 1, the original ALM algorithm; 2. APS-ALM algorithm; 3. an ASP algorithm.
Detailed Description
The present invention is described in further detail below with reference to the attached drawing figures.
Step 1, music signal conversion: the music signal to be input is converted into standard audio of a sampling rate 22050Hz/16 bit/single channel as a referenced audio signal.
Step 2, performing windowing processing on the music signal X (n), wherein the window function is W (k), and k is the window width of the window function, thereby obtaining a signal time domain matrix Xk×mWherein X is·,mX (k · m/2: k · m/2+ m) · w (k), wherein m is the number of frames obtained after framing, and then Fourier Transform (Fourier Transform) is performed to obtain a time-frequency matrix D ═ F · X of the music signal, wherein F is a Fourier Transform matrix;
step 3, spectrum low rank and noise removal: as can be seen from the spectrum, a music signal mainly contains two components: harmonic components and sparse loud noise. The harmonic components appear structurally to have a significant low rank structure; while sparse loud noise appears mainly as sparsity. Therefore, the signal spectrum is low ranked with a convex-down optimization problem and noise is removed:
Figure GDA0002202087550000041
s.t.A+E=D
wherein | · | purple*A kernel norm (kernel norm) representing a matrix, i.e., the sum of singular values of the matrix; i | · | purple wind1Represents the norm of the matrix, i.e. the sum of all non-zero elements.
The separated matrix a is the spectrum after low rank, while matrix E contains sparse loud noise and some other non-harmonic components, and D is the spectrum of the original music signal.
Step 4, PCP characteristic value extraction:
(4-1) defining a mapping matrix of the frequency spectrum to the PCP characteristics, wherein the matrix is in the form of:
Figure GDA0002202087550000051
wherein 2 pi · ωjJ is more than or equal to 0 and less than or equal to N-1 represents the frequency value represented by each frequency component in the frequency spectrum, and N represents the frequency number range obtained by the frequency spectrum; and fiAnd i is more than or equal to 1 and less than or equal to 12, the frequency values corresponding to 12 scales are represented.
Wherein the content of the first and second substances,
Figure GDA0002202087550000052
the method is a mapping function, and the function obtained according to the twelve-mean law has universality;
(4-2) obtaining a chord progression characteristic under a low rank constraint, i.e., an RPCP characteristic, by C ═ P · a.
The invention adopts a chord automatic identification test database (Practice Data) of international music information retrieval evaluation Match (MIREX), 20 music segments with different music styles and rhythms are counted, and 39 or 40 experts are provided for each music segment to manually mark the chord type of the segment.
In order to verify the effectiveness of the algorithm, the influence of the robustness scale profile characteristic algorithm based on the nuclear norm on chord progression is compared with the current popular algorithm. The smoothness degree of the main scale in chord advancing is adopted to quantitatively describe different algorithms, so that the influence of the different algorithms on chord advancing is judged. The results are shown in FIG. 2. From experimental results, compared with other algorithms, the algorithm has a better smoothing effect on chord progression on the main musical scale, so that the chord is kept stable in a certain time, the frequency of change is reduced, and the method has a guiding effect on chord identification of the whole song.
In addition, in order to verify the noise suppression effect of the algorithm and the influence of the algorithm on the chord identification accuracy, a template matching algorithm is adopted, and the most popular Harmonic PCP is used as a comparison characteristic to explain the effectiveness of the algorithm. The results of the experiment are shown in table 1. The experimental result shows that the chord identification accuracy rate obtained by the algorithm is improved by 9 percent compared with that of the Harmonic PCP.
TABLE 1 robust PCP and HPCP based average chord recognition ratio comparison
Great three chords Ab A Bb B C Db D Eb E F Gb G
RPCP(%) 76.1 80 76.6 69.0 76.1 71.8 80.4 72.9 79.6 77.6 73.3 63.6
HPCP(%) 73 78 63.8 66.7 71.7 69.2 78.4 64.6 71.4 61.2 68.9 63.6
Minor three chords Abm Am Bbm Bm Cm Dbm Dm Ebm Em Fm Gbm Gm
RPCP(%) 84.8 74 69 63.4 88.2 87 75 43.6 65.2 80.4 76.3 66.7
HPCP(%) 72.7 73.7 67.9 58.5 74.5 85.7 73.5 41 65.2 67.9 63.2 56.4
Generally, an approach to solve the problem of kernel norm constrained low-rank convex optimization is to use an Augmented Lagrange Multiplier (ALM), which is widely applied to a sparse matrix as an input. However, as the matrix dimensions increase, time will increase substantially.
According to the unique characteristic of the chord characteristic, the invention provides an ASP-ALM (adaptive selective area array) algorithm based on the threshold adaptive adjustment algorithm of the chord characteristic. The algorithm flow is as follows: initializing singular value truncation threshold parameter mu, parameter lambda, iteration index k being 0, and temporary matrix Y0=D,E0Is an all-zero matrix; performing singular value decomposition
Figure GDA0002202087550000061
Obtaining a singular value matrix sigma; then, from mukTo 1.5 mukTwenty data points are selected at equal intervals
Figure GDA0002202087550000062
Wherein 1 ≦ i ≦ 20 for each
Figure GDA0002202087550000063
Performing an inverse singular value decomposition operation
Figure GDA0002202087550000064
Since the harmonic components are distributed only at several frequency points, the matrix is calculated
Figure GDA0002202087550000065
Variance of a certain column in and from
Figure GDA0002202087550000066
When the variance is maximum, the corresponding index i is selected and used
Figure GDA0002202087550000067
Namely, completing a threshold value self-adaptive selection algorithm; calculating the matrix obtained in this step
Figure GDA0002202087550000068
Updating
Figure GDA0002202087550000069
Yk+1=Ykk(D-Ak+1-Ek+1) And k ═ k +1 until convergence.
The adaptive algorithm flow is shown in fig. 1. Where μ represents the degree of matrix recovery in the ALM algorithm. The ASP-ALM algorithm can greatly reduce the time consumption of the ALM algorithm in the chord feature extraction process.
Test results pairs are shown in fig. 3: from the results, it is clear that the time consumption is greatly reduced.

Claims (2)

1. A robust scale contour feature extraction algorithm based on a nuclear norm is characterized by comprising the following steps:
converting a music signal to be input into standard audio with a sampling rate of 22050Hz/16 bit/single channel as a referenced audio signal x (n), wherein n is the number of data points contained in the converted audio signal;
step (2), windowing is carried out on the music signals X (n), the window function is W (k), and k is the window width of the window function, so that a signal time domain matrix X is obtainedk×mWherein X is·,mX (k · m/2: k · m/2+ m) · w (k), wherein m is the number of frames obtained after framing, and then Fourier Transform (Fourier Transform) is performed to obtain a time-frequency matrix D ═ F · X of the music signal, wherein F is a Fourier Transform matrix;
step (3), it is assumed that harmonic components contained in the spectrum of the audio signal and noise are mutually independent, that is, D ═ a + E, where matrix a represents a matrix formed by harmonic components contained in the spectrum matrix, and E represents a matrix formed by noise components contained in the spectrum matrix; from the above assumptions, the recovery of the harmonic matrix a can be ascribed to the following convex optimization problem:
Figure FDA0002202087540000011
s.t.A+E=D
wherein | · | purple*A kernel norm representing a matrix, i.e. the sum of singular values of the matrix; i | · | purple wind1A norm representing the matrix, i.e. the sum of all non-zero elements; the separated matrix A is the frequency spectrum after low rank processing, the matrix E contains sparse big noise and other non-harmonic components, and the matrix D is the frequency spectrum of the original music signal;
step (4), in the iterative constraint process, the low-rank characteristic of the frequency spectrum is utilized to realize a threshold value self-adaptive adjustment algorithm; the method comprises the following specific steps: initializing singular value truncation threshold parameter mu, parameter lambda, iteration index k being 0, and temporary matrix Y0=D,E0Is an all-zero matrix; performing singular value decomposition
Figure FDA0002202087540000012
Obtaining a singular value matrix sigma; then, from mukTo 1.5 mukTwenty data points are selected at equal intervals
Figure FDA0002202087540000013
Wherein 1 ≦ i ≦ 20 for each
Figure FDA0002202087540000014
Performing an inverse singular value decomposition operation
Figure FDA0002202087540000015
Since the harmonic components are distributed only at several frequency points, the matrix is calculated
Figure FDA0002202087540000016
Variance of a certain column in and from
Figure FDA0002202087540000021
When the variance is maximum, the corresponding index i is selected and used
Figure FDA0002202087540000022
Namely, completing a threshold value self-adaptive selection algorithm; calculating the matrix obtained in this step
Figure FDA0002202087540000023
Updating
Figure FDA0002202087540000024
Yk+1=Ykk(D-Ak+1-Ek+1) And k ═ k +1 until convergence;
step (5), effective dimension reduction processing is carried out on the time frequency matrix to obtain chord characteristics of 12 dimensions, and normally, the notes A are specified0At a frequency of 440Hz as a reference frequency and passes
Figure FDA0002202087540000025
Obtaining frequency values of other notes, wherein b is the note and A0The difference in pitch between them, then, by a mapping formula
Figure FDA0002202087540000026
Mapping each frequency component of the harmonic matrix A to obtain a robust scale profile feature vector, wherein x corresponds to the frequency value corresponding to each row of the matrix A, and f corresponds to the frequency value corresponding to each row of the matrix ArefThen pass through
Figure FDA0002202087540000027
And (4) obtaining.
2. The robust scale contour feature extraction algorithm based on kernel norm as claimed in claim 1 wherein the threshold adaptive adjustment algorithm comprises the following steps:
initializing singular value truncation threshold parameter mu, parameter lambda, iteration index k being 0, and temporary matrix Y0=D,E0Is an all-zero matrix; performing singular value decomposition
Figure FDA0002202087540000028
Obtaining a singular value matrix sigma; then, from mukTo 1.5 mukAt equal intervalsTwenty data points are selected
Figure FDA0002202087540000029
Wherein 1 ≦ i ≦ 20 for each
Figure FDA00022020875400000210
Performing an inverse singular value decomposition operation
Figure FDA00022020875400000211
Since the harmonic components are distributed only at several frequency points, the matrix is calculated
Figure FDA00022020875400000212
Variance of a certain column in and from
Figure FDA00022020875400000213
When the variance is maximum, the corresponding index i is selected and used
Figure FDA00022020875400000214
Namely, completing a threshold value self-adaptive selection algorithm; calculating the matrix obtained in this step
Figure FDA00022020875400000215
Updating
Figure FDA00022020875400000216
Yk+1=Ykk(D-Ak+1-Ek+1) And k ═ k +1 until convergence.
CN201611132721.3A 2016-12-09 2016-12-09 Robust scale contour feature extraction algorithm based on nuclear norm Expired - Fee Related CN106782583B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611132721.3A CN106782583B (en) 2016-12-09 2016-12-09 Robust scale contour feature extraction algorithm based on nuclear norm

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611132721.3A CN106782583B (en) 2016-12-09 2016-12-09 Robust scale contour feature extraction algorithm based on nuclear norm

Publications (2)

Publication Number Publication Date
CN106782583A CN106782583A (en) 2017-05-31
CN106782583B true CN106782583B (en) 2020-04-28

Family

ID=58879705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611132721.3A Expired - Fee Related CN106782583B (en) 2016-12-09 2016-12-09 Robust scale contour feature extraction algorithm based on nuclear norm

Country Status (1)

Country Link
CN (1) CN106782583B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111753127B (en) * 2019-03-29 2024-05-07 阿里巴巴集团控股有限公司 Music information processing and recommending method and device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129456A (en) * 2011-03-09 2011-07-20 天津大学 Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping
CN103714806A (en) * 2014-01-07 2014-04-09 天津大学 Chord recognition method combining SVM with enhanced PCP
CN104361611A (en) * 2014-11-18 2015-02-18 南京信息工程大学 Group sparsity robust PCA-based moving object detecting method
CN104395953A (en) * 2012-04-30 2015-03-04 诺基亚公司 Evaluation of beats, chords and downbeats from a musical audio signal
CN104867162A (en) * 2015-05-26 2015-08-26 南京信息工程大学 Motion object detection method based on multi-component robustness PCA
CN104978582A (en) * 2015-05-15 2015-10-14 苏州大学 Contour chord angle feature based identification method for blocked target
CN106056607A (en) * 2016-05-30 2016-10-26 天津城建大学 Monitoring image background modeling method based on robustness principal component analysis

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129456A (en) * 2011-03-09 2011-07-20 天津大学 Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping
CN104395953A (en) * 2012-04-30 2015-03-04 诺基亚公司 Evaluation of beats, chords and downbeats from a musical audio signal
CN103714806A (en) * 2014-01-07 2014-04-09 天津大学 Chord recognition method combining SVM with enhanced PCP
CN104361611A (en) * 2014-11-18 2015-02-18 南京信息工程大学 Group sparsity robust PCA-based moving object detecting method
CN104978582A (en) * 2015-05-15 2015-10-14 苏州大学 Contour chord angle feature based identification method for blocked target
CN104867162A (en) * 2015-05-26 2015-08-26 南京信息工程大学 Motion object detection method based on multi-component robustness PCA
CN106056607A (en) * 2016-05-30 2016-10-26 天津城建大学 Monitoring image background modeling method based on robustness principal component analysis

Non-Patent Citations (9)

* Cited by examiner, † Cited by third party
Title
" Research of Chord Recognition based on MPCP";Wang Feng;《Computer and Automation Engineering 2010 The 2nd International Conference》;20101231;全文 *
"A HYBRID GAUSSIAN-HMM-DEEP-LEARNING APPROACH FOR AUTOMATIC CHORD ESTIMATION WITH VERY LARGE VOCABULARY";Junqi Deng;《Proceedings of the 17th ISMIR Conference》;20160811;全文 *
"Automatic chord recognition from audio using enhanced pitch class profile";K. Lee;《International Computer Music Conference (ICMC)》;20061231;全文 *
"Automatic Extraction of Tonal Metadata from Polyphonic Audio Recordings";E. Gomez;《Audio Engineering Society》;20041231;全文 *
"Human motions segmentation by RPCA with augmented lagrange Multiplier";Chao Gan;《IEEE》;20121231;全文 *
"Realtime chord recognition of musical sound: A system using Common Lisp Music";T. Fujishima;《International Computer Music Conference》;19991231;全文 *
"The augmented lagrange multiplier method for exact recovery of corrupted low-rank matrices";Lin Z;《ArXiv e-prints》;20101231;全文 *
"VOCAL SEPARATION USING EXTENDED ROBUST PRINCIPAL COMPONENT ANALYSIS WITH SCHATTEN P/LP-NORM AND SCALE COMPRESSION";Young Jeong;《2014 IEEE INTERNATIONAL WORKSHOP ON MACHINE LEARNING FOR SIGNAL PROCESSING》;20140924;全文 *
"基于SVM和增强型PCP特征的和弦识别";闫志勇;《计算机工程》;20140731;第40卷(第7期);全文 *

Also Published As

Publication number Publication date
CN106782583A (en) 2017-05-31

Similar Documents

Publication Publication Date Title
CN107564513B (en) Voice recognition method and device
CN103714806B (en) A kind of combination SVM and the chord recognition methods of in-dash computer P feature
EP3440672B1 (en) Estimating pitch of harmonic signals
CN111583954B (en) Speaker independent single-channel voice separation method
CN111508518B (en) Single-channel speech enhancement method based on joint dictionary learning and sparse representation
CN109308912B (en) Music style recognition method, device, computer equipment and storage medium
CN113012720B (en) Depression detection method by multi-voice feature fusion under spectral subtraction noise reduction
KR20060082465A (en) Method and apparatus for classifying voice and non-voice using sound model
CN103559232A (en) Music humming searching method conducting matching based on binary approach dynamic time warping
CN113539293B (en) Single-channel voice separation method based on convolutional neural network and joint optimization
CN105679321B (en) Voice recognition method, device and terminal
CN115116446A (en) Method for constructing speaker recognition model in noise environment
CN106782583B (en) Robust scale contour feature extraction algorithm based on nuclear norm
CN110379438B (en) Method and system for detecting and extracting fundamental frequency of voice signal
JP5807914B2 (en) Acoustic signal analyzing apparatus, method, and program
CN103971697A (en) Speech enhancement method based on non-local mean filtering
CN110675845A (en) Human voice humming accurate recognition algorithm and digital notation method
Bruna et al. Source separation with scattering non-negative matrix factorization
Bammer et al. Invariance and stability of Gabor scattering for music signals
Noyum et al. Boosting the predictive accurary of singer identification using discrete wavelet transform for feature extraction
CN108573698B (en) Voice noise reduction method based on gender fusion information
Dai et al. Multilingual i-Vector Based Statistical Modeling for Music Genre Classification.
CN112750451A (en) Noise reduction method for improving voice listening feeling
CN113780180B (en) Audio long-term fingerprint extraction and matching method
CN112951264B (en) Multichannel sound source separation method based on hybrid probability model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200428

Termination date: 20211209

CF01 Termination of patent right due to non-payment of annual fee