EP3040989B1 - Verbessertes trennverfahren und computerprogrammprodukt - Google Patents

Verbessertes trennverfahren und computerprogrammprodukt Download PDF

Info

Publication number
EP3040989B1
EP3040989B1 EP15198713.8A EP15198713A EP3040989B1 EP 3040989 B1 EP3040989 B1 EP 3040989B1 EP 15198713 A EP15198713 A EP 15198713A EP 3040989 B1 EP3040989 B1 EP 3040989B1
Authority
EP
European Patent Office
Prior art keywords
matrix
rev
spectrogram
component
specific
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Not-in-force
Application number
EP15198713.8A
Other languages
English (en)
French (fr)
Other versions
EP3040989A1 (de
Inventor
Romain Hennequin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Audionamix
Original Assignee
Audionamix
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Audionamix filed Critical Audionamix
Priority to US14/984,089 priority Critical patent/US9711165B2/en
Publication of EP3040989A1 publication Critical patent/EP3040989A1/de
Application granted granted Critical
Publication of EP3040989B1 publication Critical patent/EP3040989B1/de
Not-in-force legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/0308Voice signal separating characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/02Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using spectral analysis, e.g. transform vocoders or subband vocoders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0316Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
    • G10L21/0364Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L2021/02082Noise filtering the noise being echo, reverberation of the speech

Definitions

  • the subject of the present invention is the method of separating a plurality of contributions into a mixing acoustic signal, and in particular the separation of a voice contribution from a background musical contribution into an acoustic signal. mixture.
  • a soundtrack of a song includes a vocal contribution (the lyrics sung by one or more singers) and a musical contribution (accompanying music played by one or more instruments).
  • a soundtrack of a film includes a vocal contribution (dialogues between actors) superimposed on a musical contribution (special sound effects and / or background music).
  • this one results from the superposition of the dry voice, or pure in what follows, corresponding to the recording of the sound emitted by the singer and which propagated directly towards the microphone recording, and reverberation, corresponding to the recording of the sound emitted by the singer but which has propagated indirectly to the recording microphone, that is to say by reflection, possibly multiple, on the walls from the recording room.
  • Reverb consisting of the echoes of the pure voice at a given moment, spreads over a time interval that can be significant (for example three seconds).
  • the vocal contribution results from the superposition of the pure voice at this moment and the different echoes of the pure voice at previous moments.
  • the type of algorithm proposed by this document applies only to multichannel signals and does not allow a correct extraction of reverb effects, which can be found in music.
  • the reverberation that affects this component is distributed in the different components obtained after the separation.
  • the separate vocal component loses its richness and the accompanying music component is not of good quality.
  • reverb can be caused by the conditions under which sound is taken, but can also be artificially added during the post-production of the soundtrack, mainly for aesthetic reasons.
  • the invention therefore aims to overcome this problem.
  • the invention therefore relates to a separation method and a program product according to the claims.
  • the separation method 100 uses a mixing temporal acoustic signal w (t), to deliver a vocal acoustic signal y (t) and a musical acoustic signal z (t).
  • the signals are all acoustic signals, so that the qualifier of acoustics will be omitted in what follows.
  • These signals are time signals. They depend on time t .
  • the acoustic mix signal is a source soundtrack, or at least an extract from a soundtrack.
  • the acoustic mixing signal w (t) comprises a first so-called specific contribution and a second so-called accompanying contribution.
  • the first contribution is a vocal contribution and corresponds to words sung by a singer.
  • the second contribution is a musical contribution and corresponds to the musical accompaniment of the singer.
  • the vocal acoustic signal y (t) corresponds to the only vocal contribution, isolated from the rest of the mixing signal w (t), and the musical acoustic signal z (t) corresponds to the only musical contribution, isolated from the rest of the mixing signal w ( t ).
  • the pure speech signal x ( t ) is the free-field signal and the impulse response r (t) is characteristic of the acoustic environment of the recording.
  • the first step 110 of the method 100 consists of sampling the mixing signal w ( t ) and calculating a spectrogram V of the mixing signal w ( t ).
  • a spectrogram is defined as the absolute value (or the square of the absolute value) of the short-term Fourier transform of a sampled signal.
  • Other time-frequency transformations are possible, such as a constant Q transform, or a short-term Fourier transform followed by frequency filtering (using a Mel or Bark scale filter bank, for example).
  • the spectrogram For each time sampling step, the spectrogram comprises a frequency frame, indicating for each frequency sampling step, the instantaneous power of the signal.
  • the spectrogram V is therefore a matrix F x U, positive real numbers
  • U represents the total number of frames that subdivided the signal duration of the mixture w ( t ).
  • F is the total number of frequency sampling steps, which is generally between 200 and 2000.
  • the method 100 then comprises a first part in which the voice signal is considered as a pure vocal signal, without reverberation.
  • the mixing signal modeling spectrogram is the sum of the spectrogram of the speech signal V y , and the spectrogram of the musical signal V z .
  • V y is the spectrogram of the signal y (t), considered unaffected by reverberation.
  • This modeling is finally the usual modeling in the context of the methods of decomposition by factorization in non-negative matrices.
  • â refers to a quantity which is an estimate of the quantity a.
  • W F 0 is a matrix of harmonic atoms, which is predefined and specific to speech signals
  • H F 0 is an activation matrix indicating at each moment the harmonic atoms of the matrix W F 0 which are activated.
  • W K is a matrix of filtering atoms
  • H K is an activation matrix indicating at each instant the filtering atoms of the matrix W K that are activated.
  • the operator ⁇ corresponds to the term-to-term matrix multiplication of two matrices (also called Hadamard product).
  • the first part of the process then consists in estimating the matrices H F 0 , W K , H K , W R and H R.
  • V ⁇ there + V ⁇ z ⁇ f , t d V ft
  • b at b - log at b - 1
  • beta-divergence is defined by: d ⁇ at
  • step 120 the cost function C is thus minimized so as to determine the optimum value of each parameter of each matrix.
  • This minimization is performed by iterations, with multiplicative updating rules which are successively applied to each of the parameters of the matrices H F 0 , W K , H K , W R and H R.
  • update rules are for example elaborated by considering the gradient (that is to say the partial derivative) of the cost function C with respect to each parameter. More precisely, the gradient of the cost function with respect to the parameter considered is written in the form of a difference between two positive terms, and the corresponding updating rule is a multiplication of the parameter considered by the ratio of these two terms. .
  • the update rules are as follows: H F 0 ⁇ H F 0 ⁇ W F 0 T W K H K ⁇ V ⁇ V ⁇ ⁇ ⁇ - 2 W F 0 T W K H K ⁇ V ⁇ ⁇ ⁇ - 1 H K ⁇ H K ⁇ W K T W F 0 H F 0 ⁇ V ⁇ V ⁇ ⁇ ⁇ - 2 W K T W F 0 H F 0 ⁇ V ⁇ ⁇ ⁇ - 1 W K ⁇ W K ⁇ W F 0 H F 0 ⁇ V ⁇ V ⁇ ⁇ ⁇ - 2 H K T W F 0 H F 0 ⁇ V ⁇ ⁇ ⁇ - 1 H K T H R ⁇ H R ⁇ W R T V ⁇ V ⁇ ⁇ ⁇ - 2 W R T V ⁇ ⁇ ⁇ - 1 W R ⁇ W R ⁇ V ⁇ V ⁇ ⁇ ⁇ - 2 W R T V ⁇ ⁇ ⁇ - 1 W R ⁇ W
  • step 130 the matrix H F 0 is constrained by using a tracking algorithm such as the Viterbi "tracking" algorithm in order to select, for each time step, the frequency step in which we find a maximum power, without being too far in frequency power maxima selected for previous time steps.
  • a tracking algorithm such as the Viterbi "tracking" algorithm
  • step 140 the coefficients of the matrix H F 0 which are at a frequency distance greater than a reference distance are set to 0.
  • the speech signal is considered to be affected by reverberation.
  • the first part of the process allows to obtain initial values for the parameters which will be estimated by successive iterations during the implementation of the second part of the process. Other ways of defining the initial values of these parameters are conceivable.
  • * t denotes a line-by-line convolution operator as explained in the right-hand side of the equation above.
  • the reverberation matrix R has T time step (of the same length a step of sampling mixed signal), and F no sampling frequency.
  • T is predetermined by the user and is generally between 20 and 200, for example 100.
  • V ⁇ x W F 0 H F 0 ⁇ W K H K
  • V ⁇ rev , there + V ⁇ z ⁇ f , t d V ft
  • b at b - log at b - 1
  • the cost function of the second part is similar to that used in the first part.
  • step 220 the cost function C is then minimized so as to determine the optimum value of each parameter of each matrix, in particular the parameters of the reverberation matrix.
  • the update rules are developed from the partial derivative of the cost function C with respect to each relevant parameter. They therefore depend on the form chosen for the cost function, in particular the divergence used in this cost function. The rules above are therefore specific to the use of beta-divergence.
  • the update rule of the reverberation matrix R is general in the sense that it does not depend on the modeling chosen for the spectrogram V x of the pure signal or that of the background sound spectrogram V z .
  • the iterations start from the matrix H ' F 0 determined in the first part of the method. It should be noted that, since the update rules are multiplicative, the coefficients of the matrix H F 0 initially set to 0 will remain at 0 during the minimization of the cost function in the second part of the method.
  • step 230 conventional adapted processes (in particular a Wiener filtering type treatment) are applied to the above spectrograms to obtain in particular the spectrograms of interest V x , V z . Then, in step 240, an inverse transformation of that of step 110 is performed on these spectrograms to obtain the output signals, pure speech signal x (t) and musical signal z (t).
  • step 230 conventional adapted processes (in particular a Wiener filtering type treatment) are applied to the above spectrograms to obtain in particular the spectrograms of interest V x , V z .
  • step 240 an inverse transformation of that of step 110 is performed on these spectrograms to obtain the output signals, pure speech signal x (t) and musical signal z (t).
  • these acoustic signals are monophonic signals.
  • these signals are stereophonic. More generally, they are multichannel. The person skilled in the art knows how to adapt to stereophonic or multichannel signals the treatments presented for the case of monophonic signals.
  • the preferred embodiment relates to a specific component or interest component that is a voice component.
  • the modeling of the reverberation of a component is general and applies to any type of component.
  • the background sound component can also be affected by reverberation.
  • any type of non-negative non-reverberated sound spectrograms may also be used instead of those used above.
  • the mixture comprises two components. Generalization to any number of components is straightforward.
  • SDR signal-to-distortion ratio
  • SAR Signal to Artefact Ratio
  • SIR Signal-to-interference ratio
  • the method according to the invention therefore improves the results obtained, whatever the way of analyzing them.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Stereophonic System (AREA)
  • Auxiliary Devices For Music (AREA)
  • Measurement Of Mechanical Vibrations Or Ultrasonic Waves (AREA)

Claims (10)

  1. Verfahren, durchgeführt von einem Rechner
    zum Trennen (100), in einem akustischen Mischsignal w(t) eines spezifischen reinen Beitrags, beeinflusst von Hall, und eines Hintergrund-Schallsignal-Beitrags, dadurch charakterisiert, dass es darin besteht, den spezifischen reinen Beitrag x(t) und den Hintergrund-Schallsignal-Beitrag z(t) zu trennen,
    unter Verwendung eines Modellierungs-Spektrogramms des akustischen Mischsignals rev, das der Summe eines Spektrogramms eines spezifischen zurückhallenden Beitrags rev,y und eines Spektrogramms des Hintergrund-Schallsignal-Beitrags z entspricht, wobei das Spektrogramm des spezifischen zurückhallenden Beitrags von dem Spektrogramm des spezifischen reinen Beitrags x gemäß dem Modell V ^ f , t rev , y = τ = 1 T V ^ f , t τ + 1 x R f , τ
    Figure imgb0042
    abhängt, wobei R eine FxT-Hall-Matrix ist, wobei F die Frequenz-Dimension und T die zeitliche Dimension von R ist, f ein Frequenzindex ist, t ein Zeitindex ist und τ eine ganze Zahl zwischen 1 und T ist; und
    durch iteratives Berechnen einer Schätzung des Spektrogramms des Hintergrund-Schallsignal-Beitrags z , des Spektrogramms des spezifischen reinen Beitrags x und der Hall-Matrix R durch Minimieren einer Kostenfunktion (C) zwischen einem Spektrogramm des Mischsignals V und dem Modellierungs-Spektrogramm des Mischsignals rev ,
    wobei die Kostenfunktion (C) eine Abweichung (d) zwischen dem Spektrogramm des Mischsignals und dem Modellierungs-Spektrogramm des Mischsignals verwendet, insbesondere die Beta-Divergenz genannte Abweichung definiert durch: d β a | b = { 1 β β 1 a β + β 1 b β βa b β 1 , β \ 0,1 a log a b a + b , β = 1 a b log a b 1, β = 0
    Figure imgb0043
    wobei a und b zwei reelle positive Skalare sind und wobei die Minimierung der Kostenfunktion zum Erhalten einer Schätzung der Hall-Matrix multiplikative Aktualisierungsregeln ausführen vom Typ: R R V V ^ rev β 2 t V ^ x V ^ rev β 1 t V ^ x
    Figure imgb0044
    mit rev = rev,y + z ; und wobei ein Operator ist, der dem komponentenweisen Produkt zwischen Matrizen (oder Vektoren) entspricht; . (.) ein Operator ist, der der komponentenweisen Potenzierung einer Matrix mit einem Skalar entspricht; * t ein Operator der zeitlichen Faltung zwischen zwei Matrizen ist definiert durch A t B f , τ = τ = t T A f , τ B f , τ t + 1 .
    Figure imgb0045
  2. Verfahren gemäß Anspruch 1, dadurch charakterisiert, dass der spezifische reine Beitrag ein Sprach-Beitrag ist und das Spektrogramm des spezifischen reinen Beitrags x modelliert ist durch: V ^ x = W F 0 H F 0 W K H K
    Figure imgb0046
    wobei WF0 eine vordefinierte Harmonie-Atome-Matrix ist, HF0 eine Matrix zur Aktivierung der Harmonie-Atome der Matrix WF0 ist, WK eine Filteratom-Matrix ist, HK eine Matrix zur Aktivierung der Filteratome der Matrix WK ist und wobei ein Operator ist, der dem komponentenweisen Produkt zwischen Matrizen entspricht.
  3. Verfahren gemäß Anspruch 1 oder Anspruch 2, dadurch charakterisiert, dass die Minimierung der Kostenfunktion multiplikative Aktualisierungsregeln ausführt vom Typ: H F 0 H F 0 W F 0 T W K H K R t V V ^ rev β 2 W F 0 T W K H K R t V ^ rev β 1
    Figure imgb0047
    H K H K W K T W F 0 H F 0 R t V V ^ rev β 2 W K T W F 0 H F 0 R t V ^ rev β 1
    Figure imgb0048
    W K W K W F 0 H F 0 R t V V ^ rev β 2 H K T W F 0 H F 0 R t V ^ rev β 1 H K T
    Figure imgb0049
    mit rev = rev,y + z ; und wobei ein Operator ist, der dem komponentenweisen Produkt zwischen Matrizen (oder Vektoren) entspricht; . (.) ein Operator ist, der der komponentenweisen Potenzierung einer Matrix mit einem Skalar entspricht; (.) T die Transponierte einer Matrix ist; * t ein Operator der zeitlichen Faltung zwischen zwei Matrizen ist definiert durch A t B f , τ = τ = t T A f , τ B f , τ t + 1 .
    Figure imgb0050
  4. Verfahren gemäß einem der Ansprüche 1 bis 3, dadurch charakterisiert, dass das Spektrogramm des Hintergrund-Schallsignal-Beitrags z durch einen Faktorisierung in nicht-negative Matrizen modelliert ist: V ^ Z = W R H R
    Figure imgb0051
    wobei WR eine Matrix mit elementaren spektralen Modellen ist und HR eine Matrix zur Aktivierung der elementaren spektralen Modelle der Matrix WR ist.
  5. Verfahren gemäß Anspruch 1 und Anspruch 4, dadurch charakterisiert, dass die Minimierung der Kostenfunktion multiplikative Aktualisierungsregeln ausführt vom Typ: H R H R W R T V V ^ rev β 2 W R T V ^ rev β 1
    Figure imgb0052
    W R W R V V ^ rev β 2 H R T V ^ rev β 1 H R T
    Figure imgb0053
    mit rev = V̂rev,y + z ; und wobei ein Operator ist, der dem komponentenweisen Produkt zwischen Matrizen (oder Vektoren) entspricht; . (.) ein Operator ist, der der komponentenweisen Potenzierung einer Matrix mit einem Skalar entspricht; (.) T die Transponierte einer Matrix ist.
  6. Verfahren gemäß einem der Ansprüche 1 bis 5, dadurch charakterisiert, dass die Trennung des spezifischen reinen Beitrags x(t) und des Hintergrund-Schallsignal-Beitrags z(t) unter Verwendung eines Modellierungs-Spektrogramms des akustischen Mischsignals rev einen zweiten Teil des Verfahrens bildet und dieses einen ersten Teil aufweist, der darin besteht, in dem akustischen Mischsignal w(t) einen spezifischen Beitrag und einen Hintergrund-Schallsignal-Beitrag zu trennen, ohne den Hall zu berücksichtigen, wobei Initialisierungsparameter unter den als Ergebnis des ersten Teils des Verfahrens erhaltenen Parametern als Anfangswert der entsprechenden Parameter in dem Spektrogramm des spezifischen zurückhallenden Beitrags rev,y des zweiten Teils des Verfahrens verwendet werden.
  7. Verfahren gemäß Anspruch 6, dadurch charakterisiert, dass der erste Teil die Minimierung einer Kostenfunktion aufweist, wobei ein Algorithmus durchgeführt wird, der ähnlich ist zu dem, der im zweiten Teil durchgeführt wird.
  8. Verfahren gemäß Anspruch 7, dadurch charakterisiert, dass für die Minimierung der Kostenfunktion der erste Teil des Verfahrens multiplikative Aktualisierungsregeln ausführt vom Typ: H F 0 H F 0 W F 0 T W K H K V V ^ β 2 W F 0 T W K H K V ^ β 1
    Figure imgb0054
    H K H K W K T W F 0 H F 0 V V ^ β 2 W K T W F 0 H F 0 V ^ β 1
    Figure imgb0055
    W K W K W F 0 H F 0 V V ^ β 2 H K T W F 0 H F 0 V ^ β 1 H K T
    Figure imgb0056
    H R H R W R T V V ^ β 2 W R T V ^ β 1
    Figure imgb0057
    W R W R V V ^ β 2 H R T V ^ β 1 H R T
    Figure imgb0058
    mit = x + z , Z = (WRHR ) und x = (W F0 H F0) (WKHK) ; wobei WR eine Matrix mit elementaren spektralen Modellen ist und HR eine Matrix zur Aktivierung der elementaren spektralen Modelle der Matrix WR ist, wobei WF0 eine vordefinierte Harmonie-Atome-Matrix ist, HF0 eine Matrix zur Aktivierung der Harmonie-Atome der Matrix WF0 ist, WK eine Filteratom-Matrix ist, HK eine Matrix zur Aktivierung der Filteratome der Matrix WK ist; und wobei ein Operator ist, der dem komponentenweisen Produkt zwischen Matrizen (oder Vektoren) entspricht; . (.) ein Operator ist, der der komponentenweisen Potenzierung einer Matrix mit einem Skalar entspricht; (.) T die Transponierte einer Matrix ist.
  9. Verfahren gemäß einem der Ansprüche 6 bis 8, dadurch charakterisiert, dass es aufweist, in dem ersten Teil des Verfahrens, auf die Minimierung der Kostenfunktion folgend, die Anwendung eines Algorithmus zur Verfolgung des Leistungsmaximums in der Matrix zur Aktivierung des spezifischen Beitrags HF0, wobei der Algorithmus bevorzugt vom Typ Viterbi-Algorithmus ist, anschließend das Auf-Null-Setzen aller Terme der Matrix zur Aktivierung des spezifischen Beitrags HF0, die zu weit von dem gefundenen Leistungsmaximum entfernt sind, wobei die Terme der Matrix zur Aktivierung des spezifischen Beitrags HF0 die Initialisierungsparameter bilden, die als Anfangswerte der entsprechenden Parameter in dem Spektrogramm des spezifischen zurückhallenden Beitrags rev,y des zweiten Teils des Verfahrens verwendet werden, wobei die anderen Parameter des Spektrogramms des spezifischen zurückhallenden Beitrags rev,y mit beliebigen Werten initialisiert werden.
  10. Computerprogramm-Produkt, dadurch charakterisiert, dass es Instruktionen aufweist, die dazu geeignet sind, in dem Speicher eines Rechners gespeichert zu werden zum Ausführen eines Trennungsverfahrens gemäß einem der Ansprüche 1 bis 9, wenn sie durch den Rechner ausgeführt werden.
EP15198713.8A 2014-12-31 2015-12-09 Verbessertes trennverfahren und computerprogrammprodukt Not-in-force EP3040989B1 (de)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US14/984,089 US9711165B2 (en) 2014-12-31 2015-12-30 Process and associated system for separating a specified audio component affected by reverberation and an audio background component from an audio mixture signal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
FR1463482A FR3031225B1 (fr) 2014-12-31 2014-12-31 Procede de separation ameliore et produit programme d'ordinateur

Publications (2)

Publication Number Publication Date
EP3040989A1 EP3040989A1 (de) 2016-07-06
EP3040989B1 true EP3040989B1 (de) 2018-10-17

Family

ID=53541694

Family Applications (1)

Application Number Title Priority Date Filing Date
EP15198713.8A Not-in-force EP3040989B1 (de) 2014-12-31 2015-12-09 Verbessertes trennverfahren und computerprogrammprodukt

Country Status (3)

Country Link
US (1) US9711165B2 (de)
EP (1) EP3040989B1 (de)
FR (1) FR3031225B1 (de)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
FR3013885B1 (fr) * 2013-11-28 2017-03-24 Audionamix Procede et systeme de separation de contributions specifique et de fond sonore dans un signal acoustique de melange
CN109644304B (zh) 2016-08-31 2021-07-13 杜比实验室特许公司 混响环境的源分离
EP3324407A1 (de) * 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Vorrichtung und verfahren zur dekomposition eines audiosignals unter verwendung eines verhältnisses als eine eigenschaftscharakteristik
EP3324406A1 (de) 2016-11-17 2018-05-23 Fraunhofer Gesellschaft zur Förderung der Angewand Vorrichtung und verfahren zur zerlegung eines audiosignals mithilfe eines variablen schwellenwerts
EP3573058B1 (de) * 2018-05-23 2021-02-24 Harman Becker Automotive Systems GmbH Trocken- und raumschalltrennung
US11546689B2 (en) * 2020-10-02 2023-01-03 Ford Global Technologies, Llc Systems and methods for audio processing

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5195652B2 (ja) * 2008-06-11 2013-05-08 ソニー株式会社 信号処理装置、および信号処理方法、並びにプログラム
US20130282373A1 (en) * 2012-04-23 2013-10-24 Qualcomm Incorporated Systems and methods for audio signal processing
US9549253B2 (en) * 2012-09-26 2017-01-17 Foundation for Research and Technology—Hellas (FORTH) Institute of Computer Science (ICS) Sound source localization and isolation apparatuses, methods and systems

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
None *

Also Published As

Publication number Publication date
US9711165B2 (en) 2017-07-18
EP3040989A1 (de) 2016-07-06
FR3031225B1 (fr) 2018-02-02
US20160189731A1 (en) 2016-06-30
FR3031225A1 (fr) 2016-07-01

Similar Documents

Publication Publication Date Title
EP3040989B1 (de) Verbessertes trennverfahren und computerprogrammprodukt
Kilgour et al. Fr\'echet audio distance: A metric for evaluating music enhancement algorithms
US20210089967A1 (en) Data training in multi-sensor setups
Smaragdis Convolutive speech bases and their application to supervised speech separation
EP1730729A1 (de) Verbessertes sprachsignalumsetzungsverfahren und -system
US10614827B1 (en) System and method for speech enhancement using dynamic noise profile estimation
EP3133833B1 (de) Vorrichtung, verfahren und programm zur schallfeldwiedergabe
WO2005106853A1 (fr) Procede et systeme de conversion rapides d'un signal vocal
US20210142815A1 (en) Generating synthetic acoustic impulse responses from an acoustic impulse response
Fitzgerald et al. Projet—spatial audio separation using projections
JP5580585B2 (ja) 信号分析装置、信号分析方法及び信号分析プログラム
EP1606792B1 (de) Verfahren zur analyse der grundfrequenz, verfahren und vorrichtung zur sprachkonversion unter dessen verwendung
Wisdom et al. Enhancement and recognition of reverberant and noisy speech by extending its coherence
Chennupati et al. Significance of phase in single frequency filtering outputs of speech signals
US9633665B2 (en) Process and associated system for separating a specified component and an audio background component from an audio mixture signal
Islam et al. Supervised single channel speech enhancement based on stationary wavelet transforms and non-negative matrix factorization with concatenated framing process and subband smooth ratio mask
Mirsamadi et al. Multichannel speech dereverberation based on convolutive nonnegative tensor factorization for ASR applications.
Chen et al. A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation
Zheng et al. Noise-robust blind reverberation time estimation using noise-aware time–frequency masking
Enzinger et al. Mismatched distances from speakers to telephone in a forensic-voice-comparison case
Padaki et al. Single channel speech dereverberation using the LP residual cepstrum
Gaultier Design and evaluation of sparse models and algorithms for audio inverse problems
Adiloğlu et al. A general variational Bayesian framework for robust feature extraction in multisource recordings
Valin et al. To dereverb or not to dereverb? Perceptual studies on real-time dereverberation targets
EP2901447B1 (de) Verfahren und vorrichtung zur trennung von signalen durch filterung räumlicher mindestabweichungen unter linearen einschränkungen

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

17P Request for examination filed

Effective date: 20161208

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

INTG Intention to grant announced

Effective date: 20180629

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

Free format text: NOT ENGLISH

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

Free format text: LANGUAGE OF EP DOCUMENT: FRENCH

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602015018225

Country of ref document: DE

Ref country code: AT

Ref legal event code: REF

Ref document number: 1054923

Country of ref document: AT

Kind code of ref document: T

Effective date: 20181115

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20181017

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1054923

Country of ref document: AT

Kind code of ref document: T

Effective date: 20181017

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190117

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190217

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190117

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190118

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20190217

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602015018225

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20181209

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

26N No opposition filed

Effective date: 20190718

REG Reference to a national code

Ref country code: BE

Ref legal event code: MM

Effective date: 20181231

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20181209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20181231

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20181231

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20181231

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: DE

Payment date: 20191210

Year of fee payment: 5

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20191203

Year of fee payment: 5

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20181017

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20151209

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20181017

REG Reference to a national code

Ref country code: DE

Ref legal event code: R119

Ref document number: 602015018225

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: FR

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20201231

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20210701

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20211117

Year of fee payment: 7

GBPC Gb: european patent ceased through non-payment of renewal fee

Effective date: 20221209

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GB

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20221209