CN107578784A - A kind of method and device that target source is extracted from audio - Google Patents

A kind of method and device that target source is extracted from audio Download PDF

Info

Publication number
CN107578784A
CN107578784A CN201710816430.4A CN201710816430A CN107578784A CN 107578784 A CN107578784 A CN 107578784A CN 201710816430 A CN201710816430 A CN 201710816430A CN 107578784 A CN107578784 A CN 107578784A
Authority
CN
China
Prior art keywords
signal
frequency
target source
virtual
source
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710816430.4A
Other languages
Chinese (zh)
Other versions
CN107578784B (en
Inventor
郑羲光
尚梦宸
刘飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Yinman Technology Co.,Ltd.
Original Assignee
Sound Man (beijing) Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sound Man (beijing) Technology Co Ltd filed Critical Sound Man (beijing) Technology Co Ltd
Priority to CN201710816430.4A priority Critical patent/CN107578784B/en
Publication of CN107578784A publication Critical patent/CN107578784A/en
Application granted granted Critical
Publication of CN107578784B publication Critical patent/CN107578784B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Stereophonic System (AREA)

Abstract

The present invention discloses a kind of method and device that target source is extracted from audio.Method includes:Time-frequency conversion is carried out frame by frame to the audio signal of collection, time-domain signal is transformed to frequency-region signal, carrying out segmentation to frequency-region signal using window function forms two paths of signals;Traversal is calculated under given frequency per the virtual angle of virtual source corresponding to each frequency of two paths of signals of frame frequency-region signal;Compare the size of virtual angle and predetermined angular threshold value, according to comparative result using a signal as target source signal and extract the target source signal frequency-region signal store;The frequency-region signal of the target source signal of storage is converted into time-domain signal using time-frequency inverse transformation, exports target source time-domain signal.The present invention realizes separates target source signal from audio signal.

Description

A kind of method and device that target source is extracted from audio
Technical field
The present invention relates to Audio Signal Processing technical field, and in particular to it is a kind of from audio extract target source method and Device.
Background technology
The singing scoring system of KTV in the markets is largely to be risen and fallen with the tone of performance or volume scores at present , it is impossible to really scored according to the sound of singer, the points-scoring system of this low precision just can not increasingly meet to consume The demand of person.Justing think one, to sing that very pleasing to the ear people and one sings be not that so good people their fractions are identical, Huo Zheyin The bad people's fraction sung for the relation of volume is very high on the contrary, and the scoring of such some people substantially reduced of giving a mark is positive Property.So the improvement for KTV points-scoring systems becomes extremely important, in order to improve KTV points-scoring systems, scoring is set to become more smart Standard, we can use the voice of original singer in song to contrast the voice that consumer sings in KTV, and the goodness of fit of the two is more high, scores Will be higher.And the first step so done is then to add original singer's voice in song in voice from the accompaniment of song individually to extract Come, but how to extract the voice of original singer well from the self-contained song audio for having voice with accompaniment, turn into difficult Topic.
The content of the invention
In view of the technical drawbacks of the prior art, it is an object of the present invention to provide one kind extracts target from audio The method and device in source.
Technical scheme is used by realize the purpose of the present invention:
A kind of method that target source is extracted from audio, including step:
Time-frequency conversion is carried out frame by frame to the audio signal of collection, time-domain signal is transformed to frequency-region signal, utilizes window function Frequency-region signal is split, forms first via signal and second road signal;
The first via signal that traversal is calculated under given frequency per frame frequency-region signal is corresponding with each frequency of second road signal The virtual angle of virtual source;
Compare the virtual angle and the size of predetermined angular threshold value, according to comparative result by first via signal or the second tunnel Signal as target source signal and extract the target source signal frequency-region signal storage;
The frequency-region signal of the target source signal of storage is converted into time-domain signal using time-frequency inverse transformation, when exporting target source Domain signal.
Another aspect of the present invention, which also resides in, provides a kind of device that target source is extracted from audio, including:
Time-domain and frequency-domain conversion segmentation module, carries out time-frequency conversion, by time-domain signal frame by frame for the audio signal to collection Frequency-region signal is transformed to, frequency-region signal is split using window function, forms first via signal and second road signal;
Virtual angle calcu-lation module, calculated for traveling through under given frequency per the first via signal and second of frame frequency-region signal The virtual angle of virtual source corresponding to each frequency of road signal;
Target source signal memory module, for the virtual angle and the size of predetermined angular threshold value, according to comparing As a result first via signal or second road signal as target source signal and are extracted the frequency-region signal of the target source signal and stored;
Time domain frequency domain converts output module, for being turned the frequency-region signal of the target source signal of storage using time-frequency inverse transformation Time-domain signal is changed to, exports target source time-domain signal.
After the inventive method by carrying out time-domain and frequency-domain conversion frame by frame by audio signal to be separated, first via letter is formed respectively Number and second road signal, then calculate each of first via signal and second road signal under frequency per frame frequency-region signal by traveling through The virtual angle of virtual source corresponding to frequency, according to the virtual angle compared with predetermined angular threshold value, realization will be satisfactory Target source Signal separator out stores, and is exported again after the conversion of frequency domain to time domain afterwards, realizes target source signal from sound Separation and Extraction comes out in frequency signal, and the convenient processing subsequently to target source signal uses.
Brief description of the drawings
Fig. 1 is the flow chart for the method that target source is extracted from audio;
Fig. 2 is the calculating schematic diagram of the virtual angle of virtual source;
Fig. 3 is the structural representation for the device that target source is extracted from audio.
Embodiment
The present invention is described in further detail below in conjunction with the drawings and specific embodiments.It is it should be appreciated that described herein Specific embodiment only to explain the present invention, be not intended to limit the present invention.
It is shown in Figure 1, a kind of method that target source is extracted from audio, including step:
Time-frequency conversion is carried out frame by frame to the audio signal of collection, time-domain signal is transformed to frequency-region signal, utilizes window function Frequency-region signal is split, forms first via signal a and second road signal b;
Traversal is calculated under given frequency k, per corresponding to the first via signal and each frequency of second road signal of frame frequency-region signal Virtual source virtual angle thetaab(k);
Compare the virtual angle thetaab(k) with the size of predetermined angular threshold value, according to comparative result by first via signal or Second road signal as target source signal and extract the target source signal frequency-region signal storage;
The frequency-region signal of the target source signal of storage is converted into time-domain signal using time-frequency inverse transformation, when exporting target source Domain signal.
The inventive method can be realized the voice of song in KTV systems and the voice (target in accompaniment mixed audio Source) separate individually storage after export, so be in follow-up KTV points-scoring systems exactly assess singer vacuum sing Level provides the foundation, when the voice of song in accompaniment mixed audio with extracting voice in for KTV systems, as will be above-mentioned First via signal, by calculating the virtual angle of virtual source, then compares as accompaniment signal, second road signal as human voice signal The size of more virtual angle and predetermined angular threshold value, according to the difference of voice and the virtual angle of the virtual source of the signal of accompaniment, , will the satisfactory virtual angle according to multilevel iudge result by setting a predetermined angular threshold value come multilevel iudge Signal corresponding to corresponding virtual source individually stores as human voice signal, uses the same manner time to every frame audio signal successively Go through processing, it is possible to realize and the voice of song is separated into storage from voice with the mixed audio signal accompanied.
The predetermined angular threshold value rule of thumb determines, can be 5 degree, 3 degree or other angles, the window function can The window function of same size or different size of window function are selected as needed, to be reduced as far as window function segmentation frequency domain letter Number when spectrum energy leakage.The frequency-region signal of the target source signal of storage is being converted to using time-frequency inverse transformation (ISTFT) Time-domain signal, when exporting target source time-domain signal, accordingly window function size during corresponding intercept is used to be reduced.
Wherein, the calculation of the virtual angle of the virtual source is as follows:
In formula, θab(k) the first via signal a virtual sources corresponding with each frequencies of second road signal b that frequency k is presented are represented Virtual angle, AaAnd A (k)b(k) amplitude for the frequency k that first via signal a and second road signal b is presented is represented respectively,Represent First via signal a and second road signal b angle.
Due to the openness principle of audio signal, in same frequency of same time, a first via signal a time-frequency The output of point and the output of a second road signal b time frequency point, always have one to be far longer than another;Time frequency point is with Y-axis For frequency (HZ), X-axis is the amplitude size of the signal represented by signal in the coordinate system of time, unit dB.Such as following table institute Show:
Frequency (HZ) 20 40 60 ……
First via signal a a1 a2 a3 ……
First via signal b b1 b2 b3 ……
According to openness principle, in first via signal a and second road signal b comparison, a1 represents that first via signal a exists Frequency is under 20HZ, and the output of time-frequency conversion STFT (short time discrete Fourier transform) time frequency point, b1 represents second road signal b In the case where frequency is 20HZ, the output of a time-frequency conversion STFT time frequency point, then a1, b1 should be plural number.Always have | a1 | > > | b1 |, and b1 ≈ 0, or | a1 | < < | b1 |, and a1 ≈ 0, behind similarly.
Therefore, the openness principle of the signal is utilized, it is possible to achieve judge difference two using the virtual angle of virtual source Individual signal, it would be desirable to echo signal separate storage.
It is specifically shown in shown in accompanying drawing 2, the Fig. 2 illustrates how to calculate the amplitude that first via signal a and second road signal b is presented Virtual source 40 virtual angle thetaab, Aa, AbFor first via signal a and second road signal b amplitude, the angle of two signalsFor -30 ° to 30 °, two loudspeakers in Fig. 2, the audio signal that left speaker 10, right loudspeaker 20 are sent out, which is given, is located at two The hearer 30 in individual loudspeaker centre position, two such raise one's voice sound device transmission sound reach hearer 30 human ear frequency k Available virtual source 40 is presented first via signal a and second road signal b amplitude AsaAnd A (k)b(k)。
The frequency k that the first via signal a and second road signal b obtained after time-frequency conversion is presented virtual source is presented The amplitude A of two signalsaAnd A (k)b(k), certain present invention is not limited to what first via signal a and second road signal b was presented Frequency k virtual source is presented the amplitude A of two signalsaAnd A (k)b(k) processing or multiple signals, if any multiple letters Number (more than two), then have known signal (signal containing target source) and other signals AiAdd and:Sum(|AiI)=∑iIAiI= A1+…+AI, another signal that (1≤i≤I) is formed and the virtual source that is presented handles, actually also by two signals at Presented virtual source is managed to handle.
It can thus be calculated virtual under given frequency k according to the virtual angle calculation formula of virtual source noted earlier The virtual angle θ in sourceab(k) size, it is positive or negative value:Virtual source is by the first via signal a selected, second road signal b Amplitude and virtual angle be expressed as:{Aa(k), Ab(k), θab}。θabAs side information (Side information), i.e., second Road signal a, the virtual angles of second road signal b, i.e. virtual source angle, can by the auxiliary of side information (Side information) To analyze original signal, target source is determined whether ----voice.
After calculating virtual source angle, in the signal angle of two given signalsIn the range of (- 30 ° to 30 °, Size is fixed), in a certain frequency in same frame, if the output of first via signal a time frequency point is more than second road signal b (assuming that first via signal a is accompaniment, second road signal b is voice) then virtual angle thetaabIt can be tilted to first via signal a, it is on the contrary It can then be tilted to second road signal b.
Handled in order to facilitate judgement, can rule of thumb direction determines a predetermined angle threshold between two signal angles Value, such as zero degree, i.e., when virtual angle thetaabSize when exceeding the zero degree angle threshold, be classified as voice, as shown in Figure 2. Profit travels through each frame in frequency and classified in this way, you can realization separates target source from mixed audio;Last profit The frequency-region signal of storage is converted into time domain with time-frequency inverse transformation, exports target source signal --- voice.
Specifically, when determining whether voice according to described virtual angle, carry out in the following ways, i.e., when described The virtual angle theta of virtual sourceab(k) when being more than predetermined angular threshold value, by the first via signal corresponding to the virtual source or second Road signal is considered as target source signal, then extracts the frequency-region signal storage of the target source signal.
Wherein, the calculation for extracting the target source signal (assuming that first via signal a is target source signal, corresponds to as follows Aa(k) the target source signal containing extraction in need in):
S (k)=Aa(k) M (k),
Wherein,
M (k) is target source extraction vector;T is given threshold value, and S (k) is mesh Mark source signal.
Wherein, when time-domain and frequency-domain is changed, including but not limited to using Fourier transformation, wavelet transformation, MDCT conversion etc. Method.
It is shown in Figure 3 the present invention also aims to provide a kind of device that target source is extracted from audio, including:
Time-domain and frequency-domain conversion segmentation module, carries out time-frequency conversion, by time-domain signal frame by frame for the audio signal to collection Frequency-region signal is transformed to, frequency-region signal is split using window function, forms first via signal and second road signal;
Virtual angle calcu-lation module, calculated for traveling through under given frequency per the first via signal and second of frame frequency-region signal The virtual angle theta of virtual source corresponding to each frequency of road signalab(k);
Target source signal memory module, for the virtual angle thetaab(k) with the size of predetermined angular threshold value, according to First via signal or second road signal as target source signal and are extracted the frequency-region signal of the target source signal and deposited by comparative result Storage;
Time domain frequency domain converts output module, for being turned the frequency-region signal of the target source signal of storage using time-frequency inverse transformation Time-domain signal is changed to, exports target source time-domain signal.
Apparatus of the present invention can be realized the voice of song in KTV systems and the voice (target in accompaniment mixed audio Source) separate individually storage after export, so provided for the vacuum performance for assessing singer exactly in KTV systems is horizontal Basis, when for extracting the voice of song in KTV systems with voice in accompaniment mixed audio, such as by the above-mentioned first via Signal is then relatively more virtual by calculating the virtual angle of virtual source as human voice signal as accompaniment signal, second road signal Angle and the size of predetermined angular threshold value, according to the difference of voice and the virtual angle in the signal-virtual source of accompaniment, pass through setting One predetermined angular threshold value carrys out multilevel iudge, will void corresponding to the satisfactory virtual angle according to multilevel iudge result Signal corresponding to plan source individually stores as human voice signal, uses the same manner traversal processing to every frame audio signal successively, Can is realized separates storage from voice by the voice of original singer in song with the mixed audio signal accompanied.
The predetermined angular threshold value rule of thumb determines, can be 5 degree, 3 degree or other angles.The window function can The window function of same size or different size of window function are selected as needed, to be reduced as far as window function segmentation frequency domain letter Number when spectrum energy leakage.The target source of storage is being believed using time-frequency inverse transformation (such as Short-time Fourier inverse transformation ISTFT) Number frequency-region signal be converted to time-domain signal, when exporting target source time-domain signal, accordingly to use window letter during corresponding intercept Number size is reduced.
Wherein, the calculation of the virtual angle of the virtual source is as follows:
θab(k) the virtual angle of virtual source, A are representedaAnd A (k)b(k) first via signal and second road signal are represented respectively The frequency k of presentation amplitude,Represent the angle of first via signal and second road signal.
Specifically, when determining whether voice according to virtual angle, carry out in the following ways, i.e., when the virtual source Virtual angle thetaab(k) when being more than predetermined angular threshold value, by the first via signal or second road signal corresponding to the virtual source It is considered as target source signal, then extracts the frequency-region signal storage of the target source signal.
On virtual source and the explanation of virtual angle, refer to and foregoing first via signal a and the is presented relating to how to calculate The explanation and accompanying drawing 2 of the virtual angle of the virtual source of two road signal b amplitude.
Described above is only the preferred embodiment of the present invention, it is noted that for the common skill of the art For art personnel, under the premise without departing from the principles of the invention, some improvements and modifications can also be made, these improvements and modifications Also it should be regarded as protection scope of the present invention.

Claims (8)

  1. A kind of 1. method that target source is extracted from audio, it is characterised in that including step:
    Time-frequency conversion is carried out frame by frame to the audio signal of collection, time-domain signal is transformed to frequency-region signal, using window function to frequency Domain signal is split, and forms first via signal and second road signal;
    The first via signal that traversal is calculated under given frequency per frame frequency-region signal is corresponding with each frequency of second road signal virtual The virtual angle in source;
    Compare the virtual angle and the size of predetermined angular threshold value, according to comparative result by first via signal or second road signal As target source signal and extract the target source signal frequency-region signal storage;
    The frequency-region signal of the target source signal of storage is converted into time-domain signal using time-frequency conversion inverse transformation, when exporting target source Domain signal.
  2. 2. the method for target source is extracted from audio as claimed in claim 1, it is characterised in that the virtual angle of the virtual source Calculation it is as follows:
    θab(k) the virtual folder for the first via signal a virtual sources corresponding with second road signal b each frequency that frequency k is presented is represented Angle, AaAnd A (k)b(k) amplitude for the frequency k that first via signal a and second road signal b is presented is represented respectively,Represent the first via Signal a and second road signal b angle.
  3. 3. the method for target source is extracted from audio as claimed in claim 2, it is characterised in that when the virtual folder of the virtual source When angle is more than predetermined angular threshold value, the first via signal corresponding to the virtual source or second road signal are considered as target source letter Number, the frequency-region signal for then extracting the target source signal stores.
  4. 4. the method for target source is extracted from audio as claimed in claim 2, it is characterised in that if first via signal a is target Source signal, the then calculation for extracting the target source signal are as follows:
    S (k)=Aa(k) M (k),
    Wherein,
    M (k) is target source extraction vector;T is given threshold value, and S (k) is target source signal.
  5. A kind of 5. device that target source is extracted from audio, it is characterised in that including:
    Time-domain and frequency-domain conversion segmentation module, carries out time-frequency conversion for the audio signal to collection, time-domain signal is converted frame by frame For frequency-region signal, first via signal and second road signal are formed;
    Virtual angle calcu-lation module, believe for traveling through the first via signal calculated under given frequency per frame frequency-region signal with the second road Number each frequency corresponding to virtual source virtual angle;
    Target source signal memory module, for the virtual angle and the size of predetermined angular threshold value, according to comparative result First via signal or second road signal as target source signal and are extracted the frequency-region signal of the target source signal and stored;
    Time domain frequency domain converts output module, for being turned the frequency-region signal of the target source signal of storage using time-frequency conversion inverse transformation Time-domain signal is changed to, exports target source time-domain signal.
  6. 6. the device of target source is extracted from audio as claimed in claim 5, it is characterised in that the virtual angle of the virtual source Calculation it is as follows:
    θab(k) the virtual folder for the first via signal a virtual sources corresponding with second road signal b each frequency that frequency k is presented is represented Angle, AaAnd A (k)b(k) amplitude for the frequency k that first via signal a and second road signal b is presented is represented respectively,Represent the first via Signal a and second road signal b angle.
  7. 7. the device of target source is extracted from audio as claimed in claim 6, it is characterised in that when the virtual folder of the virtual source When angle is more than predetermined angular threshold value, the first via signal corresponding to the virtual source or second road signal are considered as target source letter Number, the frequency-region signal for then extracting the target source signal stores.
  8. 8. the device of target source is extracted from audio as claimed in claim 6, it is characterised in that if first via signal a is target Source signal, the then calculation for extracting the target source signal are as follows:
    S (k)=Aa(k) M (k),
    Wherein,
    M (k) is target source extraction vector;T is given threshold value, and S (k) is target source signal.
CN201710816430.4A 2017-09-12 2017-09-12 Method and device for extracting target source from audio Active CN107578784B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710816430.4A CN107578784B (en) 2017-09-12 2017-09-12 Method and device for extracting target source from audio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710816430.4A CN107578784B (en) 2017-09-12 2017-09-12 Method and device for extracting target source from audio

Publications (2)

Publication Number Publication Date
CN107578784A true CN107578784A (en) 2018-01-12
CN107578784B CN107578784B (en) 2020-12-11

Family

ID=61036413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710816430.4A Active CN107578784B (en) 2017-09-12 2017-09-12 Method and device for extracting target source from audio

Country Status (1)

Country Link
CN (1) CN107578784B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108735227A (en) * 2018-06-22 2018-11-02 北京三听科技有限公司 A kind of voice signal for being picked up to microphone array carries out the method and system of Sound seperation

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102522093A (en) * 2012-01-09 2012-06-27 武汉大学 Sound source separation method based on three-dimensional space audio frequency perception
CN104103277A (en) * 2013-04-15 2014-10-15 北京大学深圳研究生院 Time frequency mask-based single acoustic vector sensor (AVS) target voice enhancement method
JP2015125239A (en) * 2013-12-26 2015-07-06 Pioneer DJ株式会社 Sound signal processor, control method of sound signal processor, and program
EP2960899A1 (en) * 2014-06-25 2015-12-30 Thomson Licensing Method of singing voice separation from an audio mixture and corresponding apparatus
CN105723459A (en) * 2013-11-15 2016-06-29 华为技术有限公司 Apparatus and method for improving a perception of sound signal
CN106537502A (en) * 2014-03-31 2017-03-22 索尼公司 Method and apparatus for generating audio content

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102522093A (en) * 2012-01-09 2012-06-27 武汉大学 Sound source separation method based on three-dimensional space audio frequency perception
CN104103277A (en) * 2013-04-15 2014-10-15 北京大学深圳研究生院 Time frequency mask-based single acoustic vector sensor (AVS) target voice enhancement method
CN105723459A (en) * 2013-11-15 2016-06-29 华为技术有限公司 Apparatus and method for improving a perception of sound signal
JP2015125239A (en) * 2013-12-26 2015-07-06 Pioneer DJ株式会社 Sound signal processor, control method of sound signal processor, and program
CN106537502A (en) * 2014-03-31 2017-03-22 索尼公司 Method and apparatus for generating audio content
EP2960899A1 (en) * 2014-06-25 2015-12-30 Thomson Licensing Method of singing voice separation from an audio mixture and corresponding apparatus

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108735227A (en) * 2018-06-22 2018-11-02 北京三听科技有限公司 A kind of voice signal for being picked up to microphone array carries out the method and system of Sound seperation
CN108735227B (en) * 2018-06-22 2020-05-19 北京三听科技有限公司 Method and system for separating sound source of voice signal picked up by microphone array

Also Published As

Publication number Publication date
CN107578784B (en) 2020-12-11

Similar Documents

Publication Publication Date Title
Wang et al. Deep extractor network for target speaker recovery from single channel speech mixtures
US20200058293A1 (en) Object recognition method, computer device, and computer-readable storage medium
CN103811020B (en) A kind of intelligent sound processing method
Wang et al. Channel pattern noise based playback attack detection algorithm for speaker recognition
Cheng et al. Replay detection using CQT-based modified group delay feature and ResNeWt network in ASVspoof 2019
CN105139857A (en) Countercheck method for automatically identifying speaker aiming to voice deception
CN102129456B (en) Method for monitoring and automatically classifying music factions based on decorrelation sparse mapping
CN107851444A (en) For acoustic signal to be decomposed into the method and system, target voice and its use of target voice
CN109767776B (en) Deception voice detection method based on dense neural network
CN107450724A (en) A kind of gesture identification method and system based on dual-channel audio Doppler effect
CN105469807B (en) A kind of more fundamental frequency extracting methods and device
CN108597505A (en) Audio recognition method, device and terminal device
CN106409298A (en) Identification method of sound rerecording attack
CN102723079A (en) Music and chord automatic identification method based on sparse representation
CN105845149A (en) Predominant pitch acquisition method in acoustical signal and system thereof
WO2014043815A1 (en) A method and system for assessing karaoke users
CN109997186B (en) Apparatus and method for classifying acoustic environments
KR20220044446A (en) Method and apparatus for testing vehicle-mounted voice device, electronic device and storage medium
CN105845143A (en) Speaker confirmation method and speaker confirmation system based on support vector machine
Xue et al. Cross-modal information fusion for voice spoofing detection
Kumar et al. Speech frame selection for spoofing detection with an application to partially spoofed audio-data
CN107578784A (en) A kind of method and device that target source is extracted from audio
Shabtai et al. Room volume classification from room impulse response using statistical pattern recognition and feature selection
Chen et al. Cochlear pitch class profile for cover song identification
Sofianos et al. H-Semantics: A hybrid approach to singing voice separation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20230322

Address after: 201, No. 33, Southeast Avenue, Changshu Hi tech Industrial Development Zone, Suzhou City, Jiangsu Province, 215500

Patentee after: Suzhou Yinman Technology Co.,Ltd.

Address before: 100029 9th Floor (08), No. 19 Ritan North Road, Chaoyang District, Beijing (1056 Chaowai Incubator)

Patentee before: YINMAN (BEIJING) TECHNOLOGY CO.,LTD.