CN105719657A - Human voice extracting method and device based on microphone - Google Patents

Human voice extracting method and device based on microphone Download PDF

Info

Publication number
CN105719657A
CN105719657A CN201610098307.9A CN201610098307A CN105719657A CN 105719657 A CN105719657 A CN 105719657A CN 201610098307 A CN201610098307 A CN 201610098307A CN 105719657 A CN105719657 A CN 105719657A
Authority
CN
China
Prior art keywords
signal
voice
extracting method
carry out
human voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610098307.9A
Other languages
Chinese (zh)
Inventor
肖观送
黄锦昌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huizhou Desay SV Automotive Co Ltd
Original Assignee
Huizhou Desay SV Automotive Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huizhou Desay SV Automotive Co Ltd filed Critical Huizhou Desay SV Automotive Co Ltd
Priority to CN201610098307.9A priority Critical patent/CN105719657A/en
Publication of CN105719657A publication Critical patent/CN105719657A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Soundproofing, Sound Blocking, And Sound Damping (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a human voice extracting method and device based on a microphone. The human voice extracting device including at least one microphone is provided. A collection system further comprises an audio signal processor used for processing voice signals obtained by the microphones, and a voice identification core. The method specifically comprises the step of: carrying out analog-to-digital conversion on at least one path of obtained voice signals, and obtaining original voice signals; carrying out analysis and statistics on each time frequency point of the voice signals, and extracting initial human voice signals according to preset user voice characteristics obtained by a human voice extracting method; and then extracting human voice signals by means of phase reversal and addition operation. According to the invention, the human voice signals are sampled and quantified and then are compared with acoustic models, obtained by the system, with the user voice characteristics; in this way, the user voice signals are extracted, the extracted human voice signals are purer, and the voices of the user can be extracted to the maximum extent; in addition, the voice characteristics of different persons are different, and according to the characteristic, voices emitted by the people around can be filtered out.

Description

Voice extracting method and device based on single microphone
Technical field
The present invention relates to acoustic processing field, particularly to a kind of voice extracting method based on single microphone and device.
Background technology
At present, noise reduction schemes general in speech recognition is to add independent noise reduction module, this noise reduction module is generally adopted the active noise reduction techniques of dual microphone, namely the noise signal phase place of secondary mike is through being reversely added with the noise signal in main mike again, thus noise signal plays the effect of suppression.But the program needs independent noise reduction module and two mikes, relatively costly.The installation of dual microphone also there is is certain requirement, adds the complexity of installation.And it is difficult to differentiate between out actual user under many people speak environment, cause low discrimination.The acoustical signal that algorithm two mikes of guarantee that module developer needs exploitation complicated when exploitation the enter sequential when processing is consistent.
Summary of the invention
The invention aims to overcome the defect of above-mentioned background technology, it is provided that a kind of voice extracting method based on single microphone and device.
A kind of voice extracting method based on single microphone, including the voice extraction element with at least one mike, described acquisition system also includes the audio signal processor for processing the acoustical signal that described mike obtains and speech recognizer kernel, and described audio signal processor extracts voice concrete steps and includes as follows:
S10, to obtain at least one road acoustical signal do analog digital conversion, it is thus achieved that original sound signal;
S20, each time frequency point of acoustical signal is analyzed statistics, according in advance voice preextraction method obtain the preliminary human voice signal of user voice feature extraction;
S30, described preliminary human voice signal is carried out opposite in phase, and be added with described original sound signal, it is thus achieved that noise signal;
S40, described noise signal is carried out opposite in phase, and be added with described original sound signal, it is thus achieved that final human voice signal;
Described voice and extracting method are carry out speech characteristic parameter extraction method in the environment of low noise.
Further, also include:
S50, final voice is done signal gain process;
S60, the final human voice signal after gain process is sent to speech recognizer kernel.
Wherein, described characteristic parameter extraction method comprises the steps:
S201, acoustical signal is carried out anti-aliasing filter;
S202, to step S201 obtain signal carry out analog digital conversion;
S203, to step S202 obtain signal carry out high-pass filtering;
S204, to step S203 obtain signal do sub-frame processing;
Every frame data that step S204 is obtained by S205, employing hamming code window mouth carry out windowing process;
S206, to step S205 obtain signal do frequency domain conversion;
S207, to step S206 obtain signal carry out quarter window filtering;
S208, to step S207 obtain signal carry out logarithm operation;
S209, to step S208 obtain signal do discrete cosine transform;
S210, to step S209 obtain signal carry out spectrum weighting;
S211, the step S210 signal obtained is done cepstral mean subtracts process;
S212, step S211 obtain signal add characterize non-speech dynamic characteristics differential parameter, it is thus achieved that user voice feature.
Preferably, described voice extraction element adopts a mike.
Additionally, the present invention also provides for a kind of single microphone voice extraction element based on above-mentioned voice extracting method and includes a mike, the audio signal processor that is connected with described mike and for identifying the speech recognizer kernel of voice, described audio signal processor includes the module for the acoustical signal obtained does analog digital conversion, for each time frequency point of acoustical signal is analyzed the module of statistics, for doing the module of voice preextraction method in advance and for acoustical signal being carried out reversely and/or the module being added.
Preferably, described Sound Processor Unit also includes the module of doing gain process for many acoustical signals.
Human voice signal is carried out sample quantization by the present invention, then gets the acoustic model contrast with user voice feature with system, extracts user voice signal, and again extracts human voice signal in the signal filtered noise signal.Owing to have passed through a noise suppressed, the human voice signal extracted is purer, it is possible to extract user voice to greatest extent, and everyone sound characteristic property of there are differences, and can also filter, according to this feature, the sound that people around sends.
Accompanying drawing explanation
Fig. 1 is the method flow diagram of the voice extracting method of the present invention.
Fig. 2 is the flow chart of steps of inventive feature parameter extraction method.
Fig. 3 is the single microphone voice extraction element framework schematic diagram of the present invention.
Detailed description of the invention
It is further described below in conjunction with the accompanying drawing voice extracting method based on single microphone to the present invention and device.
A kind of voice extracting method based on single microphone, including the voice extraction element with a mike, acquisition system also includes the audio signal processor for processing the acoustical signal that mike obtains and speech recognizer kernel, as shown in Figure 1.Audio signal processor extracts voice concrete steps and includes as follows:
S10, to obtain single channel acoustical signal do analog digital conversion, convert original analoging sound signal to digital signal, thus obtaining pending original sound signal;
S20, each time frequency point of acoustical signal is analyzed statistics, in this process, the sound characteristic of each time frequency point Yu user is compared calculating, obtains out the part identical with sound characteristic, finally extract preliminary human voice signal.Wherein the sound characteristic of user is in advance relatively low noise ratio or does not have noise part to adopt speech characteristic parameter extraction method to obtain.
S30, preliminary human voice signal carrying out opposite in phase, and be added with original sound signal, now, the human voice signal in original sound signal is tentatively removed, it is thus achieved that a noise signal.
S40, noise signal carrying out opposite in phase, and be added with original sound signal, now the noise in original sound signal is then filtered, it is thus achieved that final human voice signal.
S50, in order to increase the discrimination of voice, it is possible to optionally final voice is done signal gain process.
S60, the final human voice signal after gain process is sent to speech recognizer kernel.
After voice signal enters into system by single microphone, can efficiently suppress environment noise, due to only with a mike, whole processing procedure is under same sequential, ensure that the sequential taken when reverse signal is added with primary signal is consistent, also achieve single mike and reach the effect that dual microphone processes.And single microphone has saved cost, install simple.
In preferred embodiment, as in figure 2 it is shown, characteristic parameter extraction method comprises the steps:
S201, acoustical signal is carried out anti-aliasing filter, a frequency overlapped-resistable filter can be adopted to be reduced by aliasing frequency component.
S202, the step S201 signal obtained is carried out analog digital conversion, speech simulation signal is converted to digital signal, convenient process.
S203, to step S202 obtain signal carry out high-pass filtering, namely data are done preemphasis process, high pass filter can be passed through, make the frequency spectrum of signal become smooth, be not easily susceptible to the impact of finite word length effect.
S204, to step S203 obtain signal do sub-frame processing, the short-term stationarity characteristic according to voice, voice can divide in units of frame, facilitates the follow-up process to signal.
Every frame data that step S204 is obtained by S205, employing hamming code window mouth carry out windowing process, its role is to reduce the impact of Gibbs' effect.
S206, to step S205 obtain signal do frequency domain conversion, it is preferred that embodiment can be done fast Fourier transform.
S207, to step S206 obtain signal carry out quarter window filtering, concrete, available quarter window wave filter, the power spectrum of signal is filtered, the scope that each quarter window wave filter covers is similar to a critical bandwidth of human ear, simulates the masking effect of human ear with this.
S208, to step S207 obtain signal carry out logarithm operation, it is possible to obtain being similar to the result of isomorphic transformation.
S209, to step S208 obtain signal do discrete cosine transform, remove the dependency between each dimensional signal, signal be mapped to lower dimensional space.
S210, to step S209 obtain signal carry out spectrum weighting, owing to the low order parameter of cepstrum is subject to the impact of speaker's characteristic, the characteristic of channel etc., and the resolution capability of high order parameters is relatively low, thus need carry out spectrum weighting, it is suppressed that its low order and high order parameters.
S211, the step S210 signal obtained being done cepstral mean and subtracts process, this process can reduce the impact on characteristic parameter of the phonetic entry channel effectively.
S212, add, in the step S211 signal obtained, the differential parameter characterizing non-speech dynamic characteristics, it is possible to increase the recognition performance of system, final obtain user voice feature.
Additionally, the present invention also provides for a kind of single microphone voice extraction element based on above-mentioned voice extracting method and includes a mike, the audio signal processor being connected with mike and the speech recognizer kernel being used for identifying voice, as shown in Figure 3, wherein in audio signal processor, just like lower module parts:
Convert analog signals into the analog-digital converter of digital signal.Digital signal enters sound characteristic extraction module, is extracted the preliminary human voice signal of user, then enters the first phase inverter, it is thus achieved that the inversion signal of preliminary human voice signal.Now this inversion signal is added by first adder with original sound signal, it is thus achieved that noise signal.Noise signal is admitted in the second phase inverter, it is thus achieved that the inversion signal of noise signal.Finally by second adder, the inversion signal fish original sound signal of noise signal is added, extracts final human voice signal.In order to facilitate speech recognizer kernel identification voice, it is preferable to carry out signal gain at the amplifier that advanced that human voice signal is sent to speech recognizer kernel.
Versatility of the present invention is high, and after developer designs noise reduction framework, system can actively complete noise suppressed and voice extracts.In reality is tested, we acquire the noise signal processed without this system, with the noise signal processed through system, find after comparison, in the bigger situation of real vehicle environment noise, system still can the acoustical signal of average more than the 100db of output signal-to-noise ratio, discrimination is risen to 90% by initial 30%, complies fully with the requirement that vehicle-mounted voice identification controls.
Above in conjunction with accompanying drawing, embodiments of the present invention are explained in detail, but the present invention is not limited to above-mentioned embodiment, in the ken that those of ordinary skill in the art possess, it is also possible under the premise without departing from present inventive concept, make various change.

Claims (6)

1. the voice extracting method based on single microphone, it is characterized in that: include having the voice extraction element of at least one mike, described acquisition system also includes the audio signal processor for processing the acoustical signal that described mike obtains and speech recognizer kernel, and described audio signal processor extracts voice concrete steps and includes as follows:
S10, to obtain at least one road acoustical signal do analog digital conversion, it is thus achieved that original sound signal;
S20, each time frequency point of acoustical signal is analyzed statistics, according in advance voice preextraction method obtain the preliminary human voice signal of user voice feature extraction;
S30, described preliminary human voice signal is carried out opposite in phase, and be added with described original sound signal, it is thus achieved that noise signal;
S40, described noise signal is carried out opposite in phase, and be added with described original sound signal, it is thus achieved that final human voice signal;
Described voice and extracting method are carry out speech characteristic parameter extraction method in the environment of low noise.
2. voice extracting method as claimed in claim 1, it is characterised in that also include:
S50, final voice is done signal gain process;
S60, the final human voice signal after gain process is sent to speech recognizer kernel.
3. voice extracting method as claimed in claim 1, it is characterised in that described characteristic parameter extraction method comprises the steps:
S201, acoustical signal is carried out anti-aliasing filter;
S202, to step S201 obtain signal carry out analog digital conversion;
S203, to step S202 obtain signal carry out high-pass filtering;
S204, to step S203 obtain signal do sub-frame processing;
Every frame data that step S204 is obtained by S205, employing hamming code window mouth carry out windowing process;
S206, to step S205 obtain signal do frequency domain conversion;
S207, to step S206 obtain signal carry out quarter window filtering;
S208, to step S207 obtain signal carry out logarithm operation;
S209, to step S208 obtain signal do discrete cosine transform;
S210, to step S209 obtain signal carry out spectrum weighting;
S211, the step S210 signal obtained is done cepstral mean subtracts process;
S212, step S211 obtain signal add characterize non-speech dynamic characteristics differential parameter, it is thus achieved that user voice feature.
4. the voice extracting method as according to any one of claim 1 ~ 3, it is characterised in that described voice extraction element adopts a mike.
5. the single microphone voice extraction element based on the voice extracting method described in claim 1, including a mike, the audio signal processor that is connected with described mike and for identifying the speech recognizer kernel of voice, it is characterized in that, described audio signal processor includes the module for the acoustical signal obtained does analog digital conversion, for each time frequency point of acoustical signal is analyzed the module of statistics, for doing the module of voice preextraction method in advance and the module for acoustical signal being carried out reversely and/or be added.
6. single microphone voice extraction element as claimed in claim 5, it is characterised in that described Sound Processor Unit also includes the module doing gain process for many acoustical signals.
CN201610098307.9A 2016-02-23 2016-02-23 Human voice extracting method and device based on microphone Pending CN105719657A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610098307.9A CN105719657A (en) 2016-02-23 2016-02-23 Human voice extracting method and device based on microphone

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610098307.9A CN105719657A (en) 2016-02-23 2016-02-23 Human voice extracting method and device based on microphone

Publications (1)

Publication Number Publication Date
CN105719657A true CN105719657A (en) 2016-06-29

Family

ID=56156985

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610098307.9A Pending CN105719657A (en) 2016-02-23 2016-02-23 Human voice extracting method and device based on microphone

Country Status (1)

Country Link
CN (1) CN105719657A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106653040A (en) * 2016-12-22 2017-05-10 上海百芝龙网络科技有限公司 Voice audio signal sampling processing method
CN106653048A (en) * 2016-12-28 2017-05-10 上海语知义信息技术有限公司 Method for separating sound of single channels on basis of human sound models
CN108154886A (en) * 2017-12-29 2018-06-12 广东欧珀移动通信有限公司 Noise suppressing method and device, electronic device and computer readable storage medium
CN108418968A (en) * 2018-03-12 2018-08-17 广东欧珀移动通信有限公司 Voice communication data processing method, device, storage medium and mobile terminal
CN109218882A (en) * 2018-08-16 2019-01-15 歌尔科技有限公司 The ambient sound monitor method and earphone of earphone
CN110085251A (en) * 2019-04-26 2019-08-02 腾讯音乐娱乐科技(深圳)有限公司 Voice extracting method, voice extraction element and Related product
CN110191397A (en) * 2019-06-28 2019-08-30 歌尔科技有限公司 A kind of noise-reduction method and bluetooth headset
WO2022017424A1 (en) * 2020-07-24 2022-01-27 华为技术有限公司 Active noise control method and apparatus, and audio playback device

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2682532Y (en) * 2003-09-16 2005-03-02 陈修志 Convenient anti-noise speech recognition electronic installation
CN1991979A (en) * 2005-12-30 2007-07-04 宏碁股份有限公司 Method and device for eliminating surge in sound recording
CN200973119Y (en) * 2006-11-13 2007-11-07 徐海波 Single microphone anti-noise transmitter
CN103594092A (en) * 2013-11-25 2014-02-19 广东欧珀移动通信有限公司 Single microphone voice noise reduction method and device
CN104078051A (en) * 2013-03-29 2014-10-01 中兴通讯股份有限公司 Voice extracting method and system and voice audio playing method and device
CN104303227A (en) * 2012-03-26 2015-01-21 弗朗霍夫应用科学研究促进协会 Apparatus and method for improving the perceived quality of sound reproduction by combining active noise cancellation and perceptual noise compensation
CN104616662A (en) * 2015-01-27 2015-05-13 中国科学院理化技术研究所 Active noise reduction method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2682532Y (en) * 2003-09-16 2005-03-02 陈修志 Convenient anti-noise speech recognition electronic installation
CN1991979A (en) * 2005-12-30 2007-07-04 宏碁股份有限公司 Method and device for eliminating surge in sound recording
CN200973119Y (en) * 2006-11-13 2007-11-07 徐海波 Single microphone anti-noise transmitter
CN104303227A (en) * 2012-03-26 2015-01-21 弗朗霍夫应用科学研究促进协会 Apparatus and method for improving the perceived quality of sound reproduction by combining active noise cancellation and perceptual noise compensation
CN104078051A (en) * 2013-03-29 2014-10-01 中兴通讯股份有限公司 Voice extracting method and system and voice audio playing method and device
CN103594092A (en) * 2013-11-25 2014-02-19 广东欧珀移动通信有限公司 Single microphone voice noise reduction method and device
CN104616662A (en) * 2015-01-27 2015-05-13 中国科学院理化技术研究所 Active noise reduction method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
严勤,吕勇 著: "《语音信号处理与识别》", 31 December 2015, 国防工业出版社 *
徐丽敏 著: "《鲁棒性说话人识别技术——在移动商务中的应用研究》", 30 September 2011, 南京大学出版社 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106653040A (en) * 2016-12-22 2017-05-10 上海百芝龙网络科技有限公司 Voice audio signal sampling processing method
CN106653048A (en) * 2016-12-28 2017-05-10 上海语知义信息技术有限公司 Method for separating sound of single channels on basis of human sound models
CN106653048B (en) * 2016-12-28 2019-10-15 云知声(上海)智能科技有限公司 Single channel sound separation method based on voice model
CN108154886A (en) * 2017-12-29 2018-06-12 广东欧珀移动通信有限公司 Noise suppressing method and device, electronic device and computer readable storage medium
CN108418968A (en) * 2018-03-12 2018-08-17 广东欧珀移动通信有限公司 Voice communication data processing method, device, storage medium and mobile terminal
CN109218882A (en) * 2018-08-16 2019-01-15 歌尔科技有限公司 The ambient sound monitor method and earphone of earphone
CN110085251A (en) * 2019-04-26 2019-08-02 腾讯音乐娱乐科技(深圳)有限公司 Voice extracting method, voice extraction element and Related product
CN110085251B (en) * 2019-04-26 2021-06-25 腾讯音乐娱乐科技(深圳)有限公司 Human voice extraction method, human voice extraction device and related products
CN110191397A (en) * 2019-06-28 2019-08-30 歌尔科技有限公司 A kind of noise-reduction method and bluetooth headset
CN110191397B (en) * 2019-06-28 2021-10-15 歌尔科技有限公司 Noise reduction method and Bluetooth headset
WO2022017424A1 (en) * 2020-07-24 2022-01-27 华为技术有限公司 Active noise control method and apparatus, and audio playback device

Similar Documents

Publication Publication Date Title
CN105719657A (en) Human voice extracting method and device based on microphone
CN110197670B (en) Audio noise reduction method and device and electronic equipment
JP5230103B2 (en) Method and system for generating training data for an automatic speech recognizer
CN110021307B (en) Audio verification method and device, storage medium and electronic equipment
US8359195B2 (en) Method and apparatus for processing audio and speech signals
US9536540B2 (en) Speech signal separation and synthesis based on auditory scene analysis and speech modeling
JP5127754B2 (en) Signal processing device
US20210193149A1 (en) Method, apparatus and device for voiceprint recognition, and medium
US10614827B1 (en) System and method for speech enhancement using dynamic noise profile estimation
KR101414233B1 (en) Apparatus and method for improving speech intelligibility
CN108461081B (en) Voice control method, device, equipment and storage medium
JP6386237B2 (en) Voice clarifying device and computer program therefor
KR20220062598A (en) Systems and methods for generating audio signals
Jangjit et al. A new wavelet denoising method for noise threshold
CN110765868A (en) Lip reading model generation method, device, equipment and storage medium
CN114189781A (en) Noise reduction method and system for double-microphone neural network noise reduction earphone
Alam et al. Robust feature extraction for speech recognition by enhancing auditory spectrum
CN113593599A (en) Method for removing noise signal in voice signal
CN113782044A (en) Voice enhancement method and device
Ghanbari et al. Improved multi-band spectral subtraction method for speech enhancement
CN110931034B (en) Pickup noise reduction method for built-in earphone of microphone
Ayat et al. An improved wavelet-based speech enhancement by using speech signal features
CN111261192A (en) Audio detection method based on LSTM network, electronic equipment and storage medium
CN116312561A (en) Method, system and device for voice print recognition, authentication, noise reduction and voice enhancement of personnel in power dispatching system
Hassani et al. Speech enhancement based on spectral subtraction in wavelet domain

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160629

RJ01 Rejection of invention patent application after publication