CN109920405A - Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing - Google Patents
Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing Download PDFInfo
- Publication number
- CN109920405A CN109920405A CN201910164535.5A CN201910164535A CN109920405A CN 109920405 A CN109920405 A CN 109920405A CN 201910164535 A CN201910164535 A CN 201910164535A CN 109920405 A CN109920405 A CN 109920405A
- Authority
- CN
- China
- Prior art keywords
- audio
- road
- microphone array
- collection region
- speech recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Circuit For Audible Band Transducer (AREA)
Abstract
The embodiment of the present invention provides a kind of multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing.The method of the embodiment of the present invention, by the audio data for receiving the acquisition of multichannel microphone array, beam forming processing is carried out to every road audio data, audio signal corresponding with corresponding audio pickup area in every road audio data is obtained, weakens the audio signal in the road audio data on other directions;It carries out AF panel to multipath audio signal and handles to obtain each audio collection region to correspond to voice signal, reduce interference of the noise signal in other audio collection regions to the road voice signal, the corresponding speech recognition result in each audio collection region is obtained to each voice signal speech recognition, improves the discrimination of speech recognition;When more people talk simultaneously, inhibits interfering with each other between each road voice signal, obtain the corresponding speech recognition result in each audio collection position, improve the efficiency and accuracy of speech recognition.
Description
Technical field
The present embodiments relate to technical field of voice recognition more particularly to a kind of multi-path voice recognition methods, device, set
Standby and readable storage medium storing program for executing.
Background technique
Currently, the vehicle device on vehicle is all that two-channel microphone, including two wheats of left and right sound channels all the way only is arranged at front row
Gram wind is mainly used for acquiring the audio data near skipper position, by carrying out speech recognition to the audio data of acquisition, to know
Instruction that other driver issues to vehicle device etc. identifies language.
But if when the passenger for being sitting in co-driver or back row seat on vehicle issues identification language to vehicle device, due to
Farther out, the audio data of microphone acquisition is second-rate for sound source distance microphone, causes phonetic recognization rate very low, especially in more people
When saying identification language simultaneously, it will cause reverberation, be more difficult to correctly identify identification language.
Summary of the invention
The embodiment of the present invention provides a kind of multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing, to solve
The very low problem of the phonetic recognization rate of audio recognition method on vehicle in the prior art.
The one aspect of the embodiment of the present invention is to provide a kind of multi-path voice recognition methods, comprising:
The audio data of multichannel microphone array acquisition is received, microphone array described in every road is directed toward a sound in vehicle
Frequency pickup area, for acquiring audio data all the way;
Position according to every road microphone array relative to corresponding audio pickup area, the audio data described in every road carry out
Beam forming processing, obtains audio signal corresponding with corresponding audio pickup area in audio data described in every road;
The audio signal described in multichannel carries out AF panel processing, obtains each audio collection region and corresponds to voice letter
Number;
Speech recognition is carried out to the corresponding voice signal in each audio collection region, obtains each audio collection area
The corresponding speech recognition result in domain.
The other side of the embodiment of the present invention is to provide a kind of multi-path voice identification device, comprising:
Data acquisition module, for receiving the audio data of multichannel microphone array acquisition, microphone array described in every road
It is directed toward an audio collection region in vehicle, for acquiring audio data all the way;
Beamforming block, for the position according to every road microphone array relative to corresponding audio pickup area, to every
Audio data described in road carries out beam forming processing, obtains corresponding with corresponding audio pickup area in audio data described in every road
Audio signal;
AF panel processing module carries out AF panel processing for the audio signal described in multichannel, obtains each described
Audio collection region corresponds to voice signal;
Speech recognition module is obtained for carrying out speech recognition to the corresponding voice signal in each audio collection region
The corresponding speech recognition result in each audio collection region.
The other side of the embodiment of the present invention is to provide a kind of multi-path voice identification equipment, comprising:
Memory, processor, and it is stored in the computer journey that can be run on the memory and on the processor
Sequence,
The processor realizes multi-path voice recognition methods described above when running the computer program.
The other side of the embodiment of the present invention is to provide a kind of computer readable storage medium, is stored with computer journey
Sequence,
The computer program realizes multi-path voice recognition methods described above when being executed by processor.
Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing provided in an embodiment of the present invention are more by receiving
The audio data of road microphone array acquisition, microphone array described in every road are directed toward an audio collection region in vehicle, use
In acquisition audio data all the way;Position according to every road microphone array relative to corresponding audio pickup area, described in every road
Audio data carries out beam forming processing, obtains audio letter corresponding with corresponding audio pickup area in audio data described in every road
Number, weaken the audio signal in the road audio data on other directions, realizes the compacting to audio signal on other directions;Then
The audio signal described in multichannel carries out AF panel processing, obtains each audio collection region and corresponds to voice signal, into one
Step reduces interference of the noise signal in other audio collection regions to the road voice signal, obtains more clean audio collection area
The corresponding voice signal in domain;Speech recognition is carried out to the corresponding voice signal in each audio collection region, is obtained each described
The corresponding speech recognition result in audio collection region;Realize no matter sound source is located at which audio collection region of vehicle, has pair
The microphone array all the way answered can accurately acquire the audio data, and obtain accurate speech recognition result, improve language
The discrimination of sound identification;And it is able to suppress mutual between each road voice signal in more people when different location is talked simultaneously
Interference, identifies the corresponding speech recognition result in each audio collection position, substantially increases the efficiency of speech recognition and accurate
Property.
Detailed description of the invention
Fig. 1 is the multi-path voice recognition methods flow chart that the embodiment of the present invention one provides;
Fig. 2 is multi-path voice recognition methods flow chart provided by Embodiment 2 of the present invention;
Fig. 3 is the structural schematic diagram for the multi-path voice identification device that the embodiment of the present invention three provides;
Fig. 4 is the structural schematic diagram that the multi-path voice that the embodiment of the present invention five provides identifies equipment.
Through the above attached drawings, it has been shown that the specific embodiment of the present invention will be hereinafter described in more detail.These attached drawings
It is not intended to limit the range of design of the embodiment of the present invention in any manner with verbal description, but by reference to specific reality
Applying example is that those skilled in the art illustrate idea of the invention.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to
When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the embodiment of the present invention.On the contrary, they be only with
The example of the consistent device and method of as detailed in the attached claim, the embodiment of the present invention some aspects.
Term " first " involved in the embodiment of the present invention, " second " etc. are used for description purposes only, and should not be understood as referring to
Show or imply relative importance or implicitly indicates the quantity of indicated technical characteristic.In the description of following embodiment,
The meaning of " plurality " is two or more, unless otherwise specifically defined.
These specific embodiments can be combined with each other below, may be at certain for the same or similar concept or process
It is repeated no more in a little embodiments.Below in conjunction with attached drawing, the embodiment of the present invention is described.
Embodiment one
Fig. 1 is the multi-path voice recognition methods flow chart that the embodiment of the present invention one provides.The embodiment of the present invention is for existing
The very low problem of the phonetic recognization rate of audio recognition method in technology on vehicle, provides multi-path voice recognition methods.This reality
The method applied in example is applied to speech recognition apparatus, which can be installation and set with the car-mounted terminal on vehicle
It is standby, or can be communicated with the vehicle-mounted terminal equipment on vehicle, and carry out the computer equipment of speech recognition, at it
In his embodiment, this method applies also for other equipment, and the present embodiment is schematically illustrated by taking speech recognition apparatus as an example.
As shown in Figure 1, specific step is as follows for this method:
Step S101, the audio data of multichannel microphone array acquisition is received, every road microphone array is directed toward in vehicle
One audio collection region, for acquiring audio data all the way.
The embodiment of the present invention is applied to carry out speech recognition vehicle, is usually provided with multiple seats, example in the vehicle
Such as skipper seat, assistant driver seat and other seats are equipped with multichannel microphone array, every road microphone in vehicle
Array is directed toward an audio collection region, for acquiring the audio data in the audio collection region being directed toward.Each audio collection area
Domain corresponds to the position where a seat, and the seat in audio collection region and vehicle corresponds, it is, every road Mike
Wind array is correspondingly arranged for being directed toward a seat, microphone array with the seat in vehicle.For example, for four automatically drive
Vehicle is sailed, installation is respectively directed to the four road microphone arrays at four seats in vehicle.
In the present embodiment, when carrying out speech recognition, microphone array can acquire audio data in real time, and will acquisition
Audio data be sent to speech recognition apparatus.Speech recognition apparatus can receive the audio number of each road microphone array acquisition
According to.
Wherein, every road audio data may include the mark for acquiring the microphone array of the road audio data, each to distinguish
Road audio data.
Step S102, the position according to every road microphone array relative to corresponding audio pickup area, to every road audio number
According to beam forming processing is carried out, audio signal corresponding with corresponding audio pickup area in every road audio data is obtained.
Wherein, the corresponding audio collection region of every road microphone array refers to: audio pointed by the road microphone array
Pickup area.The corresponding audio collection region of every road voice data refers to: acquiring pointed by the microphone array of the audio data
Audio collection region.
In the step, after receiving multi-path audio-frequency data, respectively according to every road microphone array relative to its meaning
To audio collection region position, by beam forming (beam forming) technology, to road microphone array acquisition
Audio data carries out beam forming processing all the way, obtains the corresponding audio collection area of the road road audio data Zhong Yugai audio data
The corresponding audio signal in domain weakens the audio signal in the road audio data on other directions, realizes to audio on other directions
The compacting of signal.
In the step, every road is obtained by carrying out beam forming processing to every road audio data using beam forming technique
Audio signal corresponding with corresponding audio pickup area in audio data that is to say to obtain the corresponding sound in each audio collection region
Frequency signal.
Step S103, AF panel processing is carried out to multipath audio signal, obtains each audio collection region and corresponds to voice
Signal.
In the present embodiment, the audio signal on other directions may not be able to be fully eliminated due to beam forming technique,
In obtaining the corresponding audio signal in each audio collection region, the audio signal on other directions still may include, that is,
It may include the audio signal that sound source issues in other audio collection regions.In the step, each audio collection region pair is being obtained
After the audio signal answered, AF panel processing is carried out to multipath audio signal, from the corresponding audio signal in audio collection region
Middle other corresponding audio signal parts in audio collection region of removal, obtain the more clean road audio signal and correspond to audio adopting
Collect voice signal corresponding to region.
In addition, the process of wave beam processing and AF panel processing is carried out, it can be by the digital signal on speech recognition apparatus
It handles (Digital Signal Processing, abbreviation DSP) processing module or independent dsp chip is completed, the present embodiment
It is not specifically limited herein.
Step S104, speech recognition is carried out to the corresponding voice signal in each audio collection region, obtains each audio collection
The corresponding speech recognition result in region.
After obtaining the corresponding voice signal in each audio collection region, respectively to the corresponding language in each audio collection region
Sound signal carries out speech recognition, obtains the recognition result of the voice signal in each audio collection region.
In addition, in the step carry out speech recognition process, can by speech recognition apparatus DSP processing module or
The independent speech recognition engine of person is completed, and the present embodiment is not specifically limited herein.
The audio data that the embodiment of the present invention is acquired by receiving multichannel microphone array, every road microphone array are directed toward vehicle
An audio collection region in, for acquiring audio data all the way;According to every road microphone array relative to corresponding audio
The position of pickup area carries out beam forming processing to every road audio data, obtains adopting in every road audio data with corresponding audio
Collect the corresponding audio signal in region, weaken the audio signal in the road audio data on other directions, realizes on other directions
The compacting of audio signal;Then AF panel processing is carried out to multipath audio signal, obtains each audio collection region and corresponds to language
Sound signal is further reduced interference of the noise signal in other audio collection regions to the road voice signal, obtains more clean
The corresponding voice signal in audio collection region;Speech recognition is carried out to the corresponding voice signal in each audio collection region, is obtained
The corresponding speech recognition result in each audio collection region;Realize no matter sound source is located at which audio collection region of vehicle,
There is corresponding microphone array all the way that can accurately acquire the audio data, and obtain accurate speech recognition result, improves
The discrimination of speech recognition;And it is able to suppress between each road voice signal in more people when different location is talked simultaneously
Interfere with each other, identify the corresponding speech recognition result in each audio collection position, substantially increase speech recognition efficiency and
Accuracy.
Embodiment two
Fig. 2 is multi-path voice recognition methods flow chart provided by Embodiment 2 of the present invention.On the basis of above-described embodiment one
On, in the present embodiment, position according to every road microphone array relative to corresponding audio pickup area, to every road audio data into
The processing of traveling wave beam shaping, obtains in every road audio data before audio signal corresponding with corresponding audio pickup area, further includes:
Obtain position of every road microphone array relative to corresponding audio pickup area.Voice signal corresponding to each audio collection region
Speech recognition is carried out, after obtaining the corresponding recognition result in each audio collection region, further includes: calculate each audio collection region
The average energy amplitude of corresponding voice signal;Remove the corresponding identification of voice signal that average energy amplitude is less than preset threshold
As a result.As shown in Fig. 2, specific step is as follows for this method:
Step S201, the audio data of multichannel microphone array acquisition is received, every road microphone array is directed toward in vehicle
One audio collection region, for acquiring audio data all the way.
The embodiment of the present invention is applied to carry out speech recognition vehicle, is usually provided with multiple seats, example in the vehicle
Such as skipper seat, assistant driver seat and other seats are equipped with multichannel microphone array, every road microphone in vehicle
Array is directed toward an audio collection region, for acquiring the audio data in the audio collection region being directed toward.Each audio collection area
Domain corresponds to the position where a seat, and the seat in audio collection region and vehicle corresponds, it is, every road Mike
Wind array is correspondingly arranged for being directed toward a seat, microphone array with the seat in vehicle.For example, for four automatically drive
Vehicle is sailed, installation is respectively directed to the four road microphone arrays at four seats in vehicle.
In addition, microphone array is installed nearby relative to corresponding audio collection region, for microphone in the present embodiment
The specific installation site of array is not specifically limited.
For example, can be installed in vehicle for four vehicles and be respectively directed to four seats, for acquiring four seats
On sound source audio data microphone array, four microphone arrays can install the vehicle in vehicle above four seats respectively
On top.
In the present embodiment, when carrying out speech recognition, microphone array can acquire audio data in real time, and will acquisition
Audio data be sent to speech recognition apparatus.Speech recognition apparatus can receive the audio number of each road microphone array acquisition
According to.
Wherein, every road audio data may include the mark for acquiring the microphone array of the road audio data, each to distinguish
Road audio data.
Step S202, position of every road microphone array relative to corresponding audio pickup area is obtained.
Wherein, every road microphone array includes: every road microphone array phase relative to the position of corresponding audio pickup area
For the angular range and distance range of corresponding audio pickup area.
In a kind of application scenarios of the present embodiment, technical staff can use beam forming technique, preset every road wheat
The position of gram wind array relative to corresponding audio pickup area, after the installation for completing each road microphone array, every road Mike
Wind array just has determined relative to the position of corresponding audio pickup area.
A kind of feasible embodiment of the step are as follows:
The available preset every road microphone array of speech recognition apparatus is relative to corresponding audio pickup area
Position.
Optionally, the position for presetting every road microphone array relative to corresponding audio pickup area can be stored in advance
In the vehicle-mounted terminal equipment of vehicle, speech recognition apparatus can obtain the road vehicle Shang Ge microphone from vehicle-mounted terminal equipment
Position of the array relative to corresponding audio pickup area.
In the present embodiment, in order to more accurately get the voice signal of sound source, each road microphone array is completed
After the installation of column, the different seats of vehicle can be sitting in respectively by technical staff, positioning audio is issued, obtain every road microphone
Position of the array relative to corresponding audio pickup area, can specifically realize in the following way:
For arbitrarily microphone array all the way, the sound source of the correspondence audio pickup area of road microphone array acquisition is received
The positioning audio of sending;Auditory localization processing is carried out to positioning audio, calculates the sound source of positioning audio relative to the road microphone
The position of array;Position of the sound source of audio relative to the road microphone array will be positioned, it is opposite as the road microphone array
In the position of corresponding audio pickup area, so that the position to the road microphone array relative to corresponding audio pickup area carries out
Calibration.
For example, for four vehicles, installation is respectively directed to four seats in vehicle, for acquiring on four seats
The microphone array of the audio data of sound source.When the sound source on a wherein seat makes a sound, the corresponding Mike in the seat
Wind array is available to arrive the audio data, and speech recognition apparatus can determine the sound source relative to this by auditory localization technology
The position of microphone array, and the position as the road microphone array relative to the corresponding audio collection region in the seat.
In the present embodiment, position of the available each microphone array of speech recognition apparatus relative to each audio collection region
It sets.It is carrying out after auditory localization determines position of a certain sound source relative to a certain microphone array, it can be according to presetting
Position of every road microphone array relative to corresponding audio pickup area, determine whether the sound source is located at the microphone array pair
In the audio collection region answered.
In another application scenarios of the present embodiment, when the personnel in vehicle want speech-controlled vehicle, it usually needs first
The speech identifying function of vehicle is waken up by preset wake-up language.Speech recognition apparatus can identify wake up language it
Afterwards, the corresponding audio of language will be waken up as positioning audio, auditory localization processing is carried out to voice frequency is waken up, calculate positioning audio
Position of the sound source relative to the road microphone array, and determining that the sound source is located at the corresponding audio collection area of the microphone array
When in domain, position of the sound source of audio relative to the road microphone array will be positioned, as the road this speech recognition process Zhong Gai
Position of the microphone array relative to corresponding audio pickup area, carries out beam forming processing to every road audio data in this way, obtains
It is more accurate to audio signal, the identification accuracy of the audio data issued for the personnel can be improved.
In the present embodiment, position of every road microphone array relative to corresponding audio pickup area is obtained in the step, it can
With the execution when carrying out speech recognition for the first time after speech recognition apparatus powers on, the road Bing Jiangmei microphone array is relative to diaphone
The position of frequency pickup area is stored, and in subsequent speech recognition process, be can be read directly and is used, language can be improved
The efficiency of sound identification.
It optionally,, can be with for the audio data of the acquisition per microphone array all the way when carrying out speech recognition every time
The audio fragment that preset period of time is intercepted from audio data updates this speech recognition using the audio fragment as positioning audio
Position of the road microphone array relative to corresponding audio pickup area in the process, in this way to every road audio data carry out wave beam at
Shape processing, obtains that audio signal is more accurate, and the identification accuracy of the audio data issued for the personnel can be improved.Its
In, preset period of time can be a period of audio data starting or a period at end, and preset period of time can be by technology
Personnel set according to practical application scene and experience, and the present embodiment is not specifically limited herein.
Step S203, the position according to every road microphone array relative to corresponding audio pickup area, to every road audio number
According to beam forming processing is carried out, audio signal corresponding with corresponding audio pickup area in every road audio data is obtained.
Wherein, the corresponding audio collection region of every road microphone array refers to: audio pointed by the road microphone array
Pickup area.The corresponding audio collection region of every road voice data refers to: acquiring pointed by the microphone array of the audio data
Audio collection region.
In the step, after receiving multi-path audio-frequency data, respectively according to every road microphone array relative to its meaning
To audio collection region position, by beam forming technique, to the audio data all the way of road microphone array acquisition into
The processing of traveling wave beam shaping obtains the corresponding audio letter in the corresponding audio collection region of the road road audio data Zhong Yugai audio data
Number, weaken the audio signal in the road audio data on other directions, realizes the compacting to audio signal on other directions.
In the step, every road is obtained by carrying out beam forming processing to every road audio data using beam forming technique
Audio signal corresponding with corresponding audio pickup area in audio data that is to say to obtain the corresponding sound in each audio collection region
Frequency signal.
Step S204, AF panel processing is carried out to multipath audio signal, obtains each audio collection region and corresponds to voice
Signal.
In the present embodiment, the audio signal on other directions may not be able to be fully eliminated due to beam forming technique,
In obtaining the corresponding audio signal in each audio collection region, the audio signal on other directions still may include, that is,
It may include the audio signal that sound source issues in other audio collection regions.In the step, each audio collection region pair is being obtained
After the audio signal answered, AF panel processing is carried out to multipath audio signal, from the corresponding audio signal in audio collection region
Middle other corresponding audio signal parts in audio collection region of removal, obtain the more clean road audio signal and correspond to audio adopting
Collect voice signal corresponding to region.
Specifically, carrying out AF panel processing to multipath audio signal, obtains each audio collection region and correspond to voice letter
Number, it can specifically realize in the following ways:
Respectively using every road audio signal as target audio, auditory localization processing is carried out to target audio, determines target sound
The sound source position of frequency;According to the sound source position of target audio, judge in target audio whether to include other audio collection regions
The audio signal that sound source issues;If the audio signal that the sound source in target audio comprising other audio collection regions issues, from
Other corresponding audio signals in audio collection region are removed in target audio, obtaining target audio, to correspond to audio pickup area institute right
The voice signal answered.
If the audio signal that the sound source in target audio not comprising other audio collection regions issues, can be directly by mesh
Mark with phonetic symbols frequency is used as it to correspond to voice signal corresponding to audio pickup area.
In the present embodiment, position of the available each microphone array of speech recognition apparatus relative to each audio collection region
It sets.It is carrying out after auditory localization determines position of a certain sound source relative to a certain microphone array, it can be according to presetting
Position of every road microphone array relative to corresponding audio pickup area, determine whether the sound source is located at the microphone array pair
In the audio collection region answered.
After determining the sound source position of target audio, if target audio corresponds to multi-acoustical, each sound can be determined
Position of the source relative to the corresponding microphone array of target audio;According to each microphone array relative to each audio collection region
Position may further determine that audio collection region locating for each sound source, judge in these sound sources with the presence or absence of in other
The sound source in audio collection region, so as to judge whether the sound source comprising other audio collection regions issues in target audio
Audio signal.
For example, two people on skipper position and co-driver issue the first identification language and the second identification language respectively, at this moment,
It can in the audio data of corresponding first microphone array in skipper position and the corresponding second microphone array acquisition of co-driver
It can include two identification language information;If after beam forming is handled, corresponding first audio signal in obtained skipper position
The signal of language is identified comprising part second;Auditory localization processing is carried out to the first audio signal, can determine that there are two sound sources, and
Obtain position of two sound sources relative to the first microphone array;In conjunction with each microphone array relative to each audio collection region
Position can determine that two sound sources are located at the audio collection region of skipper position and co-driver;So as to judge
First audio signal includes other corresponding audio signals in audio collection region, according to corresponding second audio signal of co-driver
Property parameters, the second audio signal is eliminated from the first audio signal, obtains the corresponding voice signal of the first audio signal,
It is to obtain the corresponding voice signal in skipper position.In addition, the second audio signal for copilot can also do similar place
Reason, obtains the corresponding voice signal of co-driver.
Step S205, speech recognition concurrently is carried out to the corresponding voice signal in each audio collection region, obtains each sound
The corresponding speech recognition result of frequency pickup area.
It, can be concurrently to each sound after obtaining the corresponding voice signal in each audio collection region in the present embodiment
The corresponding voice signal of frequency pickup area carries out speech recognition, obtains the identification knot of the voice signal in each audio collection region
Fruit.
Specifically, the corresponding voice signal in each audio collection region can be inputted into a speech recognition module respectively,
Speech recognition concurrently is carried out to the corresponding voice signal in each audio collection region, obtains the voice in each audio collection region
The recognition result of signal can greatly improve the efficiency of speech recognition.
In the present embodiment, after identifying the recognition result of voice signal in each audio collection region, it can also walk
Rapid S206 and S207 removes the null result in speech recognition result to checking treatment, screening is carried out in speech recognition result, with
Improve the accuracy of speech recognition.
Step S206, the average energy amplitude of the corresponding voice signal in each audio collection region is calculated.
In the present embodiment, the average energy amplitude of the corresponding voice signal in audio collection region is calculated, can be used existing
The method that any voice signal average energy amplitude is calculated in technology realizes that details are not described herein again for the present embodiment.
Step S207, removal average energy amplitude is less than the corresponding recognition result of voice signal of preset threshold.
After the average energy amplitude that the corresponding voice signal in each audio collection region is calculated, more each voice letter
Number average energy amplitude and preset threshold size, by average energy amplitude be less than preset threshold the corresponding language of voice signal
Sound recognition result is as invalid identification as a result, the voice signal that average energy amplitude is more than or equal to preset threshold is corresponding
Speech recognition result screens speech recognition result obtained in step S205 as effective recognition result, and removal is wherein
Average energy amplitude be less than preset threshold the corresponding invalid identification of voice signal as a result, obtaining final speech recognition knot
Fruit.
Wherein, preset threshold can be set by technical staff according to practical application scene and experience, the present embodiment this
Place is not specifically limited.
For example, the corresponding microphone array all the way of co-driver is also adopted after interpersonal on skipper position has said identification language
Audio data is collected, speech recognition apparatus has identified corresponding speech recognition result, skipper and copilot corresponding two
Road speech recognition result should be consistent.Since after beam forming and AF panel processing, co-driver is corresponding
The energy amplitude very little of voice signal, if the energy amplitude of the corresponding voice signal of co-driver is less than preset threshold,
The corresponding speech recognition result all the way of copilot is likely to malfunction, and can abandon the recognition result, retains skipper corresponding one
Road speech recognition result, to improve the accuracy rate of speech recognition.
The embodiment of the present invention is by the position according to every road microphone array relative to corresponding audio pickup area, to every
Road audio data carries out beam forming processing, obtains audio signal corresponding with corresponding audio pickup area in every road audio data
Before, it is calibrated by the position to every road microphone array relative to corresponding audio pickup area, so that at beam forming
It is more accurate to manage obtained audio signal;By concurrently carrying out voice knowledge to the corresponding voice signal in each audio collection region
Not, the corresponding speech recognition result in each audio collection region is obtained, the efficiency of speech recognition is further improved;Further
Ground, by calculating the average energy amplitude of the corresponding voice signal in each audio collection region, removal average energy amplitude is less than pre-
If the corresponding recognition result of the voice signal of threshold value, the secondary verification to speech recognition result is completed, removes invalid knowledge therein
Not as a result, improving the accuracy of speech recognition.
Embodiment three
Fig. 3 is the structural schematic diagram for the multi-path voice identification device that the embodiment of the present invention three provides.The embodiment of the present invention mentions
The multi-path voice identification device of confession can execute the process flow of multi-path voice recognition methods embodiment offer.As shown in figure 3, should
Multi-path voice identification device 30 includes: data acquisition module 301, beamforming block 302,303 He of AF panel processing module
Speech recognition module 304.
Specifically, data acquisition module 301, for receiving the audio data of multichannel microphone array acquisition, every road Mike
Wind array is directed toward an audio collection region in vehicle, for acquiring audio data all the way.
Beamforming block 302, it is right for the position according to every road microphone array relative to corresponding audio pickup area
Every road audio data carries out beam forming processing, obtains audio letter corresponding with corresponding audio pickup area in every road audio data
Number.
AF panel processing module 303 obtains each audio and adopts for carrying out AF panel processing to multipath audio signal
Collection region corresponds to voice signal.
Speech recognition module 304 obtains every for carrying out speech recognition to the corresponding voice signal in each audio collection region
The corresponding speech recognition result in a audio collection region.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment one,
Details are not described herein again for concrete function.
The audio data that the embodiment of the present invention is acquired by receiving multichannel microphone array, every road microphone array are directed toward vehicle
An audio collection region in, for acquiring audio data all the way;According to every road microphone array relative to corresponding audio
The position of pickup area carries out beam forming processing to every road audio data, obtains adopting in every road audio data with corresponding audio
Collect the corresponding audio signal in region, weaken the audio signal in the road audio data on other directions, realizes on other directions
The compacting of audio signal;Then AF panel processing is carried out to multipath audio signal, obtains each audio collection region and corresponds to language
Sound signal is further reduced interference of the noise signal in other audio collection regions to the road voice signal, obtains more clean
The corresponding voice signal in audio collection region;Speech recognition is carried out to the corresponding voice signal in each audio collection region, is obtained
The corresponding speech recognition result in each audio collection region;Realize no matter sound source is located at which audio collection region of vehicle,
There is corresponding microphone array all the way that can accurately acquire the audio data, and obtain accurate speech recognition result, improves
The discrimination of speech recognition;And it is able to suppress between each road voice signal in more people when different location is talked simultaneously
Interfere with each other, identify the corresponding speech recognition result in each audio collection position, substantially increase speech recognition efficiency and
Accuracy.
Example IV
On the basis of above-described embodiment three, in the present embodiment, speech recognition module is also used to:
Calculate the average energy amplitude of the corresponding voice signal in each audio collection region;Average energy amplitude is removed to be less than in advance
If the corresponding recognition result of the voice signal of threshold value.
Optionally, speech recognition module is also used to:
Speech recognition concurrently is carried out to the corresponding voice signal in each audio collection region, obtains each audio collection region
Corresponding speech recognition result.
Optionally, AF panel processing module is also used to:
Respectively using every road audio signal as target audio, auditory localization processing is carried out to target audio, determines target sound
The sound source position of frequency;According to the sound source position of target audio, judge in target audio whether to include other audio collection regions
The audio signal that sound source issues;If the audio signal that the sound source in target audio comprising other audio collection regions issues, from
Other corresponding audio signals in audio collection region are removed in target audio, obtaining target audio, to correspond to audio pickup area institute right
The voice signal answered.
Optionally, data acquisition module is also used to:
Obtain position of every road microphone array relative to corresponding audio pickup area.
Optionally, data acquisition module is also used to:
For arbitrarily microphone array all the way, the sound source of the correspondence audio pickup area of road microphone array acquisition is received
The positioning audio of sending;Auditory localization processing is carried out to positioning audio, calculates the sound source of positioning audio relative to the road microphone
The position of array;Position of the sound source of audio relative to the road microphone array will be positioned, it is opposite as the road microphone array
In the position of corresponding audio pickup area.
Optionally, data acquisition module is also used to:
Obtain position of the preset every road microphone array relative to corresponding audio pickup area.
Optionally, position of every road microphone array relative to corresponding audio pickup area, comprising:
Angular range and distance range of every road microphone array relative to corresponding audio pickup area.
In the present embodiment, the seat in audio collection region and vehicle in vehicle is corresponded.
Device provided in an embodiment of the present invention can be specifically used for executing embodiment of the method provided by above-described embodiment two,
Details are not described herein again for concrete function.
The embodiment of the present invention is by the position according to every road microphone array relative to corresponding audio pickup area, to every
Road audio data carries out beam forming processing, obtains audio signal corresponding with corresponding audio pickup area in every road audio data
Before, it is calibrated by the position to every road microphone array relative to corresponding audio pickup area, so that at beam forming
It is more accurate to manage obtained audio signal;By concurrently carrying out voice knowledge to the corresponding voice signal in each audio collection region
Not, the corresponding speech recognition result in each audio collection region is obtained, the efficiency of speech recognition is further improved;Further
Ground, by calculating the average energy amplitude of the corresponding voice signal in each audio collection region, removal average energy amplitude is less than pre-
If the corresponding recognition result of the voice signal of threshold value, the secondary verification to speech recognition result is completed, removes invalid knowledge therein
Not as a result, improving the accuracy of speech recognition.
Embodiment five
Fig. 4 is the structural schematic diagram that the multi-path voice that the embodiment of the present invention five provides identifies equipment.As shown in figure 4, this sets
Standby 40 include: processor 401, memory 402, and is stored in the computer that can be executed on memory 402 and by processor 401
Program.
Processor 401 realizes any of the above-described embodiment of the method when executing and storing in the computer program on memory 402
The multi-path voice recognition methods of offer.
The audio data that the embodiment of the present invention is acquired by receiving multichannel microphone array, every road microphone array are directed toward vehicle
An audio collection region in, for acquiring audio data all the way;According to every road microphone array relative to corresponding audio
The position of pickup area carries out beam forming processing to every road audio data, obtains adopting in every road audio data with corresponding audio
Collect the corresponding audio signal in region, weaken the audio signal in the road audio data on other directions, realizes on other directions
The compacting of audio signal;Then AF panel processing is carried out to multipath audio signal, obtains each audio collection region and corresponds to language
Sound signal is further reduced interference of the noise signal in other audio collection regions to the road voice signal, obtains more clean
The corresponding voice signal in audio collection region;Speech recognition is carried out to the corresponding voice signal in each audio collection region, is obtained
The corresponding speech recognition result in each audio collection region;Realize no matter sound source is located at which audio collection region of vehicle,
There is corresponding microphone array all the way that can accurately acquire the audio data, and obtain accurate speech recognition result, improves
The discrimination of speech recognition;And it is able to suppress between each road voice signal in more people when different location is talked simultaneously
Interfere with each other, identify the corresponding speech recognition result in each audio collection position, substantially increase speech recognition efficiency and
Accuracy.
In addition, the embodiment of the present invention also provides a kind of computer readable storage medium, it is stored with computer program, the meter
Calculation machine program realizes the multi-path voice recognition methods that any of the above-described embodiment of the method provides when being executed by processor.
In several embodiments provided by the present invention, it should be understood that disclosed device and method can pass through it
Its mode is realized.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only
Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied
Another system is closed or is desirably integrated into, or some features can be ignored or not executed.Another point, it is shown or discussed
Mutual coupling, direct-coupling or communication connection can be through some interfaces, the INDIRECT COUPLING or logical of device or unit
Letter connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit being realized in the form of SFU software functional unit can store and computer-readable deposit at one
In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer
It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention
The part steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (Read-
Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic or disk etc. it is various
It can store the medium of program code.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each functional module
Division progress for example, in practical application, can according to need and above-mentioned function distribution is complete by different functional modules
At the internal structure of device being divided into different functional modules, to complete all or part of the functions described above.On
The specific work process for stating the device of description, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.The present invention is directed to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Person's adaptive change follows general principle of the invention and including the undocumented common knowledge in the art of the present invention
Or conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by following
Claims are pointed out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is only limited by appended claims
System.
Claims (12)
1. a kind of multi-path voice recognition methods characterized by comprising
The audio data of multichannel microphone array acquisition is received, the audio that microphone array described in every road is directed toward in vehicle is adopted
Collect region, for acquiring audio data all the way;
Position according to every road microphone array relative to corresponding audio pickup area, the audio data described in every road carry out wave beam
Forming processing, obtains audio signal corresponding with corresponding audio pickup area in audio data described in every road;
The audio signal described in multichannel carries out AF panel processing, obtains each audio collection region and corresponds to voice signal;
Speech recognition is carried out to the corresponding voice signal in each audio collection region, obtains each audio collection region pair
The speech recognition result answered.
2. the method according to claim 1, wherein described believe the corresponding voice in each audio collection region
Number carry out speech recognition, obtain the corresponding speech recognition result in each audio collection region, comprising:
Speech recognition concurrently is carried out to the corresponding voice signal in each audio collection region, obtains each audio collection
The corresponding speech recognition result in region.
3. the method according to claim 1, wherein the audio signal described in multichannel carries out at AF panel
Reason, obtains each audio collection region and corresponds to voice signal, comprising:
Respectively using audio signal described in every road as target audio, auditory localization processing is carried out to the target audio, determines institute
State the sound source position of target audio;
According to the sound source position of the target audio, judge in the target audio whether include other audio collection regions sound
The audio signal that source issues;
If the audio signal that the sound source in the target audio comprising other audio collection regions issues, from the target audio
Other corresponding audio signals in audio collection region described in middle removal, obtaining the target audio, to correspond to audio pickup area institute right
The voice signal answered.
4. the method according to claim 1, wherein it is described according to every road microphone array relative to corresponding audio
The position of pickup area, the audio data described in every road carry out beam forming processing, obtain in audio data described in every road with it is right
Before answering the corresponding audio signal in audio collection region, further includes:
Obtain position of every road microphone array relative to corresponding audio pickup area.
5. according to the method described in claim 4, it is characterized in that, described obtain every road microphone array relative to corresponding audio
The position of pickup area, comprising:
For arbitrarily microphone array, the sound source for receiving the correspondence audio pickup area of road microphone array acquisition issue all the way
Positioning audio;
Auditory localization processing is carried out to the positioning audio, calculates the sound source of the positioning audio relative to the road microphone array
Position;
Position by the sound source of the positioning audio relative to the road microphone array, as the road microphone array relative to right
Answer the position in audio collection region.
6. according to the method described in claim 4, it is characterized in that, described obtain every road microphone array relative to corresponding audio
The position of pickup area, comprising:
Obtain position of the preset every road microphone array relative to corresponding audio pickup area.
7. according to the method described in claim 4, it is characterized in that, every road microphone array is relative to corresponding audio collection
The position in region, comprising:
Angular range and distance range of every road microphone array relative to corresponding audio pickup area.
8. the method according to claim 1, wherein described believe the corresponding voice in each audio collection region
Number speech recognition is carried out, after obtaining the corresponding recognition result in each audio collection region, further includes:
Calculate the average energy amplitude of the corresponding voice signal in each audio collection region;
Remove the corresponding recognition result of voice signal that average energy amplitude is less than preset threshold.
9. method according to claim 1-8, which is characterized in that audio collection region and vehicle in the vehicle
Seat in corresponds.
10. a kind of multi-path voice identification device characterized by comprising
Data acquisition module, for receiving the audio data of multichannel microphone array acquisition, microphone array described in every road is directed toward
An audio collection region in vehicle, for acquiring audio data all the way;
Beamforming block, for the position according to every road microphone array relative to corresponding audio pickup area, to every road institute
It states audio data and carries out beam forming processing, obtain audio corresponding with corresponding audio pickup area in audio data described in every road
Signal;
AF panel processing module carries out AF panel processing for the audio signal described in multichannel, obtains each audio
Pickup area corresponds to voice signal;
Speech recognition module obtains each for carrying out speech recognition to the corresponding voice signal in each audio collection region
The corresponding speech recognition result in the audio collection region.
11. a kind of multi-path voice identifies equipment characterized by comprising
Memory, processor, and it is stored in the computer program that can be run on the memory and on the processor,
The processor realizes method as claimed in any one of claims 1-9 wherein when running the computer program.
12. a kind of computer readable storage medium, which is characterized in that it is stored with computer program,
The computer program realizes method as claimed in any one of claims 1-9 wherein when being executed by processor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910164535.5A CN109920405A (en) | 2019-03-05 | 2019-03-05 | Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910164535.5A CN109920405A (en) | 2019-03-05 | 2019-03-05 | Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109920405A true CN109920405A (en) | 2019-06-21 |
Family
ID=66963410
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910164535.5A Pending CN109920405A (en) | 2019-03-05 | 2019-03-05 | Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109920405A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070868A (en) * | 2019-04-28 | 2019-07-30 | 广州小鹏汽车科技有限公司 | Voice interactive method, device, automobile and the machine readable media of onboard system |
CN110517677A (en) * | 2019-08-27 | 2019-11-29 | 腾讯科技(深圳)有限公司 | Speech processing system, method, equipment, speech recognition system and storage medium |
CN110767247A (en) * | 2019-10-29 | 2020-02-07 | 支付宝(杭州)信息技术有限公司 | Voice signal processing method, sound acquisition device and electronic equipment |
CN110970049A (en) * | 2019-12-06 | 2020-04-07 | 广州国音智能科技有限公司 | Multi-person voice recognition method, device, equipment and readable storage medium |
CN111489522A (en) * | 2020-05-29 | 2020-08-04 | 北京百度网讯科技有限公司 | Method, device and system for outputting information |
CN111489755A (en) * | 2020-04-13 | 2020-08-04 | 北京声智科技有限公司 | Voice recognition method and device |
CN111640428A (en) * | 2020-05-29 | 2020-09-08 | 北京百度网讯科技有限公司 | Voice recognition method, device, equipment and medium |
CN111968642A (en) * | 2020-08-27 | 2020-11-20 | 北京百度网讯科技有限公司 | Voice data processing method and device and intelligent vehicle |
CN112562681A (en) * | 2020-12-02 | 2021-03-26 | 腾讯科技(深圳)有限公司 | Speech recognition method and apparatus, and storage medium |
CN113270095A (en) * | 2021-04-26 | 2021-08-17 | 镁佳(北京)科技有限公司 | Voice processing method, device, storage medium and electronic equipment |
CN115223548A (en) * | 2021-06-29 | 2022-10-21 | 达闼机器人股份有限公司 | Voice interaction method, voice interaction device and storage medium |
Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102164328A (en) * | 2010-12-29 | 2011-08-24 | 中国科学院声学研究所 | Audio input system used in home environment based on microphone array |
CN104810021A (en) * | 2015-05-11 | 2015-07-29 | 百度在线网络技术(北京)有限公司 | Pre-processing method and device applied to far-field recognition |
CN106653041A (en) * | 2017-01-17 | 2017-05-10 | 北京地平线信息技术有限公司 | Audio signal processing equipment and method as well as electronic equipment |
CN206312566U (en) * | 2016-12-15 | 2017-07-07 | 北京塞宾科技有限公司 | A kind of vehicle intelligent audio devices |
CN107123429A (en) * | 2017-03-22 | 2017-09-01 | 歌尔科技有限公司 | The auto gain control method and device of audio signal |
CN107481729A (en) * | 2017-09-13 | 2017-12-15 | 百度在线网络技术(北京)有限公司 | A kind of method and system that intelligent terminal is upgraded to far field speech-sound intelligent equipment |
CN108766456A (en) * | 2018-05-22 | 2018-11-06 | 出门问问信息科技有限公司 | A kind of method of speech processing and device |
CN108986838A (en) * | 2018-09-18 | 2018-12-11 | 东北大学 | A kind of adaptive voice separation method based on auditory localization |
CN109087663A (en) * | 2017-06-13 | 2018-12-25 | 恩智浦有限公司 | signal processor |
CN109192203A (en) * | 2018-09-29 | 2019-01-11 | 百度在线网络技术(北京)有限公司 | Multitone area audio recognition method, device and storage medium |
CN109273020A (en) * | 2018-09-29 | 2019-01-25 | 百度在线网络技术(北京)有限公司 | Acoustic signal processing method, device, equipment and storage medium |
CN109308908A (en) * | 2017-07-27 | 2019-02-05 | 深圳市冠旭电子股份有限公司 | A kind of voice interactive method and device |
-
2019
- 2019-03-05 CN CN201910164535.5A patent/CN109920405A/en active Pending
Patent Citations (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102164328A (en) * | 2010-12-29 | 2011-08-24 | 中国科学院声学研究所 | Audio input system used in home environment based on microphone array |
CN104810021A (en) * | 2015-05-11 | 2015-07-29 | 百度在线网络技术(北京)有限公司 | Pre-processing method and device applied to far-field recognition |
CN206312566U (en) * | 2016-12-15 | 2017-07-07 | 北京塞宾科技有限公司 | A kind of vehicle intelligent audio devices |
CN106653041A (en) * | 2017-01-17 | 2017-05-10 | 北京地平线信息技术有限公司 | Audio signal processing equipment and method as well as electronic equipment |
CN107123429A (en) * | 2017-03-22 | 2017-09-01 | 歌尔科技有限公司 | The auto gain control method and device of audio signal |
CN109087663A (en) * | 2017-06-13 | 2018-12-25 | 恩智浦有限公司 | signal processor |
CN109308908A (en) * | 2017-07-27 | 2019-02-05 | 深圳市冠旭电子股份有限公司 | A kind of voice interactive method and device |
CN107481729A (en) * | 2017-09-13 | 2017-12-15 | 百度在线网络技术(北京)有限公司 | A kind of method and system that intelligent terminal is upgraded to far field speech-sound intelligent equipment |
CN108766456A (en) * | 2018-05-22 | 2018-11-06 | 出门问问信息科技有限公司 | A kind of method of speech processing and device |
CN108986838A (en) * | 2018-09-18 | 2018-12-11 | 东北大学 | A kind of adaptive voice separation method based on auditory localization |
CN109192203A (en) * | 2018-09-29 | 2019-01-11 | 百度在线网络技术(北京)有限公司 | Multitone area audio recognition method, device and storage medium |
CN109273020A (en) * | 2018-09-29 | 2019-01-25 | 百度在线网络技术(北京)有限公司 | Acoustic signal processing method, device, equipment and storage medium |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110070868A (en) * | 2019-04-28 | 2019-07-30 | 广州小鹏汽车科技有限公司 | Voice interactive method, device, automobile and the machine readable media of onboard system |
CN110517677A (en) * | 2019-08-27 | 2019-11-29 | 腾讯科技(深圳)有限公司 | Speech processing system, method, equipment, speech recognition system and storage medium |
CN110517677B (en) * | 2019-08-27 | 2022-02-08 | 腾讯科技(深圳)有限公司 | Speech processing system, method, apparatus, speech recognition system, and storage medium |
CN110767247B (en) * | 2019-10-29 | 2021-02-19 | 支付宝(杭州)信息技术有限公司 | Voice signal processing method, sound acquisition device and electronic equipment |
CN110767247A (en) * | 2019-10-29 | 2020-02-07 | 支付宝(杭州)信息技术有限公司 | Voice signal processing method, sound acquisition device and electronic equipment |
CN110970049A (en) * | 2019-12-06 | 2020-04-07 | 广州国音智能科技有限公司 | Multi-person voice recognition method, device, equipment and readable storage medium |
CN111489755A (en) * | 2020-04-13 | 2020-08-04 | 北京声智科技有限公司 | Voice recognition method and device |
CN111489522A (en) * | 2020-05-29 | 2020-08-04 | 北京百度网讯科技有限公司 | Method, device and system for outputting information |
CN111640428A (en) * | 2020-05-29 | 2020-09-08 | 北京百度网讯科技有限公司 | Voice recognition method, device, equipment and medium |
CN111640428B (en) * | 2020-05-29 | 2023-10-20 | 阿波罗智联(北京)科技有限公司 | Voice recognition method, device, equipment and medium |
CN111968642A (en) * | 2020-08-27 | 2020-11-20 | 北京百度网讯科技有限公司 | Voice data processing method and device and intelligent vehicle |
CN112562681A (en) * | 2020-12-02 | 2021-03-26 | 腾讯科技(深圳)有限公司 | Speech recognition method and apparatus, and storage medium |
CN112562681B (en) * | 2020-12-02 | 2021-11-19 | 腾讯科技(深圳)有限公司 | Speech recognition method and apparatus, and storage medium |
CN113270095A (en) * | 2021-04-26 | 2021-08-17 | 镁佳(北京)科技有限公司 | Voice processing method, device, storage medium and electronic equipment |
CN115223548A (en) * | 2021-06-29 | 2022-10-21 | 达闼机器人股份有限公司 | Voice interaction method, voice interaction device and storage medium |
CN115223548B (en) * | 2021-06-29 | 2023-03-14 | 达闼机器人股份有限公司 | Voice interaction method, voice interaction device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109920405A (en) | Multi-path voice recognition methods, device, equipment and readable storage medium storing program for executing | |
CN108122563B (en) | Method for improving voice awakening rate and correcting DOA | |
CN110556103B (en) | Audio signal processing method, device, system, equipment and storage medium | |
CN109509465B (en) | Voice signal processing method, assembly, equipment and medium | |
CN104908688B (en) | The method and device of vehicle active noise reduction | |
CN110459234A (en) | For vehicle-mounted audio recognition method and system | |
EP3822654A1 (en) | Audio recognition method, and target audio positioning method, apparatus and device | |
CN103901401B (en) | A kind of binaural sound source of sound localization method based on ears matched filtering device | |
CN102438189B (en) | Dual-channel acoustic signal-based sound source localization method | |
CN106328156A (en) | Microphone array voice reinforcing system and microphone array voice reinforcing method with combination of audio information and video information | |
CN109545230A (en) | Acoustic signal processing method and device in vehicle | |
CN109272989A (en) | Voice awakening method, device and computer readable storage medium | |
CN103999151B (en) | In calculating, effective wideband filtered and addition array focus on | |
CN104916289A (en) | Quick acoustic event detection method under vehicle-driving noise environment | |
CN104991573A (en) | Locating and tracking method and apparatus based on sound source array | |
CN103165137B (en) | Speech enhancement method of microphone array under non-stationary noise environment | |
CN106531179A (en) | Multi-channel speech enhancement method based on semantic prior selective attention | |
CN103680512B (en) | The horizontal lifting system of speech recognition and its method of vehicle array microphone | |
CN112216295B (en) | Sound source positioning method, device and equipment | |
CN102819009A (en) | Driver sound localization system and method for automobile | |
CN109273020A (en) | Acoustic signal processing method, device, equipment and storage medium | |
CN103278801A (en) | Noise imaging detection device and detection calculation method for transformer substation | |
CN103854660B (en) | A kind of four Mike's sound enhancement methods based on independent component analysis | |
JP7326627B2 (en) | AUDIO SIGNAL PROCESSING METHOD, APPARATUS, DEVICE AND COMPUTER PROGRAM | |
CN107346664A (en) | A kind of ears speech separating method based on critical band |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20211019 Address after: 100176 101, floor 1, building 1, yard 7, Ruihe West 2nd Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing Applicant after: Apollo Zhilian (Beijing) Technology Co.,Ltd. Address before: 100085 Baidu Building, 10 Shangdi Tenth Street, Haidian District, Beijing Applicant before: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) Co.,Ltd. |