WO2017045413A1 - Audio output method and apparatus - Google Patents

Audio output method and apparatus Download PDF

Info

Publication number
WO2017045413A1
WO2017045413A1 PCT/CN2016/082421 CN2016082421W WO2017045413A1 WO 2017045413 A1 WO2017045413 A1 WO 2017045413A1 CN 2016082421 W CN2016082421 W CN 2016082421W WO 2017045413 A1 WO2017045413 A1 WO 2017045413A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio data
channel
channel audio
data
input
Prior art date
Application number
PCT/CN2016/082421
Other languages
French (fr)
Chinese (zh)
Inventor
石武
Original Assignee
北京云知声信息技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京云知声信息技术有限公司 filed Critical 北京云知声信息技术有限公司
Publication of WO2017045413A1 publication Critical patent/WO2017045413A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Definitions

  • the present invention relates to the field of multimedia technologies, and in particular, to an audio output method and apparatus.
  • IIS Inter-IC Sound
  • the embodiment of the invention provides an audio output method and device, which are used for collecting multi-channel audio data in speech recognition.
  • an audio output method for use in a field programmable gate array, comprising the steps of:
  • the single channel audio data is output through a single audio output interface.
  • the receiving multiple audio data input by the multiple audio input interfaces includes:
  • the multi-channel audio data input by the multi-channel audio input interface is received according to the number of clocks.
  • the converting the multi-channel audio data into single-channel audio data comprises:
  • the outputting the single channel audio data comprises:
  • the sorted single channel audio data is output.
  • the receiving multiple audio data input by the multiple audio input interfaces includes:
  • the outputting the single channel audio data by using a single audio output interface includes:
  • the single channel audio data is output through a single audio output interface at a falling edge of the clock.
  • the receiving multiple audio data input by the multiple audio input interfaces includes:
  • Multiple audio data input by multiple audio input interfaces is received at preset time intervals.
  • the above technical solution converts the received multi-channel audio data into single-channel audio data for output, so that multi-channel audio data can be received and outputted in the speech recognition, and the central processor has only one set of audio input interfaces when the speech recognition is solved.
  • the problem of not being able to receive multiple audio data has more satisfied the needs of speech recognition.
  • an audio output method for use in a central processing unit, including the following steps:
  • splitting the single channel audio data into multiple channels of audio data When receiving single channel audio data, splitting the single channel audio data into multiple channels of audio data;
  • the multi-channel audio data is output.
  • the splitting the single audio data into multiple audio data includes:
  • the single audio data is split into multiple audio data according to the number of clocks.
  • the single channel audio data is single channel left channel audio data or single channel right channel audio data; when the single channel audio data is received, the single channel audio data is split Multiple audio data, including:
  • the single channel left channel audio data is split into multiple left channel audio data
  • splitting the single channel right channel audio data into multiple right channel audio data When the single channel right channel audio data is received, splitting the single channel right channel audio data into multiple right channel audio data;
  • the above technical solution is configured to split the received single channel audio data into multiple audio data, so that the audio data received by the single group audio input interface in the voice recognition can be outputted in multiple channels, thereby being able to output complete audio.
  • the data solves the problem that the central processing unit has only one set of audio input interfaces during speech recognition, which makes it impossible to receive multi-channel audio data, and satisfies the requirement of speech recognition to a greater extent.
  • an audio output device for use in a field programmable gate array, the device comprising:
  • a receiving module configured to receive multiple audio data input by multiple audio input interfaces
  • a conversion module configured to convert the multi-channel audio data into single-channel audio data
  • the first output module is configured to output the single channel audio data through a single audio output interface.
  • the receiving module comprises:
  • a first determining submodule configured to determine a second sampling rate of the multiple audio data according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data
  • a second determining submodule configured to determine, according to the second sampling rate, a number of clocks that are input by the multiple audio data in a single input
  • the first receiving submodule is configured to receive the multiple audio data input by the multiple audio input interfaces according to the number of the clocks.
  • the conversion module includes:
  • a first buffer submodule configured to buffer the multiple audio data according to the number of clocks
  • a sorting sub-module configured to sort the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces, to obtain the sorted single-channel audio data.
  • the first output module comprises:
  • the first output submodule is configured to output the sorted single channel audio data.
  • the receiving module comprises:
  • a second receiving submodule configured to receive multiple audio data input by the multiple audio input interfaces on a rising edge of the clock
  • the first output module includes:
  • a second output submodule configured to output the single audio data through a single audio output interface on a falling edge of the clock.
  • the receiving module comprises:
  • the third receiving submodule is configured to receive the multiple audio data input by the multiple audio input interfaces according to the preset time interval.
  • the above device converts the received multi-channel audio data into single-channel audio data for output, so that multi-channel audio data can be received and outputted in the speech recognition, and the central processor has only one set of audio input interfaces when the speech recognition is solved.
  • the problem of not being able to receive multiple audio data has more satisfied the needs of speech recognition.
  • an audio output device for use in a central processing unit, the device comprising:
  • a splitting module configured to split the single audio data into multiple audio data when receiving single audio data
  • a second output module configured to output the multiple audio data.
  • the splitting module comprises:
  • a third determining submodule configured to determine a second sampling rate of the multiple audio data according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data
  • a fourth determining submodule configured to determine, according to the second sampling rate, a number of clocks that are input by the multiple audio data in a single input
  • the first splitting module is configured to split the single audio data into multiple audio data according to the number of clocks.
  • the single channel audio data is single channel left channel audio data or single channel right channel audio data;
  • the split module includes:
  • a second splitting module configured to split the single left channel audio data into multiple left channel audio data when receiving the single channel left channel audio data
  • a fifth determining submodule configured to determine the split multichannel left channel audio according to the number of clocks input by the multiple audio data and the number of clocks of the received single channel audio data The first invalid data in the data;
  • a first discarding sub-module configured to discard the first invalid data, and obtain multi-channel left channel audio valid data
  • a third splitting module configured to split the single right channel audio data into multiple right channel audio data when receiving the single right channel audio data
  • a sixth determining submodule configured to determine the split multichannel right channel audio according to the number of clocks input by the multiple audio data and the number of clocks of the received single channel audio data The second invalid data in the data;
  • a second discarding sub-module configured to discard the second invalid data, and obtain multi-channel right channel audio valid data
  • a combination sub-module configured to combine the multi-channel left channel audio effective data and the multi-channel right sound according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data
  • the channel audio is valid data, and the multi-channel audio data is obtained.
  • the above device can output the audio data received by the single group audio input interface in the voice recognition by splitting the received single channel audio data into multiple audio data, so that the complete audio data can be output.
  • an audio output device which is applied to a field programmable gate array, the device comprising:
  • a memory for storing the processor executable instructions
  • processor is configured to:
  • the multi-channel audio data is output through a single audio output interface.
  • the above processor is also configured to:
  • the multi-channel audio data input by the multi-channel audio input interface is received according to the number of clocks.
  • the above processor is also configured to:
  • the above processor is also configured to:
  • the sorted single channel audio data is output.
  • the above processor is also configured to:
  • the single channel audio data is output through a single audio output interface at a falling edge of the clock.
  • the above processor is also configured to:
  • Multiple audio data input by multiple audio input interfaces is received at preset time intervals.
  • an audio output device which is applied to a central processing unit, and the device includes:
  • a memory for storing the processor executable instructions
  • processor is configured to:
  • splitting the single channel audio data into multiple channels of audio data When receiving single channel audio data, splitting the single channel audio data into multiple channels of audio data;
  • the multi-channel audio data is output.
  • the above processor is also configured to:
  • the single audio data is split into multiple audio data according to the number of clocks.
  • the above processor is also configured to:
  • the single channel left channel audio data is split into multiple left channel audio data
  • splitting the single channel right channel audio data into multiple right channel audio data When the single channel right channel audio data is received, splitting the single channel right channel audio data into multiple right channel audio data;
  • a non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for executing the method of the first aspect of the embodiment of the present invention.
  • a non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for performing the method of the second aspect of the embodiments of the present invention.
  • a computer program comprising: instructions for performing the method of the first aspect of the embodiment of the invention when the program is executed by a computer.
  • a computer program when executed by a computer, the program comprising: instructions for performing the method of the second aspect of the embodiment of the invention when the program is executed by a computer.
  • FIG. 1 is a flowchart of an audio output method according to an embodiment of the present invention.
  • FIG. 2 is a flowchart of an audio output method according to an embodiment of the present invention.
  • FIG. 3 is a flowchart of an audio output method according to an embodiment of the present invention.
  • step S32 is a flowchart of step S32 in an audio output method according to an embodiment of the present invention.
  • FIG. 5 is a flowchart of step S32 in an audio output method according to an embodiment of the present invention.
  • FIG. 6 is a block diagram of an audio recognition system in an audio output method according to an embodiment of the present invention.
  • FIG. 7 is a timing diagram of an IIS signal in an audio output method according to an embodiment of the present invention.
  • FIG. 8 is a flowchart of an audio output method according to an embodiment of the present invention.
  • FIG. 9 is a block diagram of an audio output device according to an embodiment of the present invention.
  • FIG. 10 is a block diagram of a first receiving module in an audio output device according to an embodiment of the present invention.
  • FIG. 11 is a block diagram of a conversion module in an audio output device according to an embodiment of the present invention.
  • FIG. 12 is a block diagram of an audio output device according to an embodiment of the present invention.
  • FIG. 13 is a block diagram of an audio output device according to an embodiment of the present invention.
  • FIG. 14 is a block diagram of a splitting module in an audio output device according to an embodiment of the present invention.
  • FIG. 15 is a block diagram of a splitting module in an audio output device according to an embodiment of the present invention.
  • 16 is a block diagram of an apparatus for performing an audio output method in accordance with an embodiment of the present invention.
  • An audio output method provided by an embodiment of the present invention relates to two execution entities, including a field programmable gate array FPGA and a central processing unit CPU, wherein the FPGA is configured to receive multiple audio data from multiple audio input interfaces, and The road audio data is converted into a single channel audio data output to the CPU, and the CPU is configured to split the received single channel audio data into multiple channels of audio data for output, thereby realizing the recognition of the multi-channel audio data in the voice recognition.
  • An audio output method according to an embodiment of the present invention is separately described from the perspective of two execution bodies of an FPGA and a CPU, respectively.
  • FIG. 1 is a flowchart of an audio output method according to an embodiment of the present invention. As shown in FIG. 1 , the method is used in an FPGA, wherein a plurality of serial buffers, a transmitting component, and a single audio output interface corresponding to multiple audio input interfaces are disposed in the FPGA, and the audio output method includes the following steps. S11-S13:
  • Step S11 receiving multi-channel audio data input by the multi-channel audio input interface.
  • multiple audio data input by multiple audio input interfaces can be received at preset time intervals.
  • the preset time interval is set to 10ms.
  • step S12 the multi-channel audio data is converted into single-channel audio data.
  • step S13 single channel audio data is output through a single audio output interface.
  • the above method may also be implemented as the following steps S21-S26:
  • Step S21 Determine a second sampling rate of the multi-channel audio data according to the number of the multi-channel audio input interfaces and the first sampling rate of the preset single-channel audio data.
  • the number of the multi-channel audio input interfaces is 3, and the first sampling rate of the preset single-channel audio data is 96 KHz.
  • the audio data is divided into left channel audio data and right channel audio data
  • the three multi-channel audio input interfaces can receive audio input of six microphones, and the second sampling rate of the multi-channel audio data is 1/6 of the first sampling rate, that is, 16KHz.
  • 16KHz and 96KHz are standard sampling rates.
  • Step S22 determining the number of clocks input by the multi-channel audio data in a single time according to the second sampling rate.
  • the FPGA performs sampling once on the rising edge of each clock, when the second sampling rate is 16 kHz, the number of clocks input by the multiple audio data is 16 clocks.
  • Step S23 receiving the multi-channel audio data input by the multi-channel audio input interface according to the number of clocks.
  • step S24 the multi-channel audio data is buffered according to the number of clocks.
  • the FPGA has a serial buffer corresponding to the multi-channel audio input interface for buffering the multi-channel audio data received by the multi-channel audio input interface.
  • step S25 the buffered multi-channel audio data is sorted according to a preset order of the multi-channel audio input interfaces, and the sorted single-channel audio data is obtained.
  • the FPGA has three interfaces of an audio input interface 1, an audio input interface 2, and an audio input interface 3, and the preset order is an audio input interface 1 - an audio input interface 2 - an audio input interface 3, and a buffered multi-channel audio
  • the audio data received by the audio input interface 1 is ranked first, then the data received by the audio input interface 2, and finally the data received by the audio input interface 3, and the multi-channel audio is obtained by the above sorting method.
  • the data is converted into single-channel audio data.
  • Step S26 outputting the sorted single channel audio data.
  • the single-channel audio data is sent by the transmitting component in the FPGA, and is output to the CPU through the single-channel audio output interface, and then the CPU splits the single-channel audio data into multiple audio data for output, and finally obtains a complete multi-channel. Audio data.
  • steps S21-S23 are an embodiment of step S11
  • steps S24-S25 are an embodiment of step S12
  • step S26 is an embodiment of step S13.
  • steps S21-S26 are performed cyclically, and the operations of serial buffer buffering audio data and transmitting component transmitting audio data in the FPGA are performed simultaneously.
  • the FPGA needs to loop the above process twice, and separately output the left channel audio data and the right channel twice.
  • the audio data is combined by the CPU and split into multiple audio data to complete the acquisition of one frame of audio data.
  • step S11 may be implemented as follows: receiving multiple audio data input by multiple audio input interfaces on a rising edge of the clock; in this case, step S13 may be implemented as follows: passing on the falling edge of the same clock
  • the single audio output interface outputs single audio data.
  • the received multi-channel audio data is converted into single-channel audio data for output, so that the multi-channel audio data can be received and output in the speech recognition, and the central processor is only solved when the speech recognition is performed.
  • the circuit composed of the FPGA includes a reset signal to clear the data each time the system is powered on to ensure the accuracy of the system operation.
  • FIG. 3 is a flowchart of an audio output method according to an embodiment of the present invention. As shown in FIG. 3, the method is used on the CPU side, and includes the following steps S31-S32:
  • step S31 when single channel audio data is received, the single channel audio data is split into multiple channels of audio data.
  • the single channel audio data received by the CPU is a single channel outputted by the single channel audio output interface in the FPGA. Audio data.
  • step S32 multi-channel audio data is output.
  • step S31 can be implemented as the following steps S311-S314:
  • Step S311 obtaining the number of multiple audio input interfaces.
  • Step S312 determining a second sampling rate of the multi-channel audio data according to the number of the multi-channel audio input interfaces and the first sampling rate of the preset single-channel audio data.
  • the number of the multi-channel audio input interfaces is 3, and the first sampling rate of the preset single-channel audio data is 96 KHz.
  • the audio data is divided into left channel audio data and right channel audio data
  • the three multi-channel audio input interfaces can receive audio input of six microphones, and the second sampling rate of the multi-channel audio data is divided by 6 of the first sampling rate, that is, 16 kHz.
  • 16KHz and 96KHz are standard sampling rates.
  • Step S313 determining the number of clocks input by the multi-channel audio data in a single time according to the second sampling rate.
  • the FPGA performs sampling once on the rising edge of each clock, when the second sampling rate is 16 kHz, the number of clocks input by the multiple audio data is 16 clocks.
  • step S314 the single channel audio data is split into multiple channels of audio data according to the number of clocks.
  • step S31 can be implemented as the following steps S51-S57:
  • step S51 when the single channel left channel audio data is received, the single channel left channel audio data is split into multiple left channel audio data.
  • Step S52 Determine, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the first invalid data in the split multi-channel left channel audio data.
  • the difference between the number of clocks of the single-channel audio data received by the CPU and the number of clocks input by the multi-channel audio data is the first invalid in the split multi-channel left channel audio data.
  • the number of bits of data is the first invalid in the split multi-channel left channel audio data.
  • step S53 the first invalid data is discarded, and the multi-channel left channel audio effective data is obtained.
  • Step S55 Determine, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the second invalid data in the split multi-channel right channel audio data.
  • the difference between the number of clocks of the single-channel audio data received by the CPU and the number of clocks input by the multi-channel audio data is the second invalid in the split multi-channel right channel audio data.
  • the number of bits of data is the second invalid in the split multi-channel right channel audio data.
  • step S56 the second invalid data is discarded, and the multi-channel right channel audio effective data is obtained.
  • Step S57 according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, combining the multi-channel left channel audio effective data and the multi-channel right channel audio effective data to obtain multi-channel audio data.
  • the single channel left channel audio data is split into multiple left channel audio data, or the single channel right channel audio data is split into multiple right channel audio data, all are in accordance with multiple channels.
  • the number of clocks for a single input of audio data is split.
  • the received single channel audio data is split into multiple audio data for output, so that the audio data received by the single group audio input interface in the voice recognition can be output in multiple channels. Therefore, the complete audio data can be collected, and the problem that the central processor has only one set of audio input interfaces during speech recognition and the multi-channel audio data cannot be received is solved, and the requirement of speech recognition is more satisfied.
  • the audio recognition system includes a field programmable gate array FPGA and a central processing unit CPU, and a block diagram of the system is shown in FIG. 6.
  • the audio recognition system has a total of six microphone inputs for easy sound source positioning and noise cancellation.
  • the FPGA 61 has three multi-channel audio input interfaces (interface 1, interface 2 and interface 3) and a single audio.
  • the output interface 4 and the three multi-channel audio input interfaces respectively receive three sets of audio data input by six microphones.
  • the FPGA 61 has three serial buffers (serial buffer 1, serial buffer 2, and serial buffer 3) corresponding to the interface 1, the interface 2, and the interface 3, respectively.
  • the audio recognition system uses the serial digital audio bus protocol IIS (Inter-IC Sound bus).
  • IIS Inter-IC Sound bus
  • the signal is obtained by dividing the clock signal of the single audio data by 6, and the timing diagram of the single audio data and the multi-channel audio data is shown in FIG. 7, and FIG. 7 shows the BCLK, LRCK, and the single audio data.
  • BCLK1, BCLK2, and BCLK3 each output 16 clock signals.
  • the CPU 62 needs to read 12 bytes of data from the single audio output interface 4 of the FPGA 61 each time.
  • FIG. 8 is a flowchart of an audio output method in the specific embodiment. As shown in FIG. 8, the following steps S801-S814 are included:
  • step S801 the audio recognition system is powered on.
  • step S802 the FPGA 61 determines that the sampling rate of the multi-channel audio data is 16 KHz according to the number of the multiple audio input interfaces and the sampling rate of the preset single-channel audio data of 96 KHz.
  • step S803 the FPGA 61 determines, according to the sampling rate of the multi-channel audio data, that the number of clocks input by the multi-channel audio data is 16 clocks.
  • step S804 the multi-channel audio input interface 1, the interface 2 and the interface 3 in the FPGA 61 synchronously receive the multi-channel left channel audio data.
  • the received multi-channel left channel audio data is obtained by a microphone input and converted into a digital signal by an analog-to-digital converter (A/D converter).
  • A/D converter analog-to-digital converter
  • Step S805 the three serial buffers in the FPGA 61 respectively buffer the left channel audio data of the 16 clocks received by the interface 1, the interface 2 and the interface 3, and the CPU 62 receives the 12 bytes from the single audio output interface 4 of the FPGA 61.
  • Sampling data the serial buffer 1 buffers the left channel audio data received by the interface 1
  • the serial buffer 2 buffers the left channel audio data received by the interface 2
  • the serial buffer 3 buffers the buffer 3 receives the left channel. Audio data, cached left channel audio data are represented by WORD1[15..0], WORD2[15..0] and WORD3[15..0] respectively.
  • the sample data read by the CPU 62 is obtained by the transmitting unit transmitting the data buffered in the serial buffer. Since the step is to buffer the audio data for the first time, no valid data is transmitted to the CPU 62 in the transmitting unit, that is, The sample data read by the CPU 62 is invalid data.
  • step S806 the CPU 62 discards the read invalid data.
  • Step S807 the three serial buffers in the FPGA 61 respectively buffer the right channel audio data of the 16 clocks received by the interface 1, the interface 2 and the interface 3, and the left channel audio of the last buffer in the serial buffer.
  • the data is sorted while the sending component sends the sorted left channel audio data.
  • the FPGA 61 sorts the left-channel audio data of the last buffer in the serial buffer according to the preset order of the serial buffer, for example, the left channel audio buffered in the serial buffer 1.
  • the data is ranked first, then the left channel audio data buffered in the serial buffer 2, and finally the left channel audio data buffered in the serial buffer 3, and the buffered right channel audio data is respectively WORD1 [31 ..16], WORD2[31..16] and WORD3[31..16] indicate.
  • Step S808 the CPU 62 receives 12 bytes of left channel audio data from the single audio interface 4 of the FPGA 61, and splits the received left channel audio data to obtain 16 clock left channel audio signals WORD1 [ 15..0], WORD2[15..0] and WORD3[15..0].
  • step S809 the CPU 62 discards the invalid data in WORD1[15..0], WORD2[15..0] and WORD3[15..0], and obtains the left channel audio effective data. Since the CPU 62 receives only the left channel audio data, the CPU 62 receives 12 bytes of sample data each time, that is, 96 clock signals, so the signals WORD1 [15..0], WORD2 [15..0], and WORD3. In [15..0], the first 6 bytes are the left channel audio valid data, and the last 6 bytes are invalid data.
  • Step S810 the three serial buffers in the FPGA 61 respectively buffer the left channel audio data of the 16 clocks received by the interface 1, the interface 2 and the interface 3, and the right channel audio of the last buffer in the serial buffer.
  • the data is sorted while the sending component sends the sorted right channel audio data.
  • the buffered left channel audio data is represented by WORD1[15..0], WORD2[15..0] and WORD3[15..0], respectively.
  • step S811 the CPU 62 receives 12 bytes of right channel audio data from the single audio interface 4 of the FPGA 61, and splits the received right channel audio data to obtain 16 clock right channel audio signals WORD1 [ 31..16], WORD2[31..16] and WORD3[31..16].
  • step S812 the CPU 62 discards the invalid data in WORD1[31..16], WORD2[31..16] and WORD3[31..16], and obtains the right channel audio effective data. Since the CPU 62 receives only the right channel audio data, the CPU 62 receives 12 bytes of sample data each time, that is, 96 clock signals, so the signals WORD1 [31..16], WORD2 [31..16], and WORD3. [31..16, the first 6 bytes are the right channel audio valid data, and the last 6 bytes are invalid data.
  • Step S813 the CPU 62 combines the obtained left channel audio effective data and the right channel audio effective data according to the correspondence between each left channel audio effective data and each right channel audio effective data to obtain one frame. Complete multi-channel audio data. The process returns to step S807.
  • step S814 the CPU 62 outputs the multi-channel audio data.
  • the received multi-channel audio data is converted into single-channel audio data for output by the FPGA, and then the single-channel audio data is received by the CPU, and split into multiple audio data for output, and the solution is solved.
  • the CPU has only one set of audio input interfaces, which causes the problem that the multi-channel audio data cannot be received, and the speech recognition needs are more satisfied.
  • the present invention also provides an audio output device for performing the above method.
  • FIG. 9 is a block diagram of an audio output device according to an embodiment of the present invention. As shown in Figure 9, the device is used in a field programmable gate array FPGA, including:
  • the receiving module 91 is configured to receive multiple audio data input by the multiple audio input interfaces
  • a conversion module 92 configured to convert multiple audio data into single audio data
  • the first output module 93 is configured to output single channel audio data through a single audio output interface.
  • the receiving module 91 includes:
  • a first determining sub-module 911 configured to determine a second sampling rate of the multi-channel audio data according to the number of the multiple audio input interfaces and the first sampling rate of the preset single-channel audio data
  • a second determining sub-module 912 configured to determine, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input
  • the first receiving submodule 913 is configured to receive the multiple audio data input by the multiple audio input interfaces according to the number of clocks.
  • the conversion module 92 includes:
  • the first buffer sub-module 921 is configured to buffer multiple audio data according to the number of clocks
  • the sorting sub-module 922 is configured to sort the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces to obtain the sorted single-channel audio data.
  • the first output module 93 includes:
  • the first output sub-module is configured to output the sorted single-channel audio data.
  • the receiving module 91 includes:
  • a second receiving submodule 914 configured to receive multiple audio data input by the multiple audio input interfaces on a rising edge of the clock
  • the first output module 93 includes:
  • the second output sub-module 931 is configured to output single-channel audio data through the single-channel audio output interface on the falling edge of the clock.
  • the receiving module 91 includes:
  • the third receiving submodule is configured to receive the multiple audio data input by the multiple audio input interfaces according to the preset time interval.
  • the received multi-channel audio data is converted into single-channel audio data for output, so that the multi-channel audio data can be received and outputted in the voice recognition, and the central processor has only one voice recognition.
  • FIG. 13 is a block diagram of an audio output device in accordance with an embodiment of the present invention. As shown in Figure 13, the device is used in a central processing unit CPU, including:
  • the splitting module 131 is configured to split the single channel audio data into multiple audio data when receiving the single channel audio data;
  • the second output module 132 is configured to output multiple audio data.
  • the splitting module 131 includes:
  • the obtaining submodule 1311 is configured to acquire the number of multiple audio input interfaces
  • a third determining sub-module 1312 configured to determine a second sampling rate of the multi-channel audio data according to the number of the multiple audio input interfaces and the first sampling rate of the preset single-channel audio data;
  • a fourth determining sub-module 1313 configured to determine, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input
  • the first splitting module 1314 is configured to split the single audio data into multiple audio data according to the number of clocks.
  • the single channel audio data is single channel left channel audio data or single channel right channel audio data;
  • the split module 131 includes:
  • a second splitting module 1315 configured to split the single channel left channel audio data into multiple left channel audio data when receiving the single channel left channel audio data
  • the fifth determining sub-module 1316 is configured to determine, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, determine the first of the split multi-channel left channel audio data. Invalid data;
  • a first discarding sub-module 1317 configured to discard the first invalid data, and obtain multi-channel left channel audio valid data
  • a third splitting module 1318 configured to split the single right channel audio data into multiple right channel audio data when receiving the single right channel audio data
  • a sixth determining sub-module 1319 configured to determine the split multi-channel right channel according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data Second invalid data in the audio data;
  • a second discarding sub-module 13110 configured to discard the second invalid data, and obtain multi-channel right channel audio valid data
  • the combining sub-module 13111 is configured to combine the multi-channel left channel audio effective data and the multi-channel right channel audio effective data according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data. Get multiple audio data.
  • the received single channel audio data is split into multiple audio data for output, so that the audio data received by the single group audio input interface in the voice recognition can be outputted in multiple channels, thereby
  • the ability to output complete audio data solves the problem that the central processor has only one set of audio input interfaces during speech recognition, which makes it impossible to receive multiple audio data, and more satisfies the requirements of speech recognition.
  • FIG. 16 is a block diagram of an apparatus for performing an audio output method, according to an exemplary embodiment.
  • device 1600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
  • device 1600 can include one or more of the following components: processor 1601, memory 1602, and communication component 1603.
  • the processor 1601 typically controls the overall operation of the device 1600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • the processor 1601 can execute instructions to perform all or part of the steps of the above method.
  • Memory 1602 is configured to store various types of data to support operation at device 1600. Examples of such data include instructions for any application or method operating on device 1600, contact data, phone book data, messages, pictures, videos, and the like.
  • the memory 1602 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM Electrically erasable programmable read only memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Communication component 1603 is configured to facilitate wired or wireless communication between device 1600 and other devices.
  • the device 1600 can access a wireless network based on a communication standard, such as Wi-Fi, 2G or 3G, or a combination thereof.
  • the communication component 1603 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel.
  • communication component 1603 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 1600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the above audio output method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor or other electronic component implementation for performing the above audio output method.
  • non-transitory computer readable storage medium comprising instructions, such as a memory 1602 comprising instructions executable by processor 1601 of apparatus 1600 to perform the audio output method described above.
  • the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic Tapes, floppy disks, and optical data storage devices.
  • the present invention also provides a non-transitory computer readable recording medium having recorded thereon a computer program including instructions for executing the audio output method according to the above-described embodiment of the present invention.
  • the present invention also provides a computer program comprising: instructions for performing an audio output method as described in the above embodiments of the present invention when the program is executed by a computer.
  • embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
  • the computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device.
  • the apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device.
  • the instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephone Function (AREA)
  • Stereophonic System (AREA)

Abstract

An audio output method and apparatus, which are used for collecting multiple paths of audio data in voice recognition. The method comprises: receiving multiple paths of audio data (S11); converting the multiple paths of audio data into a single path of audio data (S12); and sending same to a processor via a single-path audio output interface (S13). In the technical solution, multiple paths of received audio data are converted into a single path of audio data, so that multiple paths of audio data can be received in voice recognition, thereby solving the problem that multiple paths of audio data cannot be received because a central processing unit of a voice recognition application platform has only one group of audio input interfaces, and satisfying voice recognition requirements to a greater extent.

Description

一种音频输出方法及装置Audio output method and device
本申请基于申请日为2015年9月15日、申请号为CN201510587775.8、题为“一种音频输出方法及装置”的发明专利申请提出,并要求该发明专利申请的优先权,该发明专利申请的全部内容在此引入本申请作为参考。This application is based on an invention patent application filed on September 15, 2015, the application number is CN201510587775.8, entitled "An Audio Output Method and Apparatus", and claims the priority of the invention patent application, the invention patent The entire contents of the application are incorporated herein by reference.
技术领域Technical field
本发明涉及多媒体技术领域,尤其涉及一种音频输出方法及装置。The present invention relates to the field of multimedia technologies, and in particular, to an audio output method and apparatus.
背景技术Background technique
音频数据的采集、处理和传输是多媒体技术的重要组成部分。随着多媒体技术的迅速发展,众多的数字音频***已经进入消费市场,例如数字音频录音带、数字声音处理器等。对于设备和生产厂家来说,标准化的信息传输结构可以提高***的适应性。IIS(Inter-IC Sound,音频输入接口)总线作为数字音频设备之间的音频数据传输的一种总线标准,负责音频设备之间的数据传输,广泛应用于各种多媒体***。相关技术中,在语音识别的应用中,需要很多麦克(6麦克或8麦克)输入,便于做音源定位和噪声消除等等,但是,用来做语音识别的CPU(Center Process Unit,中央处理器)通常只有一组音频输入接口IIS,无法满足语音识别的需求。The collection, processing and transmission of audio data is an important part of multimedia technology. With the rapid development of multimedia technology, numerous digital audio systems have entered the consumer market, such as digital audio tapes, digital sound processors and the like. For equipment and manufacturers, a standardized information transfer structure can improve the system's adaptability. IIS (Inter-IC Sound) bus is a bus standard for audio data transmission between digital audio devices. It is responsible for data transmission between audio devices and is widely used in various multimedia systems. In the related art, in the application of speech recognition, a lot of microphone (6 mic or 8 mic) input is needed, which is convenient for sound source localization and noise elimination, etc., but the CPU used for speech recognition (Center Process Unit, central processing unit) ) Usually there is only one set of audio input interfaces, IIS, which cannot meet the needs of speech recognition.
发明内容Summary of the invention
本发明实施例提供一种音频输出方法及装置,用于实现语音识别中对多路音频数据的采集。The embodiment of the invention provides an audio output method and device, which are used for collecting multi-channel audio data in speech recognition.
第一方面,提供一种音频输出方法,应用于现场可编程门阵列,包括以下步骤:In a first aspect, an audio output method is provided for use in a field programmable gate array, comprising the steps of:
接收多路音频输入接口输入的多路音频数据;Receiving multiple audio data input by multiple audio input interfaces;
将所述多路音频数据转化为单路音频数据;Converting the multi-channel audio data into single-channel audio data;
通过单路音频输出接口输出所述单路音频数据。The single channel audio data is output through a single audio output interface.
在一个实施例中,所述接收多路音频输入接口输入的多路音频数据,包括:In one embodiment, the receiving multiple audio data input by the multiple audio input interfaces includes:
根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
按照所述时钟个数接收所述多路音频输入接口输入的多路音频数据。The multi-channel audio data input by the multi-channel audio input interface is received according to the number of clocks.
在一个实施例中,所述将所述多路音频数据转化为单路音频数据,包括:In one embodiment, the converting the multi-channel audio data into single-channel audio data comprises:
按照所述时钟个数缓存所述多路音频数据;Cache the multiple audio data according to the number of clocks;
将所述缓存的多路音频数据按照所述多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。 And sorting the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces to obtain the sorted single-channel audio data.
在一个实施例中,所述输出所述单路音频数据,包括:In one embodiment, the outputting the single channel audio data comprises:
输出所述排序后的单路音频数据。The sorted single channel audio data is output.
在一个实施例中,所述接收多路音频输入接口输入的多路音频数据,包括:In one embodiment, the receiving multiple audio data input by the multiple audio input interfaces includes:
在时钟的上升沿接收多路音频输入接口输入的多路音频数据;Receiving multiple audio data input by multiple audio input interfaces on a rising edge of the clock;
所述通过单路音频输出接口输出所述单路音频数据,包括:The outputting the single channel audio data by using a single audio output interface includes:
在所述时钟的下降沿通过单路音频输出接口输出所述单路音频数据。The single channel audio data is output through a single audio output interface at a falling edge of the clock.
在一个实施例中,所述接收多路音频输入接口输入的多路音频数据,包括:In one embodiment, the receiving multiple audio data input by the multiple audio input interfaces includes:
按照预设时间间隔接收多路音频输入接口输入的多路音频数据。Multiple audio data input by multiple audio input interfaces is received at preset time intervals.
本发明实施例的一些有益效果可以包括:Some beneficial effects of embodiments of the present invention may include:
上述技术方案,通过将接收到的多路音频数据转化为单路音频数据进行输出,使得语音识别中能够接收并输出多路音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。The above technical solution converts the received multi-channel audio data into single-channel audio data for output, so that multi-channel audio data can be received and outputted in the speech recognition, and the central processor has only one set of audio input interfaces when the speech recognition is solved. The problem of not being able to receive multiple audio data has more satisfied the needs of speech recognition.
第二方面,提供一种音频输出方法,应用于中央处理器,包括以下步骤:In a second aspect, an audio output method is provided for use in a central processing unit, including the following steps:
当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据;When receiving single channel audio data, splitting the single channel audio data into multiple channels of audio data;
输出所述多路音频数据。The multi-channel audio data is output.
在一个实施例中,所述将所述单路音频数据拆分为多路音频数据,包括:In one embodiment, the splitting the single audio data into multiple audio data includes:
获取多路音频输入接口的数量;Obtain the number of multiple audio input interfaces;
根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
按照所述时钟个数,将所述单路音频数据拆分为多路音频数据。The single audio data is split into multiple audio data according to the number of clocks.
在一个实施例中,所述单路音频数据为单路左声道音频数据或单路右声道音频数据;所述当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据,包括:In one embodiment, the single channel audio data is single channel left channel audio data or single channel right channel audio data; when the single channel audio data is received, the single channel audio data is split Multiple audio data, including:
当接收到所述单路左声道音频数据时,将所述单路左声道音频数据拆分为多路左声道音频数据;When the single channel left channel audio data is received, the single channel left channel audio data is split into multiple left channel audio data;
根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路左声道音频数据中的第一无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the first invalid data in the split multi-channel left channel audio data;
丢弃所述第一无效数据,获得多路左声道音频有效数据;Discarding the first invalid data to obtain multi-channel left channel audio valid data;
当接收到所述单路右声道音频数据时,将所述单路右声道音频数据拆分为多路右声道音频数据;When the single channel right channel audio data is received, splitting the single channel right channel audio data into multiple right channel audio data;
根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路右声道音频数据中的第二无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received one-way audio data, the second invalid data in the split multi-channel right channel audio data;
丢弃所述第二无效数据,获得多路右声道音频有效数据;Discarding the second invalid data to obtain multi-channel right channel audio valid data;
按照所述多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合所述多路左声道音频有效数据和所述多路右声道音频有效数据,获得所述多路音频数据。 Obtaining, according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, combining the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, The multi-channel audio data.
本发明实施例的一些有益效果可以包括:Some beneficial effects of embodiments of the present invention may include:
上述技术方案,通过将接收到的单路音频数据拆分为多路音频数据进行输出,使得语音识别中以单组音频输入接口接收的音频数据能够以多路形式输出,从而能够输出完整的音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。The above technical solution is configured to split the received single channel audio data into multiple audio data, so that the audio data received by the single group audio input interface in the voice recognition can be outputted in multiple channels, thereby being able to output complete audio. The data solves the problem that the central processing unit has only one set of audio input interfaces during speech recognition, which makes it impossible to receive multi-channel audio data, and satisfies the requirement of speech recognition to a greater extent.
第三方面,提供一种音频输出装置,应用于现场可编程门阵列,所述装置包括:In a third aspect, an audio output device is provided for use in a field programmable gate array, the device comprising:
接收模块,用于接收多路音频输入接口输入的多路音频数据;a receiving module, configured to receive multiple audio data input by multiple audio input interfaces;
转化模块,用于将所述多路音频数据转化为单路音频数据;a conversion module, configured to convert the multi-channel audio data into single-channel audio data;
第一输出模块,用于通过单路音频输出接口输出所述单路音频数据。The first output module is configured to output the single channel audio data through a single audio output interface.
在一个实施例中,所述接收模块包括:In an embodiment, the receiving module comprises:
第一确定子模块,用于根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;a first determining submodule, configured to determine a second sampling rate of the multiple audio data according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data;
第二确定子模块,用于根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;a second determining submodule, configured to determine, according to the second sampling rate, a number of clocks that are input by the multiple audio data in a single input;
第一接收子模块,用于按照所述时钟个数接收所述多路音频输入接口输入的多路音频数据。The first receiving submodule is configured to receive the multiple audio data input by the multiple audio input interfaces according to the number of the clocks.
在一个实施例中,所述转化模块包括:In one embodiment, the conversion module includes:
第一缓存子模块,用于按照所述时钟个数缓存所述多路音频数据;a first buffer submodule, configured to buffer the multiple audio data according to the number of clocks;
排序子模块,用于将所述缓存的多路音频数据按照所述多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。And a sorting sub-module, configured to sort the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces, to obtain the sorted single-channel audio data.
在一个实施例中,所述第一输出模块包括:In an embodiment, the first output module comprises:
第一输出子模块,用于输出所述排序后的单路音频数据。The first output submodule is configured to output the sorted single channel audio data.
在一个实施例中,所述接收模块包括:In an embodiment, the receiving module comprises:
第二接收子模块,用于在时钟的上升沿接收多路音频输入接口输入的多路音频数据;a second receiving submodule, configured to receive multiple audio data input by the multiple audio input interfaces on a rising edge of the clock;
所述第一输出模块,包括:The first output module includes:
第二输出子模块,用于在所述时钟的下降沿通过单路音频输出接口输出所述单路音频数据。And a second output submodule, configured to output the single audio data through a single audio output interface on a falling edge of the clock.
在一个实施例中,所述接收模块包括:In an embodiment, the receiving module comprises:
第三接收子模块,用于按照预设时间间隔接收多路音频输入接口输入的多路音频数据。The third receiving submodule is configured to receive the multiple audio data input by the multiple audio input interfaces according to the preset time interval.
本发明实施例的一些有益效果可以包括:Some beneficial effects of embodiments of the present invention may include:
上述装置,通过将接收到的多路音频数据转化为单路音频数据进行输出,使得语音识别中能够接收并输出多路音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。The above device converts the received multi-channel audio data into single-channel audio data for output, so that multi-channel audio data can be received and outputted in the speech recognition, and the central processor has only one set of audio input interfaces when the speech recognition is solved. The problem of not being able to receive multiple audio data has more satisfied the needs of speech recognition.
第四方面,提供一种音频输出装置,应用于中央处理器,所述装置包括: In a fourth aspect, an audio output device is provided for use in a central processing unit, the device comprising:
拆分模块,用于当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据;a splitting module, configured to split the single audio data into multiple audio data when receiving single audio data;
第二输出模块,用于输出所述多路音频数据。a second output module, configured to output the multiple audio data.
在一个实施例中,所述拆分模块包括:In an embodiment, the splitting module comprises:
获取子模块,用于获取多路音频输入接口的数量;Obtaining a sub-module for acquiring the number of multi-channel audio input interfaces;
第三确定子模块,用于根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;a third determining submodule, configured to determine a second sampling rate of the multiple audio data according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data;
第四确定子模块,用于根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;a fourth determining submodule, configured to determine, according to the second sampling rate, a number of clocks that are input by the multiple audio data in a single input;
第一拆分子模块,用于按照所述时钟个数,将所述单路音频数据拆分为多路音频数据。The first splitting module is configured to split the single audio data into multiple audio data according to the number of clocks.
在一个实施例中,所述单路音频数据为单路左声道音频数据或单路右声道音频数据;所述拆分模块包括:In one embodiment, the single channel audio data is single channel left channel audio data or single channel right channel audio data; the split module includes:
第二拆分子模块,用于当接收到所述单路左声道音频数据时,将所述单路左声道音频数据拆分为多路左声道音频数据;a second splitting module, configured to split the single left channel audio data into multiple left channel audio data when receiving the single channel left channel audio data;
第五确定子模块,用于根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路左声道音频数据中的第一无效数据;a fifth determining submodule, configured to determine the split multichannel left channel audio according to the number of clocks input by the multiple audio data and the number of clocks of the received single channel audio data The first invalid data in the data;
第一丢弃子模块,用于丢弃所述第一无效数据,获得多路左声道音频有效数据;a first discarding sub-module, configured to discard the first invalid data, and obtain multi-channel left channel audio valid data;
第三拆分子模块,用于当接收到所述单路右声道音频数据时,将所述单路右声道音频数据拆分为多路右声道音频数据;a third splitting module, configured to split the single right channel audio data into multiple right channel audio data when receiving the single right channel audio data;
第六确定子模块,用于根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路右声道音频数据中的第二无效数据;a sixth determining submodule, configured to determine the split multichannel right channel audio according to the number of clocks input by the multiple audio data and the number of clocks of the received single channel audio data The second invalid data in the data;
第二丢弃子模块,用于丢弃所述第二无效数据,获得多路右声道音频有效数据;a second discarding sub-module, configured to discard the second invalid data, and obtain multi-channel right channel audio valid data;
结合子模块,用于按照所述多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合所述多路左声道音频有效数据和所述多路右声道音频有效数据,获得所述多路音频数据。a combination sub-module, configured to combine the multi-channel left channel audio effective data and the multi-channel right sound according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data The channel audio is valid data, and the multi-channel audio data is obtained.
本发明实施例的一些有益效果可以包括:Some beneficial effects of embodiments of the present invention may include:
上述装置,通过将接收到的单路音频数据拆分为多路音频数据进行输出,使得语音识别中以单组音频输入接口接收的音频数据能够以多路形式输出,从而能够输出完整的音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。The above device can output the audio data received by the single group audio input interface in the voice recognition by splitting the received single channel audio data into multiple audio data, so that the complete audio data can be output. The problem that the central processor has only one set of audio input interfaces and can not receive multi-channel audio data when the speech recognition is solved, satisfies the requirement of speech recognition to a greater extent.
第五方面,提供一种音频输出装置,其特征在于,应用于现场可编程门阵列,所述装置包括:In a fifth aspect, an audio output device is provided, which is applied to a field programmable gate array, the device comprising:
处理器;processor;
用于存储所述处理器可执行指令的存储器; a memory for storing the processor executable instructions;
其中,所述处理器被配置为:Wherein the processor is configured to:
接收多路音频输入接口输入的多路音频数据;Receiving multiple audio data input by multiple audio input interfaces;
将所述多路音频数据转化为单路音频数据;Converting the multi-channel audio data into single-channel audio data;
通过单路音频输出接口输出所述多路音频数据。The multi-channel audio data is output through a single audio output interface.
上述处理器还被配置为:The above processor is also configured to:
根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
按照所述时钟个数接收所述多路音频输入接口输入的多路音频数据。The multi-channel audio data input by the multi-channel audio input interface is received according to the number of clocks.
上述处理器还被配置为:The above processor is also configured to:
按照所述时钟个数缓存所述多路音频数据;Cache the multiple audio data according to the number of clocks;
将所述缓存的多路音频数据按照所述多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。And sorting the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces to obtain the sorted single-channel audio data.
上述处理器还被配置为:The above processor is also configured to:
输出所述排序后的单路音频数据。The sorted single channel audio data is output.
上述处理器还被配置为:The above processor is also configured to:
在时钟的上升沿接收多路音频数据;Receiving multiple audio data on the rising edge of the clock;
在所述时钟的下降沿通过单路音频输出接口输出所述单路音频数据。The single channel audio data is output through a single audio output interface at a falling edge of the clock.
上述处理器还被配置为:The above processor is also configured to:
按照预设时间间隔接收多路音频输入接口输入的多路音频数据。Multiple audio data input by multiple audio input interfaces is received at preset time intervals.
第六方面,提供一种音频输出装置,其特征在于,应用于中央处理器,所述装置包括:In a sixth aspect, an audio output device is provided, which is applied to a central processing unit, and the device includes:
处理器;processor;
用于存储所述处理器可执行指令的存储器;a memory for storing the processor executable instructions;
其中,所述处理器被配置为:Wherein the processor is configured to:
当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据;When receiving single channel audio data, splitting the single channel audio data into multiple channels of audio data;
输出所述多路音频数据。The multi-channel audio data is output.
上述处理器还被配置为:The above processor is also configured to:
获取多路音频输入接口的数量;Obtain the number of multiple audio input interfaces;
根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
按照所述时钟个数,将所述单路音频数据拆分为多路音频数据。The single audio data is split into multiple audio data according to the number of clocks.
上述处理器还被配置为:The above processor is also configured to:
当接收到所述单路左声道音频数据时,将所述单路左声道音频数据拆分为多路左声道音频数据; When the single channel left channel audio data is received, the single channel left channel audio data is split into multiple left channel audio data;
根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路左声道音频数据中的第一无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the first invalid data in the split multi-channel left channel audio data;
丢弃所述第一无效数据,获得多路左声道音频有效数据;Discarding the first invalid data to obtain multi-channel left channel audio valid data;
当接收到所述单路右声道音频数据时,将所述单路右声道音频数据拆分为多路右声道音频数据;When the single channel right channel audio data is received, splitting the single channel right channel audio data into multiple right channel audio data;
根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路右声道音频数据中的第二无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received one-way audio data, the second invalid data in the split multi-channel right channel audio data;
丢弃所述第二无效数据,获得多路右声道音频有效数据;Discarding the second invalid data to obtain multi-channel right channel audio valid data;
按照所述多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合所述多路左声道音频有效数据和所述多路右声道音频有效数据,获得所述多路音频数据。Obtaining, according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, combining the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, The multi-channel audio data.
第七方面,提供一种非暂时性计算机可读记录介质,所述介质上记录有计算机程序,所述程序包括用于执行如本发明实施例第一方面所述的方法的指令。According to a seventh aspect, there is provided a non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for executing the method of the first aspect of the embodiment of the present invention.
第八方面,提供一种非暂时性计算机可读记录介质,所述介质上记录有计算机程序,所述程序包括用于执行如本发明实施例第二方面所述的方法的指令。In a eighth aspect, there is provided a non-transitory computer readable recording medium having recorded thereon a computer program, the program comprising instructions for performing the method of the second aspect of the embodiments of the present invention.
第九方面,提供一种计算机程序,所述程序包括:用于在所述程序被计算机执行时执行如本发明实施例第一方面所述的方法的指令。In a ninth aspect, a computer program is provided, the program comprising: instructions for performing the method of the first aspect of the embodiment of the invention when the program is executed by a computer.
第十方面,提供一种计算机程序,当其由计算机执行时,所述程序包括:用于在所述程序被计算机执行时执行如本发明实施例第二方面所述的方法的指令。In a tenth aspect, there is provided a computer program, when executed by a computer, the program comprising: instructions for performing the method of the second aspect of the embodiment of the invention when the program is executed by a computer.
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其他优点可通过在所写的说明书、权利要求书、以及附图中所特别指出的结构来实现和获得。Other features and advantages of the invention will be set forth in the description which follows, The objectives and other advantages of the invention may be realized and obtained by means of the structure particularly pointed in the appended claims.
下面通过附图和实施例,对本发明的技术方案做进一步的详细描述。The technical solution of the present invention will be further described in detail below through the accompanying drawings and embodiments.
附图说明DRAWINGS
附图用来提供对本发明的进一步理解,并且构成说明书的一部分,与本发明的实施例一起用于解释本发明,并不构成对本发明的限制。在附图中:The drawings are intended to provide a further understanding of the invention, and are intended to be a In the drawing:
图1为本发明实施例中一种音频输出方法的流程图;1 is a flowchart of an audio output method according to an embodiment of the present invention;
图2为本发明实施例中一种音频输出方法的流程图;2 is a flowchart of an audio output method according to an embodiment of the present invention;
图3为本发明实施例中一种音频输出方法的流程图;3 is a flowchart of an audio output method according to an embodiment of the present invention;
图4为本发明实施例中一种音频输出方法中步骤S32的流程图;4 is a flowchart of step S32 in an audio output method according to an embodiment of the present invention;
图5为本发明实施例中一种音频输出方法中步骤S32的流程图;FIG. 5 is a flowchart of step S32 in an audio output method according to an embodiment of the present invention;
图6为本发明实施例中一种音频输出方法中音频识别***的框图;6 is a block diagram of an audio recognition system in an audio output method according to an embodiment of the present invention;
图7为本发明实施例中一种音频输出方法中IIS信号的时序图;7 is a timing diagram of an IIS signal in an audio output method according to an embodiment of the present invention;
图8为本发明一具体实施例中一种音频输出方法的流程图;FIG. 8 is a flowchart of an audio output method according to an embodiment of the present invention; FIG.
图9为本发明实施例中一种音频输出装置的框图; FIG. 9 is a block diagram of an audio output device according to an embodiment of the present invention; FIG.
图10为本发明实施例中一种音频输出装置中第一接收模块的框图;10 is a block diagram of a first receiving module in an audio output device according to an embodiment of the present invention;
图11为本发明实施例中一种音频输出装置中转化模块的框图;11 is a block diagram of a conversion module in an audio output device according to an embodiment of the present invention;
图12为本发明实施例中一种音频输出装置的框图;FIG. 12 is a block diagram of an audio output device according to an embodiment of the present invention; FIG.
图13为本发明实施例中一种音频输出装置的框图;FIG. 13 is a block diagram of an audio output device according to an embodiment of the present invention; FIG.
图14为本发明实施例中一种音频输出装置中拆分模块的框图;14 is a block diagram of a splitting module in an audio output device according to an embodiment of the present invention;
图15为本发明实施例中一种音频输出装置中拆分模块的框图;15 is a block diagram of a splitting module in an audio output device according to an embodiment of the present invention;
图16为本发明实施例中一种可执行音频输出方法的装置的框图。16 is a block diagram of an apparatus for performing an audio output method in accordance with an embodiment of the present invention.
具体实施方式detailed description
以下结合附图对本发明的优选实施例进行说明,应当理解,此处所描述的优选实施例仅用于说明和解释本发明,并不用于限定本发明。The preferred embodiments of the present invention are described with reference to the accompanying drawings, which are intended to illustrate and illustrate the invention.
本发明实施例提供的一种音频输出方法涉及两个执行主体,包括现场可编程门阵列FPGA和中央处理器CPU,其中,FPGA用于从多路音频输入接口接收多路音频数据,并将多路音频数据转化为单路音频数据输出给CPU,CPU用于将接收到的单路音频数据拆分为多路音频数据进行输出,从而实现语音识别中对多路音频数据的识别。以下分别从FPGA和CPU两个执行主体的角度分别说明本发明实施例提供的一种音频输出方法。An audio output method provided by an embodiment of the present invention relates to two execution entities, including a field programmable gate array FPGA and a central processing unit CPU, wherein the FPGA is configured to receive multiple audio data from multiple audio input interfaces, and The road audio data is converted into a single channel audio data output to the CPU, and the CPU is configured to split the received single channel audio data into multiple channels of audio data for output, thereby realizing the recognition of the multi-channel audio data in the voice recognition. An audio output method according to an embodiment of the present invention is separately described from the perspective of two execution bodies of an FPGA and a CPU, respectively.
FPGA侧FPGA side
图1为本发明实施例中一种音频输出方法的流程图。如图1所示,该方法用于FPGA中,其中,FPGA中设置有多个与多路音频输入接口对应的串行缓存器、发送部件和单路音频输出接口,该音频输出方法包括以下步骤S11-S13:FIG. 1 is a flowchart of an audio output method according to an embodiment of the present invention. As shown in FIG. 1 , the method is used in an FPGA, wherein a plurality of serial buffers, a transmitting component, and a single audio output interface corresponding to multiple audio input interfaces are disposed in the FPGA, and the audio output method includes the following steps. S11-S13:
步骤S11,接收多路音频输入接口输入的多路音频数据。Step S11, receiving multi-channel audio data input by the multi-channel audio input interface.
该步骤中,可按照预设时间间隔接收多路音频输入接口输入的多路音频数据。例如,预设时间间隔设置为10ms。In this step, multiple audio data input by multiple audio input interfaces can be received at preset time intervals. For example, the preset time interval is set to 10ms.
步骤S12,将多路音频数据转化为单路音频数据。In step S12, the multi-channel audio data is converted into single-channel audio data.
步骤S13,通过单路音频输出接口输出单路音频数据。In step S13, single channel audio data is output through a single audio output interface.
在一个实施例中,如图2所示,上述方法还可实施为以下步骤S21-S26:In one embodiment, as shown in FIG. 2, the above method may also be implemented as the following steps S21-S26:
步骤S21,根据多路音频输入接口的数量和预设的单路音频数据的第一采样率,确定多路音频数据的第二采样率。Step S21: Determine a second sampling rate of the multi-channel audio data according to the number of the multi-channel audio input interfaces and the first sampling rate of the preset single-channel audio data.
例如,多路音频输入接口的数量为3,预设的单路音频数据的第一采样率为96KHz,在本发明实施例中,音频数据分左声道音频数据和右声道音频数据,因此,3个多路音频输入接口可接收6个麦克的音频输入,则多路音频数据的第二采样率为第一采样率的1/6,即16KHz。其中,16KHz和96KHz都是标准的采样率。For example, the number of the multi-channel audio input interfaces is 3, and the first sampling rate of the preset single-channel audio data is 96 KHz. In the embodiment of the present invention, the audio data is divided into left channel audio data and right channel audio data, The three multi-channel audio input interfaces can receive audio input of six microphones, and the second sampling rate of the multi-channel audio data is 1/6 of the first sampling rate, that is, 16KHz. Among them, 16KHz and 96KHz are standard sampling rates.
步骤S22,根据第二采样率,确定多路音频数据单次输入的时钟个数。Step S22, determining the number of clocks input by the multi-channel audio data in a single time according to the second sampling rate.
该步骤中,由于FPGA在每个时钟的上升沿进行一次采样,因此,第二采样率为16KHz时,多路音频数据单次输入的时钟个数为16个时钟。 In this step, since the FPGA performs sampling once on the rising edge of each clock, when the second sampling rate is 16 kHz, the number of clocks input by the multiple audio data is 16 clocks.
步骤S23,按照时钟个数接收多路音频输入接口输入的多路音频数据。Step S23, receiving the multi-channel audio data input by the multi-channel audio input interface according to the number of clocks.
步骤S24,按照时钟个数缓存多路音频数据。In step S24, the multi-channel audio data is buffered according to the number of clocks.
该步骤中,FPGA中有与多路音频输入接口对应的串行缓存器,用来缓存多路音频输入接口接收到的多路音频数据,In this step, the FPGA has a serial buffer corresponding to the multi-channel audio input interface for buffering the multi-channel audio data received by the multi-channel audio input interface.
步骤S25,将缓存的多路音频数据按照多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。In step S25, the buffered multi-channel audio data is sorted according to a preset order of the multi-channel audio input interfaces, and the sorted single-channel audio data is obtained.
例如,FPGA共有音频输入接口1、音频输入接口2和音频输入接口3这三个接口,且预设顺序为音频输入接口1-音频输入接口2-音频输入接口3,则对缓存的多路音频数据排序时,音频输入接口1接收到的音频数据排在最前,然后是音频输入接口2接收到的数据,最后则是音频输入接口3接收到的数据,通过上述的排序方法,将多路音频数据转化成了单路音频数据。For example, the FPGA has three interfaces of an audio input interface 1, an audio input interface 2, and an audio input interface 3, and the preset order is an audio input interface 1 - an audio input interface 2 - an audio input interface 3, and a buffered multi-channel audio When the data is sorted, the audio data received by the audio input interface 1 is ranked first, then the data received by the audio input interface 2, and finally the data received by the audio input interface 3, and the multi-channel audio is obtained by the above sorting method. The data is converted into single-channel audio data.
步骤S26,输出排序后的单路音频数据。Step S26, outputting the sorted single channel audio data.
该步骤中,单路音频数据被FPGA中的发送部件发送,经单路音频输出接口输出给CPU,再由CPU将单路音频数据拆分为多路音频数据进行输出,最终获得完整的多路音频数据。In this step, the single-channel audio data is sent by the transmitting component in the FPGA, and is output to the CPU through the single-channel audio output interface, and then the CPU splits the single-channel audio data into multiple audio data for output, and finally obtains a complete multi-channel. Audio data.
上述方法中,步骤S21-S23为步骤S11的一种实施方式,步骤S24-S25为步骤S12的一种实施方式,步骤S26为步骤S13的一种实施方式。在具体实施过程中,步骤S21-S26是循环进行的,且FPGA中的串行缓存器缓存音频数据和发送部件发送音频数据的操作是同时进行的。此外,由于本发明实施例中涉及的音频数据分为左声道音频数据和右声道音频数据,因此FPGA需要循环上述流程两次,且两次分别输出的左声道音频数据和右声道音频数据经CPU结合,并拆分成多路音频数据,才能完成一帧音频数据的获取。In the above method, steps S21-S23 are an embodiment of step S11, steps S24-S25 are an embodiment of step S12, and step S26 is an embodiment of step S13. In a specific implementation process, steps S21-S26 are performed cyclically, and the operations of serial buffer buffering audio data and transmitting component transmitting audio data in the FPGA are performed simultaneously. In addition, since the audio data involved in the embodiment of the present invention is divided into left channel audio data and right channel audio data, the FPGA needs to loop the above process twice, and separately output the left channel audio data and the right channel twice. The audio data is combined by the CPU and split into multiple audio data to complete the acquisition of one frame of audio data.
在一个实施例中,步骤S11可实施为以下步骤:在时钟的上升沿接收多路音频输入接口输入的多路音频数据;此时,步骤S13可实施为以下步骤:在同一时钟的下降沿通过单路音频输出接口输出单路音频数据。采用本实施例的技术方案,能够确保一次采样和输出在同一时钟内进行。In an embodiment, step S11 may be implemented as follows: receiving multiple audio data input by multiple audio input interfaces on a rising edge of the clock; in this case, step S13 may be implemented as follows: passing on the falling edge of the same clock The single audio output interface outputs single audio data. With the technical solution of the embodiment, it is ensured that one sampling and output are performed in the same clock.
采用本发明实施例提供的技术方案,通过将接收到的多路音频数据转化为单路音频数据进行输出,使得语音识别中能够接收并输出多路音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。By adopting the technical solution provided by the embodiment of the present invention, the received multi-channel audio data is converted into single-channel audio data for output, so that the multi-channel audio data can be received and output in the speech recognition, and the central processor is only solved when the speech recognition is performed. There is a set of audio input interfaces that can not receive multiple audio data, which meets the needs of speech recognition to a greater extent.
在上述任一实施例中,FPGA组成的电路都包含一个复位信号,在***每次上电时对数据进行清零,以确保***工作的准确性。In any of the above embodiments, the circuit composed of the FPGA includes a reset signal to clear the data each time the system is powered on to ensure the accuracy of the system operation.
CPU侧CPU side
图3为本发明实施例中一种音频输出方法的流程图。如图3所示,该方法用于CPU侧,包括以下步骤S31-S32:FIG. 3 is a flowchart of an audio output method according to an embodiment of the present invention. As shown in FIG. 3, the method is used on the CPU side, and includes the following steps S31-S32:
步骤S31,当接收到单路音频数据时,将单路音频数据拆分为多路音频数据。In step S31, when single channel audio data is received, the single channel audio data is split into multiple channels of audio data.
该步骤中,CPU所接收的单路音频数据即为FPGA中的单路音频输出接口所输出的单路 音频数据。In this step, the single channel audio data received by the CPU is a single channel outputted by the single channel audio output interface in the FPGA. Audio data.
步骤S32,输出多路音频数据。In step S32, multi-channel audio data is output.
在一个实施例中,如图4所示,步骤S31可实施为以下步骤S311-S314:In an embodiment, as shown in FIG. 4, step S31 can be implemented as the following steps S311-S314:
步骤S311,获取多路音频输入接口的数量。Step S311, obtaining the number of multiple audio input interfaces.
步骤S312,根据多路音频输入接口的数量和预设的单路音频数据的第一采样率,确定多路音频数据的第二采样率。Step S312, determining a second sampling rate of the multi-channel audio data according to the number of the multi-channel audio input interfaces and the first sampling rate of the preset single-channel audio data.
例如,多路音频输入接口的数量为3,预设的单路音频数据的第一采样率为96KHz,在本发明实施例中,音频数据分左声道音频数据和右声道音频数据,因此,3个多路音频输入接口可接收6个麦克的音频输入,则多路音频数据的第二采样率为第一采样率的6分频,即16KHz。其中,16KHz和96KHz都是标准的采样率。For example, the number of the multi-channel audio input interfaces is 3, and the first sampling rate of the preset single-channel audio data is 96 KHz. In the embodiment of the present invention, the audio data is divided into left channel audio data and right channel audio data, The three multi-channel audio input interfaces can receive audio input of six microphones, and the second sampling rate of the multi-channel audio data is divided by 6 of the first sampling rate, that is, 16 kHz. Among them, 16KHz and 96KHz are standard sampling rates.
步骤S313,根据第二采样率,确定多路音频数据单次输入的时钟个数。Step S313, determining the number of clocks input by the multi-channel audio data in a single time according to the second sampling rate.
该步骤中,由于FPGA在每个时钟的上升沿进行一次采样,因此,第二采样率为16KHz时,多路音频数据单次输入的时钟个数为16个时钟。In this step, since the FPGA performs sampling once on the rising edge of each clock, when the second sampling rate is 16 kHz, the number of clocks input by the multiple audio data is 16 clocks.
步骤S314,按照时钟个数,将单路音频数据拆分为多路音频数据。In step S314, the single channel audio data is split into multiple channels of audio data according to the number of clocks.
在一个实施例中,由于左声道音频数据和右声道音频数据是分别进行采集的,因此,如图5所示,步骤S31可实施为以下步骤S51-S57:In one embodiment, since the left channel audio data and the right channel audio data are separately collected, as shown in FIG. 5, step S31 can be implemented as the following steps S51-S57:
步骤S51,当接收到单路左声道音频数据时,将单路左声道音频数据拆分为多路左声道音频数据。In step S51, when the single channel left channel audio data is received, the single channel left channel audio data is split into multiple left channel audio data.
步骤S52,根据多路音频数据单次输入的时钟个数和接收到的单路音频数据的时钟个数,确定拆分后的多路左声道音频数据中的第一无效数据。Step S52: Determine, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the first invalid data in the split multi-channel left channel audio data.
该步骤中,CPU接收到的单路音频数据的时钟个数和多路音频数据单次输入的时钟个数的差值,即为拆分后的多路左声道音频数据中的第一无效数据的位数。In this step, the difference between the number of clocks of the single-channel audio data received by the CPU and the number of clocks input by the multi-channel audio data is the first invalid in the split multi-channel left channel audio data. The number of bits of data.
步骤S53,丢弃第一无效数据,获得多路左声道音频有效数据。In step S53, the first invalid data is discarded, and the multi-channel left channel audio effective data is obtained.
步骤S54,当接收到单路右声道音频数据时,将单路右声道音频数据拆分为多路右声道音频数据。In step S54, when the single channel right channel audio data is received, the single channel right channel audio data is split into multiple right channel audio data.
步骤S55,根据多路音频数据单次输入的时钟个数和接收到的单路音频数据的时钟个数,确定拆分后的多路右声道音频数据中的第二无效数据。Step S55: Determine, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the second invalid data in the split multi-channel right channel audio data.
该步骤中,CPU接收到的单路音频数据的时钟个数和多路音频数据单次输入的时钟个数的差值,即为拆分后的多路右声道音频数据中的第二无效数据的位数。In this step, the difference between the number of clocks of the single-channel audio data received by the CPU and the number of clocks input by the multi-channel audio data is the second invalid in the split multi-channel right channel audio data. The number of bits of data.
步骤S56,丢弃第二无效数据,获得多路右声道音频有效数据。In step S56, the second invalid data is discarded, and the multi-channel right channel audio effective data is obtained.
步骤S57,按照多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合多路左声道音频有效数据和多路右声道音频有效数据,获得多路音频数据。Step S57, according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, combining the multi-channel left channel audio effective data and the multi-channel right channel audio effective data to obtain multi-channel audio data.
该实施例中,将单路左声道音频数据拆分为多路左声道音频数据,或将单路右声道音频数据拆分为多路右声道音频数据时,都是按照多路音频数据单次输入的时钟个数进行拆分的。 In this embodiment, when the single channel left channel audio data is split into multiple left channel audio data, or the single channel right channel audio data is split into multiple right channel audio data, all are in accordance with multiple channels. The number of clocks for a single input of audio data is split.
采用本发明实施例提供的技术方案,通过将接收到的单路音频数据拆分为多路音频数据进行输出,使得语音识别中以单组音频输入接口接收的音频数据能够以多路形式输出,从而能够采集到完整的音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。By adopting the technical solution provided by the embodiment of the present invention, the received single channel audio data is split into multiple audio data for output, so that the audio data received by the single group audio input interface in the voice recognition can be output in multiple channels. Therefore, the complete audio data can be collected, and the problem that the central processor has only one set of audio input interfaces during speech recognition and the multi-channel audio data cannot be received is solved, and the requirement of speech recognition is more satisfied.
以下通过一个具体实施例来说明本发明提供的音频输出方法。The audio output method provided by the present invention will be described below by way of a specific embodiment.
该具体实施例中,音频识别***中包括现场可编程门阵列FPGA和中央处理器CPU,该***的框图如图6所示。从图6可看出,该音频识别***共有6个麦克输入,便于做音源定位和噪声消除等,FPGA61共有3个多路音频输入接口(接口1、接口2和接口3)和一个单路音频输出接口4,3个多路音频输入接口分别接收6个麦克输入的3组音频数据。其中,FPGA61内有与接口1、接口2和接口3分别对应的3个串行缓存器(串行缓存器1、串行缓存器2和串行缓存器3)。该音频识别***采用串行数字音频总线协议IIS(Inter-IC Sound bus),IIS信号的定义如下表1所示,时序图如图7所示。In this embodiment, the audio recognition system includes a field programmable gate array FPGA and a central processing unit CPU, and a block diagram of the system is shown in FIG. 6. As can be seen from Figure 6, the audio recognition system has a total of six microphone inputs for easy sound source positioning and noise cancellation. The FPGA 61 has three multi-channel audio input interfaces (interface 1, interface 2 and interface 3) and a single audio. The output interface 4 and the three multi-channel audio input interfaces respectively receive three sets of audio data input by six microphones. Among them, the FPGA 61 has three serial buffers (serial buffer 1, serial buffer 2, and serial buffer 3) corresponding to the interface 1, the interface 2, and the interface 3, respectively. The audio recognition system uses the serial digital audio bus protocol IIS (Inter-IC Sound bus). The definition of the IIS signal is shown in Table 1 below. The timing diagram is shown in Figure 7.
表1Table 1
信号名称Signal name 信号方向Signal direction 信号描述Signal description
BCLKBCLK 输出Output 位采样时钟Bit sampling clock
LRCKLRCK 输出Output 左右声道同步时钟Left and right channel sync clock
RXDRXD 输入Input 输入信号input signal
TXDTXD 输出Output 输出信号output signal
该实施例中,左右声道各16bit,单路音频数据的采样率设置为96KHz,即LRCK为96KHz,由于BCLK=LRCK*数据宽度*左右通道数量,其中,数据宽度即为16bit,左右通道数量为2(本实施例中涉及的音频数据为立体声数据,因此左右通道数量为2),因此BCLK为3.072MHz。根据多路音频输入接口的数量和预设的单路音频数据的采样率,可确定出多路音频数据的采样率是单路音频数据的采样率的1/6,即多路音频数据的时钟信号是由单路音频数据的时钟信号进行6分频得到的,单路音频数据和多路音频数据的时序图如图7中所示,图7示出了单路音频数据中BCLK、LRCK、RXD和TXD的时序图,以及多路音频数据中BCLK1的时序图,其中,BCLK1表示多路音频数据中其中一路音频数据的位采样时钟。以LRCKm(m=1/2/3)表示多路音频数据的左右声道同步时钟,则LRCK1、LRCK2和LRCK3均为16KHz,其中,16KHz和96KHz都是标准的采样率。同时,可得出CPU62每读取一次采样数据,可读取96个时钟信号,即BCLK每次输出96个时钟信号,以BCLKn(n=1/2/3)表示多路音频数据的位采样时钟,则BCLK1、BCLK2和BCLK3均每次输出16个时钟信号。此外,为使BCLK能够每次输出96个时钟信号,CPU62需要每次从FPGA61的单路音频输出接口4读取12字节的数据。In this embodiment, the left and right channels are 16 bits each, and the sampling rate of the single channel audio data is set to 96 KHz, that is, LRCK is 96 KHz, and the number of channels is BCLK=LRCK* data width*, wherein the data width is 16 bits, and the number of left and right channels It is 2 (the audio data involved in this embodiment is stereo data, so the number of left and right channels is 2), so BCLK is 3.072 MHz. According to the number of multi-channel audio input interfaces and the sampling rate of the preset single-channel audio data, it can be determined that the sampling rate of the multi-channel audio data is 1/6 of the sampling rate of the single-channel audio data, that is, the clock of the multi-channel audio data. The signal is obtained by dividing the clock signal of the single audio data by 6, and the timing diagram of the single audio data and the multi-channel audio data is shown in FIG. 7, and FIG. 7 shows the BCLK, LRCK, and the single audio data. A timing diagram of RXD and TXD, and a timing diagram of BCLK1 in the multi-channel audio data, wherein BCLK1 represents a bit sampling clock of one of the plurality of audio data. The left and right channel synchronization clocks of the multi-channel audio data are represented by LRCKm (m=1/2/3), and LRCK1, LRCK2, and LRCK3 are both 16 kHz, and 16 kHz and 96 kHz are standard sampling rates. At the same time, it can be concluded that the CPU62 can read 96 clock signals every time the sample data is read, that is, BCLK outputs 96 clock signals each time, and BCLKn (n=1/2/3) represents bit sampling of multiple audio data. For the clock, BCLK1, BCLK2, and BCLK3 each output 16 clock signals. Furthermore, in order for BCLK to output 96 clock signals at a time, the CPU 62 needs to read 12 bytes of data from the single audio output interface 4 of the FPGA 61 each time.
图8为本具体实施例中一种音频输出方法的流程图。如图8所示,包括以下步骤S801-S814: FIG. 8 is a flowchart of an audio output method in the specific embodiment. As shown in FIG. 8, the following steps S801-S814 are included:
步骤S801,音频识别***上电。In step S801, the audio recognition system is powered on.
步骤S802,FPGA61根据多路音频输入接口的数量和预设的单路音频数据的采样率96KHz,确定出多路音频数据的采样率为16KHz。In step S802, the FPGA 61 determines that the sampling rate of the multi-channel audio data is 16 KHz according to the number of the multiple audio input interfaces and the sampling rate of the preset single-channel audio data of 96 KHz.
步骤S803,FPGA61根据多路音频数据的采样率,确定多路音频数据单次输入的时钟个数为16个时钟。In step S803, the FPGA 61 determines, according to the sampling rate of the multi-channel audio data, that the number of clocks input by the multi-channel audio data is 16 clocks.
步骤S804,FPGA61中的多路音频输入接口1、接口2和接口3同步接收多路左声道音频数据。其中,接收到的多路左声道音频数据是由麦克输入、并经过模数转换器(A/D转换器)将音频模拟信号转换成数字信号得到的。In step S804, the multi-channel audio input interface 1, the interface 2 and the interface 3 in the FPGA 61 synchronously receive the multi-channel left channel audio data. The received multi-channel left channel audio data is obtained by a microphone input and converted into a digital signal by an analog-to-digital converter (A/D converter).
步骤S805,FPGA61中的3个串行缓存器分别缓存接口1、接口2和接口3接收到的16个时钟的左声道音频数据,同时CPU62从FPGA61的单路音频输出接口4接收12字节的采样数据。其中,串行缓存器1缓存接口1接收到的左声道音频数据,串行缓存器2缓存接口2接收到的左声道音频数据,串行缓存器3缓存接口3接收到的左声道音频数据,缓存的左声道音频数据分别用WORD1[15..0],WORD2[15..0]和WORD3[15..0]表示。CPU62所读取的采样数据是由发送部件将串行缓存器中缓存的数据发送出去得到的,由于该步骤是首次缓存音频数据,因此发送部件中并没有有效数据可发送至CPU62,也就是说,CPU62所读取的采样数据为无效数据。Step S805, the three serial buffers in the FPGA 61 respectively buffer the left channel audio data of the 16 clocks received by the interface 1, the interface 2 and the interface 3, and the CPU 62 receives the 12 bytes from the single audio output interface 4 of the FPGA 61. Sampling data. Wherein, the serial buffer 1 buffers the left channel audio data received by the interface 1, the serial buffer 2 buffers the left channel audio data received by the interface 2, and the serial buffer 3 buffers the buffer 3 receives the left channel. Audio data, cached left channel audio data are represented by WORD1[15..0], WORD2[15..0] and WORD3[15..0] respectively. The sample data read by the CPU 62 is obtained by the transmitting unit transmitting the data buffered in the serial buffer. Since the step is to buffer the audio data for the first time, no valid data is transmitted to the CPU 62 in the transmitting unit, that is, The sample data read by the CPU 62 is invalid data.
步骤S806,CPU62将读取的无效数据丢弃。In step S806, the CPU 62 discards the read invalid data.
步骤S807,FPGA61中的3个串行缓存器分别缓存接口1、接口2和接口3接收到的16个时钟的右声道音频数据,并将串行缓存器中上一次缓存的左声道音频数据进行排序,同时发送部件将排序后的左声道音频数据发送出去。该步骤中,FPGA61对串行缓存器中上一次缓存的左声道音频数据进行排序是按照串行缓存器的预设顺序进行排序的,例如,串行缓存器1中缓存的左声道音频数据排在最前面,然后是串行缓存器2中缓存的左声道音频数据,最后是串行缓存器3中缓存的左声道音频数据,缓存的右声道音频数据分别用WORD1[31..16],WORD2[31..16]和WORD3[31..16]表示。Step S807, the three serial buffers in the FPGA 61 respectively buffer the right channel audio data of the 16 clocks received by the interface 1, the interface 2 and the interface 3, and the left channel audio of the last buffer in the serial buffer. The data is sorted while the sending component sends the sorted left channel audio data. In this step, the FPGA 61 sorts the left-channel audio data of the last buffer in the serial buffer according to the preset order of the serial buffer, for example, the left channel audio buffered in the serial buffer 1. The data is ranked first, then the left channel audio data buffered in the serial buffer 2, and finally the left channel audio data buffered in the serial buffer 3, and the buffered right channel audio data is respectively WORD1 [31 ..16], WORD2[31..16] and WORD3[31..16] indicate.
步骤S808,CPU62从FPGA61的单路音频接口4接收12字节的左声道音频数据,并将接收到的左声道音频数据进行拆分,分别得到16个时钟的左声道音频信号WORD1[15..0],WORD2[15..0]和WORD3[15..0]。Step S808, the CPU 62 receives 12 bytes of left channel audio data from the single audio interface 4 of the FPGA 61, and splits the received left channel audio data to obtain 16 clock left channel audio signals WORD1 [ 15..0], WORD2[15..0] and WORD3[15..0].
步骤S809,CPU62丢弃WORD1[15..0],WORD2[15..0]和WORD3[15..0]中的无效数据,获得左声道音频有效数据。由于CPU62接收到的仅有左声道音频数据,但CPU62每次接收12字节的采样数据,即96个时钟信号,因此信号WORD1[15..0],WORD2[15..0]和WORD3[15..0]中,前6个字节分别是左声道音频有效数据,后6个字节则为无效数据。In step S809, the CPU 62 discards the invalid data in WORD1[15..0], WORD2[15..0] and WORD3[15..0], and obtains the left channel audio effective data. Since the CPU 62 receives only the left channel audio data, the CPU 62 receives 12 bytes of sample data each time, that is, 96 clock signals, so the signals WORD1 [15..0], WORD2 [15..0], and WORD3. In [15..0], the first 6 bytes are the left channel audio valid data, and the last 6 bytes are invalid data.
步骤S810,FPGA61中的3个串行缓存器分别缓存接口1、接口2和接口3接收到的16个时钟的左声道音频数据,并将串行缓存器中上一次缓存的右声道音频数据进行排序,同时发送部件将排序后的右声道音频数据发送出去。缓存的左声道音频数据分别用WORD1[15..0],WORD2[15..0]和WORD3[15..0]表示。 Step S810, the three serial buffers in the FPGA 61 respectively buffer the left channel audio data of the 16 clocks received by the interface 1, the interface 2 and the interface 3, and the right channel audio of the last buffer in the serial buffer. The data is sorted while the sending component sends the sorted right channel audio data. The buffered left channel audio data is represented by WORD1[15..0], WORD2[15..0] and WORD3[15..0], respectively.
步骤S811,CPU62从FPGA61的单路音频接口4接收12字节的右声道音频数据,并将接收到的右声道音频数据进行拆分,分别得到16个时钟的右声道音频信号WORD1[31..16],WORD2[31..16]和WORD3[31..16]。In step S811, the CPU 62 receives 12 bytes of right channel audio data from the single audio interface 4 of the FPGA 61, and splits the received right channel audio data to obtain 16 clock right channel audio signals WORD1 [ 31..16], WORD2[31..16] and WORD3[31..16].
步骤S812,CPU62丢弃WORD1[31..16],WORD2[31..16]和WORD3[31..16]中的无效数据,获得右声道音频有效数据。由于CPU62接收到的仅有右声道音频数据,但CPU62每次接收12字节的采样数据,即96个时钟信号,因此信号WORD1[31..16],WORD2[31..16]和WORD3[31..16中,前6个字节分别是右声道音频有效数据,后6个字节则为无效数据。In step S812, the CPU 62 discards the invalid data in WORD1[31..16], WORD2[31..16] and WORD3[31..16], and obtains the right channel audio effective data. Since the CPU 62 receives only the right channel audio data, the CPU 62 receives 12 bytes of sample data each time, that is, 96 clock signals, so the signals WORD1 [31..16], WORD2 [31..16], and WORD3. [31..16, the first 6 bytes are the right channel audio valid data, and the last 6 bytes are invalid data.
步骤S813,CPU62按照每一路左声道音频有效数据和每一路右声道音频有效数据之间的对应关系,将获得的左声道音频有效数据和右声道音频有效数据进行组合,获得一帧完整的多路音频数据。返回步骤步骤S807。Step S813, the CPU 62 combines the obtained left channel audio effective data and the right channel audio effective data according to the correspondence between each left channel audio effective data and each right channel audio effective data to obtain one frame. Complete multi-channel audio data. The process returns to step S807.
步骤S814,CPU62输出多路音频数据。In step S814, the CPU 62 outputs the multi-channel audio data.
采用本实施例中的技术方案,通过FPGA将接收到的多路音频数据转化为单路音频数据进行输出,再通过CPU接收单路音频数据,并拆分成多路音频数据进行输出,解决了语音识别时CPU仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。By adopting the technical scheme in the embodiment, the received multi-channel audio data is converted into single-channel audio data for output by the FPGA, and then the single-channel audio data is received by the CPU, and split into multiple audio data for output, and the solution is solved. In the speech recognition, the CPU has only one set of audio input interfaces, which causes the problem that the multi-channel audio data cannot be received, and the speech recognition needs are more satisfied.
对应于上述实施例中的一种音频输出方法,本发明还提供一种音频输出装置,用以执行上述方法。Corresponding to an audio output method in the above embodiment, the present invention also provides an audio output device for performing the above method.
图9为本发明实施例中一种音频输出装置的框图。如图9所示,该装置用于现场可编程门阵列FPGA,包括:FIG. 9 is a block diagram of an audio output device according to an embodiment of the present invention. As shown in Figure 9, the device is used in a field programmable gate array FPGA, including:
接收模块91,用于接收多路音频输入接口输入的多路音频数据;The receiving module 91 is configured to receive multiple audio data input by the multiple audio input interfaces;
转化模块92,用于将多路音频数据转化为单路音频数据;a conversion module 92, configured to convert multiple audio data into single audio data;
第一输出模块93,用于通过单路音频输出接口输出单路音频数据。The first output module 93 is configured to output single channel audio data through a single audio output interface.
在一个实施例中,如图10所示,接收模块91包括:In an embodiment, as shown in FIG. 10, the receiving module 91 includes:
第一确定子模块911,用于根据多路音频输入接口的数量和预设的单路音频数据的第一采样率,确定多路音频数据的第二采样率;a first determining sub-module 911, configured to determine a second sampling rate of the multi-channel audio data according to the number of the multiple audio input interfaces and the first sampling rate of the preset single-channel audio data;
第二确定子模块912,用于根据第二采样率,确定多路音频数据单次输入的时钟个数;a second determining sub-module 912, configured to determine, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
第一接收子模块913,用于按照时钟个数接收多路音频输入接口输入的多路音频数据。The first receiving submodule 913 is configured to receive the multiple audio data input by the multiple audio input interfaces according to the number of clocks.
在一个实施例中,如图11所示,转化模块92包括:In one embodiment, as shown in FIG. 11, the conversion module 92 includes:
第一缓存子模块921,用于按照时钟个数缓存多路音频数据;The first buffer sub-module 921 is configured to buffer multiple audio data according to the number of clocks;
排序子模块922,用于将缓存的多路音频数据按照多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。The sorting sub-module 922 is configured to sort the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces to obtain the sorted single-channel audio data.
在一个实施例中,第一输出模块93包括:In one embodiment, the first output module 93 includes:
第一输出子模块,用于输出排序后的单路音频数据。 The first output sub-module is configured to output the sorted single-channel audio data.
在一个实施例中,如图12所示,接收模块91包括:In one embodiment, as shown in FIG. 12, the receiving module 91 includes:
第二接收子模块914,用于在时钟的上升沿接收多路音频输入接口输入的多路音频数据;a second receiving submodule 914, configured to receive multiple audio data input by the multiple audio input interfaces on a rising edge of the clock;
第一输出模块93,包括:The first output module 93 includes:
第二输出子模块931,用于在时钟的下降沿通过单路音频输出接口输出单路音频数据。The second output sub-module 931 is configured to output single-channel audio data through the single-channel audio output interface on the falling edge of the clock.
在一个实施例中,接收模块91包括:In one embodiment, the receiving module 91 includes:
第三接收子模块,用于按照预设时间间隔接收多路音频输入接口输入的多路音频数据。The third receiving submodule is configured to receive the multiple audio data input by the multiple audio input interfaces according to the preset time interval.
采用本发明实施例提供的装置,通过将接收到的多路音频数据转化为单路音频数据进行输出,使得语音识别中能够接收并输出多路音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。By using the device provided by the embodiment of the present invention, the received multi-channel audio data is converted into single-channel audio data for output, so that the multi-channel audio data can be received and outputted in the voice recognition, and the central processor has only one voice recognition. The problem of the inability to receive multi-channel audio data due to the group audio input interface, and the need for speech recognition to a greater extent.
图13为本发明实施例中一种音频输出装置的框图。如图13所示,该装置用于中央处理器CPU,包括:Figure 13 is a block diagram of an audio output device in accordance with an embodiment of the present invention. As shown in Figure 13, the device is used in a central processing unit CPU, including:
拆分模块131,用于当接收到单路音频数据时,将单路音频数据拆分为多路音频数据;The splitting module 131 is configured to split the single channel audio data into multiple audio data when receiving the single channel audio data;
第二输出模块132,用于输出多路音频数据。The second output module 132 is configured to output multiple audio data.
在一个实施例中,如图14所示,拆分模块131包括:In one embodiment, as shown in FIG. 14, the splitting module 131 includes:
获取子模块1311,用于获取多路音频输入接口的数量;The obtaining submodule 1311 is configured to acquire the number of multiple audio input interfaces;
第三确定子模块1312,用于根据多路音频输入接口的数量和预设的单路音频数据的第一采样率,确定多路音频数据的第二采样率;a third determining sub-module 1312, configured to determine a second sampling rate of the multi-channel audio data according to the number of the multiple audio input interfaces and the first sampling rate of the preset single-channel audio data;
第四确定子模块1313,用于根据第二采样率,确定多路音频数据单次输入的时钟个数;a fourth determining sub-module 1313, configured to determine, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
第一拆分子模块1314,用于按照时钟个数,将单路音频数据拆分为多路音频数据。The first splitting module 1314 is configured to split the single audio data into multiple audio data according to the number of clocks.
在一个实施例中,如图15所示,单路音频数据为单路左声道音频数据或单路右声道音频数据;拆分模块131包括:In one embodiment, as shown in FIG. 15, the single channel audio data is single channel left channel audio data or single channel right channel audio data; the split module 131 includes:
第二拆分子模块1315,用于当接收到所述单路左声道音频数据时,将单路左声道音频数据拆分为多路左声道音频数据;a second splitting module 1315, configured to split the single channel left channel audio data into multiple left channel audio data when receiving the single channel left channel audio data;
第五确定子模块1316,用于根据多路音频数据单次输入的时钟个数和接收到的单路音频数据的时钟个数,确定拆分后的多路左声道音频数据中的第一无效数据;The fifth determining sub-module 1316 is configured to determine, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, determine the first of the split multi-channel left channel audio data. Invalid data;
第一丢弃子模块1317,用于丢弃第一无效数据,获得多路左声道音频有效数据;a first discarding sub-module 1317, configured to discard the first invalid data, and obtain multi-channel left channel audio valid data;
第三拆分子模块1318,用于当接收到所述单路右声道音频数据时,将所述单路右声道音频数据拆分为多路右声道音频数据;a third splitting module 1318, configured to split the single right channel audio data into multiple right channel audio data when receiving the single right channel audio data;
第六确定子模块1319,用于根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路右声道音频数据中的第二无效数据; a sixth determining sub-module 1319, configured to determine the split multi-channel right channel according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data Second invalid data in the audio data;
第二丢弃子模块13110,用于丢弃所述第二无效数据,获得多路右声道音频有效数据;a second discarding sub-module 13110, configured to discard the second invalid data, and obtain multi-channel right channel audio valid data;
结合子模块13111,用于按照多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合多路左声道音频有效数据和多路右声道音频有效数据,获得多路音频数据。The combining sub-module 13111 is configured to combine the multi-channel left channel audio effective data and the multi-channel right channel audio effective data according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data. Get multiple audio data.
采用本发明实施例提供的装置,通过将接收到的单路音频数据拆分为多路音频数据进行输出,使得语音识别中以单组音频输入接口接收的音频数据能够以多路形式输出,从而能够输出完整的音频数据,解决了语音识别时中央处理器仅有一组音频输入接口而导致无法接收多路音频数据的问题,更大限度地满足了语音识别的需求。By using the device provided by the embodiment of the present invention, the received single channel audio data is split into multiple audio data for output, so that the audio data received by the single group audio input interface in the voice recognition can be outputted in multiple channels, thereby The ability to output complete audio data solves the problem that the central processor has only one set of audio input interfaces during speech recognition, which makes it impossible to receive multiple audio data, and more satisfies the requirements of speech recognition.
图16是根据一示例性实施例示出的一种可执行音频输出方法的装置的框图。例如,装置1600可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。FIG. 16 is a block diagram of an apparatus for performing an audio output method, according to an exemplary embodiment. For example, device 1600 can be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a gaming console, a tablet device, a medical device, a fitness device, a personal digital assistant, and the like.
参照图16,装置1600可以包括以下一个或多个组件:处理器1601,存储器1602以及通信组件1603。Referring to Figure 16, device 1600 can include one or more of the following components: processor 1601, memory 1602, and communication component 1603.
处理器1601通常控制装置1600的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理器1601可以执行指令,以完成上述的方法的全部或部分步骤。The processor 1601 typically controls the overall operation of the device 1600, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processor 1601 can execute instructions to perform all or part of the steps of the above method.
存储器1602被配置为存储各种类型的数据以支持在装置1600的操作。这些数据的示例包括用于在装置1600上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器1602可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。 Memory 1602 is configured to store various types of data to support operation at device 1600. Examples of such data include instructions for any application or method operating on device 1600, contact data, phone book data, messages, pictures, videos, and the like. The memory 1602 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable. Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
通信组件1603被配置为便于装置1600和其他设备之间有线或无线方式的通信。装置1600可以接入基于通信标准的无线网络,如Wi-Fi,2G或3G,或它们的组合。在一个示例性实施例中,通信组件1603经由广播信道接收来自外部广播管理***的广播信号或广播相关信息。在一个示例性实施例中,通信组件1603还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。 Communication component 1603 is configured to facilitate wired or wireless communication between device 1600 and other devices. The device 1600 can access a wireless network based on a communication standard, such as Wi-Fi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1603 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, communication component 1603 also includes a near field communication (NFC) module to facilitate short range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
在示例性实施例中,装置1600可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述音频输出方法。In an exemplary embodiment, device 1600 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the above audio output method.
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器1602,上述指令可由装置1600的处理器1601执行以完成上述音频输出方法。例如,非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁 带、软盘和光数据存储设备等。In an exemplary embodiment, there is also provided a non-transitory computer readable storage medium comprising instructions, such as a memory 1602 comprising instructions executable by processor 1601 of apparatus 1600 to perform the audio output method described above. For example, the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic Tapes, floppy disks, and optical data storage devices.
本发明还提供一种非暂时性计算机可读记录介质,所述介质上记录有计算机程序,所述程序包括用于执行如本发明上述实施例所述的音频输出方法的指令。The present invention also provides a non-transitory computer readable recording medium having recorded thereon a computer program including instructions for executing the audio output method according to the above-described embodiment of the present invention.
本发明还提供一种计算机程序,所述程序包括:用于在所述程序由计算机执行时执行如本发明上述实施例所述的音频输出方法的指令。The present invention also provides a computer program comprising: instructions for performing an audio output method as described in the above embodiments of the present invention when the program is executed by a computer.
本领域内的技术人员应明白,本发明的实施例可提供为方法、***、或计算机程序产品。因此,本发明可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本发明可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器和光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art will appreciate that embodiments of the present invention can be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment, or a combination of software and hardware. Moreover, the invention can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage and optical storage, etc.) including computer usable program code.
本发明是参照根据本发明实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。The present invention has been described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。The computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The apparatus implements the functions specified in one or more blocks of a flow or a flow and/or block diagram of the flowchart.
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.
显然,本领域的技术人员可以对本发明进行各种改动和变型而不脱离本发明的精神和范围。这样,倘若本发明的这些修改和变型属于本发明权利要求及其等同技术的范围之内,则本发明也意图包含这些改动和变型在内。 It is apparent that those skilled in the art can make various modifications and variations to the invention without departing from the spirit and scope of the invention. Thus, it is intended that the present invention cover the modifications and modifications of the invention

Claims (18)

  1. 一种音频输出方法,其特征在于,应用于现场可编程门阵列,所述方法包括:An audio output method, characterized by being applied to a field programmable gate array, the method comprising:
    接收多路音频输入接口输入的多路音频数据;Receiving multiple audio data input by multiple audio input interfaces;
    将所述多路音频数据转化为单路音频数据;Converting the multi-channel audio data into single-channel audio data;
    通过单路音频输出接口输出所述多路音频数据。The multi-channel audio data is output through a single audio output interface.
  2. 根据权利要求1所述的方法,其特征在于,所述接收多路音频输入接口输入的多路音频数据,包括:The method according to claim 1, wherein the receiving the multi-channel audio data input by the multi-channel audio input interface comprises:
    根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
    根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
    按照所述时钟个数接收所述多路音频输入接口输入的多路音频数据。The multi-channel audio data input by the multi-channel audio input interface is received according to the number of clocks.
  3. 根据权利要求2所述的方法,其特征在于,所述将所述多路音频数据转化为单路音频数据,包括:The method according to claim 2, wherein said converting said multi-channel audio data into single-channel audio data comprises:
    按照所述时钟个数缓存所述多路音频数据;Cache the multiple audio data according to the number of clocks;
    将所述缓存的多路音频数据按照所述多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。And sorting the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces to obtain the sorted single-channel audio data.
  4. 根据权利要求3所述的方法,其特征在于,所述输出所述单路音频数据,包括:The method according to claim 3, wherein said outputting said single channel audio data comprises:
    输出所述排序后的单路音频数据。The sorted single channel audio data is output.
  5. 根据权利要求1所述的方法,其特征在于,The method of claim 1 wherein
    所述接收多路音频输入接口输入的多路音频数据,包括:The receiving multiple audio data input by the multiple audio input interfaces includes:
    在时钟的上升沿接收多路音频数据;Receiving multiple audio data on the rising edge of the clock;
    所述通过单路音频输出接口输出所述单路音频数据,包括:The outputting the single channel audio data by using a single audio output interface includes:
    在所述时钟的下降沿通过单路音频输出接口输出所述单路音频数据。The single channel audio data is output through a single audio output interface at a falling edge of the clock.
  6. 根据权利要求1所述的方法,其特征在于,所述接收多路音频输入接口输入的多路音频数据,包括:The method according to claim 1, wherein the receiving the multi-channel audio data input by the multi-channel audio input interface comprises:
    按照预设时间间隔接收多路音频输入接口输入的多路音频数据。Multiple audio data input by multiple audio input interfaces is received at preset time intervals.
  7. 一种音频输出方法,其特征在于,应用于中央处理器,所述方法包括:An audio output method, characterized by being applied to a central processing unit, the method comprising:
    当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据;When receiving single channel audio data, splitting the single channel audio data into multiple channels of audio data;
    输出所述多路音频数据。The multi-channel audio data is output.
  8. 根据权利要求7所述的方法,其特征在于,所述将所述单路音频数据拆分为多路音频数据,包括:The method according to claim 7, wherein the splitting the single audio data into multiple audio data comprises:
    获取多路音频输入接口的数量;Obtain the number of multiple audio input interfaces;
    根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率; Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
    根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
    按照所述时钟个数,将所述单路音频数据拆分为多路音频数据。The single audio data is split into multiple audio data according to the number of clocks.
  9. 根据权利要求7所述的方法,其特征在于,所述单路音频数据为单路左声道音频数据或单路右声道音频数据;所述当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据,包括:The method according to claim 7, wherein the one-way audio data is single-channel left channel audio data or single-channel right channel audio data; when the single-channel audio data is received, the Single-channel audio data is split into multiple audio data, including:
    当接收到所述单路左声道音频数据时,将所述单路左声道音频数据拆分为多路左声道音频数据;When the single channel left channel audio data is received, the single channel left channel audio data is split into multiple left channel audio data;
    根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路左声道音频数据中的第一无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the first invalid data in the split multi-channel left channel audio data;
    丢弃所述第一无效数据,获得多路左声道音频有效数据;Discarding the first invalid data to obtain multi-channel left channel audio valid data;
    当接收到所述单路右声道音频数据时,将所述单路右声道音频数据拆分为多路右声道音频数据;When the single channel right channel audio data is received, splitting the single channel right channel audio data into multiple right channel audio data;
    根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路右声道音频数据中的第二无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received one-way audio data, the second invalid data in the split multi-channel right channel audio data;
    丢弃所述第二无效数据,获得多路右声道音频有效数据;Discarding the second invalid data to obtain multi-channel right channel audio valid data;
    按照所述多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合所述多路左声道音频有效数据和所述多路右声道音频有效数据,获得所述多路音频数据。Obtaining, according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, combining the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, The multi-channel audio data.
  10. 一种音频输出装置,其特征在于,所述装置包括:An audio output device, characterized in that the device comprises:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;a memory for storing the processor executable instructions;
    其中,所述处理器被配置为执行一种音频输出方法,所述方法应用于现场可编程门阵列,所述方法包括:Wherein the processor is configured to perform an audio output method, the method being applied to a field programmable gate array, the method comprising:
    接收多路音频输入接口输入的多路音频数据;Receiving multiple audio data input by multiple audio input interfaces;
    将所述多路音频数据转化为单路音频数据;Converting the multi-channel audio data into single-channel audio data;
    通过单路音频输出接口输出所述多路音频数据。The multi-channel audio data is output through a single audio output interface.
  11. 根据权利要求10所述的装置,其中,所述接收多路音频输入接口输入的多路音频数据,包括:The apparatus of claim 10, wherein the receiving the multi-channel audio data input by the multi-channel audio input interface comprises:
    根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
    根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
    按照所述时钟个数接收所述多路音频输入接口输入的多路音频数据。The multi-channel audio data input by the multi-channel audio input interface is received according to the number of clocks.
  12. 根据权利要求11所述的装置,其中,所述将所述多路音频数据转化为单路音频数据,包括:按照所述时钟个数缓存所述多路音频数据;The apparatus according to claim 11, wherein said converting said plurality of audio data into one-way audio data comprises: buffering said plurality of audio data according to said number of clocks;
    将所述缓存的多路音频数据按照所述多路音频输入接口的预设顺序进行排序,获得排序后的单路音频数据。 And sorting the buffered multi-channel audio data according to a preset order of the multiple audio input interfaces to obtain the sorted single-channel audio data.
  13. 根据权利要求12所述的装置,其中,所述输出所述单路音频数据,包括:The apparatus of claim 12, wherein said outputting said single channel audio data comprises:
    输出所述排序后的单路音频数据。The sorted single channel audio data is output.
  14. 根据权利要求10所述的装置,其中,所述接收多路音频输入接口输入的多路音频数据,包括:The apparatus of claim 10, wherein the receiving the multi-channel audio data input by the multi-channel audio input interface comprises:
    在时钟的上升沿接收多路音频数据;Receiving multiple audio data on the rising edge of the clock;
    所述通过单路音频输出接口输出所述单路音频数据,包括:The outputting the single channel audio data by using a single audio output interface includes:
    在所述时钟的下降沿通过单路音频输出接口输出所述单路音频数据。The single channel audio data is output through a single audio output interface at a falling edge of the clock.
  15. 根据权利要求10所述的装置,其中,所述接收多路音频输入接口输入的多路音频数据,包括:The apparatus of claim 10, wherein the receiving the multi-channel audio data input by the multi-channel audio input interface comprises:
    按照预设时间间隔接收多路音频输入接口输入的多路音频数据。Multiple audio data input by multiple audio input interfaces is received at preset time intervals.
  16. 一种音频输出装置,其特征在于,所述装置包括:An audio output device, characterized in that the device comprises:
    处理器;processor;
    用于存储所述处理器可执行指令的存储器;a memory for storing the processor executable instructions;
    其中,所述处理器被配置为执行一种音频输出方法,所述方法应用于中央处理器,所述方法包括:Wherein the processor is configured to perform an audio output method, the method being applied to a central processor, the method comprising:
    当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据;When receiving single channel audio data, splitting the single channel audio data into multiple channels of audio data;
    输出所述多路音频数据。The multi-channel audio data is output.
  17. 根据权利要求16所述的装置,其中,所述将所述单路音频数据拆分为多路音频数据,包括:The apparatus of claim 16, wherein the splitting the single channel audio data into multiple audio data comprises:
    获取多路音频输入接口的数量;Obtain the number of multiple audio input interfaces;
    根据所述多路音频输入接口的数量和预设的所述单路音频数据的第一采样率,确定所述多路音频数据的第二采样率;Determining, according to the number of the multiple audio input interfaces and the preset first sampling rate of the single audio data, a second sampling rate of the multiple audio data;
    根据所述第二采样率,确定所述多路音频数据单次输入的时钟个数;Determining, according to the second sampling rate, a number of clocks input by the multiple audio data in a single input;
    按照所述时钟个数,将所述单路音频数据拆分为多路音频数据。The single audio data is split into multiple audio data according to the number of clocks.
  18. 根据权利要求16所述的装置,其中,所述单路音频数据为单路左声道音频数据或单路右声道音频数据;所述当接收到单路音频数据时,将所述单路音频数据拆分为多路音频数据,包括:The apparatus according to claim 16, wherein said one-way audio data is single-channel left channel audio data or single-channel right channel audio data; said single channel when said single channel audio data is received The audio data is split into multiple audio data, including:
    当接收到所述单路左声道音频数据时,将所述单路左声道音频数据拆分为多路左声道音频数据;When the single channel left channel audio data is received, the single channel left channel audio data is split into multiple left channel audio data;
    根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个数,确定所述拆分后的多路左声道音频数据中的第一无效数据;Determining, according to the number of clocks input by the multi-channel audio data and the number of clocks of the received single-channel audio data, the first invalid data in the split multi-channel left channel audio data;
    丢弃所述第一无效数据,获得多路左声道音频有效数据;Discarding the first invalid data to obtain multi-channel left channel audio valid data;
    当接收到所述单路右声道音频数据时,将所述单路右声道音频数据拆分为多路右声道音频数据;When the single channel right channel audio data is received, splitting the single channel right channel audio data into multiple right channel audio data;
    根据所述多路音频数据单次输入的时钟个数和所述接收到的单路音频数据的时钟个 数,确定所述拆分后的多路右声道音频数据中的第二无效数据;a number of clocks input in a single pass according to the multi-channel audio data and a clock of the received single-channel audio data And determining a second invalid data in the split multi-channel right channel audio data;
    丢弃所述第二无效数据,获得多路右声道音频有效数据;Discarding the second invalid data to obtain multi-channel right channel audio valid data;
    按照所述多路左声道音频有效数据和多路右声道音频有效数据之间的对应关系,结合所述多路左声道音频有效数据和所述多路右声道音频有效数据,获得所述多路音频数据。 Obtaining, according to the correspondence between the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, combining the multi-channel left channel audio effective data and the multi-channel right channel audio effective data, The multi-channel audio data.
PCT/CN2016/082421 2015-09-15 2016-05-18 Audio output method and apparatus WO2017045413A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510587775.8 2015-09-15
CN201510587775.8A CN105261365A (en) 2015-09-15 2015-09-15 Audio output method and device

Publications (1)

Publication Number Publication Date
WO2017045413A1 true WO2017045413A1 (en) 2017-03-23

Family

ID=55101024

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/082421 WO2017045413A1 (en) 2015-09-15 2016-05-18 Audio output method and apparatus

Country Status (2)

Country Link
CN (1) CN105261365A (en)
WO (1) WO2017045413A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115881128A (en) * 2023-02-07 2023-03-31 北京合思信息技术有限公司 Voice behavior interaction method and device based on history matching degree
EP4283891A4 (en) * 2021-01-30 2024-03-06 Huawei Technologies Co., Ltd. Data transmission method and apparatus, and electronic device and storage medium

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105261365A (en) * 2015-09-15 2016-01-20 北京云知声信息技术有限公司 Audio output method and device
CN106782562A (en) * 2016-12-20 2017-05-31 Tcl通力电子(惠州)有限公司 Audio-frequency processing method, apparatus and system
CN107424605A (en) * 2017-03-13 2017-12-01 浙江曼悟电子科技股份有限公司 A kind of parallel intelligent sound identification all-in-one of portable multipath based on X86 and ARM chips
CN107889044B (en) * 2017-12-19 2019-10-15 维沃移动通信有限公司 The processing method and processing device of audio data
CN110085241B (en) * 2019-04-28 2021-10-08 北京地平线机器人技术研发有限公司 Data encoding method, data encoding device, computer storage medium and data encoding equipment
CN110085268B (en) * 2019-05-10 2021-02-19 深圳市智微智能科技股份有限公司 Method and system for real-time switching of double MICs of Android advertisement machine, advertisement machine and storage medium
CN112216310B (en) * 2019-07-09 2021-10-26 海信视像科技股份有限公司 Audio processing method and device and multi-channel system
CN110349584A (en) * 2019-07-31 2019-10-18 北京声智科技有限公司 A kind of audio data transmission method, device and speech recognition system
CN110838298A (en) * 2019-11-15 2020-02-25 闻泰科技(无锡)有限公司 Method, device and equipment for processing multi-channel audio data and storage medium
CN113645540B (en) * 2020-04-24 2022-11-08 矽统科技股份有限公司 Digital audio array circuit
TWI747250B (en) 2020-04-24 2021-11-21 矽統科技股份有限公司 Digital audio array circuit
CN112562681B (en) * 2020-12-02 2021-11-19 腾讯科技(深圳)有限公司 Speech recognition method and apparatus, and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075183A (en) * 2007-06-29 2007-11-21 北京中星微电子有限公司 Multi-path audio-frequency data processing system
CN101546558A (en) * 2009-05-05 2009-09-30 南京莱斯信息技术股份有限公司 Multipath input audio mixing and exchanging method
CN102065231A (en) * 2010-11-26 2011-05-18 深圳中兴力维技术有限公司 Multipath data fusion device, realization method thereof and multipath audio data processing system
US20120070004A1 (en) * 2010-09-22 2012-03-22 Crestron Electronics, Inc. Digital Audio Distribution
CN204406122U (en) * 2015-02-15 2015-06-17 科大讯飞股份有限公司 Audio signal processor
CN105261365A (en) * 2015-09-15 2016-01-20 北京云知声信息技术有限公司 Audio output method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6311161B1 (en) * 1999-03-22 2001-10-30 International Business Machines Corporation System and method for merging multiple audio streams
JP4304796B2 (en) * 1999-11-30 2009-07-29 ソニー株式会社 Dubbing equipment
CN103702047A (en) * 2013-12-13 2014-04-02 乐视致新电子科技(天津)有限公司 Audio conversion device and television system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101075183A (en) * 2007-06-29 2007-11-21 北京中星微电子有限公司 Multi-path audio-frequency data processing system
CN101546558A (en) * 2009-05-05 2009-09-30 南京莱斯信息技术股份有限公司 Multipath input audio mixing and exchanging method
US20120070004A1 (en) * 2010-09-22 2012-03-22 Crestron Electronics, Inc. Digital Audio Distribution
CN102065231A (en) * 2010-11-26 2011-05-18 深圳中兴力维技术有限公司 Multipath data fusion device, realization method thereof and multipath audio data processing system
CN204406122U (en) * 2015-02-15 2015-06-17 科大讯飞股份有限公司 Audio signal processor
CN105261365A (en) * 2015-09-15 2016-01-20 北京云知声信息技术有限公司 Audio output method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4283891A4 (en) * 2021-01-30 2024-03-06 Huawei Technologies Co., Ltd. Data transmission method and apparatus, and electronic device and storage medium
CN115881128A (en) * 2023-02-07 2023-03-31 北京合思信息技术有限公司 Voice behavior interaction method and device based on history matching degree

Also Published As

Publication number Publication date
CN105261365A (en) 2016-01-20

Similar Documents

Publication Publication Date Title
WO2017045413A1 (en) Audio output method and apparatus
US10045140B2 (en) Utilizing digital microphones for low power keyword detection and noise suppression
US10149088B2 (en) Speaker position identification with respect to a user based on timing information for enhanced sound adjustment
US9736587B2 (en) Smart tool for headphones
US9817629B2 (en) Audio synchronization method for bluetooth speakers
CN106790940B (en) Recording method, recording playing method, device and terminal
CN105357604B (en) Audio playing device with Bluetooth function and audio playing method
WO2006050112A3 (en) Audio spatial environment engine
US11632617B2 (en) Method, apparatus and device for synchronously playing audio
US9812146B1 (en) Synchronization of inbound and outbound audio in a heterogeneous echo cancellation system
EP4336863A3 (en) Latency negotiation in a heterogeneous network of synchronized speakers
GB2462567A (en) Data processing apparatus
WO2017071183A1 (en) Voice processing method and device, and pickup circuit
US20220038769A1 (en) Synchronizing bluetooth data capture to data playback
Chatterjee et al. ClearBuds: wireless binaural earbuds for learning-based speech enhancement
US20240221774A1 (en) Systems, Devices, and Methods for Synchronizing Audio
US10431238B1 (en) Memory and computation efficient cross-correlation and delay estimation
WO2020103066A1 (en) Method and apparatus for determining relative position between two terminal devices
WO2017061023A1 (en) Audio signal processing method and device
US9106717B2 (en) Speaking participant identification
WO2019109420A1 (en) Left and right channel determining method and earphone device
US9807492B1 (en) System and/or method for enhancing hearing using a camera module, processor and/or audio input and/or output devices
CN104599691B (en) Audio frequency playing method and device
CN105407443B (en) The way of recording and device
KR101402869B1 (en) Method and system for processing audio signals in a central audio hub

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16845527

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 19/06/2018)

122 Ep: pct application non-entry in european phase

Ref document number: 16845527

Country of ref document: EP

Kind code of ref document: A1