WO2011004744A1 - 音響信号処理装置、その処理方法およびプログラム - Google Patents
音響信号処理装置、その処理方法およびプログラム Download PDFInfo
- Publication number
- WO2011004744A1 WO2011004744A1 PCT/JP2010/061108 JP2010061108W WO2011004744A1 WO 2011004744 A1 WO2011004744 A1 WO 2011004744A1 JP 2010061108 W JP2010061108 W JP 2010061108W WO 2011004744 A1 WO2011004744 A1 WO 2011004744A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- spectrum
- band
- frequency
- unit
- replacement
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 47
- 238000003672 processing method Methods 0.000 title abstract description 4
- 238000001228 spectrum Methods 0.000 claims abstract description 623
- 238000004364 calculation method Methods 0.000 claims abstract description 39
- 230000007274 generation of a signal involved in cell-cell signaling Effects 0.000 claims abstract description 35
- 238000000034 method Methods 0.000 claims description 73
- 238000009826 distribution Methods 0.000 claims description 27
- 238000006467 substitution reaction Methods 0.000 abstract description 7
- 230000000875 corresponding effect Effects 0.000 description 95
- 238000013139 quantization Methods 0.000 description 73
- 238000010586 diagram Methods 0.000 description 37
- 238000010606 normalization Methods 0.000 description 35
- 230000008569 process Effects 0.000 description 24
- 238000013500 data storage Methods 0.000 description 18
- 238000006243 chemical reaction Methods 0.000 description 11
- 230000001755 vocal effect Effects 0.000 description 8
- 230000006870 function Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 4
- 230000007423 decrease Effects 0.000 description 3
- 238000009499 grossing Methods 0.000 description 3
- 238000012937 correction Methods 0.000 description 2
- 230000002596 correlated effect Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 230000004807 localization Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01V—GEOPHYSICS; GRAVITATIONAL MEASUREMENTS; DETECTING MASSES OR OBJECTS; TAGS
- G01V3/00—Electric or magnetic prospecting or detecting; Measuring magnetic field characteristics of the earth, e.g. declination, deviation
- G01V3/12—Electric or magnetic prospecting or detecting; Measuring magnetic field characteristics of the earth, e.g. declination, deviation operating with electromagnetic waves
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10H—ELECTROPHONIC MUSICAL INSTRUMENTS; INSTRUMENTS IN WHICH THE TONES ARE GENERATED BY ELECTROMECHANICAL MEANS OR ELECTRONIC GENERATORS, OR IN WHICH THE TONES ARE SYNTHESISED FROM A DATA STORE
- G10H1/00—Details of electrophonic musical instruments
- G10H1/36—Accompaniment arrangements
- G10H1/361—Recording/reproducing of accompaniment for use with an external source, e.g. karaoke systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R5/00—Stereophonic arrangements
- H04R5/04—Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S1/00—Two-channel systems
- H04S1/007—Two-channel systems in which the audio signals are in digital form
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04R—LOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
- H04R2430/00—Signal processing covered by H04R, not provided for in its groups
- H04R2430/03—Synergistic effects of band splitting and sub-band processing
Definitions
- the present invention relates to an acoustic signal processing device, and more particularly to an acoustic signal processing device that suppresses an audio component contained in an acoustic signal, a processing method in these, and a program that causes a computer to execute the method.
- a number of stereo signal processing devices have been devised that suppress a vocal sound component contained in a stereo signal based on a stereo signal in which the vocal is localized in the center.
- a vocal signal removal device that removes vocal signals of the same phase and the same level included in both channels by subtracting the right channel signal from the left channel signal has been proposed (see, for example, Patent Document 1). .
- a music signal from which a voice component which is a vocal signal included in a stereo signal is removed can be obtained by subtracting the right channel signal from the left channel signal.
- a music signal that is a difference signal between the compressed signals of the left and right channels is generated based on a compressed signal obtained by decoding a stereo signal compressed by encoding, auditory noise may occur. This is due to the fact that the spectral levels of the same frequency band in the compressed signals of the left and right channels are equal to each other by the encoding process on the stereo signal.
- This invention is made in view of such a situation, and it aims at suppressing the auditory noise which arises in the differential signal produced
- a first aspect of the present invention is that a frequency spectrum of a two-channel acoustic signal including audio components having substantially the same frequency distribution among a plurality of channel acoustic signals.
- a difference spectrum calculation unit that calculates a difference as a difference spectrum
- a low level band determination unit that determines a frequency band having a sharp level drop in an envelope of the difference spectrum calculated by the difference spectrum calculation unit as a low level band
- a replacement spectrum generating unit that generates a replacement spectrum for replacing the difference spectrum based on at least one of the frequency spectra in the acoustic signals of the two channels, and the low-level band of the difference spectrum calculated by the difference spectrum calculating unit
- the difference spectrum corresponding to An acoustic signal processing apparatus and a processing method therefor, comprising: a spectrum replacement unit that replaces a substitute spectrum; and an accompaniment signal generation unit that generates an accompaniment signal by converting the frequency spectrum output from the spectrum replacement unit into a time-domain signal.
- a program for causing a computer to execute the method As a result, a replacement spectrum is generated based on the frequency spectrum in the two-channel acoustic signal, and the difference spectrum corresponding to the low level band where the level drop in the envelope of the difference spectrum is steep is replaced with the replacement spectrum. .
- the replacement spectrum generation unit may perform the replacement based on at least one frequency spectrum in the two-channel acoustic signal and a predetermined level adjustment coefficient for adjusting the level of the replacement spectrum.
- a spectrum may be generated.
- the level obtained by multiplying the level of at least one frequency spectrum in the two-channel acoustic signal by the level adjustment coefficient is generated as the level of the replacement spectrum.
- the replacement spectrum generation unit performs the replacement based on the level adjustment coefficient of the voice band that is smaller than the level adjustment coefficient corresponding to a band other than the voice band and the level of the at least one frequency spectrum.
- a spectrum may be generated. This brings about the effect that the degree of reduction in the level of the replacement spectrum in the voice band is increased as compared with the band other than the voice band.
- the voice corresponding to the voice band based on the level ratio of the frequency spectrum corresponding to the band other than the voice band in the frequency spectrum of at least one of the two-channel acoustic signals and the voice band.
- a speech coefficient setting unit configured to set a coefficient; and a replacement spectrum generation unit configured to generate the replacement spectrum based on the at least one frequency spectrum and the speech coefficient set by the speech coefficient setting unit. Also good.
- the replacement spectrum is obtained using the speech coefficient corresponding to the speech band set based on the level ratio between the average level of the frequency spectrum corresponding to the band other than the speech band and the average level of the frequency spectrum corresponding to the speech band. It produces the effect of generating.
- the voice coefficient setting unit sets the voice coefficient to be larger as the level of the frequency spectrum corresponding to a band other than the voice band increases, and as the level of the frequency spectrum corresponding to the voice band increases. You may make it set the said audio
- the voice coefficient setting unit sets the voice coefficient to be larger as the level of the frequency spectrum corresponding to the band other than the voice band increases, and sets the voice coefficient to be smaller as the level of the frequency spectrum corresponding to the voice band increases. Bring about an effect.
- the low-level band determination unit is configured based on a low-level threshold value for specifying a frequency band in which the level drop in the envelope curve is steep and each level of the difference spectrum.
- the low level band may be determined.
- the low-level band determination unit determines the low-level band by using the low-level threshold value set based on the level of at least one frequency spectrum in the two-channel acoustic signal and the level of the difference spectrum. You may make it determine.
- the low level band determination unit causes the low level threshold to be set based on the level of at least one frequency spectrum in the two-channel acoustic signal.
- FIG. 721 It is a block diagram which shows the example of 1 structure of the music reproduction apparatus in the 1st Embodiment of this invention. It is a block diagram which shows one structure of the conventional acoustic signal encoding apparatus. It is a conceptual diagram which shows an example regarding the frequency spectrum divided
- FIG. It is a block diagram which shows one structural example of the acoustic signal decoding process part 200 in the 1st Embodiment of this invention. It is a block diagram which shows one structural example of the audio
- FIG. 6 is a diagram relating to a low-level band generated due to a shared band encoding process performed by a shared band encoding unit 800 in an acoustic signal encoding apparatus 700.
- First embodiment accommodation signal generation method: example of generating a replacement spectrum based on the frequency component of the left channel
- Second embodiment accommodation signal generation method: an example in which a speech coefficient for level adjustment of a replacement spectrum is set based on a frequency component of the left channel
- Third embodiment accommodation signal generation method: an example in which a replacement spectrum is generated based on frequency components of right and left channels
- FIG. 1 is a block diagram showing an example of the configuration of a music playback device according to the first embodiment of the present invention.
- the music playback device 100 includes an operation reception unit 110, a control unit 120, a display unit 130, an acoustic data storage unit 140, an acoustic data input unit 150, an analog conversion unit 160, an amplifier 170, and a speaker 180.
- the music playback device 100 is an example of the acoustic signal processing device described in the claims.
- the operation accepting unit 110 accepts various settings based on the operation of the user who uses the music playback device 100. For example, the operation receiving unit 110 receives a setting for reproducing any one of a plurality of pieces of acoustic data stored in the acoustic data storage unit 140. In addition, when the operation reception unit 110 reproduces the sound data stored in the sound data storage unit 140, the operation reception unit 110 reduces the sound component included in the sound data and outputs it from the speaker 180 as an accompaniment signal. Accept settings. Further, the operation receiving unit 110 generates a setting signal based on the received setting and supplies the setting signal to the control unit 120.
- the control unit 120 controls the display unit 130, the acoustic data storage unit 140, the analog conversion unit 160, the acoustic signal decoding processing unit 200, and the audio component removal unit 300 based on the setting signal supplied from the operation reception unit 110. It is.
- the control unit 120 causes the acoustic data storage unit 140 to store the acoustic data input from the acoustic data input unit 150 based on the setting signal related to the transfer from the operation reception unit 110.
- the control unit 120 stores, for example, an acoustic signal, which is a digital signal generated by a PCM (Pulse Code Modulation) code, in the acoustic data storage unit 140 as acoustic data. Moreover, this control part 120 memorize
- PCM Pulse Code Modulation
- control unit 120 converts any one of the encoded audio data stored in the acoustic data storage unit 140 based on the setting signal related to the reproduction from the operation reception unit 110 to the acoustic signal decoding processing unit 200. To supply. In addition, the control unit 120 supplies the encoded sound data from the sound data input unit 150 to the sound signal decoding processing unit 200 based on the setting signal related to reproduction from the operation receiving unit 110.
- control unit 120 supplies the acoustic signal decoded by the acoustic signal decoding processing unit 200 or the acoustic signal from the acoustic data storage unit 140 to the analog conversion unit 160 as a digital signal.
- control unit 120 supplies the sound signal from the sound data storage unit 140 to the sound component removal unit 300 based on the setting signal related to the karaoke function from the operation reception unit 110.
- control unit 120 supplies the analog conversion unit 160 with the accompaniment signal from which the audio component included in the acoustic signal has been removed by the audio component removal unit 300 based on the setting signal related to the karaoke function from the operation reception unit 110.
- control unit 120 causes the display unit 130 to display various information related to the music playback device 100 based on the setting signal from the operation receiving unit 110. For example, the control unit 120 causes the display unit 130 to display information regarding the acoustic data stored in the acoustic data storage unit 140. For example, the control unit 120 causes the display unit 130 to display a reproduction status of acoustic data, a setting status such as a karaoke function, and the like.
- the display unit 130 displays various information related to the music playback device 100 from the control unit 120.
- the display unit 130 can be realized by, for example, an LCD (Liquid Crystal Display).
- the acoustic data storage unit 140 stores acoustic data supplied from the control unit 120.
- the acoustic data storage unit 140 stores the encoded acoustic data and the acoustic signal from the acoustic data input unit 150 as acoustic data. Further, the acoustic data storage unit 140 stores the acoustic signal from the acoustic signal decoding processing unit 200. In addition, the acoustic data storage unit 140 outputs the stored acoustic data to the control unit 120.
- the acoustic data input unit 150 supplies acoustic data input from an external device to the control unit 120.
- This acoustic data input unit 150 supplies, for example, encoded sound data or an acoustic signal from an external device to the control unit 120.
- the analog conversion unit 160 converts a digital signal that is an acoustic signal supplied from the control unit 120 into an analog signal.
- the analog conversion unit 160 generates an electrical signal that is an analog signal based on a digital signal that is an acoustic signal.
- the analog conversion unit 160 supplies the generated electric signal to the amplifier 170.
- the amplifier 170 amplifies the amplitude of the analog signal supplied from the analog conversion unit 160.
- the amplifier 170 supplies the amplified analog signal to the speaker 180.
- the speaker 180 converts the analog signal supplied from the amplifier 170 into a sound wave and outputs the sound wave.
- the acoustic signal decoding processing unit 200 decodes the acoustic encoded data from the control unit 120.
- the acoustic signal decoding processing unit 200 supplies the decoded acoustic encoded data as an acoustic signal to the audio component removing unit 300 via the control unit 120 or the signal line 290.
- the audio component removal unit 300 generates an accompaniment signal composed of an accompaniment component by removing the audio component from the audio component and the accompaniment component included in the acoustic signal from the acoustic signal decoding processing unit 200 or the acoustic data storage unit 140. To do.
- the audio component removal unit 300 supplies the generated accompaniment signal to the analog conversion unit 160 via the control unit 120.
- the music playback device 100 is provided with the audio component removal unit 300, so that the audio component included in the acoustic signal is suppressed based on the acoustic signal from the acoustic data storage unit 140 or the acoustic data input unit 150.
- Accompaniment signals can be generated.
- an acoustic signal encoding device that generates acoustic data supplied from the acoustic data storage unit 140 or the acoustic data input unit 150 will be described below with reference to the drawings.
- FIG. 2 is a block diagram showing a configuration of a conventional acoustic signal encoding apparatus.
- an acoustic signal encoding apparatus 700 that performs encoding processing by the intensity method will be described.
- the acoustic signal encoding apparatus 700 encodes two-channel acoustic signals input via the input lines 701 and 702, and outputs the encoded acoustic signals as acoustic encoded data via the output line 759. To do.
- the acoustic signal encoding apparatus 700 includes a frequency spectrum generation unit 711 and 712, a normalization unit 721 and 722, a quantization unit 731 and 732, an encoding unit 741 and 742, a multiplexing unit 750, a shared band An encoding unit 800.
- Shared band encoding section 800 includes shared band selection section 810, quantization section 830, and encoding section 840.
- the frequency spectrum generation units 711 and 712 generate frequency spectra by converting the acoustic signals of the respective channels input from the input lines 701 and 702 of the right and left channels into the frequency domain. That is, the frequency spectrum generation units 711 and 712 convert time-domain signals that are acoustic signals of the respective channels into frequency components.
- the frequency spectrum generation units 711 and 712 extract an acoustic signal, which is a discrete time signal sampled at a constant time interval, in units of a constant sampling number, and the extracted time domain signal is framed. Generate as Then, the frequency spectrum generation units 711 and 712 generate a frequency spectrum by converting the generated frame into the frequency domain.
- the frequency spectrum generation units 711 and 712 generate, for example, Fourier coefficients calculated by performing fast Fourier transform (FFT) on the acoustic signals of the respective channels as frequency spectra.
- FFT fast Fourier transform
- the frequency spectrum generation units 711 and 712 generate MDCT coefficients calculated by a modified discrete cosine transform (MDCT: Modified ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ Discrete Cosine Transform) as a frequency spectrum.
- MDCT Modified ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ Discrete Cosine Transform
- the normalization units 721 and 722 perform normalization based on the level of each frequency spectrum supplied from the frequency spectrum generation units 711 and 712.
- the normalization units 721 and 722 divide the frequency spectrum from the frequency spectrum generation units 711 and 712 for each predetermined frequency band.
- the normalization units 721 and 722 generate a normalization reference value (scale factor) for each divided band (scale factor band) based on the maximum level of each frequency spectrum in the divided band. Then, normalization units 721 and 722 normalize the power value based on the amplitude level of each frequency spectrum corresponding to the divided band based on the normalization reference value of the divided band. That is, normalization units 721 and 722 generate a normalized component for each divided band by normalizing the power value that is the level of each frequency spectrum for each divided band.
- a normalization reference value scale factor
- the normalization units 721 and 722 supply the normalized value, which is the normalized power value, to the quantization unit 731, the quantization unit 732, and the shared band selection unit 810 via the signal lines 726 and 728. To do. At the same time, the normalization units 721 and 722 are necessary when decoding the encoded acoustic signal via the signal lines 727 and 729, so that the normalization reference value of each divided band is sent to the multiplexing unit 750. Supply.
- the quantization units 731 and 732 quantize the normalized values supplied from the normalization units 721 and 722 for each divided band.
- the quantization units 731 and 732 quantize the normalized power value by the number of quantization steps set for each division band. For example, the quantizing units 731 and 732 convert the normalized power value (0 to 1) into a discrete value with a constant quantization step width. That is, the quantization units 731 and 732 generate a quantization component for each divided band by quantizing the normalized value for each divided band.
- the quantizing units 731 and 732 supply the quantized value, which is the quantized power value, to the encoding units 741 and 742 via the signal lines 736 and 738. At the same time, the quantizing units 731 and 732 are necessary for decoding the encoded acoustic signal via the signal lines 737 and 739, so that the number of quantization steps in each divided band is sent to the multiplexing unit 750. Supply.
- the encoding units 741 and 742 encode the quantization values from the quantization units 731 and 732 for each divided band with reference to the encoding table.
- the encoding units 741 and 742 refer to a fixed-length or variable-length codebook as an encoding table, and convert the code into a code having a predetermined bit length based on the quantized value. In this way, by encoding the quantization value based on the referenced encoding table, the information amount of the quantization value can be compressed.
- the encoding units 741 and 742 supply the encoded quantized value as encoded data to the multiplexing unit 750 via the signal lines 746 and 748. At the same time, the encoding units 741 and 742 need to decode the encoded acoustic signal via the signal lines 747 and 749, so the table identification information of the referenced encoding table is divided for each divided band. Supply.
- Shared band coding section 800 performs shared band coding processing for coding only the normalized value of one channel in the divided band when the correlation between the normalized values of the two channels in the same divided band is high It is.
- the shared band selecting unit 810 selects a divided band having a high correlation between the normalized value of the left channel from the normalizing unit 721 and the normalized value of the right channel from the normalizing unit 722 as the shared band.
- the shared band selection unit 810 calculates a correlation based on the normalized values of the right channel and the left channel for each divided band, and when the calculated correlation exceeds a certain correlation threshold, The normalized value of one channel in this divided band is selected as the shared band. Also, the shared band selection unit 810 supplies shared band information indicating the selected shared band to the multiplexing unit 750 via the signal line 819.
- the shared band selection unit 810 supplies the normalized value of one channel in the selected shared band to the quantization unit 830 via the signal line 818.
- the shared band selection unit 810 supplies the normalized value of the left channel in the selected shared band to the quantization unit 830, for example.
- the quantization unit 830 quantizes the normalized value supplied from the shared band selection unit 810. Since the function of the quantizing unit 830 is the same as that of the quantizing units 731 and 732, detailed description thereof is omitted here.
- the quantization unit 830 supplies the number of quantization steps to the multiplexing unit 750 via the signal line 839 and supplies the quantized value to the encoding unit 840 via the signal line 838.
- the encoding unit 840 encodes the quantization value supplied from the quantization unit 830. Since the function of the encoding unit 840 is the same as that of the encoding units 741 and 742, detailed description thereof is omitted here.
- the encoding unit 840 supplies table identification information to the multiplexing unit 750 through the signal line 849 and supplies encoded data to the multiplexing unit 750 through the signal line 848.
- Multiplexer 750 receives data supplied from normalizers 721 and 722, shared band selector 810, quantizers 731, 732, and 830, and encoders 741, 742, and 840, respectively, as one code. Multiplexed into columns.
- the multiplexing unit 750 includes a 2-channel normalization reference value, a quantization step number, table identification information, and encoded data, shared band information from the shared band encoding unit 800, a normalization reference value, and a quantization step number.
- the table identification information and the encoded data are multiplexed. That is, the multiplexing unit 750 generates one code string (bit stream) by multiplexing these data by time division.
- the multiplexing unit 750 for example, based on the shared band information supplied from the shared band selection unit 810, the quantization units 731 and 732, the encoding units 741 and 742 in the divided band corresponding to the shared band information, Are excluded from being multiplexed. Accordingly, it is possible to multiplex encoded data obtained by encoding only the frequency spectrum of one channel in the highly correlated subband among the frequency spectra of two channels.
- the multiplexing unit 750 outputs the generated one code string to the output line 759 as acoustic encoded data.
- the multiplexing unit 750 supplies the encoded audio data to the audio data input unit 150 illustrated in FIG. 1 via the output line 759.
- the multiplexing unit 750 supplies the encoded sound data to an external storage device or the like via the output line 759, for example.
- the acoustic signal encoding apparatus 700 includes the shared band encoding unit 800 and multiplexes only the encoded data of one channel among the encoded data of two channels in the highly correlated division band. Thus, the amount of encoded audio data is reduced.
- the frequency spectrum in the divided band divided by the normalization units 721 and 722 will be briefly described below with reference to the drawings.
- FIG. 3 is a conceptual diagram illustrating an example of the frequency spectrum divided by the normalization units 721 and 722.
- FIG. 3A is a conceptual diagram illustrating a divided band obtained by dividing a frequency spectrum, which is a frequency component of an acoustic signal, into predetermined bands by the left channel normalization unit 721.
- FIG. 3B is a conceptual diagram showing a frequency spectrum in the divided band shown in FIG.
- the left channel frequency spectrum envelope 725 generated by the frequency spectrum generation unit 711 and the nine divided bands B [0] to B [9] are shown as the left channel acoustic signal component 720. Is shown.
- the vertical axis is the frequency component power Pl in the left channel
- the horizontal axis is the frequency spectrum number (index) f corresponding to the frequency.
- the divided bands B [0] to B [9] indicate frequency bands obtained by dividing the frequency spectrum generated by the frequency spectrum generating unit 711 by the normalizing unit 721 into nine.
- the levels (heights) of the divided bands B [0] to B [9] indicate normalized reference values (scale factors) calculated based on the maximum level of the frequency spectrum in the divided bands.
- the low frequency band is narrowed and the higher the frequency is, the wider the frequency band is. ing.
- FIG. 3 (b) shows the levels Pl (f) of the 0th to fourth frequency spectra included in the divided bands B [0] and B [1].
- These frequency spectrum levels Pl (f) indicate power values calculated based on the amplitude level of the f-th frequency spectrum. For example, it is a value calculated based on the square of the f-th Fourier coefficient.
- the index of divided band B is represented as [i].
- the encoding is performed by associating the plurality of frequency spectra f with respect to each divided band B [i].
- FIG. 4 is a block diagram illustrating a configuration example of the acoustic signal decoding processing unit 200 according to the first embodiment of the present invention.
- the acoustic signal decoding processing unit 200 includes a decoding unit 210, a left channel inverse quantization unit 221, a right channel inverse quantization unit 222, a shared band inverse quantization unit 223, selection units 231 and 232, and inverse normalization.
- the decoding unit 210 decodes acoustic encoded data that is a code string supplied from the signal line 129.
- the decoding unit 210 separates the encoded audio data into a normalization reference value, a quantization step number, table identification information, and encoded data for each channel.
- the decoding unit 210 extracts encoded data and table identification information from the separated acoustic encoded data, and refers to a decoding table specified by the extracted table identification information, thereby performing encoding. Decodes data into quantized values.
- the decoding unit 210 also converts the left and right channel quantization step numbers of the separated acoustic encoded data into the left channel dequantization unit 221 and the right channel dequantization unit via the signal lines 214 and 215, respectively. 222 respectively. At the same time, the decoding unit 210 supplies the quantized values for the divided bands of the right channel and the left channel to the left channel inverse quantization unit 221 and the right channel inverse quantization unit 222 via the signal lines 211 and 212, respectively. .
- the decoding unit 210 transmits the quantization value of the shared band specified by the shared band information and the number of quantization steps corresponding to the shared band inverse quantum among the separated acoustic encoded data via the signal line 213. To the conversion unit 223. In addition, the decoding unit 210 receives a selection signal for selecting the output from the shared band inverse quantization unit 223 based on the shared band information among the separated acoustic encoded data via the signal lines 216 and 217. This is supplied to the selectors 231 and 232. That is, the decoding unit 210 supplies the output corresponding to the shared band from the shared band inverse quantization unit 223 to the denormalization units 241 and 242 of both channels simultaneously.
- the decoding unit 210 supplies the right channel normalization reference value of the separated acoustic encoded data to the denormalization units 241 and 242 for each divided band via the signal lines 218 and 219, respectively. To do.
- the left and right channel dequantization units 221 and 222 dequantize the quantized value based on the number of quantization steps for each divided band.
- the left and right channel dequantization units 221 and 222 calculate the quantization value for each divided band from the signal lines 211 and 212 based on the number of quantization steps from the signal lines 214 and 215, and the normalized value of each channel. Is generated.
- the left channel inverse quantization unit 221 generates the left channel normalized value from the signal line 211 based on the number of quantization steps from the signal line 214.
- the right channel inverse quantization unit 222 generates a right channel normalized value from the signal line 212 and a right channel normalized value based on the number of quantization steps from the signal line 215.
- the left and right channel dequantization units 221 and 222 supply the generated normalized values of the respective channels to the denormalization units 241 and 242 via the selection units 231 and 232, respectively.
- the shared band inverse quantization unit 223 performs inverse quantization on the quantized value in the shared band specified by the shared band information based on the number of quantization steps corresponding thereto.
- the shared band inverse quantization unit 223 generates a normalized value in the shared band based on the quantization value and the number of quantization steps supplied from the signal line 213.
- the shared band inverse quantization unit 223 supplies the generated normalized value to the denormalization units 241 and 242 via the selection units 231 and 232, respectively.
- selection sections 231 and 232 select a normalized value in the shared band and a normalized value in a divided band other than the shared band, and reverses the selected normalized value. This is output to the normalization units 241 and 242. For example, when the normalization value corresponding to the shared band is supplied from the shared band inverse quantization unit 223, the selection units 231 and 232, based on the selection signal from the decoding unit 210, the denormalization unit 241. And 242, a normalized value corresponding to the same shared band is output.
- the selection units 231 and 232 based on the selection signal from the decoding unit 210, denormalization units 241 and 242, respectively. Output the normalized value of each channel.
- the denormalization units 241 and 242 denormalize the normalization value based on the normalization reference value for each divided band.
- the denormalization units 241 and 242 generate the frequency spectrum of each channel based on the normalization value for each divided band from the selection units 231 and 232 and the normalization reference value from the signal lines 218 and 219.
- the left channel inverse quantization unit 221 generates the power value of the frequency spectrum of the left channel based on the normalized value from the selection unit 231 and the normalized reference value from the signal line 218. Further, the right channel inverse quantization unit 222 generates a power value of the frequency spectrum of the right channel based on the normalized value from the selection unit 232 and the normalized reference value from the signal line 219. Further, the denormalization units 241 and 242 supply the generated frequency spectrum of each channel to the acoustic signal generation units 251 and 252, respectively.
- the acoustic signal generation units 251 and 252 generate acoustic signals of each channel based on the frequency spectrum of each channel supplied from the denormalization units 241 and 242. That is, the acoustic signal generation units 251 and 252 convert the frequency spectrum, which is frequency domain data, into an acoustic signal, which is a time domain signal.
- the acoustic signal generation units 251 and 252 restore time domain signals in units of frames by performing, for example, inverse fast Fourier transform (IFFT: Inverse FFT) on the frequency spectrum of each channel.
- IFFT Inverse FFT
- IMDCT inverse modified discrete cosine transform
- the acoustic signal generators 251 and 252 supply the generated acoustic signals of the respective channels to the left and right channel signal lines 291 and 292, respectively. That is, the acoustic signal generation units 251 and 252 supply the right channel and left channel acoustic signals to the audio component removal unit 300.
- an acoustic signal generated by decoding the encoded acoustic signal by the acoustic signal generation units 251 and 252 is referred to as a compressed signal.
- the acoustic signal decoding processing unit 200 can decode the acoustic encoded data encoded by the acoustic signal encoding unit 700 by providing the shared band inverse quantization unit 223 and the selection units 231 and 232. it can.
- the frequency distribution in the shared band is substantially equal in the shared band where the normalization reference values of both channels are equal.
- the configuration example of the acoustic signal decoding processing unit 200 that decodes acoustic signals of two channels has been described, but the present invention is not limited to this, and acoustic signals of three or more channels may be decoded.
- the sound component removal unit 300 that reduces the sound component included in the sound signal supplied from the sound signal decoding processing unit 200 or the control unit 120 will be described below with reference to the drawings.
- FIG. 5 is a block diagram illustrating a configuration example of the audio component removal unit 300 according to the first embodiment of the present invention.
- the audio component removing unit 300 reduces the audio component in the acoustic signal of each channel supplied from the acoustic signal decoding processing unit 200 via the left and right channel signal lines 291 and 292 included in the signal line 290, thereby reducing the accompaniment signal. Output as.
- two-channel acoustic signals including audio components having substantially equal frequency distribution among a plurality of acoustic signals of two or more channels are supplied from the left and right channel signal lines 291 and 292. .
- the speech component removal unit 300 includes frequency spectrum generation units 311 and 312, a difference spectrum calculation unit 320, a low level band determination unit 330, a level adjustment coefficient holding unit 340, and a replacement spectrum generation unit 350. Furthermore, the audio component removal unit 300 includes a spectrum substitution unit 360 and an accompaniment signal generation unit 370.
- the frequency spectrum generation units 311 and 312 generate frequency spectra by converting the acoustic signals of the respective channels from the left and right channel signal lines 291 and 292 into frequency components.
- the functions of the frequency spectrum generation units 311 and 312 are the same as those of the frequency spectrum generation units 711 and 712 shown in FIG. 2, and thus detailed description thereof is omitted here.
- the frequency spectrum generation unit 311 supplies each frequency spectrum indicating the generated frequency component of the left channel to the difference spectrum calculation unit 320, the low level band determination unit 330, and the replacement spectrum generation unit 350.
- the frequency spectrum generation unit 312 supplies the generated frequency spectrum of the right channel to the difference spectrum calculation unit 320.
- the difference spectrum calculation unit 320 is a calculation unit that calculates the difference absolute value of the level of the frequency spectrum corresponding to the same frequency from the frequency spectrum generation units 311 and 312 as the difference spectrum. That is, the difference spectrum calculation unit 320 calculates, as a difference spectrum, a difference between frequency spectra in two-channel acoustic signals including audio components having substantially equal frequency distributions among a plurality of channel acoustic signals. Thus, by calculating the difference between the frequency spectrum of the right channel and the frequency spectrum of the left channel, the audio component in the acoustic signal can be reduced.
- the difference spectrum calculation unit 320 calculates the absolute value of the subtraction value obtained by subtracting the power value of the right channel frequency spectrum from the power value that is the level of the frequency spectrum of the left channel as the power value of the difference spectrum. For example, the difference spectrum calculation unit 320 subtracts the power value of the 0th frequency spectrum in the right channel from the power value of the 0th frequency spectrum in the left channel, thereby obtaining the absolute value of the difference. Is calculated as the difference spectrum.
- the difference spectrum calculation unit 320 supplies the calculated difference spectrum to the low-level band determination unit 330 and the spectrum replacement unit 360.
- the difference spectrum calculation unit 320 is an example of the difference spectrum calculation unit described in the claims.
- the low level band determination unit 330 determines a frequency band in which the level drop in the envelope of the difference spectrum calculated by the difference spectrum calculation unit 320 is steep as a low level band.
- the low level band determination unit 330 compares a low level threshold value for specifying a frequency band in which the level drop in the envelope of the frequency spectrum is steep, and each level of the difference spectrum.
- the low level band determination unit 330 compares, for example, a preset low level threshold value with power values based on the amplitude levels of all the difference spectra. As another example, the low level band determination unit 330 sets a low level threshold based on the frequency spectrum level of the left channel corresponding to the difference spectrum to be compared, and sets the set low level threshold and the difference spectrum. And compare. In this example, the low-level band determination unit 330 may use an average value or a global envelope in the frequency spectrum of the left channel.
- the low level band determination unit 330 determines, for each difference spectrum, whether or not the level of the difference spectrum is less than the low level threshold based on the comparison result. And the low level zone
- the low level band determination unit 330 generates replacement information for each differential spectrum in order to replace the differential spectrum determined to be the low level band with another spectrum. For example, the low-level band determination unit 330 generates replacement information indicating true (TRUE) when it is determined to be a low-level band, and indicates false (False) when it is determined that it is not a low-level band. Generate replacement information.
- the low level band determination unit 330 supplies the generated replacement information to the spectrum replacement unit 360.
- the low level band determination unit 330 is an example of the low level band determination unit described in the claims.
- the replacement spectrum generation unit 350 generates a replacement spectrum for replacing the component of the difference spectrum with another component when the difference spectrum is determined to be a low level band based on the frequency spectrum of the left channel corresponding to the difference spectrum. To do. That is, the replacement spectrum generation unit 350 generates a replacement spectrum for replacing the difference spectrum based on at least one of the two-channel frequency spectra.
- the replacement spectrum generation unit 350 generates a replacement spectrum based on, for example, the frequency spectrum of the left channel and a predetermined level adjustment coefficient held in the level adjustment coefficient holding unit 340.
- the replacement spectrum generation unit 350 generates a multiplication value of the frequency spectrum of the left channel and the level adjustment coefficient corresponding to the frequency spectrum as the level of the replacement spectrum.
- the replacement spectrum generation unit 350 supplies the generated replacement spectrum to the spectrum replacement unit 360.
- the replacement spectrum generation unit 350 is an example of a replacement spectrum generation unit described in the claims.
- the level adjustment coefficient holding unit 340 holds a level adjustment coefficient for adjusting the level of the replacement spectrum.
- the level adjustment coefficient holding unit 340 holds, for example, a predetermined level adjustment coefficient.
- the level adjustment coefficient holding unit 340 holds, for example, a level adjustment coefficient whose level adjustment coefficient corresponding to the voice band is a smaller value than the level adjustment coefficient corresponding to a band other than the voice band. That is, the replacement spectrum generation unit 350 generates a replacement spectrum based on the level adjustment coefficient of the voice band smaller than the level adjustment coefficient corresponding to the band other than the voice band and the frequency spectrum of the left channel. Further, the level adjustment coefficient holding unit 340 outputs the held level adjustment coefficient to the replacement spectrum generation unit 350.
- the spectrum replacement unit 360 replaces the difference spectrum corresponding to the low level band among the difference spectra calculated by the difference spectrum calculation unit 320 with the replacement spectrum.
- the spectrum replacement unit 360 replaces the difference spectrum from the difference spectrum calculation unit 320 with the replacement spectrum from the replacement spectrum generation unit 350 based on the replacement information from the low level band determination unit 330.
- the spectrum replacement unit 360 converts the level of the difference spectrum determined to be the low level band into the level of the replacement spectrum corresponding to the difference spectrum. For example, when the replacement information corresponding to the first differential spectrum indicates true (TRUE), the spectrum replacement unit 360 generates a replacement spectrum generated based on the first frequency spectrum in the left channel, Replace with the new first difference spectrum.
- the spectrum replacement unit 360 replaces the level of the difference spectrum determined to be the low level band with the level of the replacement spectrum corresponding to the difference spectrum, and outputs the level to the accompaniment signal generation unit 370.
- spectrum replacing section 360 outputs the level of the difference spectrum determined not to be in the low level band to accompaniment signal generating section 370 as it is.
- the spectrum replacement unit 360 is an example of a spectrum replacement unit described in the claims.
- the accompaniment signal generation unit 370 generates an accompaniment signal by converting the frequency spectrum in the entire frequency band output from the spectrum replacement unit 360 into a signal in the time domain.
- the accompaniment signal generation unit 370 converts the frequency domain data that is the frequency spectrum indicating the frequency component output from the spectrum substitution unit 360 into an accompaniment signal that is a time domain signal.
- the accompaniment signal generation unit 370 restores the time domain signal in units of frames, for example, by performing fast Fourier inverse transform on the frequency spectrum. As another example, the accompaniment signal generation unit 370 restores the time domain signal for each frame by inverse correction discrete cosine transform.
- the accompaniment signal generation unit 370 outputs the generated accompaniment signal to the signal line 128. That is, the accompaniment signal generation unit 370 supplies the accompaniment signal to the control unit 120 and outputs the accompaniment signal from the speaker 180 as an accompaniment sound.
- the accompaniment signal generation unit 370 is an example of an accompaniment signal generation unit described in the claims.
- the low-level band determination unit 330 the difference spectrum corresponding to the low-level band among the difference spectra calculated by the difference spectrum calculation unit 320 can be determined. Further, by providing the replacement spectrum generation unit 350, a replacement spectrum can be generated based on the frequency spectrum of the left channel having frequency characteristics that approximate the difference spectrum. As a result, a replacement spectrum that approximates the frequency characteristic of the original difference spectrum can be generated, so that a more natural difference spectrum can be corrected.
- the level of the frequency spectrum in the low level band can be replaced with the level of the replacement spectrum generated by the replacement spectrum generation unit 350.
- the difference spectrum calculated by the difference spectrum calculation unit 320 will be described below with reference to the drawings.
- FIG. 6 is a conceptual diagram showing an example of the frequency distribution of the audio component and the accompaniment component in the difference signal generated based on the difference between the acoustic signals in the left and right channels.
- the difference signal is generated by subtracting in the subtractor 321 stereo signals that are right and left channel acoustic signals in which the vocal sound is localized in the center and the localization of each instrument in the accompaniment is scattered. .
- FIGS. 6A to 6D are diagrams showing the frequency distribution of the audio component and accompaniment component included in the left channel acoustic signal as the left channel signal component.
- C) and (d) of FIG. 6 are diagrams showing the frequency distribution of the audio component and accompaniment component included in the right channel acoustic signal as the right channel signal component.
- the vertical axis in FIGS. 6A to 6D is power, and the horizontal axis is frequency.
- FIG. 6 (a) shows an accompaniment component Pli included in the left channel acoustic signal.
- the left channel accompaniment component Pli has a large power distribution mainly in a frequency band of 200 Hz or less.
- FIG. 6B shows a sound component Plv included in the left channel acoustic signal.
- the left channel audio component Plv has a large power distribution mainly in the frequency band of 200 Hz to 2 KHz.
- FIG. 6 (c) shows an accompaniment component Pri included in the right channel acoustic signal.
- the accompaniment component Pri of the right channel is different from the frequency distribution of the accompaniment component Pli of the left channel, but large power is distributed mainly in a frequency band of 200 Hz or less.
- FIG. 6D shows the audio component Prv included in the right channel acoustic signal.
- the right channel audio component Prv has a frequency distribution equal to that of the left channel audio component Plv, and a large power is distributed in a frequency band of 200 Hz to 2 KHz.
- the left channel sound component and the right channel sound component have substantially the same frequency distribution.
- the localization of each instrument is spatially dispersed, so that the frequency distributions of the left channel and the right channel tend to be different from each other.
- (E) and (f) of FIG. 6 are audio components and accompaniment components included in the differential signal generated by the absolute difference values of the right and left channel acoustic signals shown in (a) to (d) of FIG. It is a figure which shows frequency distribution of.
- the vertical axis is power
- the horizontal axis is frequency.
- FIG. 6 (e) shows an accompaniment component Pdi included in the differential signal. Since the accompaniment component Pdi of the difference signal has different frequency distributions of the accompaniment components Pli and Pri of the right and left channels, the degree of cancellation by the frequency components of both channels is small.
- FIG. 6 shows the audio component Pdv included in the differential signal.
- the frequency distribution of the audio component Plv or Prv of the right or left channel is indicated by a broken line.
- the audio component Pdv in the difference signal has the same frequency distribution of the audio components Plv and Prv of the right and left channels, so that the audio components are canceled by the frequency components of both channels.
- an accompaniment signal in which the sound component is suppressed is generated by subtracting the acoustic signal of the other channel from the acoustic signal of one channel.
- an accompaniment signal in which the sound component is suppressed is generated by subtracting the acoustic signal of the other channel from the acoustic signal of one channel.
- generated in the time domain was demonstrated here, after converting the acoustic signal of 2 channels into a frequency spectrum, a difference signal is produced
- a differential signal can be generated.
- the differential signal is generated based on the compressed acoustic signal obtained by decoding the acoustic signal compressed by the acoustic signal encoding device 700 shown in FIG. 2, the amplitude level of the frequency component of the differential signal is extremely high. In some cases, a low level band is generated. The occurrence of such a low-level band in the differential signal appears as harsh noise on human hearing.
- the cause of the low-level band generated in the differential signal generated based on the compressed signal that is the decoded compressed acoustic signal will be described below with reference to the drawings.
- FIG. 7 is a diagram regarding a low-level band generated due to quantization by the quantization units 731 and 732 in the acoustic signal encoding device 700.
- FIGS. 7A and 7B are diagrams illustrating examples of the left normalized component 771 and the right normalized component 772 generated by the normalizing units 721 and 722 in the acoustic signal encoding device 700, respectively.
- FIG. 7C is a diagram illustrating a normalized difference absolute value 773 that is a difference absolute value between the left normalized component 771 and the right normalized component 772.
- FIG. 7D and 7E show a left quantized component 781 and a left quantized component 781 obtained by quantizing the left normalized component 771 and the right normalized component 772 by the quantizing units 731 and 732 in the acoustic signal encoding device 700, respectively. It is a figure which shows an example of the right quantization component 782.
- (F) of FIG. 7 is a diagram illustrating a quantized difference absolute value 783 that is a difference absolute value of the left quantized component 781 and the right quantized component 782.
- FIG. 7A shows the normalized values Pl of the four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the left channel.
- FIG. 7B shows normalized values Pr of the four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the right channel.
- FIG. 7 (d) shows the quantized values Q of the four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the left channel.
- the quantized value Q is set to “2” by quantizing the normalized value.
- FIG. 7 (e) shows the quantized values Q of the four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the right channel.
- the quantized value Q is set to “2”, which is the same as the left channel, by quantizing the normalized value.
- FIG. 7 (f) shows the absolute difference value Q of the quantized values of the same frequency spectrum (f1 to f4) in the right and left channels.
- the absolute difference values Q of these frequency spectra (f1 to f4) are all “0”. This is due to the fact that the normalized value of the frequency spectrum (f1 to f4) is limited to five quantized values Q (0 to 4) by quantizing the normalized value of each channel. That is, the quantization difference absolute value Q of each frequency spectrum (f1 to f4) in the i-th divided band B [i] is all “0” due to the quantization error caused by the quantization.
- the quantizing units 731 and 732 quantize the quantized values of the right and left channels. It may be the same.
- the i-th divided band B [i] when the normalized reference values corresponding to the i-th divided band B [i] in which the quantized values of both channels are the same, the i-th divided band B [i] The frequency band corresponding to is a low level band in the differential signal.
- FIG. 8 is a diagram relating to a low-level band generated due to the shared band encoding process performed by the shared band encoding unit 800 in the acoustic signal encoding apparatus 700.
- shared band coding section 800 determines that the i-th divided band B [i] having a high degree of correlation between the normalized components of the left and right channels is the shared band, and normalizes the left channel in the shared band. It is assumed that the component is quantized.
- FIG. 8A and 8B are diagrams showing examples of the left normalization component 771 and the right normalization component 774 generated by the normalization units 721 and 722 in the acoustic signal encoding apparatus 700, respectively.
- FIG. 8C is a diagram showing a normalized difference absolute value 775 that is a difference absolute value between the left normalized component 771 and the right normalized component 774.
- FIG. 7 is a diagram illustrating a quantization difference absolute value 785 that is a difference absolute value between the left quantization component 781 and the right quantization component 784.
- FIG. 8 shows the normalized values Pl of the four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the left channel.
- FIG. 8B shows normalized values Pr of four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the right channel.
- (C) of FIG. 8 shows the absolute difference value Pd of the normalized values of the frequency spectrum (f1 to f4) in the right and left channels.
- the difference absolute values Pd of the frequency spectra (f1 to f4) indicate different levels.
- FIG. 8 (d) shows quantized values Q of four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the left channel.
- the quantized values of these four frequency spectra (f1 to f4) are the same as those in (d) of FIG.
- FIG. 8 (e) shows quantized values Q of four frequency spectra (f1 to f4) included in the i-th divided band B [i] in the right channel.
- the quantized values Q of the four frequency spectra (f1 to f4) of the right channel are the same as the quantized value Q of the left channel. That is, the quantized value Q of the frequency spectrum (f1 to f4) is determined by the left-channel quantized value when the i-th divided band B [i] is determined as the shared band by the shared band encoding unit 800. It is shown that Q is also used for the quantized value Q of the right channel.
- FIG. 8 (f) shows the absolute difference value Q of the quantized values of the frequency spectra (f1 to f4) in the right and left channels. Unlike the absolute difference 773 shown in FIG. 8C, the absolute difference Q of the frequency spectrum (f1 to f4) is all “0”. This is because the shared band encoding unit 800 shares the normalized value of the frequency spectrum in the divided band B [i] of the left channel as the normalized value of both channels.
- the quantized value component generated by the shared band encoding unit 800 is shared as the quantized values of both channels. Therefore, the quantization values are equal to each other at the time of decoding. For this reason, when the encoded data in which the normalized value of the i-th divided band [i] is shared by the shared band encoding unit 800 and the difference spectrum is calculated based on the decoded compressed signal, The difference spectrum corresponding to the i-th divided band [i] is the low level band.
- FIG. 9 is a conceptual diagram illustrating an example of a divided band B [i] based on the difference spectrum calculated by the difference spectrum calculation unit 320 according to the first embodiment of the present invention.
- a spectrum envelope such as the frequency spectrum envelope 725 shown in FIG. 3A is omitted.
- FIGS. 9A and 9B are diagrams illustrating compressed signal components 313 and 314 in the left and right channel acoustic signals generated by the frequency spectrum generation units 311 and 312.
- FIG. 9C is a diagram illustrating a difference absolute value component 321 based on the difference spectrum calculated by the difference spectrum calculation unit 320.
- the vertical axis represents the size of the normalized reference value (scale factor) corresponding to the divided band B [i], and the horizontal axis represents the frequency.
- the left and right channel compressed signal components 313 and 314 represent the frequency distribution of the left and right channels in the compressed signal restored by decoding the encoded acoustic signal, and are divided into 10 divided bands B [0] to [9]. More conceptually. Note that this divided band B [i] includes a plurality of frequency spectra as shown in FIG.
- the difference absolute value component 321 conceptually shows the frequency distribution of the difference absolute value of the frequency spectrum in the left and right channel compressed signal components 313 and 314 by 10 divided bands B [0] to [9].
- the first divided band B [1] is a low-level band in which the quantized values of both channels become equal to each other due to quantization and the level of each differential spectrum is significantly reduced. is there.
- the fifth, seventh, and eighth divided bands B [5], B [7], and B [8] have the same quantization value for both channels by shared band coding. Thus, it is a low level band in which the level of each difference spectrum is greatly reduced.
- the low level band determination unit 330 determines the low level band and replaces the difference spectrum corresponding to the determined low level band with the replacement spectrum.
- FIG. 10 is an conceptual diagram illustrating an example in which the difference spectrum corresponding to the low-level band is replaced with a replacement spectrum by the speech component removal unit 300 according to the first embodiment of the present invention.
- FIG. 10 is a diagram showing the left channel compressed signal component 313 supplied to the replacement spectrum generation unit 350.
- FIG. 10B shows the difference absolute value component 361 after the difference spectrum in the low level band in the difference absolute value component 321 shown in FIG. 9C is replaced with the replacement spectrum by the spectrum replacement unit 360.
- the vertical axis represents the size of the normalized reference value (scale factor) corresponding to the divided band B [i], and the horizontal axis represents the frequency.
- the left channel compressed signal component 313 is the same as that shown in FIG.
- the difference absolute value component 361 after replacement is determined to be the low level band by the low level band determination unit 330, and the divided bands B [1], B [5], B [7], and B [in the difference absolute value component 321 are determined.
- 8] shows a frequency distribution in which the difference spectrum is replaced with a replacement spectrum.
- the frequency distribution is indicated not by the frequency spectrum but by the divided bands B [0] to B [9].
- the replacement spectrums of these divided bands B [1], B [5], B [7], and B [8] are the left channel corresponding to the difference spectrum determined by the replacement spectrum generation unit 350 as the low level band. Generated based on the frequency spectrum. These replacement spectrum levels are calculated by the replacement spectrum generation unit 350 by multiplying the level of the frequency spectrum corresponding to the low level band by the level adjustment coefficient in the level adjustment coefficient holding unit 340.
- the level of the replacement spectrum included in the first divided band B [1] includes the level adjustment coefficient g1 corresponding to the first divided band B [1] and the divided band B [1] of the left channel. ] Is generated by a multiplication value with each frequency spectrum Pl included. Further, the level of the replacement spectrum included in the fifth divided band B [5] is the level adjustment coefficient g2 corresponding to the fifth divided band B [5] and the divided band B [5] of the left channel. It is generated by a multiplication value with each included frequency spectrum Pl.
- the level of the replacement spectrum included in the seventh divided band B [7] is the level adjustment coefficient g3 corresponding to the seventh divided band B [7] and the divided band B [7] of the left channel. It is generated by a multiplication value with each included frequency spectrum Pl.
- the level of the replacement spectrum included in the fifth divided band B [8] is the level adjustment coefficient g4 corresponding to the eighth divided band B [8] and the divided band B [8] of the left channel. It is generated by a multiplication value with each included frequency spectrum Pl.
- the low-level band in the accompaniment signal can be eliminated by replacing the differential spectrum corresponding to the low-level band with the replacement spectrum obtained by multiplying the frequency spectrum of the left channel by the level adjustment coefficient.
- FIG. 11 is a diagram illustrating an example of the frequency characteristic 341 of the level adjustment coefficient held in the level adjustment coefficient holding unit 340 according to the first embodiment of the present invention.
- the horizontal axis is frequency
- the vertical axis is the level adjustment coefficient.
- the level adjustment coefficient frequency characteristic 341 indicates the frequency characteristic of the level adjustment coefficient g (f) for adjusting the level of the replacement spectrum generated by the replacement spectrum generation unit 350.
- the level adjustment coefficient in the middle sound band (fvl to fvh) corresponding to the sound component is different from the level adjustment coefficient corresponding to the band other than the sound band.
- the level adjustment coefficient g (f) corresponding to a band other than the voice band in the level adjustment coefficient frequency characteristic 341 is “1.0”. Thereby, the level of the frequency spectrum of the left channel is applied as it is as the level of the replacement spectrum generated by the replacement spectrum generation unit 350.
- the level adjustment coefficient g (f) corresponding to the voice band (fvl to fvh) in the level adjustment coefficient frequency characteristic 341 is gv.
- This level adjustment coefficient gv is a value smaller than “1.0”. Since it is about 0.1 that the listener feels that the audio component in the difference signal has become sufficiently small, this level adjustment coefficient gv is preferably set to about 0.1. However, depending on the frequency characteristics of the difference signal, it may feel unnatural even if it is set to about 0.1. In such a case, the level adjustment coefficient gv is set to about 0.2 to 0.3. You may make it do.
- the sound component is sufficiently suppressed by setting the level adjustment coefficient gv corresponding to the sound band including the sound component smaller than the level adjustment coefficient corresponding to the band other than the sound band (fvl to fvh).
- the accompaniment signal without a sense of incongruity can be generated.
- FIG. 12 is a diagram relating to an example of a method for determining a difference spectrum corresponding to a low level band by the low level band determination unit 330 according to the first embodiment of the present invention.
- a left channel spectrum envelope 315, a left channel spectrum smooth line 331, a difference spectrum envelope 322, and a low level threshold line 332 are shown.
- the vertical axis is power
- the horizontal axis is frequency.
- the left channel spectrum envelope 315 indicates the envelope of the frequency spectrum Pl (f) of the left channel generated by the frequency spectrum generator 311.
- the level Pl (f) of the frequency spectrum generally decreases as the frequency f increases.
- the left channel spectrum smooth line 331 is a smooth line SMT (f) generated by smoothing the left channel spectrum envelope 315.
- the smooth line SMT (f) is generated by calculating the slope of the straight line based on the level of the frequency spectrum of the left channel.
- the left channel spectrum smooth line 331 may be generated, for example, by moving average. Although an example in which the smooth line 331 is calculated based on the frequency spectrum of the left channel is shown here, the smooth line SMT (f) may be generated based on the difference spectrum envelope 322.
- the difference spectrum envelope 322 is an envelope of the difference spectrum D (f) calculated by the difference spectrum calculation unit 320.
- This differential spectrum envelope 322 shows the first and second low level bands ⁇ fa (fla to fha) and ⁇ fb (flb to fhb) where the level drop is steep. Further, the level D (f) of the difference spectrum generally decreases as the frequency f increases, as with the left channel spectrum envelope 315. Thus, the difference spectrum D (f) and the frequency spectrum Pl (f) of the left channel tend to have characteristics that approximate globally.
- the levels of the difference spectra corresponding to the first and second low-level bands ( ⁇ fa and ⁇ fb) in the difference spectrum envelope 322 are different from each other. This is because the frequency spectrum of each channel is converted from the frequency domain to the time domain when encoded data having a band in which the quantized values of the left and right channels match each other by quantization or shared band encoding. Due to Since this conversion process causes a slight difference in the level of the frequency spectrum in the shared band of the left and right channels, the difference in the spectrum level in the first and second low-level bands ( ⁇ fa and ⁇ fb) in the difference spectrum envelope 322 occurs. Has occurred.
- the low level threshold line 332 is a line of the low level threshold TH (f) set based on the left channel spectrum smooth line 331 and a certain threshold coefficient. This threshold coefficient is set according to the level of the assumed low level band. If the threshold coefficient is too large, the low-level band determination unit 330 may erroneously determine a band that is not a low-level band as a low-level band. Therefore, it is desirable to set the threshold coefficient as small as possible.
- the low-level band determination unit 330 uses the level Pl (f) of the frequency spectrum of the left channel and the threshold coefficient to generate a low-level threshold line 332 that can easily approximate the global frequency characteristics of the difference spectrum. Can be set. As a result, the low level band determination unit 330 can more accurately determine the difference spectrum corresponding to the low level band compared to the case where a certain threshold is provided for all frequency bands.
- the low level threshold line 332 is generated based on the frequency spectrum of the left channel has been described, but a right channel frequency spectrum or a sum of the two channel frequency spectra may be used.
- FIG. 13 is a flowchart illustrating an example of a processing procedure of the accompaniment signal generation method by the audio component removal unit 300 according to the first embodiment of the present invention.
- the frequency spectrum generation units 311 and 312 generate N frequency spectra for each channel based on the stereo signals supplied from the left and right channel signal lines 291 and 292 (step S911).
- the low-level band determination unit 330 calculates the spectrum smooth line SMT (f) of the left channel based on the levels Pl (0 to N ⁇ 1) of the N frequency spectra in the left channel (step S912). . Subsequently, the spectrum numbers f of the frequency spectra Pl (f) and Pr (f) of each channel for which the difference spectrum is to be calculated are set to “0” (step S913).
- the frequency spectrum generation units 311 and 312 output the levels Pl (0) and Pr (0) of the 0th frequency spectrum in the left and right channels, respectively (step S914).
- the difference spectrum calculation unit 320 calculates the 0th difference spectrum D (0) which is the absolute value of the difference (Pl (0) ⁇ Pr (0)) of the 0th frequency spectrum in the right and left channels. (Step S915).
- Step S915 is an example of the difference spectrum calculation procedure described in the claims.
- the low level band determination unit 330 executes a low level band determination process for determining whether or not the calculated zeroth difference spectrum D (0) is a difference spectrum corresponding to the low level band. (Step S930). Then, the spectrum replacement unit 360 determines whether or not the replacement information Info (0) corresponding to the zeroth difference spectrum D (0) is true (step S916).
- step S940 If the replacement information Info (0) is true (TRUE), a spectrum replacement process is executed (step S940). On the other hand, if the replacement information Info (0) is not true (TRUE), the replacement spectrum generating unit 350 does not replace the 0th difference spectrum (0) with the replacement spectrum, and the process proceeds to step S917.
- step S917 “1” is added to the spectrum number f (step S917). Then, it is determined whether or not the added spectrum number f is less than the number of spectra N (step S918). If the spectrum number f is less than the number of spectra N, the process returns to step S914, and a series of processes of steps S914 to S918 and S930 are repeated until the spectrum number f matches the number of spectra N.
- Step S919 is an example of the accompaniment signal generation procedure described in the claims.
- FIG. 14 is a flowchart illustrating an example of a processing procedure of the low-level band determination process (step S930) by the low-level band determination unit 330 according to the first embodiment of the present invention.
- a low level threshold TH (f) obtained by multiplying the spectrum smooth line SMT (f) generated in the process of step S912 by a certain threshold coefficient ⁇ is calculated (step S931).
- the example in which the spectrum smoothing line SMT (f) is generated based on all the frequency spectra in step S912 has been described.
- the average value of a certain number of frequency spectra Pl (f) in the past is used as the spectrum smoothing.
- the line SMT (f) may be used.
- step S932 it is determined whether or not the level D (f) of the difference spectrum output from the difference spectrum calculation unit 320 is less than the low level threshold value TH (f) (step S932). That is, it is determined whether or not the difference spectrum D (f) output from the difference spectrum calculation unit 320 is a difference spectrum corresponding to the low level band.
- Step S933 That is, the frequency band where the level drop in the envelope of the difference spectrum is steep is determined as the low level band. Steps S932 and S933 are an example of the low-level band determination procedure described in the claims.
- step S934 when the difference spectrum D (f) is equal to or higher than the low level threshold TH (f), it is not necessary to replace the difference spectrum D (f) with the replacement spectrum, and the replacement information Info (f) is false ( FALSE) (step S934).
- the processing of these steps S933 or S934 is executed, and the low level band determination processing ends.
- FIG. 15 is a flowchart illustrating a processing procedure example of the spectrum replacement process (step S940) by the spectrum replacement unit 360 according to the first embodiment of the present invention.
- the replacement spectrum generation unit 350 acquires the level adjustment coefficient g (f) from the level adjustment coefficient holding unit 340 (step S941). Subsequently, the replacement spectrum generation unit 350 acquires the frequency spectrum Pl (f) from the frequency spectrum generation unit 311 of the left channel (step S942).
- the replacement spectrum generation unit 350 multiplies the acquired level adjustment coefficient g (f) by the frequency spectrum Pl (f) of the left channel, thereby calculating the replacement spectrum R (f) (step) S943). That is, the replacement spectrum generating unit 350 generates a replacement spectrum for replacing the difference spectrum based on the frequency spectrum in the acoustic signal of the left channel.
- step S943 is an example of a replacement spectrum generation procedure described in the claims.
- Step S944 is an example of the spectrum replacement procedure described in the claims.
- the difference spectrum D (f) corresponding to the low level band is replaced with the replacement spectrum generated based on the frequency spectrum Pl (f) of the left channel.
- an accompaniment signal without a sense of incongruity can be generated.
- the audio component of the accompaniment signal can be sufficiently suppressed by setting the level adjustment coefficient g (f) corresponding to the audio band to be smaller than that of other bands.
- the level adjustment coefficient g (f) corresponding to the audio band becomes relatively small compared to the level of the other difference spectrum, leaving an uncomfortable feeling in the sense of hearing. It may become an accompaniment signal.
- FIG. 16 is a diagram illustrating a configuration example of an audio component removal unit 300 according to the second embodiment of the present invention.
- This speech component removal unit 300 includes a speech coefficient setting unit 651 and a replacement spectrum generation unit 652 instead of the replacement spectrum generation unit 350 shown in FIG.
- the configuration other than the speech coefficient setting unit 651 and the replacement spectrum generation unit 652 is the same as that in FIG. 5, the same reference numerals as those in FIG.
- the audio coefficient setting unit 651 sets an audio coefficient based on the frequency spectrum of the left channel from the frequency spectrum generation unit 311 and the level adjustment coefficient corresponding to the audio band in the level adjustment coefficient holding unit 340.
- the voice coefficient setting unit 651 sets a voice coefficient corresponding to the voice band based on the level ratio of both the band other than the voice band and the frequency spectrum corresponding to the voice band in the entire frequency spectrum of the left channel.
- the voice coefficient setting unit 651 sets the voice coefficient based on the level ratio between the average level of the frequency spectrum corresponding to other than the voice band in the frequency spectrum of the left channel and the average level of the frequency spectrum corresponding to the voice band. Set. That is, the voice coefficient setting unit 651 sets the voice coefficient larger as the level of the frequency spectrum corresponding to the band other than the voice band increases, and sets the voice coefficient smaller as the level of the frequency spectrum corresponding to the voice band increases.
- the speech coefficient setting unit 651 supplies the set speech coefficient and the level adjustment coefficient corresponding to the part other than the speech band in the level adjustment coefficient holding unit 340 to the replacement spectrum generation unit 652.
- the speech coefficient setting unit 651 is an example of the speech coefficient setting unit described in the claims.
- the replacement spectrum generation unit 652 generates a replacement spectrum based on the frequency spectrum of the left channel and the voice coefficient or level adjustment coefficient from the voice coefficient setting unit 651 corresponding to the frequency spectrum.
- the replacement spectrum generation unit 652 generates a replacement spectrum based on the frequency spectrum of the left channel from the frequency spectrum generation unit 311 and the speech coefficient set by the speech coefficient setting unit 651.
- the replacement spectrum generation unit 652 calculates the level of the replacement spectrum by, for example, multiplying the level of the frequency spectrum of the left channel by the voice coefficient or the level adjustment coefficient from the voice coefficient setting unit 651. Further, the replacement spectrum generation unit 652 supplies the calculated replacement spectrum to the spectrum replacement unit 360.
- the replacement spectrum generation unit 652 corresponds to the replacement spectrum generation unit 350 shown in FIG.
- the replacement spectrum generation unit 652 is an example of a replacement spectrum generation unit described in the claims.
- the level of the replacement spectrum corresponding to the audio band can be adjusted according to the level of the frequency spectrum of the left channel.
- a speech coefficient setting method by the speech coefficient setting unit 651 will be described below with reference to the drawings.
- FIG. 17 is a diagram illustrating an example of a speech coefficient setting method performed by the speech coefficient setting unit 651 according to the second embodiment of the present invention.
- the left channel spectrum envelope Pl (f) 316, the accompaniment band average value Pia, and the voice band average value Pva are shown.
- the vertical axis is the power value, and the horizontal axis is the frequency.
- the left channel spectrum envelope Pl (f) indicates the envelope of the left channel frequency spectrum Pl (f) generated by the frequency spectrum generation unit 311.
- the accompaniment band average value Pia indicates an average value of the frequency spectrum Pl (f) in the accompaniment band (0 to fvl).
- the accompaniment band average value Pia is calculated by the audio coefficient setting unit 651.
- the voice band average value Pva indicates an average value of the frequency spectrum Pl (f) in the voice band (fvl to fvh).
- the voice band average value Pva is calculated by the voice coefficient setting unit 651.
- the voice coefficient setting unit 651 calculates the voice coefficient V based on the following equation, for example.
- gv is a level adjustment coefficient in the level adjustment coefficient holding unit 340 corresponding to the voice band.
- V gv ⁇ (Pia / Pva)
- the sound coefficient V based on the level adjustment coefficient gv increases as the accompaniment band average value Pia increases, and the sound coefficient V based on the level adjustment coefficient gv decreases as the sound band average value Pva increases.
- the voice coefficient V takes a value larger than the level adjustment coefficient gv. For this reason, the level of the replacement spectrum corresponding to the voice band is increased, the level difference from the difference spectrum corresponding to the band other than the voice band is reduced, and auditory noise in the accompaniment signal can be suppressed.
- the voice coefficient V is smaller than the level adjustment coefficient gv.
- the level of the replacement spectrum corresponding to the voice band is reduced, the level difference from the difference spectrum corresponding to the band other than the voice band is reduced, and auditory noise in the accompaniment signal can be suppressed.
- the level of the replacement spectrum corresponding to the sound component is reduced, the sound component in the accompaniment signal can be further suppressed as compared with the constant level adjustment coefficient gv.
- the level of the replacement spectrum corresponding to the audio band can be adjusted according to the characteristics of the frequency spectrum of the left channel. it can. That is, the level of the replacement spectrum corresponding to the voice band can be adjusted based on the frequency characteristic of the frequency spectrum of the left channel that approximates the frequency characteristic of the difference spectrum.
- FIG. 18 is a flowchart illustrating an example of a processing procedure of spectrum replacement processing (step S950) in the speech component removal unit 300 according to the second embodiment of the present invention.
- This step S950 processing corresponds to the processing in step S940 shown in FIG.
- the accompaniment band average value Pia and the voice band average value Pva are calculated by the voice coefficient setting unit 651 based on the level of the frequency spectrum from the frequency spectrum generation unit 311.
- the level adjustment coefficient holding unit 340 holds the level adjustment coefficient gv corresponding to the voice band shown in FIG.
- the level adjustment coefficient g (f) is acquired from the level adjustment coefficient holding unit 340 by the audio coefficient setting unit 651 (step S951). Subsequently, the replacement spectrum generation unit 652 acquires the frequency spectrum Pl (f) of the left channel from the frequency spectrum generation unit 311 (step S952).
- the speech coefficient setting unit 651 determines whether or not the spectrum number f is a number corresponding to the speech band (step S953). If the spectrum number f is not a number corresponding to the voice band, the replacement spectrum generator 652 multiplies the level adjustment coefficient g (f) and the frequency spectrum Pl (f) of the left channel by the replacement spectrum R. (F) is calculated (step S958).
- the voice coefficient setting unit 651 calculates a voice coefficient V obtained by multiplying the ratio of the accompaniment band average value Pia to the voice band average value Pva by the level adjustment coefficient gv corresponding to the voice band (step S955).
- the replacement spectrum generation unit 652 multiplies the calculated speech coefficient V by the frequency spectrum Pl (f) of the left channel to calculate a replacement spectrum R (f) (step S956). Note that steps S953 to S956 and S958 are an example of a replacement spectrum generation procedure described in the claims.
- Step S957 is an example of a spectrum replacement procedure described in the claims.
- the level of the replacement spectrum corresponding to the voice band is appropriately set according to the size of the accompaniment component in the frequency spectrum of the left channel that approximates the frequency characteristic of the difference spectrum. Can be adjusted.
- the accompaniment without a sense of incongruity is obtained by replacing the difference spectrum corresponding to the low level band with the replacement spectrum.
- a signal can be generated. That is, a more natural accompaniment signal can be generated by correcting the frequency component of the difference signal based on the frequency spectrum of the left channel that approximates the frequency characteristic of the difference signal.
- the replacement spectrum is generated based on the frequency spectrum of the left channel.
- the replacement spectrum is generated based on the frequency spectrum of the right channel from the frequency spectrum generation unit 312. It may be.
- a replacement spectrum may be generated based on the level of the frequency spectrum of the right and left channels.
- FIG. 19 is a block diagram illustrating a configuration example of an audio component removal unit 300 according to the third embodiment of the present invention.
- the audio component removing unit 300 includes a frequency spectrum adding unit 380 in addition to the audio component removing unit 300 shown in FIG.
- the configuration other than the frequency spectrum addition unit 380 is the same as that shown in FIG. 5, the same reference numerals are given and the description thereof is omitted here.
- the frequency spectrum adding unit 380 adds the right and left channel frequency spectra respectively supplied from the frequency spectrum generating units 311 and 312 and divides the added value by two. That is, the frequency spectrum adding unit 380 calculates the average value of the frequency spectra of the left and right channels. Further, the frequency spectrum addition unit 380 supplies the calculated average value of the frequency spectrum to the replacement spectrum generation unit 350 and the low level band determination unit 330.
- the frequency component of the difference signal is corrected by the replacement spectrum based on the average value of the frequency characteristics of both the right and left channels. can do.
- the deviation of components included in the right and left channel acoustic signals is removed, so that more natural spectrum correction can be performed. That is, by generating a replacement spectrum based on at least one of the frequency spectra in the two-channel acoustic signal, auditory noise for the accompaniment signal can be suppressed.
- the enhancement filter that amplifies the low frequency component, and the mid frequency component to attenuate the audio component.
- An attenuation filter that attenuates the frequency component may be provided.
- the embodiment of the present invention shows an example for embodying the present invention, and as clearly shown in the embodiment of the present invention, the matters in the embodiment of the present invention and the scope of claims There is a corresponding relationship with the invention-specific matters in. Similarly, the invention specific matter in the claims and the matter in the embodiment of the present invention having the same name as this have a corresponding relationship.
- the present invention is not limited to the embodiments, and can be embodied by making various modifications to the embodiments without departing from the gist of the present invention.
- the processing procedure described in the embodiment of the present invention may be regarded as a method having a series of these procedures, and a program for causing a computer to execute the series of procedures or a recording medium storing the program May be taken as
- this recording medium for example, a CD (CompactDisc), an MD (MiniDisc), a DVD (Digital Versatile Disk), a memory card, a Blu-ray Disc (Blu-ray Disc (registered trademark)), or the like can be used.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Remote Sensing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Electromagnetism (AREA)
- Geophysics (AREA)
- Mathematical Physics (AREA)
- General Physics & Mathematics (AREA)
- General Life Sciences & Earth Sciences (AREA)
- Geology (AREA)
- Environmental & Geological Engineering (AREA)
- Quality & Reliability (AREA)
- Stereophonic System (AREA)
- Reverberation, Karaoke And Other Acoustics (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
1.第1の実施の形態(伴奏信号生成手法:左チャンネルの周波数成分に基づいて置換スペクトルを生成する例)
2.第2の実施の形態(伴奏信号生成手法:置換スペクトルのレベル調整の音声係数を左チャンネルの周波数成分に基づいて設定する例)
3.第3の実施の形態(伴奏信号生成手法:右および左チャンネルの周波数成分に基づいて置換スペクトルを生成する例)
[音楽再生装置の構成例]
図1は、本発明の第1の実施の形態における音楽再生装置の一構成例を示すブロック図である。音楽再生装置100は、操作受付部110と、制御部120と、表示部130と、音響データ記憶部140と、音響データ入力部150と、アナログ変換部160と、アンプ170と、スピーカ180とを備える。なお、音楽再生装置100は、請求の範囲に記載の音響信号処理装置の一例である。
図2は、従来の音響信号符号化装置の一構成を示すブロック図である。ここでは一例として、インテンシティ法による符号化処理を行う音響信号符号化装置700について説明する。この音響信号符号化装置700は、入力線701および702を介して入力された2チャンネルの音響信号を符号化して、その符号化された音響信号を音響符号化データとして出力線759を介して出力するものである。
図3は、正規化部721および722において分割される周波数スペクトルに関する一例を示す概念図である。図3の(a)は、左チャンネルの正規化部721により音響信号の周波数成分である周波数スペクトルが所定帯域ごとに分割された分割帯域を示す概念図である。図3の(b)は、図3の(a)に示した分割帯域における周波数スペクトルを示す概念図である。
図4は、本発明の第1の実施の形態における音響信号復号処理部200の一構成例を示すブロック図である。音響信号復号処理部200は、復号部210と、左チャンネル逆量子化部221と、右チャンネル逆量子化部222と、共有帯域逆量子化部223と、選択部231および232と、逆正規化部241および242と、音響信号生成部251および252とを備える。
図5は、本発明の第1の実施の形態における音声成分除去部300の一構成例を示すブロック図である。この音声成分除去部300は、信号線290に含まれる左および右チャンネル信号線291および292を介して音響信号復号処理部200から供給される各チャンネルの音響信号における音声成分を低減して伴奏信号として出力する。
図6は、左および右チャンネルにおける音響信号の差分に基づいて生成される差分信号における音声成分および伴奏成分の周波数分布の一例を示す概念図である。ここでは、ボーカルの音声が中央に定位し、伴奏における各楽器の定位が散在する右および左チャンネルの音響信号であるステレオ信号を減算部321において減算することによって差分信号を生成することを想定する。
図7は、音響信号符号化装置700における量子化部731および732による量子化に起因して生じる低レベル帯域に関する図である。図7の(a)および(b)は、音響信号符号化装置700における正規化部721および722によりそれぞれ生成された左正規化成分771および右正規化成分772の一例を示す図である。図7の(c)は、左正規化成分771および右正規化成分772の差分絶対値である正規化差分絶対値773を示す図である。
図8は、音響信号符号化装置700における共有帯域符号化部800による共有帯域符号化処理に起因して発生する低レベル帯域に関する図である。ここでは、共有帯域符号化部800により、左および右チャンネルの正規化成分の相関度が高い第i番の分割帯域B[i]が共有帯域と判定され、その共有帯域における左チャンネルの正規化成分が量子化させることを想定している。
図9は、本発明の第1の実施の形態における差分スペクトル算出部320により算出された差分スペクトルに基づく分割帯域B[i]の一例を示す概念図である。ここでは、便宜上、図3の(a)に示した周波数スペクトル包絡線725のようなスペクトル包絡線を省略している。
図10は、本発明の第1の実施の形態における音声成分除去部300により低レベル帯域に対応する差分スペクトルを置換スペクトルに置き換える例を示す観念図である。
図11は、本発明の第1の実施の形態におけるレベル調整係数保持部340に保持されたレベル調整係数の周波数特性341の一例を示す図である。ここでは、横軸を周波数とし、縦軸をレベル調整係数の大きさとする。
図12は、本発明の第1の実施の形態における低レベル帯域判定部330による低レベル帯域に対応する差分スペクトルの判定手法例に関する図である。ここでは、左チャンネルスペクトル包絡線315と、左チャンネルスペクトル平滑線331と、差分スペクトル包絡線322と、低レベル閾値線332とが示されている。また、ここでは、縦軸をパワーとし、横軸を周波数とする。
次に本発明の第1の実施の形態における音声成分除去部300の動作について図面を参照して説明する。
図14は、本発明の第1の実施の形態における低レベル帯域判定部330による低レベル帯域判定処理(ステップS930)の処理手順例を示すフローチャートである。
図15は、本発明の第1の実施の形態におけるスペクトル置換部360によるスペクトル置換処理(ステップS940)の処理手順例を示すフローチャートである。
[音声成分除去部300の構成例]
図16は、本発明の第2の実施の形態における音声成分除去部300の一構成例を示す図である。この音声成分除去部300は、図5に示した置換スペクトル生成部350に代えて、音声係数設定部651および置換スペクトル生成部652を備えている。ここでは、音声係数設定部651および置換スペクトル生成部652以外の構成は、図5と同様のものであるため、図5と同一符号を付してここでの説明を省略する。
図17は、本発明の第2の実施の形態における音声係数設定部651による音声係数の設定手法に関する一例を示す図である。ここでは、左チャンネルスペクトル包絡線Pl(f)316と、伴奏帯域平均値Piaと、音声帯域平均値Pvaとが示されている。また、縦軸をパワー値とし、横軸を周波数とする。
V = gv×(Pia/Pva)
図18は、本発明の第2の実施の形態における音声成分除去部300におけるスペクトル置換処理(ステップS950)の処理手順例を示すフローチャートである。このステップS950処理は、図13に示したステップS940の処理に対応する。また、ここでは、音声係数設定部651により、周波数スペクトル生成部311からの周波数スペクトルのレベルに基づいて伴奏帯域平均値Piaおよび音声帯域平均値Pvaが算出されていることを想定する。また、レベル調整係数保持部340には、図11に示した音声帯域に対応するレベル調整係数gvが保持されていることとする。
図19は、本発明の第3の実施の形態における音声成分除去部300の一構成例を示すブロック図である。音声成分除去部300は、図5に示した音声成分除去部300に加えて周波数スペクトル加算部380を備えている。ここでは、周波数スペクトル加算部380以外の構成は、図5に示したものと同様であるため、同一符号を付してここでの説明を省略する。
110 操作受付部
120 制御部
130 表示部
140 音響データ記憶部
150 音響データ入力部
160 アナログ変換部
170 アンプ
180 スピーカ
200 音響信号復号処理部
210 復号部
221 左チャンネル逆量子化部
222 右チャンネル逆量子化部
223 共有帯域逆量子化部
231、232 選択部
241 逆正規化部
251 音響信号生成部
300 音声成分除去部
311、312 周波数スペクトル生成部
320 差分スペクトル算出部
330 低レベル帯域判定部
340 レベル調整係数保持部
350 置換スペクトル生成部
360 スペクトル置換部
370 伴奏信号生成部
380 周波数スペクトル加算部
651 音声係数設定部
652 置換スペクトル生成部
Claims (9)
- 複数チャンネルの音響信号のうち略等しい周波数分布の音声成分が含まれる2チャンネルの音響信号における周波数スペクトルの差分を差分スペクトルとして算出する差分スペクトル算出部と、
前記差分スペクトル算出部により算出された差分スペクトルの包絡線におけるレベル低下が急峻である周波数帯域を低レベル帯域と判定する低レベル帯域判定部と、
前記差分スペクトルを置き換えるための置換スペクトルを前記2チャンネルの音響信号における周波数スペクトルの少なくとも一方に基づいて生成する置換スペクトル生成部と、
前記差分スペクトル算出部により算出された差分スペクトルのうち前記低レベル帯域に対応する前記差分スペクトルを前記置換スペクトルに置き換えるスペクトル置換部と、
前記スペクトル置換部から出力された周波数スペクトルを時間領域の信号に変換することによって伴奏信号を生成する伴奏信号生成部と
を具備する音響信号処理装置。 - 前記置換スペクトル生成部は、前記2チャンネルの音響信号における少なくとも一方の周波数スペクトルと前記置換スペクトルのレベルを調整するための所定のレベル調整係数とに基づいて前記置換スペクトルを生成する請求項1記載の音響信号処理装置。
- 前記置換スペクトル生成部は、音声帯域以外の帯域に対応する前記レベル調整係数に比べて小さい前記音声帯域の前記レベル調整係数と前記少なくとも一方の周波数スペクトルのレベルとに基づいて前記置換スペクトルを生成する請求項2記載の音響信号処理装置。
- 前記2チャンネルの音響信号における少なくとも一方の周波数スペクトルにおける音声帯域以外の帯域および前記音声帯域に対応する前記周波数スペクトルのレベル比に基づいて前記音声帯域に対応する音声係数を設定する音声係数設定部をさらに具備し、
置換スペクトル生成部は、前記少なくとも一方の周波数スペクトルと前記音声係数設定部により設定された音声係数とに基づいて前記置換スペクトルを生成する
請求項1記載の音響信号処理装置。 - 前記音声係数設定部は、前記音声帯域以外の帯域に対応する前記周波数スペクトルのレベルが大きくなるほど前記音声係数を大きく設定し、前記音声帯域に対応する前記周波数スペクトルのレベルが大きくなるほど前記音声係数を小さく設定する請求項4記載の音響信号処理装置。
- 前記低レベル帯域判定部は、前記包絡線におけるレベル低下が急峻である周波数帯域を特定するための低レベル閾値と前記差分スペクトルの各々のレベルとに基づいて前記低レベル帯域を判定する請求項1記載の音響信号処理装置。
- 前記低レベル帯域判定部は、前記2チャンネルの音響信号における少なくとも一方の周波数スペクトルのレベルに基づいて設定した前記低レベル閾値と前記差分スペクトルのレベルとを用いて前記低レベル帯域を判定する請求項6記載の音響信号処理装置。
- 複数チャンネルの音響信号のうち略等しい周波数分布の音声成分が含まれる2チャンネルの音響信号における周波数スペクトルの差分を差分スペクトルとして算出する差分スペクトル算出手順と、
前記差分スペクトル算出手順により算出された差分スペクトルの包絡線におけるレベル低下が急峻である周波数帯域を低レベル帯域と判定する低レベル帯域判定手順と、
前記差分スペクトルを置き換えるための置換スペクトルを前記2チャンネルの音響信号における周波数スペクトルの少なくとも一方に基づいて生成する置換スペクトル生成手順と、
前記差分スペクトル算出手順により算出された差分スペクトルのうち前記低レベル帯域に対応する前記差分スペクトルを前記置換スペクトルに置き換えるスペクトル置換手順と、
前記スペクトル置換手順により出力された周波数スペクトルを時間領域の信号に変換することによって伴奏信号を生成する伴奏信号生成手順と
を具備する伴奏信号生成方法。 - 複数チャンネルの音響信号のうち略等しい周波数分布の音声成分が含まれる2チャンネルの音響信号における周波数スペクトルの差分を差分スペクトルとして算出する差分スペクトル算出手順と、
前記差分スペクトル算出手順により算出された差分スペクトルの包絡線におけるレベル低下が急峻である周波数帯域を低レベル帯域と判定する低レベル帯域判定手順と、
前記差分スペクトルを置き換えるための置換スペクトルを前記2チャンネルの音響信号における周波数スペクトルの少なくとも一方に基づいて生成する置換スペクトル生成手順と、
前記差分スペクトル算出手順により算出された差分スペクトルのうち前記低レベル帯域に対応する前記差分スペクトルを前記置換スペクトルに置き換えるスペクトル置換手順と、
前記スペクトル置換手順により出力された周波数スペクトルを時間領域の信号に変換することによって伴奏信号を生成する伴奏信号生成手順と
をコンピュータに実行させるプログラム。
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201080002466.5A CN102138341B (zh) | 2009-07-07 | 2010-06-30 | 声信号处理设备及其处理方法 |
US13/061,687 US8891774B2 (en) | 2009-07-07 | 2010-06-30 | Acoustic signal processing apparatus, processing method therefor, and program |
HK11113424.6A HK1159391A1 (en) | 2009-07-07 | 2011-12-13 | Acoustic signal processing device and processing method thereof |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2009160561A JP5365380B2 (ja) | 2009-07-07 | 2009-07-07 | 音響信号処理装置、その処理方法およびプログラム |
JP2009-160561 | 2009-07-07 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2011004744A1 true WO2011004744A1 (ja) | 2011-01-13 |
Family
ID=43429166
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2010/061108 WO2011004744A1 (ja) | 2009-07-07 | 2010-06-30 | 音響信号処理装置、その処理方法およびプログラム |
Country Status (6)
Country | Link |
---|---|
US (1) | US8891774B2 (ja) |
JP (1) | JP5365380B2 (ja) |
CN (1) | CN102138341B (ja) |
HK (1) | HK1159391A1 (ja) |
TW (1) | TWI391916B (ja) |
WO (1) | WO2011004744A1 (ja) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DK2974084T3 (da) | 2013-03-12 | 2020-11-09 | Hear Ip Pty Ltd | Fremgangsmåde og system til støjreduktion |
JP6314803B2 (ja) * | 2014-11-26 | 2018-04-25 | ソニー株式会社 | 信号処理装置、信号処理方法及びプログラム |
EP3741136B1 (de) * | 2018-01-18 | 2024-06-26 | ASK Industries GmbH | Verfahren zur ausgabe eines ein musikstück abbildenden audiosignals in einen innenraum über eine ausgabeeinrichtung |
CN111667805B (zh) * | 2019-03-05 | 2023-10-13 | 腾讯科技(深圳)有限公司 | 一种伴奏音乐的提取方法、装置、设备和介质 |
CN111613197B (zh) * | 2020-05-15 | 2023-05-26 | 腾讯音乐娱乐科技(深圳)有限公司 | 音频信号处理方法、装置、电子设备及存储介质 |
CN115914910A (zh) | 2021-08-17 | 2023-04-04 | 达发科技股份有限公司 | 适应性主动噪声消除装置以及使用其的声音播放*** |
TWI777729B (zh) * | 2021-08-17 | 2022-09-11 | 達發科技股份有限公司 | 適應性主動雜訊消除裝置以及使用其之聲音播放系統 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10143171A (ja) * | 1996-11-07 | 1998-05-29 | Sony Corp | 信号処理装置および方法 |
JP2005141121A (ja) * | 2003-11-10 | 2005-06-02 | Matsushita Electric Ind Co Ltd | オーディオ再生装置 |
JP2005326587A (ja) * | 2004-05-13 | 2005-11-24 | Fuji Television Network Inc | 音響信号除去装置、音響信号除去方法及び音響信号除去プログラム |
JP2008072600A (ja) * | 2006-09-15 | 2008-03-27 | Kobe Steel Ltd | 音響信号処理装置、音響信号処理プログラム、音響信号処理方法 |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
SE512719C2 (sv) * | 1997-06-10 | 2000-05-02 | Lars Gustaf Liljeryd | En metod och anordning för reduktion av dataflöde baserad på harmonisk bandbreddsexpansion |
US6405163B1 (en) * | 1999-09-27 | 2002-06-11 | Creative Technology Ltd. | Process for removing voice from stereo recordings |
JP3810004B2 (ja) * | 2002-03-15 | 2006-08-16 | 日本電信電話株式会社 | ステレオ音響信号処理方法、ステレオ音響信号処理装置、ステレオ音響信号処理プログラム |
JP4594681B2 (ja) * | 2004-09-08 | 2010-12-08 | ソニー株式会社 | 音声信号処理装置および音声信号処理方法 |
JP2006100869A (ja) * | 2004-09-28 | 2006-04-13 | Sony Corp | 音声信号処理装置および音声信号処理方法 |
-
2009
- 2009-07-07 JP JP2009160561A patent/JP5365380B2/ja not_active Expired - Fee Related
-
2010
- 2010-06-09 TW TW099118783A patent/TWI391916B/zh not_active IP Right Cessation
- 2010-06-30 US US13/061,687 patent/US8891774B2/en not_active Expired - Fee Related
- 2010-06-30 WO PCT/JP2010/061108 patent/WO2011004744A1/ja active Application Filing
- 2010-06-30 CN CN201080002466.5A patent/CN102138341B/zh not_active Expired - Fee Related
-
2011
- 2011-12-13 HK HK11113424.6A patent/HK1159391A1/xx not_active IP Right Cessation
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH10143171A (ja) * | 1996-11-07 | 1998-05-29 | Sony Corp | 信号処理装置および方法 |
JP2005141121A (ja) * | 2003-11-10 | 2005-06-02 | Matsushita Electric Ind Co Ltd | オーディオ再生装置 |
JP2005326587A (ja) * | 2004-05-13 | 2005-11-24 | Fuji Television Network Inc | 音響信号除去装置、音響信号除去方法及び音響信号除去プログラム |
JP2008072600A (ja) * | 2006-09-15 | 2008-03-27 | Kobe Steel Ltd | 音響信号処理装置、音響信号処理プログラム、音響信号処理方法 |
Also Published As
Publication number | Publication date |
---|---|
JP5365380B2 (ja) | 2013-12-11 |
US8891774B2 (en) | 2014-11-18 |
HK1159391A1 (en) | 2012-07-27 |
US20120114142A1 (en) | 2012-05-10 |
CN102138341B (zh) | 2014-03-12 |
TW201126518A (en) | 2011-08-01 |
TWI391916B (zh) | 2013-04-01 |
JP2011018962A (ja) | 2011-01-27 |
CN102138341A (zh) | 2011-07-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2016202800B2 (en) | Signal processing apparatus and method, and program | |
JP4899359B2 (ja) | 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体 | |
JP5365380B2 (ja) | 音響信号処理装置、その処理方法およびプログラム | |
KR101373004B1 (ko) | 고주파수 신호 부호화 및 복호화 장치 및 방법 | |
RU2494477C2 (ru) | Устройство и способ генерирования выходных данных расширения полосы пропускания | |
JP5267362B2 (ja) | オーディオ符号化装置、オーディオ符号化方法及びオーディオ符号化用コンピュータプログラムならびに映像伝送装置 | |
US20130030818A1 (en) | Signal processing apparatus and signal processing method, encoder and encoding method, decoder and decoding method, and program | |
JP6769299B2 (ja) | オーディオ符号化装置およびオーディオ符号化方法 | |
JP5873936B2 (ja) | 知覚的オーディオコーデックにおけるハーモニック信号のための位相コヒーレンス制御 | |
WO2006075563A1 (ja) | オーディオ符号化装置、オーディオ符号化方法およびオーディオ符号化プログラム | |
KR20100086000A (ko) | 오디오 신호 처리 방법 및 장치 | |
JP2010079275A (ja) | 周波数帯域拡大装置及び方法、符号化装置及び方法、復号化装置及び方法、並びにプログラム | |
JP2005202248A (ja) | オーディオ符号化装置およびオーディオ符号化装置のフレーム領域割り当て回路 | |
JP2011059714A (ja) | 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体 | |
KR20160120713A (ko) | 복호 장치, 부호화 장치, 복호 방법, 부호화 방법, 단말 장치, 및 기지국 장치 | |
KR100891666B1 (ko) | 믹스 신호의 처리 방법 및 장치 | |
US20130085762A1 (en) | Audio encoding device | |
JP4973397B2 (ja) | 符号化装置および符号化方法、ならびに復号化装置および復号化方法 | |
JP5569476B2 (ja) | 信号符号化装置及び方法、信号復号装置及び方法、並びにプログラム及び記録媒体 | |
JP6439843B2 (ja) | 信号処理装置および方法、並びにプログラム | |
JP2007178529A (ja) | 符号化オーディオ信号再生装置及び符号化オーディオ信号再生方法 | |
JP2009031676A (ja) | 信号処理装置及び信号処理方法、並びにプログラム | |
JP2005148539A (ja) | オーディオ信号符号化装置およびオーディオ信号符号化方法 | |
JP2016105180A (ja) | 信号処理装置および方法、並びにプログラム |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 201080002466.5 Country of ref document: CN |
|
WWE | Wipo information: entry into national phase |
Ref document number: 13061687 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 10797055 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 10797055 Country of ref document: EP Kind code of ref document: A1 |