WO2006022190A1 - オーディオエンコーダ - Google Patents
オーディオエンコーダ Download PDFInfo
- Publication number
- WO2006022190A1 WO2006022190A1 PCT/JP2005/015083 JP2005015083W WO2006022190A1 WO 2006022190 A1 WO2006022190 A1 WO 2006022190A1 JP 2005015083 W JP2005015083 W JP 2005015083W WO 2006022190 A1 WO2006022190 A1 WO 2006022190A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frequency band
- signal
- audio
- quantization
- audio encoder
- Prior art date
Links
- 238000013139 quantization Methods 0.000 claims abstract description 90
- 230000005236 sound signal Effects 0.000 claims abstract description 72
- 238000000034 method Methods 0.000 claims description 58
- 230000006835 compression Effects 0.000 claims description 21
- 238000007906 compression Methods 0.000 claims description 21
- 238000006243 chemical reaction Methods 0.000 claims description 18
- 238000001514 detection method Methods 0.000 claims description 13
- 230000035945 sensitivity Effects 0.000 claims description 4
- 239000003292 glue Substances 0.000 claims 1
- 238000010586 diagram Methods 0.000 description 24
- 238000001228 spectrum Methods 0.000 description 6
- 230000006866 deterioration Effects 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- 239000013598 vector Substances 0.000 description 4
- 230000002441 reversible effect Effects 0.000 description 3
- 230000005540 biological transmission Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000015556 catabolic process Effects 0.000 description 2
- 239000000470 constituent Substances 0.000 description 2
- 238000006731 degradation reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 210000001217 buttock Anatomy 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- HPIGCVXMBGOWTF-UHFFFAOYSA-N isomaltol Natural products CC(=O)C=1OC=CC=1O HPIGCVXMBGOWTF-UHFFFAOYSA-N 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000008825 perceptual sensitivity Effects 0.000 description 1
- 238000007788 roughening Methods 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B27/00—Editing; Indexing; Addressing; Timing or synchronising; Monitoring; Measuring tape travel
- G11B27/02—Editing, e.g. varying the order of information signals recorded on, or reproduced from, record carriers
- G11B27/022—Electronic editing of analogue information signals, e.g. audio or video signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/00007—Time or data compression or expansion
Definitions
- the present invention relates to an audio encoder that encodes a multichannel signal of at least two channels.
- the present invention relates to a technique for specifically creating auxiliary information necessary to separate a mixed signal (downmix signal) obtained by downmixing a multichannel signal into an original multichannel signal.
- Spatial Codec spatial coding
- the AAC format which is a multi-channel codec that is already widely used as an audio system for digital television, requires a bit rate of 512 kbps or 384 kbps per lch.
- the Spatial Codec we aim to compress and code multi-channel signals at a very small bit rate of 128kbps, 64kbps, and 48kbps.
- Patent Document 1 discloses a technique for that purpose.
- BCC auditory space
- the spectral components of the input signal are down-mixed to generate BCC parameters (eg, inter-channel level and Z or time difference) for stereo implementation.
- BCC parameters eg, inter-channel level and Z or time difference
- the spectral components of the paired left and right channels are down-mixed into a mono component.
- these mono components and the spectral components of the left and right channels that were not down-mixed are inversely transformed into the time domain to generate a hybrid stereo signal, and then these hybrid stereo signals are converted into conventional code signals.
- It is signed using a technique.
- the encoded bit stream is decoded using conventional decoding techniques.
- BCC synthesis technology By applying BCC parameters using the technique, an auditory scene is synthesized based on the stereo component that has not been down-mixed with these mono components.
- Patent Document 1 US2003Z0236583A1 (corresponding Japanese application, Japanese Patent Application Laid-Open No. 2004-78183)
- Patent Document 1 when one or more auditory spatial parameters are generated for the one or more down-mixed spectral components, the description, the one or more If the auditory spatial parameters include the level difference between one or more channels and the time difference between channels, it is necessary to quantize and compress such information (auxiliary information) simply by the description. I wonder if it is disclosed.
- the present invention has been made in view of such a conventional problem, and is capable of decoding only a mixed signal, and more specifically, auxiliary information necessary for separating the mixed signal.
- An object of the present invention is to provide an audio encoder that can be created.
- the audio encoder according to the present invention is an audio encoder that compresses and encodes an N-channel (N> 1) audio signal, and mixes the audio signal.
- N N> 1
- the mixed signal encoding means for encoding and the mixed signal encoded by the mixed signal encoding means into the N-channel audio signal In order to achieve the above object, the audio encoder according to the present invention is an audio encoder that compresses and encodes an N-channel (N> 1) audio signal, and mixes the audio signal.
- Auxiliary information generating means for generating necessary auxiliary information, wherein the auxiliary information generating means Conversion means for respectively converting the signals in the domain, division means for dividing the frequency band of the frequency domain signal, phase difference information and gain ratio information representing the degree of difference between the signals in the frequency domain, And a quantization means for quantizing the phase difference information and the gain ratio information detected by the detection means for each corresponding frequency band.
- the present invention can be realized not only as an audio encoder such as this, but also as a coding method that uses characteristic means provided in such an audio encoder as a step, or to perform these steps. It can also be realized as a program executed by a computer. Further, it is possible to configure as an LSI in which characteristic means provided in such an audio encoder are integrated. Needless to say, such a program can be distributed via a recording medium such as a CD-ROM or a transmission medium such as the Internet.
- the audio encoder of the present invention only the mixed signal can be decoded, and auxiliary information necessary for separating the mixed signal is specified. The effect that it can create automatically is produced.
- the present invention enables simple and high-quality playback, and simple music playback on mobile devices such as mobile phones and full-scale music playback on AV devices have become widespread today.
- the practical value of the present invention is extremely high.
- FIG. 1 is a block diagram showing an overall configuration of an audio signal code decoding system to which an audio encoder according to the present invention is applied.
- FIG. 2 is a diagram showing the relationship on the frequency axis between two-channel audio signals, mixed signals, gain ratios, and phase differences.
- FIG. 3 is a diagram showing a format structure of a bit stream output from the audio encoder 10.
- FIG. 4 is a block diagram showing a detailed configuration example of an auxiliary information generation unit shown in FIG. 1.
- FIG. 5 is a diagram showing an example of a quantization accuracy setting table 124.
- FIG. 6 is a diagram for comparing the prior art with the present invention.
- FIG. 6 (a) is a diagram showing the quantization accuracy in the prior art
- FIG. 6 (b) is a diagram showing the quantization accuracy in the present invention.
- FIG. 7 is a block diagram showing another detailed configuration example of the auxiliary information generation unit in the second embodiment.
- FIG. 8 is a block diagram showing still another detailed configuration example of the auxiliary information generating unit in the third embodiment.
- FIG. 9 is a diagram showing an example of a frequency division table 1271 related to gain ratio information.
- FIG. 10 is a diagram showing an example of a frequency division table 1272 related to phase difference information.
- FIG. 11 is a diagram showing an example of a quantization accuracy table 1281 in which the gain accuracy information quantization accuracy and the phase difference information quantization accuracy are separately set.
- FIG. 12 is a diagram showing an example of a frequency division table related to gain ratio information during operation in the low bit rate mode.
- FIG. 13 is a diagram showing an example of a frequency division table related to phase difference information during operation in the low bit rate mode.
- FIG. 14 is a diagram for explaining the feature of the present invention.
- Figs. 14 (a) and 14 (b) show a comparison of quantization accuracy at high and low bit rates.
- Figs. 14 (c) and 14 (d) It is a figure which compares and shows the quantization precision in phase difference information and gain ratio information.
- FIG. 1 is a block diagram showing the overall configuration of an audio signal encoding / decoding system configured using an audio encoder according to the present invention.
- the audio signal encoding / decoding system 1 includes an audio encoder 10 that compresses and encodes an audio signal of N channels (N> 1), and an audio encoder 10 that compresses the audio signal.
- the audio decoder 20 decodes the input audio signal.
- N N channels
- the audio decoder 20 decodes the input audio signal.
- a case where a 2-channel audio signal is encoded is shown.
- the audio encoder 10 is encoded by a mixed signal encoding unit 11 that encodes a mixed signal obtained by mixing two-channel input audio signals and a mixed signal encoding unit 11.
- a mixed signal encoding unit 11 that encodes a mixed signal obtained by mixing two-channel input audio signals and a mixed signal encoding unit 11.
- Auxiliary information generation unit 12 that generates auxiliary information (level ratio, phase difference) necessary for the transmission, a mixed signal encoded by the mixed signal encoding unit 11, and auxiliary information generated by the auxiliary information generation unit 12
- a formatter 13 for generating a bit stream and outputting the generated bit stream to the audio decoder 20.
- the mixed signal encoding unit 11 synthesizes the mixed signal into a vector indicated by a code X.
- one of the input audio signals has a normal value whose absolute value is 1 and the other is normally a level ratio D.
- the auxiliary information generation unit 12 detects the level ratio D and the phase difference ⁇ of the input audio signals of the two channels and quantizes them for each corresponding band. Details of the configuration of the auxiliary information generation unit 12 will be described later.
- the formatter 13 connects the mixed signal and the auxiliary information every predetermined frame to generate a bit stream.
- FIG. 3 is a diagram showing a format configuration of the bit stream. In FIG. 3, only one frame is illustrated.
- a mixed signal obtained by down-mixing two-channel signals is compression-encoded by the MPEG standard AAC method, and the encoded mixed signal is stored!
- downmixing is the process of vector synthesis of signals!
- the area ⁇ stores auxiliary information including a value representing the gain ratio D between the two-channel audio signals and a value representing the phase difference ⁇ between the two-channel audio signals.
- the value representing the phase difference ⁇ does not necessarily have to be a direct sign of the phase difference ⁇ .
- the phase difference ⁇ can be expressed in the range of 0 ° to 180 ° by the value of cos ⁇ .
- the audio decoder 20 includes a deformer 21 that separates the encoded mixed signal and the auxiliary information from the bit stream received from the audio encoder 10 for each frame, and a deformer 21.
- a mixed signal decoding unit 22 that decodes the separated encoded mixed signal, and an output selection unit 23 that selectively outputs one of the mixed audio signal and the N-channel audio signal.
- the output selection unit 23 includes an output destination selection switch 231 and a channel extension decoding unit 232.
- the audio decoder 20 is a monofilar device such as a mobile phone and is easily played back with headphones
- the audio signal is decoded by the mixed signal decoding unit 22 by the output destination selection switch 231.
- the mixed signal is output as it is.
- the mixed signal decoded by the mixed signal decoding unit 22 by the output destination selection switch 231 is channel extended decoding.
- the channel extension decoding unit 232 performs processing reverse to that of the auxiliary information generation unit 12, that is, performs inverse quantization, decodes the level ratio and phase difference, and then outputs the output destination selection switch 231 on the frequency axis. From the input to the input signal, the processing opposite to that shown in Fig. 2, that is, the diagonal line corresponds to the mixed signal and the apex angle is the phase difference ⁇ .
- FIG. 4 is a block diagram illustrating a detailed configuration example of the auxiliary information generation unit illustrated in FIG.
- the auxiliary information generation unit 12a includes the first conversion unit 121 and the second conversion unit.
- the first converter 121 converts the first input audio signal into a frequency band signal.
- the second converter 122 converts the second input audio signal into a frequency band signal.
- the detection unit 123 detects the degree of difference between the corresponding frequency band signals of the first input audio signal and the second input audio signal.
- the quantization accuracy setting table 124 sets the accuracy of quantization in the quantization unit 125 for each frequency band.
- the quantization unit 125 quantizes the degree of difference for each detected frequency band.
- the first converter 121 converts the first input audio signal into a plurality of frequency band signals.
- This is a method in which, for example, a Fourier transform or a cosine transform is used to convert an input audio signal into a frequency spectrum signal, and several spectrum signals are combined to form a predetermined frequency band signal.
- the input audio signal is converted into 1024 frequency spectrum signals, and the lowest frequency frequency signal with four frequency vector signals is combined into the first frequency band signal, and the next four frequency spectrum signals.
- a larger number of frequency spectrum signals may be combined into a frequency band signal.
- the frequency band signal may be obtained using a QMF filter bank or the like.
- the second conversion unit 122 converts the second input audio signal into a plurality of frequency band signals. This method is the same as the method in the first conversion unit 121.
- the detection unit 123 detects the degree of difference between the corresponding frequency band signals of the first input audio signal and the second input audio signal. For example, the level difference or phase difference between corresponding frequency band signals is detected.
- Methods for detecting the level difference include a method of comparing the maximum amplitude values for the corresponding bands and a method of comparing energy levels.
- Methods for detecting the phase difference include a method for obtaining the phase angle from the real value and imaginary value of the Fourier series, and a method for obtaining it from the correlation value of the corresponding band signal. That is, the correlation value is C (Hima
- the phase angle can be calculated as ⁇ * (1—C) Z2.
- the quantization unit 125 quantizes the degree of difference for each detected frequency band.
- the accuracy of quantization for each band is set in advance by the quantization accuracy setting table 124.
- FIG. 5 is a diagram illustrating an example of the quantization accuracy setting table 124.
- Fig. 5 it is shown that the number of quantization bits of 6 bits is given for the lowest band. 5 bits for the next band, 4 bits for the next band Quantization accuracy is given for each band, and for the highest frequency band to be quantized, 1 bit is set so that the lower the band, the higher the accuracy of quantization. It has been determined.
- the value is merely an example, and it goes without saying that another value may be used. It is also possible to change the quantization accuracy according to the auditory sensitivity characteristics rather than changing the quantization accuracy in the order of frequency bands.
- the quantization unit 125 quantizes the signal for each frequency band with the quantization accuracy set in the quantization accuracy setting table 124.
- the force that the accuracy of quantization for each frequency band is preset by the table is not necessarily required, but it goes without saying. That is, a method may be used in which the quantization roughness (Coarse) of the frequency band is appropriately set according to the input signal, and information indicating the quantization roughness is also encoded. In that case, in order to reduce the size of the sign signal of the information indicating the roughness, it is appropriate to express the quantization roughness in two stages.
- the auxiliary information generation unit 12a is configured to use the N channel (N
- the first converter 121, the second converter 122, which converts each audio signal into a plurality of frequency band signals, and the degree of difference in the corresponding frequency band signals among the N-channel audio signals are detected.
- the quantization accuracy in the quantization unit 125 is set low for each frequency band.
- the audio signal can be encoded with high bit quality at the bit rate.
- phase difference between channels is quantized with a binary quantization accuracy for each of multiple frequency bands.
- Either there is no phase difference or the phase difference can be expressed as 180 degree force, which also means that control according to the sensitivity of hearing is not possible.
- quantization is performed with the same quantization accuracy (for example, 32 values for phase angle quantization) for all frequency bands.
- the low band has 32 values
- the upper band has 16 values
- the upper band has 13 values.
- the level ratio and phase difference quantization accuracy can be changed according to the band, such as 11 values in the band.
- the audio signal is encoded with high sound quality and a low bit rate. It will be possible.
- FIG. 7 is a block diagram illustrating another detailed configuration example of the auxiliary information generation unit according to the second embodiment. The parts corresponding to the constituent parts of the auxiliary information generating unit 12a shown in FIG.
- the auxiliary information generation unit 12b is a component of the auxiliary information generation unit 12, That is, in addition to the first conversion unit 121, the second conversion unit 122, the detection unit 123, the quantization accuracy setting table 124, and the quantization unit 125, a compression unit 126 is further provided.
- the difference from Embodiment 1 is that the quantized value obtained by quantizing the degree of difference for each frequency band quantized by quantizing section 125 is further reversible.
- the compression unit 126 for compression is provided.
- the lossless compression by the compression unit 126 is a lossless compression method in which decoding is completely performed without deterioration due to compression.
- this lossless compression for example, there is a method of compressing each quantized value by a Huffman code.
- a differential code method may be used. That is, the quantized value corresponding to the lowest frequency band is left as it is, and the quantized value corresponding to the adjacent frequency band is calculated as a differential signal and used as a compressed signal.
- This is a lossless compression that takes advantage of the fact that there is no significant difference in quantization values between adjacent frequency bands.
- the difference signal may be further compressed by a Huffman code.
- the number of bits may be reduced by performing a run-length code indicating how many times it is continuously. Further, the run length code may be further compressed by a Norman code.
- the number of bits may be further reduced by signing a value obtained by expressing B adjacent quantized values quantized with the A value in B-digit A-ary numbers.
- the detection unit 123 detects the phase difference of the corresponding frequency signal between the input audio signals, and the quantization unit 125 quantizes the detected phase difference with five values and compresses it.
- the unit 126 can compress the information amount by compressing at least two quantized values thus quantized together.
- the quantization unit 125 does not necessarily equalize the phase difference. There is no need to quantize at the quantization level divided into five equal intervals. Depending on the auditory characteristics, the phase difference near 90 ° is rough, and near 0 ° is better quantized.
- the detection unit 123 detects the phase difference of the corresponding frequency signal between the input audio signals, and the quantization unit 125 quantizes the detected phase difference with three values, and the compression unit 126
- the amount of information can be compressed if, for example, at least three quantized values thus quantized are compressed together.
- the quantization unit 125 does not necessarily need to divide the phase difference into three equal intervals, depending on the auditory characteristics, the phase difference near 90 ° is coarser, and the vicinity near 0 ° is more quantified. Good.
- the detection unit 123 detects the phase difference of the corresponding frequency signal between the input audio signals, and the quantization unit 125 quantizes the detected phase difference with 11 values, and the compression unit For 126, the amount of information can be compressed if, for example, at least two quantized values thus quantized are compressed together.
- the quantization unit 125 is not necessarily required to divide the phase difference into 11 equal intervals. Good.
- the compression unit 126 performs reversible compression on a plurality of quantized values so that the audio can be audio with a lower bit rate and higher sound quality.
- the signal can be signed.
- FIG. 8 is a block diagram showing still another detailed configuration example of the auxiliary information generation unit in the third embodiment. The parts corresponding to the constituent parts of the auxiliary information generating unit 12a shown in FIG.
- the auxiliary information generation unit 12c includes a first conversion unit 121, a second conversion unit 122, a first division unit 127a, and a second division unit 127b.
- a third division unit 127c, a fourth division unit 127d, a first quantization unit 128a, and a second quantization unit 128b are provided.
- the first converter 121 converts the first input audio signal into a frequency domain signal.
- the second converter 122 converts the second input audio signal into a frequency domain signal.
- the first division unit 127a has a frequency division table 1271 regarding gain ratio information, and divides the frequency domain signal generated by the first conversion unit 121 into a plurality of frequency bands.
- the second division unit 127b has a frequency division table 1272 related to phase difference information, and the frequency domain signal generated by the first conversion unit 121 has a different division from that of the first division unit 127a. Divide by way.
- the third division unit 127c has a frequency division table 1271 related to gain ratio information, and the frequency domain signal generated by the second conversion unit 122 is divided in the same manner as the first division unit 127a. To divide.
- the fourth division unit 127d has a frequency division table 1272 related to phase difference information, and the frequency domain signal generated by the second conversion unit 122 is divided in the same way as the second division unit 127b. To divide.
- the first quantization unit 128a has a quantization accuracy table 1281 in which the quantization accuracy of the gain ratio information and the quantization accuracy of the phase difference information are set separately, and the first dividing unit 127a The gain ratio for each frequency band corresponding to the frequency band signal divided in step 3 and the frequency band signal divided by the third division unit 127c is detected and quantized.
- the second quantizing unit 128b has a quantization accuracy table 1281, and includes the frequency band signal divided by the second dividing unit 127b and the frequency band signal divided by the fourth dividing unit 127d. The phase difference for each corresponding frequency band is detected and quantized.
- the first converter 121 converts the first input audio signal into a frequency domain signal.
- the input audio signal is converted into a frequency spectrum signal by using, for example, Fourier transform.
- the number is converted to 1024 complex Fourier series.
- the second converter 122 converts the second input audio signal into a frequency domain signal.
- This method is the same as the method in the first conversion unit 121.
- the first division unit 127a divides the frequency domain signal generated by the first conversion unit 121 into a plurality of frequency bands. At this time, the division method follows the table in FIG.
- FIG. 9 is a diagram showing a detailed configuration of the frequency division table 1271.
- the left column indicates the band number
- the center column indicates the start frequency of the frequency band of the band number
- the right column indicates the end frequency of the frequency band of the band number.
- the second division unit 127b divides the frequency domain signal generated by the first conversion unit 121 into a plurality of frequency bands. At this time, the division method follows the table in Fig. 10.
- FIG. 10 is a diagram showing a detailed configuration of the frequency division table 1272.
- Fig. 10 The meaning of Fig. 10 is the same as that of Fig. 9, but the specific bandwidth allocation method is different.
- the division width of the high frequency band is made coarser than that shown in FIG.
- the second dividing unit 127b has the frequency shown in the frequency domain signal (1024 complex Fourier series) generated by the first converting unit 121. Divide by band.
- a force that makes the division width of the high band side coarser than that shown in FIG. 9 is not necessarily required. May be selectively roughened.
- the third divider 127c divides the frequency domain signal generated by the second converter 122 for each of a plurality of frequency bands, but the operation is the same as that of the first divider 127a. .
- the fourth division unit 127d divides the frequency domain signal generated by the second conversion unit 122 into a plurality of frequency bands, but the operation is the same as that of the second division unit 127b. .
- the frequency band signals divided by the first dividing unit 127a and the frequency band signals divided by the third dividing unit 127c respectively correspond to the corresponding frequency bands.
- the gain ratio for each of several bands is detected and quantized.
- the gain ratio can be detected by any method, such as a method of comparing the maximum amplitude values for the corresponding bands or a method of comparing energy levels. Then, the detected gain ratio is quantized by the first quantization unit 128a.
- the frequency band signals divided by the second dividing unit 127b and the frequency band signals divided by the fourth dividing unit 127d respectively correspond to the corresponding frequency bands. Detect and quantize the phase difference for several bands.
- the method of detecting the phase difference may be any method such as a method of obtaining the real value and the imaginary value representative value force of the Fourier series in the frequency band as the phase angle. Then, the detected phase difference is quantized by the second quantization unit 128b.
- the first dividing unit 127a and the third dividing unit 127c are both divided in the manner shown in the table of FIG. Since the frequency signal of 2 and the frequency signal of the second input audio signal are divided, it is subdivided to a relatively high frequency range.
- the second dividing unit 127b and the fourth dividing unit 127d are both divided in the manner shown in the table of FIG. 10 and the frequency signal of the first input audio signal, respectively.
- the frequency signal of the input audio signal 2 is divided, the frequency is roughly divided in the high range.
- the gain ratio information is fine to a relatively high frequency! / And is detected and quantized for each frequency, but the phase difference information is roughly detected and quantized on the high frequency side. become. This is due to the auditory characteristic that phase information cannot be accurately detected for high-frequency signals, and this reduces the amount of information while minimizing auditory sound quality degradation. It will be possible.
- the frequency signal is not necessarily required to be set in advance using a table as a method of dividing the frequency signal.
- the method of dividing the frequency signal is appropriately set, and the information indicating the method of dividing is also encoded.
- the division method may be performed as follows. In other words, it is a method in which the number of band signals that each group obtained by dividing is grouped by a low frequency band power sequentially for each desired width (Stride). [0096] Finally, the gain ratio information and the phase difference information quantized in this way are formatted according to predetermined rules to form a bit stream.
- the amount of information is reduced while reducing auditory sound quality degradation by quantizing the phase difference information by frequency division that is coarser than the gain ratio information. It will be awkward.
- the amount of information of phase difference information is reduced by roughening frequency division.
- another method for reducing the amount of information of phase difference information for example, There is also a method of making the quantization accuracy of the phase difference information for each frequency band coarser than the quantization accuracy of the gain ratio information.
- the quantization accuracy of the phase difference information is set to be coarser than the quantization accuracy of the gain ratio information. This is also based on the characteristic that the perceptual sensitivity of the phase difference information is lower than that of the gain ratio information.
- the values are of course only examples, and may be set appropriately according to the sampling frequency and bit rate.
- the number of bits used for phase difference information is the bit used for gain ratio information. By making the number less than the number, high compression can be realized while minimizing the deterioration of sound quality.
- the frequency division related to the gain ratio information is performed based on FIG. 9 and the frequency division related to the phase difference information is performed based on FIG. 10, for example, the gain ratio information Frequency division may be performed based on FIG. 12, and frequency division related to phase difference information may be performed based on FIG.
- Fig. 12 shows that the frequency division is coarser than Fig. 9, and comparing Fig. 10 and Fig. 13 shows that Fig. 13 is better than Fig. 10.
- the frequency division becomes coarse. Therefore, if the table shown in FIGS. 12 and 13 is selected as a table for determining the division method, the amount of information when quantized can be reduced. Therefore, in the case of an encoder having a plurality of bit rate operation modes, it is only necessary to change the frequency division method when operating at a low bit rate. In this case, it is possible to use the quantization accuracy table 1281 together with the band number. Prepare and use a quantization accuracy table for the data rate.
- Fig. 14 (a) As shown in Fig. 14 (b), the band division method is changed according to the bit rate bit rate. As a result, it is possible to achieve low bit rate noise while reducing deterioration in sound quality.
- the phase difference information is encoded by a method of cutting a band that is coarser than the gain difference information. Also, the phase difference information is quantized with higher quantization accuracy than the gain difference information.
- the power that the division method is set in advance by a table is not necessarily Needless to say, this is not necessary.
- the Stride value is increased. It is possible to set the division method to be sparse, and to set the division method to be dense by making the Stride value small.
- the input audio signal of 2 channels has been described, and it may be applied to an input audio signal of 1S or more multi-channel.
- a 5-channel multi-channel signal is a 5-channel signal from a sound source placed at the front front center, front right FR, front left FL, rear right BR, and rear left BL of the viewer. It consists of an audio signal and a 0.1-channel signal LFE that represents the very low frequency range of the audio signal.
- the mixed signal encoding unit 11 generates the downmix signal DL by mixing the audio signals of the front left FL, the rear left BL, the front front center, and the LFE two by two
- the downmix signal DR may be generated by mixing the audio signals of the front right FR, rear right BR, front front center, and LFE two by two.
- the auxiliary information generation unit 12 detects the level ratio and the phase difference of the audio signal of the front left FL, the rear left BL, the front front center, and the LFE for each of the downmix signal DL, For the downmix signal DR, front right FR, rear right BR, front It is only necessary to detect the level ratio and phase difference between the audio signal from the front center and the LFE.
- the audio encoder according to the present invention is an audio encoder that encodes a multi-channel signal.
- the phase difference and level difference between multi-multi channels can be expressed with a very small number of bits, so that the bit rate is low. It is suitable to be applied to the receiving equipment such as equipment used for music broadcasting services, music distribution services, mopile equipment such as mobile phones, and AV equipment.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Compression, Expansion, Code Conversion, And Decoders (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2006531857A JP4794448B2 (ja) | 2004-08-27 | 2005-08-18 | オーディオエンコーダ |
US11/659,949 US7848931B2 (en) | 2004-08-27 | 2005-08-18 | Audio encoder |
CN2005800287250A CN101010724B (zh) | 2004-08-27 | 2005-08-18 | 音频编码器 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2004248990 | 2004-08-27 | ||
JP2004-248990 | 2004-08-27 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2006022190A1 true WO2006022190A1 (ja) | 2006-03-02 |
WO2006022190A9 WO2006022190A9 (ja) | 2006-05-11 |
Family
ID=35967403
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2005/015083 WO2006022190A1 (ja) | 2004-08-27 | 2005-08-18 | オーディオエンコーダ |
Country Status (4)
Country | Link |
---|---|
US (1) | US7848931B2 (ja) |
JP (1) | JP4794448B2 (ja) |
CN (1) | CN101010724B (ja) |
WO (1) | WO2006022190A1 (ja) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013172441A (ja) * | 2012-02-23 | 2013-09-02 | Pioneer Electronic Corp | 時間差補正方法、音声信号処理装置、再生装置およびプログラム |
US10300281B2 (en) | 2012-03-09 | 2019-05-28 | Mayo Foundation For Medical Education And Research | Modulating afferent signals to treat medical conditions |
JP2021530723A (ja) * | 2018-07-02 | 2021-11-11 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 没入的オーディオ信号を含むビットストリームを生成またはデコードするための方法および装置 |
US12020718B2 (en) | 2019-07-02 | 2024-06-25 | Dolby International Ab | Methods and devices for generating or decoding a bitstream comprising immersive audio signals |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8385556B1 (en) * | 2007-08-17 | 2013-02-26 | Dts, Inc. | Parametric stereo conversion system and method |
US8577485B2 (en) | 2007-12-06 | 2013-11-05 | Lg Electronics Inc. | Method and an apparatus for processing an audio signal |
WO2009078681A1 (en) * | 2007-12-18 | 2009-06-25 | Lg Electronics Inc. | A method and an apparatus for processing an audio signal |
US8060042B2 (en) * | 2008-05-23 | 2011-11-15 | Lg Electronics Inc. | Method and an apparatus for processing an audio signal |
KR101428487B1 (ko) * | 2008-07-11 | 2014-08-08 | 삼성전자주식회사 | 멀티 채널 부호화 및 복호화 방법 및 장치 |
JP5309944B2 (ja) * | 2008-12-11 | 2013-10-09 | 富士通株式会社 | オーディオ復号装置、方法、及びプログラム |
US8666752B2 (en) * | 2009-03-18 | 2014-03-04 | Samsung Electronics Co., Ltd. | Apparatus and method for encoding and decoding multi-channel signal |
KR20110022252A (ko) * | 2009-08-27 | 2011-03-07 | 삼성전자주식회사 | 스테레오 오디오의 부호화, 복호화 방법 및 장치 |
EP2661746B1 (en) * | 2011-01-05 | 2018-08-01 | Nokia Technologies Oy | Multi-channel encoding and/or decoding |
CN103812824A (zh) * | 2012-11-07 | 2014-05-21 | 中兴通讯股份有限公司 | 音频多编码传输方法及相应装置 |
US9659569B2 (en) | 2013-04-26 | 2017-05-23 | Nokia Technologies Oy | Audio signal encoder |
EP2866227A1 (en) * | 2013-10-22 | 2015-04-29 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method for decoding and encoding a downmix matrix, method for presenting audio content, encoder and decoder for a downmix matrix, audio encoder and audio decoder |
CN106104684A (zh) | 2014-01-13 | 2016-11-09 | 诺基亚技术有限公司 | 多通道音频信号分类器 |
ES2829413T3 (es) * | 2015-05-20 | 2021-05-31 | Ericsson Telefon Ab L M | Codificación de señales de audio de múltiples canales |
CN108694955B (zh) * | 2017-04-12 | 2020-11-17 | 华为技术有限公司 | 多声道信号的编解码方法和编解码器 |
JP7092047B2 (ja) * | 2019-01-17 | 2022-06-28 | 日本電信電話株式会社 | 符号化復号方法、復号方法、これらの装置及びプログラム |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001306097A (ja) * | 2000-04-26 | 2001-11-02 | Matsushita Electric Ind Co Ltd | 音声符号化方式及び装置、音声復号化方式及び装置、並びに記録媒体 |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP3277692B2 (ja) * | 1994-06-13 | 2002-04-22 | ソニー株式会社 | 情報符号化方法、情報復号化方法及び情報記録媒体 |
JP3341474B2 (ja) * | 1994-07-28 | 2002-11-05 | ソニー株式会社 | 情報符号化方法及び復号化方法、情報符号化装置及び復号化装置、並びに情報記録媒体 |
JP3557674B2 (ja) * | 1994-12-15 | 2004-08-25 | ソニー株式会社 | 高能率符号化方法及び装置 |
US5956674A (en) * | 1995-12-01 | 1999-09-21 | Digital Theater Systems, Inc. | Multi-channel predictive subband audio coder using psychoacoustic adaptive bit allocation in frequency, time and over the multiple channels |
DE19721487A1 (de) * | 1997-05-23 | 1998-11-26 | Thomson Brandt Gmbh | Verfahren und Vorrichtung zur Fehlerverschleierung bei Mehrkanaltonsignalen |
JP3352406B2 (ja) * | 1998-09-17 | 2002-12-03 | 松下電器産業株式会社 | オーディオ信号の符号化及び復号方法及び装置 |
JP2000260855A (ja) * | 1999-03-10 | 2000-09-22 | Mitsubishi Electric Corp | ウェハ処理装置 |
NL1013938C2 (nl) * | 1999-12-23 | 2001-06-26 | Asm Int | Inrichting voor het behandelen van een wafer. |
US7292901B2 (en) | 2002-06-24 | 2007-11-06 | Agere Systems Inc. | Hybrid multi-channel/cue coding/decoding of audio signals |
CN1647156B (zh) * | 2002-04-22 | 2010-05-26 | 皇家飞利浦电子股份有限公司 | 参数编码方法、参数编码器、用于提供音频信号的设备、解码方法、解码器、用于提供解码后的多声道音频信号的设备 |
AU2003216682A1 (en) * | 2002-04-22 | 2003-11-03 | Koninklijke Philips Electronics N.V. | Signal synthesizing |
ES2323294T3 (es) | 2002-04-22 | 2009-07-10 | Koninklijke Philips Electronics N.V. | Dispositivo de decodificacion con una unidad de decorrelacion. |
CN100546233C (zh) * | 2003-04-30 | 2009-09-30 | 诺基亚公司 | 用于支持多声道音频扩展的方法和设备 |
-
2005
- 2005-08-18 CN CN2005800287250A patent/CN101010724B/zh active Active
- 2005-08-18 US US11/659,949 patent/US7848931B2/en active Active
- 2005-08-18 JP JP2006531857A patent/JP4794448B2/ja active Active
- 2005-08-18 WO PCT/JP2005/015083 patent/WO2006022190A1/ja active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2001306097A (ja) * | 2000-04-26 | 2001-11-02 | Matsushita Electric Ind Co Ltd | 音声符号化方式及び装置、音声復号化方式及び装置、並びに記録媒体 |
Non-Patent Citations (1)
Title |
---|
FALLER C. AND BAUMGARTE F.: "Binaural Cue Coding-Part II: Schemes and Applications", IEEE TRANS. ON SPEECH AND AUDIO PROCESSING, vol. 11, no. 6, 2003, pages 520 - 531, XP002338415 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2013172441A (ja) * | 2012-02-23 | 2013-09-02 | Pioneer Electronic Corp | 時間差補正方法、音声信号処理装置、再生装置およびプログラム |
US10300281B2 (en) | 2012-03-09 | 2019-05-28 | Mayo Foundation For Medical Education And Research | Modulating afferent signals to treat medical conditions |
JP2021530723A (ja) * | 2018-07-02 | 2021-11-11 | ドルビー ラボラトリーズ ライセンシング コーポレイション | 没入的オーディオ信号を含むビットストリームを生成またはデコードするための方法および装置 |
US12020718B2 (en) | 2019-07-02 | 2024-06-25 | Dolby International Ab | Methods and devices for generating or decoding a bitstream comprising immersive audio signals |
Also Published As
Publication number | Publication date |
---|---|
US7848931B2 (en) | 2010-12-07 |
US20070271095A1 (en) | 2007-11-22 |
WO2006022190A9 (ja) | 2006-05-11 |
JP4794448B2 (ja) | 2011-10-19 |
JPWO2006022190A1 (ja) | 2008-05-08 |
CN101010724A (zh) | 2007-08-01 |
CN101010724B (zh) | 2011-05-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP4794448B2 (ja) | オーディオエンコーダ | |
TWI759240B (zh) | 用以使用量化及熵寫碼來編碼或解碼方向性音訊寫碼參數之設備及方法 | |
RU2388068C2 (ru) | Временное и пространственное генерирование многоканальных аудиосигналов | |
CN1758337B (zh) | 用于低比特率音频编码应用的高效可标度参数立体声编码 | |
JP4934427B2 (ja) | 音声信号復号化装置及び音声信号符号化装置 | |
KR101102401B1 (ko) | 오브젝트 기반 오디오 신호의 부호화 및 복호화 방법과 그 장치 | |
JP4521032B2 (ja) | 空間音声パラメータの効率的符号化のためのエネルギー対応量子化 | |
JP4921365B2 (ja) | 信号処理装置 | |
WO2011013381A1 (ja) | 符号化装置および復号装置 | |
US8019614B2 (en) | Energy shaping apparatus and energy shaping method | |
MX2007009887A (es) | Esquema de codificador/descodificador de multicanal casi transparente o transparente. | |
KR100899141B1 (ko) | 인코딩된 신호의 처리 | |
WO2007089129A1 (en) | Apparatus and method for visualization of multichannel audio signals | |
KR20070001139A (ko) | 오디오 분배 시스템, 오디오 인코더, 오디오 디코더 및이들의 동작 방법들 | |
Breebaart et al. | 19th INTERNATIONAL CONGRESS ON ACOUSTICS MADRID, 2-7 SEPTEMBER 2007 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
COP | Corrected version of pamphlet |
Free format text: PAGES 1-22, DESCRIPTION, REPLACED BY CORRECT PAGES 1-22 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2006531857 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 11659949 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 200580028725.0 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase | ||
WWP | Wipo information: published in national office |
Ref document number: 11659949 Country of ref document: US |