EP2610867B1 - Audio reproducing device and audio reproducing method - Google Patents

Audio reproducing device and audio reproducing method Download PDF

Info

Publication number
EP2610867B1
EP2610867B1 EP13161700.3A EP13161700A EP2610867B1 EP 2610867 B1 EP2610867 B1 EP 2610867B1 EP 13161700 A EP13161700 A EP 13161700A EP 2610867 B1 EP2610867 B1 EP 2610867B1
Authority
EP
European Patent Office
Prior art keywords
information
audio signal
basic
stereo
processing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
EP13161700.3A
Other languages
German (de)
French (fr)
Other versions
EP2610867A1 (en
Inventor
Takashi Yokoyama
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Socionext Inc
Original Assignee
Panasonic Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Panasonic Corp filed Critical Panasonic Corp
Publication of EP2610867A1 publication Critical patent/EP2610867A1/en
Application granted granted Critical
Publication of EP2610867B1 publication Critical patent/EP2610867B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/24Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/038Speech enhancement, e.g. noise reduction or echo cancellation using band spreading techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/005Correction of errors induced by the transmission channel, if related to the coding algorithm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Definitions

  • the present invention relates to an audio reproducing device which decodes an encoded audio signal and reproduces the decoded audio signal.
  • SBR spectral band replication
  • HQ-SBR high-quality SBR
  • LP-SBR low-power SBR
  • HQ-SBR performs complex arithmetic for overall processing of sub-band analysis, high-band generation, and sub-band synthesis.
  • HQ-SBR is suitable for enhancing sound quality, but requires a large amount of computation.
  • LP-SBR performs real number operations instead of the complex arithmetic of HQ-SBR.
  • LP-SBR is designed to reduce aliasing distortion generated by the real number operation.
  • LP-SBR is capable of significantly reducing the amount of computation, and achieving, at low bit rates, the sound quality equivalent to that of HQ-SBR. It is known that LP-SBR requires only approximately half the amount of processing that is required in HQ-SBR (See Non-Patent Literature (NPL) 1).
  • AAC Advanced Audio Coding
  • HE-AAC High-Efficiency AAC
  • AAC+LP-SBR requires only approximately 70 % of the processing amount that is required in AAC+HQ-SBR (see NPL 1).
  • PS Parametric Stereo
  • SBR Complex Quadrature Mirror Filter
  • PS is used in combination with AAC and SBR, and the combined configuration is referred to as HE-AACv2 profile.
  • PS needs to be used in combination with HQ-SBR which uses the complex QMF (see Non-Patent Literatures 2 and 3).
  • AAC may be used in combination with either HQ-SBR or LP-SBR.
  • the HE-AAC profile and HE-AACv2 profile have a concept of levels. The higher the level is, the more variety of types of signals can be decoded. Examples of the types here include maximum sampling frequency or maximum number of channels of an encoded input audio signal, and maximum sampling frequency of a decoded output audio signal (see NPL 3).
  • TILMAN LIEBCHEN "Proposed 2nd Edition of ISO/IEC 14496-3:2005/Amd.2, Audio Lossless Coding (ALS), new audio profiles and BSAC extensions", 79.
  • MPEG MEETING 15-01-2007 - 19-01-2007; MARRAKECH; (MOTION PICTUREEXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. M14193, 10 January 2007 (2007-01-10 )
  • NPL 3 in the case where the technique of decoding encoded audio signals complies with HE-AACv2 profile and there is PS data, processing needs to be performed in combination with HQ-SBR. However, in the case where there is no PS data, processing may be performed in combination with either HQ-SBR or LP-SBR.
  • such a method for preventing the increase in the computation amount, such a method can be considered which switches SBR depending on the state of the stream to be decoded. More specifically, when HQ-SBR needs to be used, that is, when there is PS data, HQ-SBR is used. In other cases, that is, when there is no PS data, LP-SBR is used for reducing the increase in the computation amount.
  • HQ-SBR is switched to LP-SBR.
  • LP-SBR is switched to HQ-SBR when the state where a stream includes PS data without any missing, but SBR and stereo processing cannot be executed because SBR header is not yet obtained, is changed to the state where the SBR header is obtained.
  • HQ-SBR performs complex arithmetic for QMF filtering
  • LP-SBR performs real number operations for QMF filtering.
  • HQ-SBR and LP-SBR have different formats of delay information, which does not allow HQ-SBR and LP-SBR to share the delay information of the QMF filtering.
  • delay information of the QMF filtering becomes discontinuous at the time of switching of SBR, thereby generating abnormal sounds.
  • FIG. 7 shows an output audio signal for a single channel in the case where SBR is switched at times t0 and t2. It is shown that abnormal sounds are generated during the periods between t0 and t1 and between t2 and t3 because delay information cannot be used due to the switching of SBR (in FIG. 7 , (b) shows a normal audio signal). In such a manner, attempts to prevent the increase in computation amount by switching SBR results in generating abnormal sounds at the time of switching of SBR.
  • the present invention has been conceived in order to solve the problem, and has an object to provide an audio reproducing device and an audio reproducing method which prevent occurrence of abnormal sounds without significantly increasing the computation amount even when an encoded input audio signal is a multi-channel signal.
  • two separate processing having different processing amount are switched based on the analysis information indicating the type of basic codec.
  • more appropriate processing can be selected.
  • processing is switched based on the analysis information; and thus, processing is not switched while the type of the basic codec is the same. As a result, it is possible to prevent abnormal sounds which may occur at the time of switching of processing.
  • the stream separating unit separates, on the frame basis, the stream into the basic codec, the bandwidth extension information, and stereo extension information that is used for performing stereo processing on the basic codec
  • the audio reproducing device further includes: a stereo extension processing unit which performs, by using the stereo extension information, stereo processing on the decoded basic codec signal having the frequency band extended by the second bandwidth extension processing unit.
  • the basic codec information analyzing unit analyzes the basic codec separated by the stream separating unit, to generate analysis information including at least one of channel information and sampling frequency information, the channel information indicating the number of channels of the basic codec, the sampling frequency information indicating a sampling frequency of the basic codec, and that the switching unit determines at least one of (i) whether the number of channels indicated by the channel information is greater than a predetermined first threshold and (ii) whether the sampling frequency indicated by the sampling frequency information is greater than a predetermined second threshold, and select the first bandwidth extension processing unit when at least one of the following is determined: (i) the number of channels is greater than the predetermined first threshold and (ii) the sampling frequency is greater than the predetermined second threshold.
  • a first processing is selected which requires less processing amount but produces lower accuracy.
  • the first processing is also selected which requires less processing amount but produces lower accuracy.
  • the audio reproducing device further includes a buffer which stores stereo extension information of a first frame, wherein the stereo extension processing unit performs stereo processing on a decoded basic codec signal of a second frame by using the stereo extension information stored in the buffer, the second frame being a frame after the first frame and being a frame in which the stereo extension information is missing.
  • stereo extension information used for stereo processing is stored in a buffer, and the stereo extension information stored in the buffer is used when stereo extension information cannot be obtained.
  • stereo processing can be properly performed on the frame.
  • the second bandwidth extension processing unit generates a high-frequency component signal from the decoded basic codec signal by using the bandwidth extension information
  • the stereo extension processing unit performs, by using the stereo extension information, stereo processing on the decoded basic codec signal and the high-frequency component signal generated by the second bandwidth extension processing unit, to generate a decoded basic codec signal and a high-frequency component signal for a first channel and a decoded basic codec signal and a high-frequency component signal for a second channel
  • the second bandwidth extension processing unit further includes a band synthesis filter for synthesizing the high-frequency component signal and the decoded basic codec signal that have been generated, and synthesizes bands of the second channel by using delay information that is stored in the band synthesis filter of the first channel, as delay information stored in the band synthesis filter of the second channel, when the stereo extension information is missing.
  • the obtained delay information is used as delay information for the other channel.
  • bands of the respective signals of two channels can be properly synthesized.
  • the basic codec is an audio signal encoded according to Advanced Audio Coding (AAC) scheme
  • the bandwidth extension information is Spectral Band Replication (SBR) information generated according to SBR scheme
  • the stereo extension information is Parametric Stereo (PS) information generated according to PS scheme
  • the first bandwidth extension processing unit extends a frequency band of the decoded basic codec signal according to Low Power-SBR (LP-SBR) scheme
  • the second bandwidth extension processing unit extends a frequency band of the decoded basic codec signal according to High Quality-SBR (HQ-SBR) scheme.
  • AAC Advanced Audio Coding
  • SBR Spectral Band Replication
  • PS Parametric Stereo
  • the present invention may be implemented not only as an audio reproducing device, but also as an audio reproducing method which includes processing units of the audio reproducing device as steps.
  • the present invention may be also implemented as a program causing a computer to execute these steps.
  • the present invention may be implemented as a computer-readable recording medium, such as a Compact Disc-Read Only Memory (CD-ROM), which records the program therein, and as information, data, or signals indicating the program.
  • CD-ROM Compact Disc-Read Only Memory
  • Such program, information, data, and signals may be distributed over a communication network such as the Internet.
  • each audio reproducing device above may be in a form of a single system large scale integration (LSI).
  • the system LSI is a ultra-multifunctional LSI which is produced by integrating a plurality of constitutional units on a single chip. More specifically, the system LSI is a computer system including, for example, a microprocessor, a ROM, and a Random Access Memory (RAM).
  • LSI system large scale integration
  • An audio reproducing device is characterized in switching between two bandwidth extension processing having different characteristics based on an analysis result of basic codec, regardless of validity of stereo extension information used for performing stereo processing on a monaural audio signal.
  • the two bandwidth extension processing are: processing which requires larger processing amount but produces higher accuracy, that is, processing for outputting an audio signal with excellent sound quality; and processing which requires less processing amount but produces lower accuracy.
  • FIG. 1 is a block diagram illustrating an example of a structure of an audio reproducing device 100 according to Embodiment 1.
  • the audio reproducing device 100 in FIG. 1 includes: a stream separating unit 101; a basic codec analyzing unit 102; a basic codec decoding unit 103; a bandwidth extension data analyzing unit 104; a stereo extension data analyzing unit 105; a first bandwidth extension processing unit 106; a second bandwidth extension processing unit 107; a stereo extension processing unit 108; and a switching unit 109.
  • the stream separating unit 101 separates an input stream into basic codec, bandwidth extension data, and stereo extension data. When an input stream includes no stereo extension data, the stream separating unit 101 separates the stream into basic codec and bandwidth extension data. The stream separating unit 101 then transmits the separated basic codec to the basic codec analyzing unit 102, transmits the bandwidth extension data to the bandwidth extension data analyzing unit 104, and transmits the stereo extension data to the stereo extension data analyzing unit 105.
  • the stream input to the audio reproducing device 100 is, for example, a stream having HE-AACv2 profile.
  • the basic codec is an encoded audio signal, and is, for example, an audio signal encoded in accordance with AAC scheme.
  • the bandwidth extension data is data used for extending bandwidth of the basic codec, and is, for example, SBR data.
  • the stereo extension data is data used for performing stereo processing on a monaural audio signal, and is, for example, PS data.
  • the basic codec analyzing unit 102 generates basic codec analysis information by analyzing the basic codec transmitted from the stream separating unit 101.
  • the basic codec analysis information includes, for example, channel information representing the number of channels (CH) of the basic codec, and sampling frequency information representing the sampling frequency (FS) of the basic codec.
  • the basic codec analyzing unit 102 transmits the generated basic codec analysis information to the basic codec decoding unit 103.
  • the basic codec analyzing unit 102 also transmits the channel information and the sampling frequency information to the switching unit 109.
  • the basic codec decoding unit 103 decodes the basic codec by using the basic codec analysis information transmitted from the basic codec analyzing unit 102, and generates a decoded basic codec signal. The basic codec decoding unit 103 then transmits the decoded basic codec signal to the switching unit 109.
  • the bandwidth extension data analyzing unit 104 analyzes the bandwidth extension data transmitted from the stream separating unit 101 to generate bandwidth extension information, and transmits the generated bandwidth extension information to the switching unit 109.
  • the bandwidth extension information includes, for example, side information used for prediction for reconstruction of high band of the decoded basic codec signal using the SBR technique.
  • the stereo extension data analyzing unit 105 analyzes the stereo extension data transmitted from the stream separating unit 101 to generate stereo extension information, and transmits the generated stereo extension information to the stereo extension processing unit 108.
  • the stereo extension information is, for example, information used for performing stereo extension processing (also referred to as stereo processing) on a monaural audio signal using the PS technique.
  • the first bandwidth extension processing unit 106 extends the frequency band of the decoded basic codec signal by using the bandwidth extension information transmitted from the switching unit 109 to output an audio signal. More specifically, the first bandwidth extension processing unit 106 predicts and generates high frequency components by using the bandwidth extension information, and synthesizes the bands of the generated high frequency component signal and the decoded basic codec signal to output an audio signal.
  • the first bandwidth extension processing unit 106 has an advantage over the second bandwidth extension processing unit 107 in that the first bandwidth extension processing unit 106 requires less processing amount for processing a same signal. However, the sound quality of the audio signal output by the first bandwidth extension processing unit 106 is lower than that of the audio signal output by the second bandwidth extension processing unit 107.
  • the first bandwidth extension processing unit 106 performs, for example, bandwidth extension based the LP-SBR scheme.
  • the second bandwidth extension processing unit 107 extends the frequency band of the decoded basic codec signal by using the bandwidth extension information transmitted from the switching unit 109 to output an audio signal. More specifically, the second bandwidth extension processing unit 107 predicts and generates high frequency components by using the bandwidth extension information, and synthesizes the bands of the generated high frequency component signal and the decoded basic codec signal to output an audio signal.
  • the sound quality of the audio signal output by the second bandwidth extension processing unit 107 is higher than that of the audio signal output by the first bandwidth extension processing unit 106.
  • the second bandwidth extension processing unit 107 requires processing amount larger than that of the first bandwidth extension processing unit 106.
  • the second bandwidth extension processing unit 107 performs, for example, bandwidth extension based the HQ-SBR scheme.
  • the decoded basic codec signal is an audio signal mainly including low frequency components.
  • the bandwidth extension performed by the first bandwidth extension processing unit 160 and the second bandwidth extension processing unit 107 are processing in which the removed high frequency components are predicted and generated by using bandwidth extension information.
  • the first bandwidth extension processing unit 106 and the second bandwidth extension processing unit 107 each includes a band synthesis filter.
  • the first and second bandwidth extension processing units 106 and 107 reconstruct an output audio signal that is close to an original sound by synthesizing the bands of the decoded basic codec signal generated by the basic codec decoding unit 103 and the high frequency component signal reconstructed based on the decoded basic codec signal by using the bandwidth extension information.
  • the stereo extension processing unit 108 uses stereo extension information transmitted from the stereo extension data analyzing unit 105 to perform stereo processing on the monaural audio signal having a frequency band extended by the second bandwidth extension processing unit 107. More specifically, the stereo extension processing unit 108 performs, by using the stereo extension information, stereo processing on the decoded basic codec signal that is a monaural audio signal and the high frequency component signal generated by the second bandwidth extension processing unit 107, to generate a decoded basic codec signal and a high frequency component signal for left (L) channel, and a decoded basic codec signal and a high frequency component signal for right (R) channel.
  • the stereo extension processing unit 108 performs, for example, stereo processing based on the PS scheme.
  • the stereo extension processing unit 108 has to be used in combination with the second bandwidth extension processing unit 107. In other words, the stereo extension processing unit 108 shares the complex QMF with the second bandwidth extension processing unit 107.
  • the second bandwidth extension processing unit 107 synthesizes the bands of the generated L-channel signals and the bands of the generated R-channel signals.
  • the delay information of the L channel is copied to the delay information of the R channel.
  • band synthesis of R channel is performed using the delay information of the L channel copied for a previous frame, as the delay information of R channel.
  • the delay information of the L channel is information held over frames in the band synthesis filter of the band synthesis processing.
  • the switching unit 109 determines whether the outputs of the basic codec decoding unit 103 and the bandwidth extension data analyzing unit 104 are connected to terminal A or terminal B, based on the number of channels CH and the sampling frequency FS transmitted form the basic codec analyzing unit 102. The determination procedure will be specifically described later with reference to FIG. 3 .
  • the switching unit 109 transmits the decoded basic codec signal transmitted form the basic codec decoding unit 103 and the bandwidth extension information transmitted from the bandwidth extension data analyzing unit 104, to the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 depending on the determination result.
  • the audio reproducing device 100 includes the switching unit 109 which selects one of the two bandwidth extension processing having different characteristics, based on the analysis result of the basic codec.
  • the two bandwidth extension processing are: a first processing which requires less processing amount but produces lower sound quality; and a second processing which requires larger processing amount but produces higher sound quality.
  • FIG. 2 is a flowchart of the operations of the audio reproducing device 100 according to Embodiment 1. The following operations are performed on a frame basis.
  • the stream separating unit 101 separates an input stream into basic codec, bandwidth extension data, and stereo extension data (S101).
  • the basic codec is transmitted to the basic codec analyzing unit 102.
  • the bandwidth extension data is transmitted to the bandwidth extension data analyzing unit 104.
  • the stereo extension data is transmitted to the stereo extension data analyzing unit 105.
  • each of separated data is analyzed (S102). More specifically, the basic codec analyzing unit 102 analyzes the basic codec to generate basic codec analysis information.
  • the bandwidth extension data analyzing unit 104 analyzes the bandwidth extension data to generate bandwidth extension information.
  • the stereo extension data analyzing unit 105 analyzes the stereo extension data to generate stereo extension information. In the case where stereo extension information cannot be generated, such as the case where stereo extension data is missing, the stereo extension data analyzing unit 105 transmits, to the stereo extension processing unit 108, information indicating that there is no stereo extension information.
  • the basic codec decoding unit 103 decodes the basic codec in accordance with the basic codec analysis information (S103).
  • the decoded basic codec signal is transmitted to the switching unit 109.
  • the switching unit 109 determines the connection destination of the transmission path of the decoded basic codec signal based on the basic codec analysis information, and switches between the terminal A and the terminal B based on the determination result (S104). For example, the switching unit 109 refers to the channel information included in the basic codec analysis information, and selects the terminal A when the number of channels CH of the basic codec is greater than a predetermined threshold. Alternatively, the switching unit 109 refers to the sampling frequency information included in the basic codec analysis information, and selects the terminal A when the sampling frequency FS of the basic codec is equal to or greater than a predetermined threshold. In other cases, the switching unit 109 selects the terminal B.
  • the decoded basic codec signal and the bandwidth extension information are transmitted to the first bandwidth extension processing unit 106.
  • the first bandwidth extension processing unit 106 extends the frequency band of the decoded basic codec signal to generate an output audio signal (S106).
  • the first bandwidth extension processing unit 106 executes processing in accordance with the LP-SBR scheme or the like which requires less processing amount but generates an audio signal with lower sound quality.
  • the decoded basic codec signal and the bandwidth extension information are transmitted to the second bandwidth extension processing unit 107.
  • the second bandwidth extension processing unit 107 extends the frequency band of the decoded basic codec signal to generate an output audio signal (S107).
  • the second bandwidth extension processing unit 107 executes processing in accordance with the HQ-SBR scheme or the like which requires larger processing amount but generates an audio signal with higher sound quality.
  • the stereo extension processing unit 108 performs stereo processing on the decoded basic codec signal (monaural audio signal) having the frequency band extended by the second bandwidth extension processing unit 107.
  • the audio signal generated by the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 is output (S108).
  • processing is selected based on the basic codec analysis information representing the type of the basic codec. Accordingly, for example, in the case where processing amount is increased due to multi-channel or higher sampling frequency, it is possible to prevent an increase in the processing amount by selecting the first bandwidth extension processing unit 106 which requires less processing amount.
  • connection destination S104
  • FIG. 3 is a flowchart of a specific example of the operations of the switching unit 109 according to Embodiment 1.
  • the transmission path is connected to the terminal A, and the input bandwidth extension information and the decoded basic codec signal are transmitted to the first bandwidth extension processing unit 106 (S202).
  • the transmission path is connected to the terminal B, and the input bandwidth extension information and the decoded basic codec signal are transmitted to the second bandwidth extension processing unit 107 (S203).
  • FIG. 4 is a diagram illustrating an example of an input stream which includes stereo extension data.
  • the second bandwidth extension processing unit 107 extends the band of the decoded basic codec signal transmitted from the switching unit 109, by using the bandwidth extension information.
  • the stereo extension processing unit 108 performs stereo extension processing by using the stereo extension information, and outputs the stereo audio signal.
  • the number of channels CH is 1.
  • the stereo extension data is information used for performing stereo processing on a monaural audio signal.
  • the number of channels is 1, it represents that the decoded basic codec signal is a monaural audio signal.
  • FIG. 5 is a diagram illustrating an example of an input stream which includes no stereo extension data.
  • the first bandwidth extension processing unit 106 extends the band of the decoded basic codec signal transmitted from the switching unit 109, by using the bandwidth extension information, and outputs an audio signal.
  • the audio reproducing device 100 receives a stream in which stereo extension data is missing in a frame at some point within the stream, and stereo extension data reappears in the subsequent frames.
  • FIG. 6 is a diagram illustrating an example of an input stream including a frame in which stereo extension data is missing.
  • stereo extension data is included in frames 201 and 203, but stereo extension data is missing in a frame 202.
  • the switching unit 109 determines that each frame meets the condition shown in FIG. 3 (Yes in S201), and connects the transmission path to the terminal B (S203).
  • the second bandwidth extension processing unit 107 performs bandwidth extension on each frame.
  • FIG. 7 is a diagram illustrating an example of waveforms of output audio signals.
  • (a) shows a conventional waveform of an output audio signal of the case where HQ-SBR is switched to LP-SBR at time t0 and LP-SBR is switched to HQ-SBR at time t2 due to the missing PS data of the frame 202.
  • Conventionally, such switching of the processing causes abnormal sounds because delay information is not available during the periods between times t0 and t1 and between times t2 and t3.
  • the audio reproducing device 100 determines the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 for performing processing, independently of the existence of the stereo extension data within a stream. More specifically, in the case where the respective frames have the same analysis information of the basic codec, the same processing unit is used for extending the band of the decoded basic codec signal of each frame. Thus, discontinuity of the delay information does not occur, thereby preventing abnormal sounds as shown in FIG. 7(b) .
  • An audio reproducing device includes a buffer for storing stereo extension information. For example, when there is missing stereo extension data under the influences of broadcast receiving, stereo processing is performed by using the stereo extension information stored in the buffer.
  • FIG. 8 is a block diagram illustrating an example of a structure of an audio reproducing device 300 according to Embodiment 2.
  • the audio reproducing device 300 shown in FIG. 8 differs from the audio reproducing device 100 shown in FIG. 1 in that a stereo extension processing unit 308 is included instead of the stereo extension processing unit 108, and that a buffer 310 is further included.
  • a stereo extension processing unit 308 is included instead of the stereo extension processing unit 108
  • a buffer 310 is further included.
  • the stereo extension processing unit 308 stores, in the buffer 310, stereo extension information used for the stereo processing. More specifically, the stereo extension processing unit 308 performs stereo processing on the decoded basic codec signal having the frequency band extended by the second bandwidth extension processing unit 107, by using the stereo extension information transmitted from the stereo extension data analyzing unit 105.
  • the stereo extension information used here is stored in the buffer 310. For example, each time stereo extension information is obtained, the stereo extension processing unit 308 updates the stereo extension information stored in the buffer 310.
  • the stereo extension processing unit 308 reads stereo extension information from the buffer 310, and performs stereo processing on the decoded basic codec signal (monaural audio signal) of the frame by using the read stereo extension information.
  • the buffer 310 stores the stereo extension information transmitted from the stereo extension data analyzing unit 105.
  • the buffer 310 not only stores newest stereo extension information, but also may store a plurality pieces of stereo extension information.
  • the stereo extension processing unit 308, for example refers to the basic codec extension information and uses the stereo extension information used for the stereo processing of a previous decoded basic codec signal similar to the current decoded basic codec signal.
  • the audio reproducing device 300 includes the buffer 310 for storing stereo extension information. In the case where there is no stereo extension information, the audio reproducing device 300 performs stereo processing on the decoded basic codec signal by using the stereo extension information stored in the buffer 310.
  • the audio reproducing device 300 decodes input streams in accordance with the flowcharts shown in FIG. 2 and FIG. 3 .
  • the stereo extension processing unit 308 according to Embodiment 2 performs processing when the second bandwidth extension processing unit 107 performs bandwidth extension (S107).
  • FIG. 9 is a flowchart of the operations of the stereo extension processing unit 308 according to Embodiment 2.
  • the stereo extension processing unit 308 determines whether or not a stream includes stereo extension data, that is, whether or not stereo extension information is transmitted from the stereo extension data analyzing unit 105 (S301). In the case where the stereo extension information is transmitted (Yes in S301), stereo extension processing is performed by using the stereo extension information (S302). The stereo extension processing unit 308 further stores the stereo extension information used here (S303).
  • stereo extension processing is performed by using the stereo extension information stored when decoding previous frames (S305). In the case where no stereo extension processing has been performed (No in S304), the processing ends here.
  • the stereo extension processing unit 308 stores, in the buffer 310, the stereo extension information used for decoding previous frames.
  • stereo processing is performed on the decoded basic codec signal by using the stereo extension information stored in the buffer 310.
  • the switching unit 109 connects the transmission path to the terminal B because all of the frames 201 to 203 has a single channel and sampling frequency FS that is equal to or higher than 24 kHz.
  • the decoded basic codec signal and bandwidth extension information are transmitted to the second bandwidth extension processing unit 107.
  • the bandwidth extension processing on all of the frames 201 to 203 is performed by the second bandwidth extension processing unit 107, which allows continuity of delay information.
  • FIG. 10 is a diagram illustrating an example of waveforms of output stereo audio signals.
  • stereo extension processing is not performed during a period of the frame in which stereo extension data is missing(period between t4 and t5).
  • R-channel audio signal is not output, which gives a listener a feeling of strangeness.
  • the stereo extension processing unit 308 performs the following operations.
  • the stereo extension processing unit 308 performs stereo extension processing (S302), and stores the stereo extension information used here (S303).
  • the frame 202 in which stereo extension data is missing is input. Since stereo extension data is missing in the frame 202 (No in S301) and the stereo extension processing is performed at the time of decoding of the frame 201 (Yes in S304), the stereo extension processing unit 308 performs stereo extension processing on the frame 202 by using the stereo extension information of the frame 201.
  • the stereo extension processing unit 308 performs stereo extension processing on the frame 203 by using the stereo extension information extracted from the frame 203 (S302).
  • the audio reproducing device 300 according to Embodiment 2 is capable of keeping continuity of an output sound, and also performing stereo extension even on a frame in which stereo extension data is missing.
  • FIG. 11 is an external view of an example of an audio reproducing apparatus incorporating an audio reproducing device according to the present invention.
  • FIG. 11 illustrates a recording medium 401, an audio reproducing apparatus 402, and earphones 403.
  • the recording medium 401 is a recording medium which is capable of recording compressed audio streams.
  • FIG. 11 shows the recording medium 401 as a medium, such as a secure digital (SD) card, removable from an apparatus; however, the recording medium 401 may also be implemented as an optical disk, a hard disk drive (HDD) incorporated in the apparatus, or the like.
  • SD secure digital
  • HDD hard disk drive
  • the audio reproducing apparatus 402 is an apparatus which reproduces compressed audio streams, and includes at least one of the audio reproducing devices 100 and 300 according to Embodiments 1 and 2.
  • the earphones 403 are loud speaker apparatus which output audio signals output from the audio reproducing apparatus 402 to outside.
  • FIG. 11 illustrates earphones which are inserted into the ears of a user; however, the earphones may be headphones which are put on the head of the user, or desktop loudspeakers.
  • the audio reproducing apparatus 402 it is possible to obtain an output audio signal without causing abnormal sounds even when a stream includes a frame in which stereo extension data is missing.
  • the switching unit 109 makes the determination based on the determination condition that the number of channels is 1 and the sampling frequency is 24 kHz or lower; however, the determination condition is not limitative. For example, it may be that the switching unit 109 determines to use the second bandwidth extension processing unit 107 (connect to the terminal B) only when the number of channels is two or less. In this case, when a stream having the basic codec with 1 or 2 channels is input, bandwidth extension is performed by the second bandwidth extension processing unit 107 which generates higher sound quality but requires larger processing amount.
  • the present invention may be implemented not only as an audio reproducing device and an audio reproducing method as described above, but also as a program causing a computer to execute an audio reproducing method according to the embodiments.
  • the present invention may also be implemented as a recording medium, such as a computer readable CD-ROM, which stores the program.
  • the present invention may be implemented as information, data, or a signal indicating the program. Such program, information, data, and signal may be distributed over a communication network such as the Internet.
  • portion or all of the constituent elements of the audio reproducing device according to the present invention may be structured as a single system LSI.
  • the system LSI is a super multi-functional LSI manufactured by integrating a plurality of structural units onto a single chip. Specifically, it is a computer system including a microprocessor, a ROM, a RAM, and the like.
  • the present invention prevents a significant increase in processing amount, and also prevents occurrence of abnormal sounds.
  • the present invention may be used for, for example, an audio reproducing device.
  • the present invention may be used for an audio reproducing apparatus, such as a portable music player, which has limited processor capability and limited memory resources.

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Description

    [Technical Field]
  • The present invention relates to an audio reproducing device which decodes an encoded audio signal and reproduces the decoded audio signal.
  • [Background Art]
  • There is a conventional audio reproducing device which receives a low-band audio signal and bandwidth extension information, and generates an extended high-band audio signal by using a spectral band replication (hereinafter, referred to as SBR) technique. SBR reconstructs the high-band of the received signal by predicting the high-band with reference to side information included in the bandwidth extension information. Here, only a small amount of side information is necessary; and thus, the SBR enhances the sound quality of the encoded audio signal at low bit rates.
  • Two types of SBR are defined, which are high-quality SBR (hereinafter, referred to as HQ-SBR) and low-power SBR (hereinafter, referred to as LP-SBR).
  • HQ-SBR performs complex arithmetic for overall processing of sub-band analysis, high-band generation, and sub-band synthesis. Thus, HQ-SBR is suitable for enhancing sound quality, but requires a large amount of computation.
  • LP-SBR performs real number operations instead of the complex arithmetic of HQ-SBR. LP-SBR is designed to reduce aliasing distortion generated by the real number operation. Thus, LP-SBR is capable of significantly reducing the amount of computation, and achieving, at low bit rates, the sound quality equivalent to that of HQ-SBR. It is known that LP-SBR requires only approximately half the amount of processing that is required in HQ-SBR (See Non-Patent Literature (NPL) 1).
  • SBR is used in combination with Advanced Audio Coding (AAC), and the combined configuration is referred to as High-Efficiency AAC (HE-AAC) profile. In combination with AAC, it is known that AAC+LP-SBR requires only approximately 70 % of the processing amount that is required in AAC+HQ-SBR (see NPL 1).
  • There is also a conventional reproducing device which receives a monaural audio signal and stereo information, and performs stereo processing on the monaural audio signal based on the stereo information to generate a stereo audio signal. The stereo processing is known as Parametric Stereo (hereinafter, referred to as PS), and used in combination with SBR. PS commonly uses a complex Quadrature Mirror Filter (QMF) with SBR for stereo processing (see NPL 2).
  • It is known that PS is used in combination with AAC and SBR, and the combined configuration is referred to as HE-AACv2 profile. PS needs to be used in combination with HQ-SBR which uses the complex QMF (see Non-Patent Literatures 2 and 3). When there is no PS data, AAC may be used in combination with either HQ-SBR or LP-SBR.
  • The HE-AAC profile and HE-AACv2 profile have a concept of levels. The higher the level is, the more variety of types of signals can be decoded. Examples of the types here include maximum sampling frequency or maximum number of channels of an encoded input audio signal, and maximum sampling frequency of a decoded output audio signal (see NPL 3).
  • [Citation List] [Non Patent Literature]
  • Further details of this technique are disclosed in TILMAN LIEBCHEN: "Proposed 2nd Edition of ISO/IEC 14496-3:2005/Amd.2, Audio Lossless Coding (ALS), new audio profiles and BSAC extensions", 79. MPEG MEETING; 15-01-2007 - 19-01-2007; MARRAKECH; (MOTION PICTUREEXPERT GROUP OR ISO/IEC JTC1/SC29/WG11), no. M14193, 10 January 2007 (2007-01-10)
  • [Summary of Invention] [Technical Problem]
  • However, in order for the conventional techniques of decoding encoded audio signals to comply with HE-AACv2 profile and higher levels, HQ-SBR needs to be used which requires a large amount of computation. As a result, for example, in the case where an encoded input audio signal is a multi-channel signal, computation amount (processing amount) significantly increases. Furthermore, attempts to solve the problem by using the conventional techniques result in generating abnormal sounds in the decoded audio signal. Details are described below.
  • According to NPL 3, as described above, in the case where the technique of decoding encoded audio signals complies with HE-AACv2 profile and there is PS data, processing needs to be performed in combination with HQ-SBR. However, in the case where there is no PS data, processing may be performed in combination with either HQ-SBR or LP-SBR.
  • For example, in view of NPL 3, for preventing the increase in the computation amount, such a method can be considered which switches SBR depending on the state of the stream to be decoded. More specifically, when HQ-SBR needs to be used, that is, when there is PS data, HQ-SBR is used. In other cases, that is, when there is no PS data, LP-SBR is used for reducing the increase in the computation amount.
  • Here, in the case where a stream includes a plurality pieces of normal PS data but a piece of PS data is missing at some point in the stream, HQ-SBR is switched to LP-SBR. Alternatively, LP-SBR is switched to HQ-SBR when the state where a stream includes PS data without any missing, but SBR and stereo processing cannot be executed because SBR header is not yet obtained, is changed to the state where the SBR header is obtained.
  • As described earlier, HQ-SBR performs complex arithmetic for QMF filtering, and LP-SBR performs real number operations for QMF filtering. Thus, HQ-SBR and LP-SBR have different formats of delay information, which does not allow HQ-SBR and LP-SBR to share the delay information of the QMF filtering. As a result, delay information of the QMF filtering becomes discontinuous at the time of switching of SBR, thereby generating abnormal sounds.
  • In FIG. 7, (a) shows an output audio signal for a single channel in the case where SBR is switched at times t0 and t2. It is shown that abnormal sounds are generated during the periods between t0 and t1 and between t2 and t3 because delay information cannot be used due to the switching of SBR (in FIG. 7, (b) shows a normal audio signal). In such a manner, attempts to prevent the increase in computation amount by switching SBR results in generating abnormal sounds at the time of switching of SBR.
  • The present invention has been conceived in order to solve the problem, and has an object to provide an audio reproducing device and an audio reproducing method which prevent occurrence of abnormal sounds without significantly increasing the computation amount even when an encoded input audio signal is a multi-channel signal.
  • [Solution to Problem]
  • In order to solve the problem, an audio reproducing device according to claim 1 is proposed.
  • According to the structure, two separate processing having different processing amount are switched based on the analysis information indicating the type of basic codec. As a result, more appropriate processing can be selected. Thus, for example, even when an input encoded audio signal is a multi-channel signal, the computation amount (processing amount) does not significantly increase. In addition, processing is switched based on the analysis information; and thus, processing is not switched while the type of the basic codec is the same. As a result, it is possible to prevent abnormal sounds which may occur at the time of switching of processing.
  • It may also be that the stream separating unit separates, on the frame basis, the stream into the basic codec, the bandwidth extension information, and stereo extension information that is used for performing stereo processing on the basic codec, and that the audio reproducing device further includes: a stereo extension processing unit which performs, by using the stereo extension information, stereo processing on the decoded basic codec signal having the frequency band extended by the second bandwidth extension processing unit.
  • Accordingly, when the basic codec is a monaural audio signal, proper stereo processing can be performed.
  • The basic codec information analyzing unit analyzes the basic codec separated by the stream separating unit, to generate analysis information including at least one of channel information and sampling frequency information, the channel information indicating the number of channels of the basic codec, the sampling frequency information indicating a sampling frequency of the basic codec, and that the switching unit determines at least one of (i) whether the number of channels indicated by the channel information is greater than a predetermined first threshold and (ii) whether the sampling frequency indicated by the sampling frequency information is greater than a predetermined second threshold, and select the first bandwidth extension processing unit when at least one of the following is determined: (i) the number of channels is greater than the predetermined first threshold and (ii) the sampling frequency is greater than the predetermined second threshold.
  • According to the structure, in the case where the basic codec has a large number of channels, that is, where the basic codec is multi-channel, a first processing is selected which requires less processing amount but produces lower accuracy. As a result, it is possible to prevent processing amount from significantly increasing compared to a single channel signal. Alternatively, when the sampling frequency of the basic codec is high, the first processing is also selected which requires less processing amount but produces lower accuracy. Thus, similarly, it is possible to prevent the processing amount from significantly increasing compared to the case where the basic codec with lower sampling frequency is processed.
  • It may also be that the audio reproducing device further includes a buffer which stores stereo extension information of a first frame, wherein the stereo extension processing unit performs stereo processing on a decoded basic codec signal of a second frame by using the stereo extension information stored in the buffer, the second frame being a frame after the first frame and being a frame in which the stereo extension information is missing.
  • Accordingly, stereo extension information used for stereo processing is stored in a buffer, and the stereo extension information stored in the buffer is used when stereo extension information cannot be obtained. Thus, even when a stream includes a frame in which stereo extension data is missing, stereo processing can be properly performed on the frame.
  • It also may be that the second bandwidth extension processing unit generates a high-frequency component signal from the decoded basic codec signal by using the bandwidth extension information, the stereo extension processing unit performs, by using the stereo extension information, stereo processing on the decoded basic codec signal and the high-frequency component signal generated by the second bandwidth extension processing unit, to generate a decoded basic codec signal and a high-frequency component signal for a first channel and a decoded basic codec signal and a high-frequency component signal for a second channel, and the second bandwidth extension processing unit further includes a band synthesis filter for synthesizing the high-frequency component signal and the decoded basic codec signal that have been generated, and synthesizes bands of the second channel by using delay information that is stored in the band synthesis filter of the first channel, as delay information stored in the band synthesis filter of the second channel, when the stereo extension information is missing.
  • According to the structure, even when delay information only for a single channel is obtained, the obtained delay information is used as delay information for the other channel. As a result, bands of the respective signals of two channels can be properly synthesized.
  • It may also be that the basic codec is an audio signal encoded according to Advanced Audio Coding (AAC) scheme, the bandwidth extension information is Spectral Band Replication (SBR) information generated according to SBR scheme, the stereo extension information is Parametric Stereo (PS) information generated according to PS scheme, the first bandwidth extension processing unit extends a frequency band of the decoded basic codec signal according to Low Power-SBR (LP-SBR) scheme, and the second bandwidth extension processing unit extends a frequency band of the decoded basic codec signal according to High Quality-SBR (HQ-SBR) scheme.
  • The present invention may be implemented not only as an audio reproducing device, but also as an audio reproducing method which includes processing units of the audio reproducing device as steps. The present invention may be also implemented as a program causing a computer to execute these steps. Furthermore, the present invention may be implemented as a computer-readable recording medium, such as a Compact Disc-Read Only Memory (CD-ROM), which records the program therein, and as information, data, or signals indicating the program. Such program, information, data, and signals may be distributed over a communication network such as the Internet.
  • In addition, part or all of the elements included in each audio reproducing device above may be in a form of a single system large scale integration (LSI). The system LSI is a ultra-multifunctional LSI which is produced by integrating a plurality of constitutional units on a single chip. More specifically, the system LSI is a computer system including, for example, a microprocessor, a ROM, and a Random Access Memory (RAM).
  • [Advantageous Effects of Invention]
  • According to the present invention, it is possible to prevent occurrence of abnormal sounds without significantly increasing the computation amount even when an encoded input audio signal is a multi-channel signal.
  • [Brief Description of Drawings]
    • [FIG. 1]
      FIG. 1 is a block diagram illustrating an example of a structure of an audio reproducing device according to Embodiment 1.
    • [FIG. 2]
      FIG. 2 is a flowchart of an example of operations of the audio reproducing device according to Embodiment 1.
    • [FIG. 3]
      FIG. 3 is a flowchart of a specific example of operations of a switching unit according to Embodiment 1.
    • [FIG. 4]
      FIG. 4 is a diagram illustrating an example of an input stream which includes stereo extension data.
    • [FIG. 5]
      FIG. 5 is a diagram illustrating an example of an input stream which does not include stereo extension data.
    • [FIG. 6]
      FIG. 6 is a diagram illustrating an example of an input stream including a frame in which stereo extension data is missing.
    • [FIG. 7]
      FIG. 7 is a diagram illustrating an example of waveforms of output audio signals.
    • [FIG. 8]
      FIG. 8 is a block diagram illustrating an example of a structure of an audio reproducing device according to Embodiment 2.
    • [FIG. 9]
      FIG. 9 is a flowchart of an example of operations of a stereo extension processing unit according to Embodiment 2.
    • [FIG. 10]
      FIG. 10 is a diagram illustrating an example of waveforms of stereo audio signals to be output.
    • [FIG. 11]
      FIG. 11 is an external view of an example of an audio reproducing apparatus incorporating an audio reproducing device according to the present invention.
    [Description of Embodiments]
  • Hereinafter, embodiments of an audio reproducing device according to the present invention will be described with reference to the drawings.
  • (Embodiment 1)
  • An audio reproducing device according to Embodiment 1 is characterized in switching between two bandwidth extension processing having different characteristics based on an analysis result of basic codec, regardless of validity of stereo extension information used for performing stereo processing on a monaural audio signal. The two bandwidth extension processing are: processing which requires larger processing amount but produces higher accuracy, that is, processing for outputting an audio signal with excellent sound quality; and processing which requires less processing amount but produces lower accuracy.
  • FIG. 1 is a block diagram illustrating an example of a structure of an audio reproducing device 100 according to Embodiment 1. The audio reproducing device 100 in FIG. 1 includes: a stream separating unit 101; a basic codec analyzing unit 102; a basic codec decoding unit 103; a bandwidth extension data analyzing unit 104; a stereo extension data analyzing unit 105; a first bandwidth extension processing unit 106; a second bandwidth extension processing unit 107; a stereo extension processing unit 108; and a switching unit 109.
  • The stream separating unit 101 separates an input stream into basic codec, bandwidth extension data, and stereo extension data. When an input stream includes no stereo extension data, the stream separating unit 101 separates the stream into basic codec and bandwidth extension data. The stream separating unit 101 then transmits the separated basic codec to the basic codec analyzing unit 102, transmits the bandwidth extension data to the bandwidth extension data analyzing unit 104, and transmits the stereo extension data to the stereo extension data analyzing unit 105.
  • Here, the stream input to the audio reproducing device 100 is, for example, a stream having HE-AACv2 profile. The basic codec is an encoded audio signal, and is, for example, an audio signal encoded in accordance with AAC scheme. The bandwidth extension data is data used for extending bandwidth of the basic codec, and is, for example, SBR data. The stereo extension data is data used for performing stereo processing on a monaural audio signal, and is, for example, PS data.
  • The basic codec analyzing unit 102 generates basic codec analysis information by analyzing the basic codec transmitted from the stream separating unit 101. The basic codec analysis information includes, for example, channel information representing the number of channels (CH) of the basic codec, and sampling frequency information representing the sampling frequency (FS) of the basic codec. The basic codec analyzing unit 102 transmits the generated basic codec analysis information to the basic codec decoding unit 103. Of the basic codec analysis information, the basic codec analyzing unit 102 also transmits the channel information and the sampling frequency information to the switching unit 109.
  • The basic codec decoding unit 103 decodes the basic codec by using the basic codec analysis information transmitted from the basic codec analyzing unit 102, and generates a decoded basic codec signal. The basic codec decoding unit 103 then transmits the decoded basic codec signal to the switching unit 109.
  • The bandwidth extension data analyzing unit 104 analyzes the bandwidth extension data transmitted from the stream separating unit 101 to generate bandwidth extension information, and transmits the generated bandwidth extension information to the switching unit 109. The bandwidth extension information includes, for example, side information used for prediction for reconstruction of high band of the decoded basic codec signal using the SBR technique.
  • The stereo extension data analyzing unit 105 analyzes the stereo extension data transmitted from the stream separating unit 101 to generate stereo extension information, and transmits the generated stereo extension information to the stereo extension processing unit 108. The stereo extension information is, for example, information used for performing stereo extension processing (also referred to as stereo processing) on a monaural audio signal using the PS technique.
  • The first bandwidth extension processing unit 106 extends the frequency band of the decoded basic codec signal by using the bandwidth extension information transmitted from the switching unit 109 to output an audio signal. More specifically, the first bandwidth extension processing unit 106 predicts and generates high frequency components by using the bandwidth extension information, and synthesizes the bands of the generated high frequency component signal and the decoded basic codec signal to output an audio signal.
  • Here, the first bandwidth extension processing unit 106 has an advantage over the second bandwidth extension processing unit 107 in that the first bandwidth extension processing unit 106 requires less processing amount for processing a same signal. However, the sound quality of the audio signal output by the first bandwidth extension processing unit 106 is lower than that of the audio signal output by the second bandwidth extension processing unit 107. The first bandwidth extension processing unit 106 performs, for example, bandwidth extension based the LP-SBR scheme.
  • The second bandwidth extension processing unit 107 extends the frequency band of the decoded basic codec signal by using the bandwidth extension information transmitted from the switching unit 109 to output an audio signal. More specifically, the second bandwidth extension processing unit 107 predicts and generates high frequency components by using the bandwidth extension information, and synthesizes the bands of the generated high frequency component signal and the decoded basic codec signal to output an audio signal.
  • Here, the sound quality of the audio signal output by the second bandwidth extension processing unit 107 is higher than that of the audio signal output by the first bandwidth extension processing unit 106. However, the second bandwidth extension processing unit 107 requires processing amount larger than that of the first bandwidth extension processing unit 106. The second bandwidth extension processing unit 107 performs, for example, bandwidth extension based the HQ-SBR scheme.
  • Generally, when encoding an audio signal (that is, when generating basic codec), high frequency components are removed to reduce encoding amount. Thus, the decoded basic codec signal is an audio signal mainly including low frequency components. The bandwidth extension performed by the first bandwidth extension processing unit 160 and the second bandwidth extension processing unit 107 are processing in which the removed high frequency components are predicted and generated by using bandwidth extension information.
  • More specifically, the first bandwidth extension processing unit 106 and the second bandwidth extension processing unit 107 each includes a band synthesis filter. The first and second bandwidth extension processing units 106 and 107 reconstruct an output audio signal that is close to an original sound by synthesizing the bands of the decoded basic codec signal generated by the basic codec decoding unit 103 and the high frequency component signal reconstructed based on the decoded basic codec signal by using the bandwidth extension information.
  • The stereo extension processing unit 108 uses stereo extension information transmitted from the stereo extension data analyzing unit 105 to perform stereo processing on the monaural audio signal having a frequency band extended by the second bandwidth extension processing unit 107. More specifically, the stereo extension processing unit 108 performs, by using the stereo extension information, stereo processing on the decoded basic codec signal that is a monaural audio signal and the high frequency component signal generated by the second bandwidth extension processing unit 107, to generate a decoded basic codec signal and a high frequency component signal for left (L) channel, and a decoded basic codec signal and a high frequency component signal for right (R) channel. The stereo extension processing unit 108 performs, for example, stereo processing based on the PS scheme. Here, the stereo extension processing unit 108 has to be used in combination with the second bandwidth extension processing unit 107. In other words, the stereo extension processing unit 108 shares the complex QMF with the second bandwidth extension processing unit 107.
  • The second bandwidth extension processing unit 107 synthesizes the bands of the generated L-channel signals and the bands of the generated R-channel signals. In the band synthesis processing of the second bandwidth extension processing unit 107, when an input stream includes a frame in which stereo extension data is missing, the delay information of the L channel is copied to the delay information of the R channel. When stereo extension data is obtained, band synthesis of R channel is performed using the delay information of the L channel copied for a previous frame, as the delay information of R channel. The delay information of the L channel is information held over frames in the band synthesis filter of the band synthesis processing.
  • The switching unit 109 determines whether the outputs of the basic codec decoding unit 103 and the bandwidth extension data analyzing unit 104 are connected to terminal A or terminal B, based on the number of channels CH and the sampling frequency FS transmitted form the basic codec analyzing unit 102. The determination procedure will be specifically described later with reference to FIG. 3. The switching unit 109 transmits the decoded basic codec signal transmitted form the basic codec decoding unit 103 and the bandwidth extension information transmitted from the bandwidth extension data analyzing unit 104, to the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 depending on the determination result.
  • As described above, the audio reproducing device 100 according to Embodiment 1 includes the switching unit 109 which selects one of the two bandwidth extension processing having different characteristics, based on the analysis result of the basic codec. The two bandwidth extension processing are: a first processing which requires less processing amount but produces lower sound quality; and a second processing which requires larger processing amount but produces higher sound quality.
  • Next, operations of the audio reproducing device 100 according to Embodiment 1 are described.
  • FIG. 2 is a flowchart of the operations of the audio reproducing device 100 according to Embodiment 1. The following operations are performed on a frame basis.
  • First, the stream separating unit 101 separates an input stream into basic codec, bandwidth extension data, and stereo extension data (S101). The basic codec is transmitted to the basic codec analyzing unit 102. The bandwidth extension data is transmitted to the bandwidth extension data analyzing unit 104. The stereo extension data is transmitted to the stereo extension data analyzing unit 105.
  • Next, each of separated data is analyzed (S102). More specifically, the basic codec analyzing unit 102 analyzes the basic codec to generate basic codec analysis information. The bandwidth extension data analyzing unit 104 analyzes the bandwidth extension data to generate bandwidth extension information. The stereo extension data analyzing unit 105 analyzes the stereo extension data to generate stereo extension information. In the case where stereo extension information cannot be generated, such as the case where stereo extension data is missing, the stereo extension data analyzing unit 105 transmits, to the stereo extension processing unit 108, information indicating that there is no stereo extension information.
  • Next, the basic codec decoding unit 103 decodes the basic codec in accordance with the basic codec analysis information (S103). The decoded basic codec signal is transmitted to the switching unit 109.
  • The switching unit 109 determines the connection destination of the transmission path of the decoded basic codec signal based on the basic codec analysis information, and switches between the terminal A and the terminal B based on the determination result (S104). For example, the switching unit 109 refers to the channel information included in the basic codec analysis information, and selects the terminal A when the number of channels CH of the basic codec is greater than a predetermined threshold. Alternatively, the switching unit 109 refers to the sampling frequency information included in the basic codec analysis information, and selects the terminal A when the sampling frequency FS of the basic codec is equal to or greater than a predetermined threshold. In other cases, the switching unit 109 selects the terminal B.
  • When the terminal A is selected ("A" in S105), the decoded basic codec signal and the bandwidth extension information are transmitted to the first bandwidth extension processing unit 106. The first bandwidth extension processing unit 106 extends the frequency band of the decoded basic codec signal to generate an output audio signal (S106). The first bandwidth extension processing unit 106 executes processing in accordance with the LP-SBR scheme or the like which requires less processing amount but generates an audio signal with lower sound quality.
  • When the terminal B is selected ("B" in S105), the decoded basic codec signal and the bandwidth extension information are transmitted to the second bandwidth extension processing unit 107. The second bandwidth extension processing unit 107 extends the frequency band of the decoded basic codec signal to generate an output audio signal (S107). The second bandwidth extension processing unit 107 executes processing in accordance with the HQ-SBR scheme or the like which requires larger processing amount but generates an audio signal with higher sound quality.
  • Here, when there is stereo extension information, the stereo extension processing unit 108 performs stereo processing on the decoded basic codec signal (monaural audio signal) having the frequency band extended by the second bandwidth extension processing unit 107.
  • Lastly, the audio signal generated by the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 is output (S108).
  • In such a manner, it is possible to generate an output audio signal that is close to an original sound by predicting and reconstructing the high frequency components of the decoded basic codec signal. Here, processing is selected based on the basic codec analysis information representing the type of the basic codec. Accordingly, for example, in the case where processing amount is increased due to multi-channel or higher sampling frequency, it is possible to prevent an increase in the processing amount by selecting the first bandwidth extension processing unit 106 which requires less processing amount.
  • Next, reference is made to a specific example of the determination processing of the connection destination (S104).
  • FIG. 3 is a flowchart of a specific example of the operations of the switching unit 109 according to Embodiment 1.
  • First, it is determined whether or not the number of channels CH and the sampling frequency FS of an input basic codec meets a predetermined condition (S201). Here, it is determined whether the CH is 1 and also the FS is at most equal to 24 kHz.
  • In the case where the number of channels CH is 2 or more, or the sampling frequency FS is higher than 24 kHz (No in S201), the transmission path is connected to the terminal A, and the input bandwidth extension information and the decoded basic codec signal are transmitted to the first bandwidth extension processing unit 106 (S202). In the case where the number of channels CH is 1, and the sampling frequency FS is 24 kHz or lower (Yes in S201), the transmission path is connected to the terminal B, and the input bandwidth extension information and the decoded basic codec signal are transmitted to the second bandwidth extension processing unit 107 (S203).
  • In the following, reference is made to the operations of the audio reproducing device 100 according to Embodiment 1 with a specific example of a stream.
  • FIG. 4 is a diagram illustrating an example of an input stream which includes stereo extension data.
  • For example, when the audio reproducing device 100 receives a stream as shown in FIG. 4, the basic codec analyzing unit 102 analyzes the basic codec, and transmits the number of channels CH (=1) and the sampling frequency information FS (=24 kHz) to the switching unit 109. Since the condition shown in FIG. 3 is met (Yes in S201), the switching unit 109 connects the transmission path to the terminal B, and transmits the decoded basic codec signal and the bandwidth extension information to the second bandwidth extension processing unit 107 (S203). The second bandwidth extension processing unit 107 extends the band of the decoded basic codec signal transmitted from the switching unit 109, by using the bandwidth extension information. Here, at the same time, the stereo extension processing unit 108 performs stereo extension processing by using the stereo extension information, and outputs the stereo audio signal.
  • As shown in FIG. 4, when the stereo extension data is included, the number of channels CH is 1. The stereo extension data is information used for performing stereo processing on a monaural audio signal. When the number of channels is 1, it represents that the decoded basic codec signal is a monaural audio signal.
  • FIG. 5 is a diagram illustrating an example of an input stream which includes no stereo extension data. When the audio reproducing device 100 receives a stream as shown in FIG. 5, the basic codec analyzing unit 102 analyzes the basic codec, and transmits the number of channels CH (= 5.1) and the sampling frequency information FS (= 24 kHz) to the switching unit 109. Since the condition shown in FIG. 3 is not met (No in S201), the switching unit 109 connects the transmission path to the terminal A, and transmits the decoded basic codec signal and the bandwidth extension information to the first bandwidth extension processing unit 106 (S202). The first bandwidth extension processing unit 106 extends the band of the decoded basic codec signal transmitted from the switching unit 109, by using the bandwidth extension information, and outputs an audio signal.
  • Next, reference is made to the case where the audio reproducing device 100 receives a stream in which stereo extension data is missing in a frame at some point within the stream, and stereo extension data reappears in the subsequent frames.
  • FIG. 6 is a diagram illustrating an example of an input stream including a frame in which stereo extension data is missing. As shown in FIG. 6, stereo extension data is included in frames 201 and 203, but stereo extension data is missing in a frame 202. However, there is no change in the basic codec analysis information generated by the analysis of the basic codec included in the frames 201, 202, and 203. More specifically, each of the number of channels CH of the basic codec of the frames 201, 202, 203 is 1, and the sampling frequency is 24 kHz.
  • Thus, the switching unit 109 determines that each frame meets the condition shown in FIG. 3 (Yes in S201), and connects the transmission path to the terminal B (S203). The second bandwidth extension processing unit 107 performs bandwidth extension on each frame.
  • FIG. 7 is a diagram illustrating an example of waveforms of output audio signals. In FIG. 7, (a) shows a conventional waveform of an output audio signal of the case where HQ-SBR is switched to LP-SBR at time t0 and LP-SBR is switched to HQ-SBR at time t2 due to the missing PS data of the frame 202. Conventionally, such switching of the processing causes abnormal sounds because delay information is not available during the periods between times t0 and t1 and between times t2 and t3.
  • On the other hand, as described above, the audio reproducing device 100 according to Embodiment 1 determines the first bandwidth extension processing unit 106 or the second bandwidth extension processing unit 107 for performing processing, independently of the existence of the stereo extension data within a stream. More specifically, in the case where the respective frames have the same analysis information of the basic codec, the same processing unit is used for extending the band of the decoded basic codec signal of each frame. Thus, discontinuity of the delay information does not occur, thereby preventing abnormal sounds as shown in FIG. 7(b).
  • As described above, in the audio reproducing device 100 according to Embodiment 1, the second bandwidth extension processing unit 107 performs bandwidth extension on a stream including stereo extension data (that is, the stream having CH = 1); and thus, it is possible to perform stereo extension processing without any problems. Furthermore, the first bandwidth extension processing unit 106 performs bandwidth extension on a stream that is multi-channel and includes no stereo extension data; and thus, it is possible to reduce processing amount (computation amount).
  • As a result, for example, it is possible to reproduce an audio signal with which a stream having HE-AACv2 profile properly decoded, without increasing the computation amount required when reproducing a multi-channel audio signal. Here, it is possible to reproduce audio signals without abnormal sounds even in the case where no PS data is input and then PS data is input.
  • (Embodiment 2)
  • An audio reproducing device according to Embodiment 2 includes a buffer for storing stereo extension information. For example, when there is missing stereo extension data under the influences of broadcast receiving, stereo processing is performed by using the stereo extension information stored in the buffer.
  • FIG. 8 is a block diagram illustrating an example of a structure of an audio reproducing device 300 according to Embodiment 2. The audio reproducing device 300 shown in FIG. 8 differs from the audio reproducing device 100 shown in FIG. 1 in that a stereo extension processing unit 308 is included instead of the stereo extension processing unit 108, and that a buffer 310 is further included. In the following, only the differences from Embodiment 1 are described, and the descriptions of the same points are omitted.
  • In addition to the processing performed by the stereo extension processing unit 108, the stereo extension processing unit 308 stores, in the buffer 310, stereo extension information used for the stereo processing. More specifically, the stereo extension processing unit 308 performs stereo processing on the decoded basic codec signal having the frequency band extended by the second bandwidth extension processing unit 107, by using the stereo extension information transmitted from the stereo extension data analyzing unit 105. The stereo extension information used here is stored in the buffer 310. For example, each time stereo extension information is obtained, the stereo extension processing unit 308 updates the stereo extension information stored in the buffer 310.
  • Furthermore, in the case where there is no stereo extension information such as the case where the stereo extension information in a frame is missing, the stereo extension processing unit 308 reads stereo extension information from the buffer 310, and performs stereo processing on the decoded basic codec signal (monaural audio signal) of the frame by using the read stereo extension information.
  • The buffer 310 stores the stereo extension information transmitted from the stereo extension data analyzing unit 105. The buffer 310 not only stores newest stereo extension information, but also may store a plurality pieces of stereo extension information. In the case where the buffer 310 stores a plurality pieces of stereo extension information, the stereo extension processing unit 308, for example, refers to the basic codec extension information and uses the stereo extension information used for the stereo processing of a previous decoded basic codec signal similar to the current decoded basic codec signal.
  • As described, the audio reproducing device 300 according to Embodiment 1 includes the buffer 310 for storing stereo extension information. In the case where there is no stereo extension information, the audio reproducing device 300 performs stereo processing on the decoded basic codec signal by using the stereo extension information stored in the buffer 310.
  • Next, of the operations of the audio reproducing device 300 according to Embodiment 2, the operations of the stereo extension processing unit 308 are described. The audio reproducing device 300 decodes input streams in accordance with the flowcharts shown in FIG. 2 and FIG. 3. The stereo extension processing unit 308 according to Embodiment 2 performs processing when the second bandwidth extension processing unit 107 performs bandwidth extension (S107).
  • FIG. 9 is a flowchart of the operations of the stereo extension processing unit 308 according to Embodiment 2.
  • First, the stereo extension processing unit 308 determines whether or not a stream includes stereo extension data, that is, whether or not stereo extension information is transmitted from the stereo extension data analyzing unit 105 (S301). In the case where the stereo extension information is transmitted (Yes in S301), stereo extension processing is performed by using the stereo extension information (S302). The stereo extension processing unit 308 further stores the stereo extension information used here (S303).
  • In the case where the stereo extension information is not transmitted (No in S301), it is determined whether or not stereo extension processing has been performed for decoding previous frames (S304). In the case where the stereo extension processing has been performed (Yes in S304), stereo extension processing is performed by using the stereo extension information stored when decoding previous frames (S305). In the case where no stereo extension processing has been performed (No in S304), the processing ends here.
  • In such a manner, the stereo extension processing unit 308 according to Embodiment 2 stores, in the buffer 310, the stereo extension information used for decoding previous frames. When stereo extension data is missing in a subsequent frame, stereo processing is performed on the decoded basic codec signal by using the stereo extension information stored in the buffer 310.
  • In the following, reference is made to the operations performed by the audio reproducing device 300 according to Embodiment 2 when a stream as shown in FIG. 6 is input.
  • According to Embodiment 2, in the case where a stream is input which includes a frame in which stereo extension data is missing at some point within the stream as shown in FIG. 6, the switching unit 109 connects the transmission path to the terminal B because all of the frames 201 to 203 has a single channel and sampling frequency FS that is equal to or higher than 24 kHz. The decoded basic codec signal and bandwidth extension information are transmitted to the second bandwidth extension processing unit 107. In such a manner, the bandwidth extension processing on all of the frames 201 to 203 is performed by the second bandwidth extension processing unit 107, which allows continuity of delay information.
  • FIG. 10 is a diagram illustrating an example of waveforms of output stereo audio signals. Conventionally, stereo extension processing is not performed during a period of the frame in which stereo extension data is missing(period between t4 and t5). As shown in (a) in FIG. 10, R-channel audio signal is not output, which gives a listener a feeling of strangeness. In order to overcome such a strange feeling, and properly output R-channel audio signal as shown in (b) in FIG. 10, the stereo extension processing unit 308 performs the following operations.
  • Since the frame 201 includes stereo extension data (Yes in S301), the stereo extension processing unit 308 performs stereo extension processing (S302), and stores the stereo extension information used here (S303).
  • Next, the frame 202 in which stereo extension data is missing is input. Since stereo extension data is missing in the frame 202 (No in S301) and the stereo extension processing is performed at the time of decoding of the frame 201 (Yes in S304), the stereo extension processing unit 308 performs stereo extension processing on the frame 202 by using the stereo extension information of the frame 201.
  • Subsequently, the next frame 203 with stereo extension data is input. Since the frame 203 includes stereo extension data (Yes S301), the stereo extension processing unit 308 performs stereo extension processing on the frame 203 by using the stereo extension information extracted from the frame 203 (S302).
  • In such a manner, the audio reproducing device 300 according to Embodiment 2 is capable of keeping continuity of an output sound, and also performing stereo extension even on a frame in which stereo extension data is missing.
  • As a result, for example, it is possible to reproduce an audio signal with which a stream having HE-AACv2 profile properly decoded, without increasing the computation amount required when reproducing a multi-channel audio signal. Here, it is possible to reproduce audio signals without abnormal sounds even in the case where no PS data is input and then PS data is input. Alternatively, in the case where PS data is input in a preceding frame and then no PS data is input in a subsequent frame, a stereo audio signal can be reproduced by using the previous PS data.
  • FIG. 11 is an external view of an example of an audio reproducing apparatus incorporating an audio reproducing device according to the present invention. FIG. 11 illustrates a recording medium 401, an audio reproducing apparatus 402, and earphones 403.
  • The recording medium 401 is a recording medium which is capable of recording compressed audio streams. FIG. 11 shows the recording medium 401 as a medium, such as a secure digital (SD) card, removable from an apparatus; however, the recording medium 401 may also be implemented as an optical disk, a hard disk drive (HDD) incorporated in the apparatus, or the like.
  • The audio reproducing apparatus 402 is an apparatus which reproduces compressed audio streams, and includes at least one of the audio reproducing devices 100 and 300 according to Embodiments 1 and 2.
  • The earphones 403 are loud speaker apparatus which output audio signals output from the audio reproducing apparatus 402 to outside. FIG. 11 illustrates earphones which are inserted into the ears of a user; however, the earphones may be headphones which are put on the head of the user, or desktop loudspeakers.
  • According to such structure of the audio reproducing apparatus 402, it is possible to obtain an output audio signal without causing abnormal sounds even when a stream includes a frame in which stereo extension data is missing.
  • The audio reproducing device and the audio reproducing method according to the present invention have been described based on the embodiments; however, the present invention is not limited to these embodiments. Those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this invention, which is defined by the appended claims.
  • For example, the switching unit 109 makes the determination based on the determination condition that the number of channels is 1 and the sampling frequency is 24 kHz or lower; however, the determination condition is not limitative. For example, it may be that the switching unit 109 determines to use the second bandwidth extension processing unit 107 (connect to the terminal B) only when the number of channels is two or less. In this case, when a stream having the basic codec with 1 or 2 channels is input, bandwidth extension is performed by the second bandwidth extension processing unit 107 which generates higher sound quality but requires larger processing amount.
  • In the case where a stream of 3 or more channels is input, it is possible to perform bandwidth extension by using the first bandwidth extension processing unit 106 which requires less processing amount but generates lower sound quality, to reduce the overall processing amount. In such a manner, it is possible to provide high-quality sound even for multi-channel processing as long as the processing capability and memory resources permit.
  • The present invention may be implemented not only as an audio reproducing device and an audio reproducing method as described above, but also as a program causing a computer to execute an audio reproducing method according to the embodiments. The present invention may also be implemented as a recording medium, such as a computer readable CD-ROM, which stores the program. Furthermore, the present invention may be implemented as information, data, or a signal indicating the program. Such program, information, data, and signal may be distributed over a communication network such as the Internet.
  • Furthermore, portion or all of the constituent elements of the audio reproducing device according to the present invention may be structured as a single system LSI. The system LSI is a super multi-functional LSI manufactured by integrating a plurality of structural units onto a single chip. Specifically, it is a computer system including a microprocessor, a ROM, a RAM, and the like.
  • [Industrial Applicability]
  • The present invention prevents a significant increase in processing amount, and also prevents occurrence of abnormal sounds. The present invention may be used for, for example, an audio reproducing device. For example, the present invention may be used for an audio reproducing apparatus, such as a portable music player, which has limited processor capability and limited memory resources.
  • [Reference Signs List]
    • 100, 300 Audio reproducing device
    • 101 Stream separating unit
    • 102 Basic codec analyzing unit
    • 103 Basic codec decoding unit
    • 104 Bandwidth extension data analyzing unit
    • 105 stereo extension data analyzing unit
    • 106 First bandwidth extension processing unit
    • 107 Second bandwidth extension processing unit
    • 108, 308 Stereo extension processing unit
    • 109 Switching unit
    • 201, 202, 203 Frame
    • 310 Buffer
    • 401 Recording medium
    • 402 Audio reproducing apparatus
    • 403 Earphones

Claims (8)

  1. An audio reproducing device configured to reproduce a stream including a basic codec encoded audio signal, said audio reproducing device comprising:
    a stream separating unit (101) configured to separate, on a frame basis, the stream into the basic encoded audio signal and bandwidth extension information that is used for extending a band of the basic encoded audio signal;
    a basic codec information analyzing unit (102) configured to analyze the basic encoded audio signal separated by said stream separating unit, to generate analysis information indicating a type of the basic encoded audio signal;
    a basic codec decoding unit (103) configured to decode the basic encoded audio signal in accordance with the analysis information generated by said basic codec information analyzing unit, to generate a decoded basic audio signal;
    a first bandwidth extension processing unit (106) configured to execute first processing which extends, by using the bandwidth extension information, a frequency band of the decoded basic audio signal generated by said basic codec decoding unit;
    a second bandwidth extension processing unit (107) configured to execute second processing which extends, by using the bandwidth extension information, the frequency band of the decoded basic audio signal generated by said basic codec decoding unit, the second processing being executed with an accuracy higher than an accuracy of the first processing; and
    a switching unit configured to switch between said first bandwidth extension processing unit and said second bandwidth extension processing unit based on the analysis information,
    wherein the bandwidth extension information is Spectral Band Replication (SBR) information generated according to SBR scheme,
    said first bandwidth extension processing unit is configured to execute the first processing by using Low-Power SBR scheme,
    said second bandwidth extension processing unit is configured to execute the second processing by using High-Quality SBR scheme, and
    wherein said basic codec information analyzing unit is configured to analyze the basic encoded audio signal separated by said stream separating unit, to generate analysis information including at least one of channel information and sampling frequency information, the channel information indicating the number of channels of the basic encoded audio signal, the sampling frequency information indicating a sampling frequency of the basic encoded audio signal, and
    said switching unit is configured to determine at least one of (i) whether the number of channels indicated by the channel information is greater than a predetermined first threshold and (ii) whether the sampling frequency indicated by the sampling frequency information is greater than a predetermined second threshold, and select said first bandwidth extension processing unit when at least one of the following is determined: (i) the number of channels is greater than the predetermined first threshold and (ii) the sampling frequency is greater than the predetermined second threshold.
  2. The audio reproducing device according to Claim 1,
    wherein said stream separating unit (101) is configured to separate, on the frame basis, the stream into the basic encoded audio signal, the bandwidth extension information, and stereo extension information that is used for performing stereo processing on the basic encoded audio signal,
    said audio reproducing device (100) further comprises:
    a stereo extension processing unit (108) configured to perform, by using the stereo extension information, stereo processing on the decoded basic audio signal having the frequency band extended by said second bandwidth extension processing unit.
  3. The audio reproducing device according to Claim 1 or 2, further comprising
    a buffer which stores stereo extension information of a first frame,
    wherein said stereo extension processing unit is configured to perform stereo processing on a decoded basic audio signal of a second frame by using the stereo extension information stored in said buffer, the second frame being a frame after the first frame and being a frame in which the stereo extension information is missing.
  4. The audio reproducing device according to Claim 1 or 2,
    wherein said second bandwidth extension processing unit is configured to generate a high-frequency component signal from the decoded basic audio signal by using the bandwidth extension information,
    said stereo extension processing unit is configured to perform, by using the stereo extension information, stereo processing on the decoded basic audio signal and the high-frequency component signal generated by said second bandwidth extension processing unit, to generate a decoded basic audio signal and a high-frequency component signal for a first channel and a decoded basic audio signal and a high-frequency component signal for a second channel, and
    said second bandwidth extension processing unit further includes a band synthesis filter for synthesizing the high-frequency component signal and the decoded basic audio signal that have been generated, and is configured to synthesize bands of the second channel by using delay information that is stored in said band synthesis filter of the first channel, as delay information stored in said band synthesis filter of the second channel, when the stereo extension information is missing.
  5. An audio reproducing apparatus comprising the audio reproducing device according to one of Claims 1 to 4.
  6. An audio reproducing method of reproducing a stream including a basic encoded audio signal, said audio reproducing method comprising:
    separating, on a frame basis, the stream into the basic encoded audio signal and bandwidth extension information that is used for extending a band of the basic encoded audio signal;
    analyzing the basic encoded audio signal separated in said separating, to generate analysis information indicating a type of the basic encoded audio signal;
    decoding the basic encoded audio signal in accordance with the analysis information generated in said analyzing, to generate a decoded basic audio signal;
    switching between first processing and second processing based on the analysis information, the second processing being executed with an accuracy higher than an accuracy of the first processing;
    executing, when the first processing is selected in said switching, the first processing which extends, by using the bandwidth extension information, a frequency band of the decoded basic codec signal generated in said decoding; and
    executing, when the second processing is selected in said switching, the second processing which extends, by using the bandwidth extension information, the frequency band of the decoded basic audio signal generated in said decoding,
    wherein the bandwidth extension information is Spectral Band Replication (SBR) information generated according to SBR scheme,
    real number operations are performed in the first processing by using Low-Power SBR scheme,, and
    complex arithmetic is performed in the second processing by using High-Quality SBR scheme, and
    wherein said analyzing step analyzes the basic encoded audio signal separated in said stream separating step, to generate analysis information including at least one of channel information and sampling frequency information, the channel information indicating the number of channels of the basic encoded audio signal, the sampling frequency information indicating a sampling frequency of the basic encoded audio signal, and
    said switching step determines at least one of (i) whether the number of channels indicated by the channel information is greater than a predetermined first threshold and (ii) whether the sampling frequency indicated by the sampling frequency information is greater than a predetermined second threshold, and selects said first bandwidth extension processing unit when at least one of the following is determined: (i) the number of channels is greater than the predetermined first threshold and (ii) the sampling frequency is greater than the predetermined second threshold.
  7. An integrated circuit configured to reproduce a stream including a basic encoded audio signal, said integrated circuit comprising:
    an audio reproducing device according to claim 1.
  8. A computer program for reproducing a stream including a basic encoded audio signal, said computer program comprising: instructions which upon execution on a computer causes the computer to carry out the method as defined in claim 6.
EP13161700.3A 2008-11-21 2009-10-13 Audio reproducing device and audio reproducing method Active EP2610867B1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008298809A JP5629429B2 (en) 2008-11-21 2008-11-21 Audio playback apparatus and audio playback method
EP09827300.6A EP2360684B1 (en) 2008-11-21 2009-10-13 Audio reproducing device and audio reproducing method

Related Parent Applications (2)

Application Number Title Priority Date Filing Date
EP09827300.6A Division EP2360684B1 (en) 2008-11-21 2009-10-13 Audio reproducing device and audio reproducing method
EP09827300.6 Division 2009-10-13

Publications (2)

Publication Number Publication Date
EP2610867A1 EP2610867A1 (en) 2013-07-03
EP2610867B1 true EP2610867B1 (en) 2015-03-11

Family

ID=42197962

Family Applications (2)

Application Number Title Priority Date Filing Date
EP13161700.3A Active EP2610867B1 (en) 2008-11-21 2009-10-13 Audio reproducing device and audio reproducing method
EP09827300.6A Active EP2360684B1 (en) 2008-11-21 2009-10-13 Audio reproducing device and audio reproducing method

Family Applications After (1)

Application Number Title Priority Date Filing Date
EP09827300.6A Active EP2360684B1 (en) 2008-11-21 2009-10-13 Audio reproducing device and audio reproducing method

Country Status (4)

Country Link
EP (2) EP2610867B1 (en)
JP (1) JP5629429B2 (en)
BR (1) BRPI0921067B1 (en)
WO (1) WO2010058518A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5692255B2 (en) * 2010-12-03 2015-04-01 ヤマハ株式会社 Content reproduction apparatus and content processing method
CA3157717A1 (en) 2011-07-01 2013-01-10 Dolby Laboratories Licensing Corporation System and method for adaptive audio signal generation, coding and rendering
KR101798126B1 (en) 2013-01-29 2017-11-16 프라운호퍼 게젤샤프트 쭈르 푀르데룽 데어 안겐반텐 포르슝 에. 베. Decoder for generating a frequency enhanced audio signal, method of decoding, encoder for generating an encoded signal and method of encoding using compact selection side information
JPWO2022097244A1 (en) * 2020-11-05 2022-05-12
JPWO2022097241A1 (en) * 2020-11-05 2022-05-12
US20230402051A1 (en) * 2020-11-05 2023-12-14 Nippon Telegraph And Telephone Corporation Sound signal high frequency compensation method, sound signal post processing method, sound signal decode method, apparatus thereof, program, and storage medium

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3957589B2 (en) * 2001-08-23 2007-08-15 松下電器産業株式会社 Audio processing device
JP4281349B2 (en) * 2001-12-25 2009-06-17 パナソニック株式会社 Telephone equipment
KR100917464B1 (en) * 2003-03-07 2009-09-14 삼성전자주식회사 Method and apparatus for encoding/decoding digital data using bandwidth extension technology
JP2005114813A (en) * 2003-10-03 2005-04-28 Matsushita Electric Ind Co Ltd Audio signal reproducing device and reproducing method
JP2007538281A (en) * 2004-05-17 2007-12-27 ノキア コーポレイション Speech coding using different coding models.
JP2006065002A (en) * 2004-08-26 2006-03-09 Kenwood Corp Device and method for content reproduction
JP4567412B2 (en) * 2004-10-25 2010-10-20 アルパイン株式会社 Audio playback device and audio playback method
US8055500B2 (en) * 2005-10-12 2011-11-08 Samsung Electronics Co., Ltd. Method, medium, and apparatus encoding/decoding audio data with extension data

Also Published As

Publication number Publication date
EP2360684B1 (en) 2013-05-29
JP5629429B2 (en) 2014-11-19
BRPI0921067B1 (en) 2020-02-18
JP2010122640A (en) 2010-06-03
EP2360684A1 (en) 2011-08-24
EP2360684A4 (en) 2012-09-12
EP2610867A1 (en) 2013-07-03
BRPI0921067A2 (en) 2015-12-15
WO2010058518A1 (en) 2010-05-27

Similar Documents

Publication Publication Date Title
JP4939933B2 (en) Audio signal encoding apparatus and audio signal decoding apparatus
KR101228630B1 (en) Energy shaping device and energy shaping method
JP4685925B2 (en) Adaptive residual audio coding
JP4601669B2 (en) Apparatus and method for generating a multi-channel signal or parameter data set
EP3745397B1 (en) Decoding device and decoding method, and program
JP4794448B2 (en) Audio encoder
JP6155274B2 (en) Upsampling with oversampled SBR
US20110044458A1 (en) Slot position coding of residual signals of spatial audio coding application
US20080013614A1 (en) Device and method for generating a data stream and for generating a multi-channel representation
EP2610867B1 (en) Audio reproducing device and audio reproducing method
KR100763919B1 (en) Method and apparatus for decoding input signal which encoding multi-channel to mono or stereo signal to 2 channel binaural signal
JP2010204533A (en) Device and method for decoding audio
JP5468020B2 (en) Acoustic signal decoding apparatus and balance adjustment method
KR20130007439A (en) Signal processing apparatus, signal processing method, and program
KR100636145B1 (en) Exednded high resolution audio signal encoder and decoder thereof
JPWO2008132826A1 (en) Stereo speech coding apparatus and stereo speech coding method
JP2006337767A (en) Device and method for parametric multichannel decoding with low operation amount
WO2021139757A1 (en) Audio encoding method and device and audio decoding method and device
JP5943982B2 (en) Audio playback apparatus and audio playback method
JP2005114813A (en) Audio signal reproducing device and reproducing method
WO2006011367A1 (en) Audio signal encoder and decoder
JP2005181672A (en) Audio player

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

AC Divisional application: reference to earlier application

Ref document number: 2360684

Country of ref document: EP

Kind code of ref document: P

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

17P Request for examination filed

Effective date: 20131021

RBV Designated contracting states (corrected)

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

17Q First examination report despatched

Effective date: 20140304

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

GRAJ Information related to disapproval of communication of intention to grant by the applicant or resumption of examination proceedings by the epo deleted

Free format text: ORIGINAL CODE: EPIDOSDIGR1

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

INTG Intention to grant announced

Effective date: 20141105

RIC1 Information provided on ipc code assigned before grant

Ipc: G10L 19/005 20130101ALN20141024BHEP

Ipc: G10L 19/008 20130101ALN20141024BHEP

Ipc: G10L 21/038 20130101ALI20141024BHEP

Ipc: G10L 19/24 20130101AFI20141024BHEP

INTG Intention to grant announced

Effective date: 20141120

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

AC Divisional application: reference to earlier application

Ref document number: 2360684

Country of ref document: EP

Kind code of ref document: P

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO SE SI SK SM TR

REG Reference to a national code

Ref country code: GB

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: REF

Ref document number: 715707

Country of ref document: AT

Kind code of ref document: T

Effective date: 20150415

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602009029994

Country of ref document: DE

Effective date: 20150423

RAP2 Party data changed (patent owner data changed or rights of a patent transferred)

Owner name: SOCIONEXT INC.

REG Reference to a national code

Ref country code: NL

Ref legal event code: VDEP

Effective date: 20150311

REG Reference to a national code

Ref country code: NL

Ref legal event code: VDEP

Effective date: 20150311

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150611

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 715707

Country of ref document: AT

Kind code of ref document: T

Effective date: 20150311

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150612

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: DE

Ref legal event code: R082

Ref document number: 602009029994

Country of ref document: DE

Representative=s name: NOVAGRAAF, FR

Ref country code: DE

Ref legal event code: R081

Ref document number: 602009029994

Country of ref document: DE

Owner name: SOCIONEXT INC., YOKOHAMA-SHI, JP

Free format text: FORMER OWNER: PANASONIC CORP., KADOMA-SHI, OSAKA, JP

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150713

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150711

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602009029994

Country of ref document: DE

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

26N No opposition filed

Effective date: 20151214

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: GB

Ref legal event code: 732E

Free format text: REGISTERED BETWEEN 20160218 AND 20160224

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20151013

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: IE

Ref legal event code: MM4A

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20151031

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20151031

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: BE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 8

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20151013

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: HU

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT; INVALID AB INITIO

Effective date: 20091013

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 9

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20150311

REG Reference to a national code

Ref country code: FR

Ref legal event code: PLFP

Year of fee payment: 10

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20231020

Year of fee payment: 15

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: IT

Payment date: 20231026

Year of fee payment: 15

Ref country code: FR

Payment date: 20231025

Year of fee payment: 15

Ref country code: DE

Payment date: 20231020

Year of fee payment: 15