US20190265944A1 - Loudness control for user interactivity in audio coding systems - Google Patents
Loudness control for user interactivity in audio coding systems Download PDFInfo
- Publication number
- US20190265944A1 US20190265944A1 US16/413,507 US201916413507A US2019265944A1 US 20190265944 A1 US20190265944 A1 US 20190265944A1 US 201916413507 A US201916413507 A US 201916413507A US 2019265944 A1 US2019265944 A1 US 2019265944A1
- Authority
- US
- United States
- Prior art keywords
- loudness
- group
- metadata
- gain
- audio
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000005236 sound signal Effects 0.000 claims abstract description 254
- 239000003607 modifier Substances 0.000 claims abstract description 40
- 238000012545 processing Methods 0.000 claims abstract description 29
- 230000004044 response Effects 0.000 claims abstract description 28
- 238000000034 method Methods 0.000 claims description 55
- 238000010606 normalization Methods 0.000 claims description 23
- 238000004590 computer program Methods 0.000 claims description 16
- 239000011800 void material Substances 0.000 claims description 4
- 238000009877 rendering Methods 0.000 description 10
- 230000008901 benefit Effects 0.000 description 9
- 230000003993 interaction Effects 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 7
- 238000013459 approach Methods 0.000 description 6
- 239000000203 mixture Substances 0.000 description 6
- 230000008569 process Effects 0.000 description 6
- 230000004048 modification Effects 0.000 description 5
- 238000012986 modification Methods 0.000 description 5
- 230000002238 attenuated effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 4
- 230000008859 change Effects 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000004075 alteration Effects 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 241000854350 Enicospilus group Species 0.000 description 1
- 241000295146 Gallionellaceae Species 0.000 description 1
- 208000032041 Hearing impaired Diseases 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000001010 compromised effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/162—Interface to dedicated audio devices, e.g. audio drivers, interface to CODECs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L19/00—Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
- G10L19/008—Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/11—Application of ambisonics in stereophonic audio systems
Definitions
- the invention refers to an audio processor and to an audio encoder.
- the invention also refers to corresponding methods.
- Modern audio coding systems do not only provide means to efficiently transmit audio content in a loudspeaker channel-based representation that is simply played back at the decoder side. They additionally include more advanced features to allow users to interact with the content and, thus, to influence how the audio is reproduced and rendered at the decoder. This allows for new types of user experiences compared to legacy audio coding systems.
- MPEG-H Audio The New Standard for Universal Spatial/3D Audio Coding”, 137th AES Convention, 2014, Los Angeles. It allows a transmission of immersive audio content in three different formats, channel-based, object-based, and scene-based using higher order ambisonics (HOA). It has been designed to offer new capabilities such as user interaction for personalization and adaptation of the audio for different use scenarios.
- HOA ambisonics
- a method for loudness compensation in object-based audio coding systems including user interaction has been presented in EP 2 879 131 A1.
- a decoder receives an audio input signal comprising audio object signals and generates an audio output signal.
- a signal processor determines a loudness compensation value for the audio output signal based on loudness information associated with the audio input signal and based on rendering information. The rendering information indicates whether one or more of the audio object signals shall be amplified or attenuated and can be adjusted by a user's wish.
- an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating which group is to be used or is not to be used for determining the loudness compensation gain, and wherein the group includes one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to at least one preset, wherein the preset refers to a set of at least one group including one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating whether a group is switched off or switched on, wherein the group includes one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal with at least one group loudness missing in the metadata of a group included in the audio signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to a playback configuration for a reproduction of the signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- an audio encoder for generating an audio signal including metadata may have: a loudness determiner for determining a loudness value for at least one group having one or more audio elements; and a metadata writer for introducing the determined loudness value as a group loudness into the metadata.
- a method for processing an audio signal may have the steps of: modifying the audio signal in response to a user input; determining a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, where the modified loudness or the modified gain depends on the user input, wherein the loudness compensation gain is determined based on metadata of the audio signal indicating whether a group included in the audio signal is to be used or is not to be used for determining the loudness compensation gain, wherein the group includes one or more audio elements, and/or wherein the loudness compensation gain is determined based on metadata of the audio signal referring to a preset, wherein the preset refers to a set of at least one group including one or more audio elements, and/or wherein the loudness compensation gain is determined based on metadata of the audio signal indicating whether a group is switched off or switched on, wherein the group includes one or more audio elements, and/or wherein the loudness compensation gain is determined based on metadata of
- a method for generating an audio signal including metadata may have the steps of: determining a loudness value for a group having one or more audio elements; and introducing the determined loudness value for the group as a group loudness into the metadata.
- a non-transitory digital storage medium may have a computer program stored thereon to perform any of the inventive methods, when said computer program is run by a computer.
- an audio processor for processing an audio signal comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating which group is to be used or is not to be used for determining the loudness compensation gain, and wherein the group comprises one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- the audio processor or decoder or apparatus for processing an audio signal—receives an audio signal and generates in one embodiment an output signal which comprises the audio objects and audio elements etc. of the audio signal to be reproduced, for example, by loudspeakers or earphones or to be stored at a medium and so on.
- the audio processor reacts to a user input via an audio signal modifier that is configured to modify the audio signal in response to a user input.
- the user input refers in one embodiment to an amplification or an attenuation of a group and/or to switching off a group or to switching on a group.
- the groups comprise one or more audio elements, e.g. audio objects, channels, objects or HOA components.
- the user input also refers, depending on the embodiment, to data concerning the playback configuration used for the reproduction of the signal.
- a further user input refers to a selection of a preset.
- a preset refers to a set of at least one group and specifies—depending on the embodiment—specifically measured group loudness values and/or gain values for the respective groups.
- the user input is used by the audio signal modifier for modifying appropriately the audio signal.
- the metadata comprises data belonging to a plurality of presets.
- the preset refers in an embodiment to a set a group and defines in a different embodiment the groups that do not belong to the preset.
- the audio processor also comprises a loudness controller that is configured to determine a loudness compensation gain.
- the loudness compensation gain here called C—allows to counterbalance the effect of the user input in order to provide a signal with an overall loudness as may be useful or as set by the user.
- the loudness compensation gain is determined based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain.
- the loudness compensation gain is determined based on a reference loudness or a reference gain and a modified loudness or a modified gain.
- the modified loudness or the modified gain are depending on the user input.
- the loudness controller is additionally configured to determine the loudness compensation gain based on metadata of the audio signal.
- the metadata that is associated with the audio signal carries information about the audio signal and the individual groups and is in one embodiment compromised by the audio signal itself.
- the information about the corresponding groups is either considered or neglected for determining the loudness compensation gain.
- whether a group or groups is/are considered or neglected depends additionally on the user input.
- considering or neglecting groups includes also considering or neglecting them partially in the sense, that the groups and their respective values are only used for a part of the determination of the loudness compensation gain, e.g. only for the calculation of the reference or the modified loudness.
- the loudness compensation gain is used by a loudness manipulator comprised by the audio processor.
- the loudness manipulator manipulates a loudness of a signal using the loudness compensation gain.
- the applied loudness compensation gain is not only affected by the user input but is also the result of the data of the metadata associated with or even belonging to the audio signal.
- the signal manipulated by the loudness manipulator is according to an embodiment an output signal provided by the audio processor and based on the audio signal.
- the loudness manipulator in this embodiment provides the output signals and manipulates the loudness of the output signal using the loudness compensation gain.
- the loudness manipulator manipulates a loudness of a signal provided to the loudness manipulator and advantageously already modified according to the user input.
- a part of the audio processor provides or generates a signal that is fed to the loudness manipulator and is accordingly processed, i.e. modified with regard to its loudness by the loudness manipulator.
- the signal whose loudness is manipulated by the loudness manipulator is the audio signal.
- the loudness manipulator modifies the metadata of the audio signal by the modification.
- the audio processor provides a modified audio signal. The modified audio signal is modified according to the user input and according to the modification of the loudness. This modified audio signal is afterwards also a bitstream.
- the loudness controller is configured to determine the loudness compensation gain based on at least one flag comprised by the data of the metadata, wherein the flag is indicating whether or how a group is to be considered for determining the loudness compensation gain.
- the metadata comprises flags having, for example, either a “true” or “false” value indicating whether an associated group has to be considered for calculating the loudness compensation gain or not, respectively.
- the consideration of a group refers in one embodiment also to the question for which step of the calculation the group is to be used for. This refers e.g. to the calculation of the reference loudness and the modified loudness.
- the reference loudness and the modified loudness are the calculated overall loudnesses before and after the consideration of the user input, respectively.
- the flag indicates in a different embodiment that the corresponding group is present just during a short interval and, thus, can be neglected for determining the loudness compensation gain.
- the loudness controller is configured to use only groups for determining the loudness compensation gain when the groups belong to an anchor comprised by the metadata of the audio signal.
- the anchor refers in one embodiment, for example, to audio elements belonging to voices, dialogs or special sound effects.
- the loudness controller is configured to use only the groups belonging to the anchor for determining the loudness compensation gain when the modified gain of at least one group belonging to the anchor is greater than the corresponding reference gain.
- just the groups of the anchor are used for the calculation of the loudness compensation gain when the gain value of at least one group of these “anchor groups” is increased due to the user input, i.e. when the user amplified at least one of these groups.
- the loudness controller is configured to use groups belonging to the anchor and groups missing from the anchor for determining the loudness compensation gain when the modified gain of at least one group belonging to the anchor is lower than the corresponding reference gain.
- groups belonging to the anchor and groups missing from the anchor are used for the calculation, when the gain value of at least one anchor group is lowered due to the user input.
- the two foregoing embodiments are combined.
- the change of the gain of at least one group belonging to the anchor determines whether only anchor groups or anchor groups and non-anchor groups are used for determining the loudness compensation gain.
- an audio processor for processing an audio signal comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to at least one preset, wherein the preset refers to a set of at least one group comprising one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- the loudness controller of the audio processor refers to data of the metadata associated with or belonging to the audio signal.
- the data refers to a preset, wherein the preset refers to a set of at least one group comprising one or more audio elements.
- the metadata comprises data for the groups depending on different presets or at least on a default preset. Therefore, the loudness controller uses the data which is associated with a preset chosen by the user or which is a default preset.
- the audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, the embodiments discussed above are at least partially also realized with the audio processor mentioned before.
- the loudness controller is configured to determine the loudness compensation gain based on group loudnesses and/or gain values of the at least one group of the set referred to by the preset.
- the preset refers to a specific set of groups of audio elements comprised by the audio signal.
- the metadata contains specific data—i.e. group loudnesses and/or gain values—to be used for the determination of the loudness compensation gain when the corresponding preset is chosen or set as a default preset.
- the loudness controller is configured to determine the reference loudness for the set referred to by the preset using the respective group loudnesses and the respective gain values.
- the loudness controller is also configured to determine the modified loudness for the set referred to by the preset using the respective group loudnesses and the respective modified gain values.
- the modified gain values are modified by the user input.
- the reference loudness and the modified loudness are determined based on the values associated with a preset and for the groups belonging to the preset. The determination takes also care of the indication whether and how—e.g. for the determination of reference or modified loudness—the groups are to be used.
- the loudness controller is configured to determine the loudness compensation gain based on data comprised by the metadata of the audio signal referring to a selected preset and wherein the preset is selected by the user input.
- the preset is chosen by the user via the user input.
- the loudness controller is configured to determine the loudness compensation gain based on data comprised by the metadata of the audio signal referring to a default preset.
- the default preset is set prior to or independently of a user input. This embodiment handles the situation that a user does not chose a preset. For this, a default preset is used, e.g. prior to any user input for ensuring that even without an interaction by the user a set of data—here covering a default preset—is used for determining the loudness compensation gain.
- an audio processor for processing an audio signal comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating whether a group is switched off or switched on, wherein the group comprises one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- the loudness controller here is configured to determine the loudness compensation gain based on metadata of the audio signal indicating whether a group is switched off or switched on.
- the audio signal may comprise as audio objects different soundtracks belonging to different language versions of a movie.
- the presets also may refer to different language versions. Hence, in the different presets one soundtrack of one language will be switched on while the other versions will be switched off.
- This example also shows that the user may switch between the different language versions by switching on a desired and offered language version and, thus, switching off the soundtrack associated with a default preset. Nevertheless, switching on one group does not always imply switching off another group and vice versa.
- the audio processor is in one embodiment configured according to at least one of the foregoing embodiments.
- the audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, the embodiments discussed above are at least partially also realized with the audio processor mentioned before. This holds also the other way around as one audio processor discussed above is in at least one embodiment realized taking the following embodiments into account.
- the loudness controller determines the loudness compensation gain based on the user input depending whether a group is switched off or switched on by the user input.
- the user interaction affects the determination of the loudness controller gain.
- the loudness controller is configured to discard a group for determining the modified loudness when the group is switched off in response to the user input. If the user switches off a group, in this embodiment, the group is not used for determining the modified loudness which results from the loudness values representing the user's wishes.
- the loudness controller is configured to discard a group for determining the reference loudness when the group is switched off in the metadata and to include the group for determining the modified loudness when the group is switched on by the user input.
- a group is switched off in the metadata and is not used for determining the reference loudness. If the user switches the group on, it is included for the evaluation of the modified loudness.
- the loudness controller is configured to include a group for determining the reference loudness when the group is switched on in the metadata and to exclude the group for determining the modified loudness when the group is switched off by the user input.
- the reverse case of the foregoing embodiment is taken care of.
- an audio processor for processing an audio signal comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal with at least one group loudness missing in the metadata of a group comprised by the audio signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- the loudness controller takes care of the situation that for a group present within the audio signal the corresponding group loudness is missing.
- the group loudness may either be missing for a specific preset or playback configuration and so one or the metadata may be completely void of any group loudness for this group.
- the audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, the embodiments discussed above are at least partially also realized with the audio processor mentioned before. This holds also the other way around as the audio processor discussed above is in at least one embodiment realized taking the following embodiments into account.
- the loudness controller is configured to calculate the missing group loudness using a loudness of a preset, the reference gain of the group with missing group loudness as well as the group loudnesses and the reference gains for the groups having a group loudness.
- the loudness of the preset is the overall loudness of the groups of the preset.
- the loudness controller is configured to determine the loudness compensation gain in the case that the metadata of the audio signal is missing at least one group loudness for a blind loudness compensation using only at least one reference gain and at least one modified gain.
- the case of at least one missing group loudness is handled identically to the case that all group loudnesses are missing.
- the loudness controller is configured to determine the loudness compensation gain in the case that the metadata of the audio signal is void of group loudnesses for a blind loudness compensation using only at least one reference gain and at least one modified gain.
- an audio processor for processing an audio signal comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to a playback configuration for a reproduction of the signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- the audio processor determines the loudness compensation gain based on data referring to a specific playback configuration.
- the metadata associated with and in one embodiment being comprised by the audio signal therefore, contains data specified for at least one playback configuration.
- the metadata for each playback configuration, contain data corresponding to the respective playback—or reproduction—configuration.
- the audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, this audio processor is in one embodiment combined with at least one of the foregoing embodiments.
- the loudness controller is configured to determine the loudness compensation gain based on the data of the metadata referring to a playback configuration and comprising associated group loudnesses and/or reference gain values.
- the different playback configurations are associated with different gain values and/or group loudnesses for the respective groups.
- the metadata comprises data for different presets and different playback configurations.
- the audio processor comprises a configuration converter for converting data comprised by the metadata and referring to the playback configuration to data referring to a current playback configuration, wherein the loudness controller is configured to determine the loudness compensation gain using data provided by the configuration converter.
- the audio processor takes care of the situation that the current playback configuration for reproduction of the signal differs from the playback configurations provided by the metadata.
- the data of the metadata are converting in order to fit to the current playback configuration and the converted data are used for the determination of the loudness compensation gain.
- the audio processor comprises a format converter for converting a signal to a predefined playback configuration.
- the loudness controller is configured to select the specific loudness value for the specific playback configuration used by the format converter.
- the audio signal comprises a bitstream with the metadata and the metadata comprises the reference gain for at least one group.
- the metadata of the audio signal comprises a group loudness for at least one group.
- the metadata comprises group loudnesses for a plurality of groups belonging to the audio signal.
- the loudness controller is configured to determine the reference loudness for at least one group using the group loudness and the gain value for the—at least one—group, wherein the loudness controller is configured to determine the modified loudness for the—at least one—group using the group loudness and the modified gain value, and wherein the modified gain value is modified by the user input.
- the loudness controller is configured to determine the reference loudness—named L ref —for a plurality of groups using the respective group loudnesses—named L i —and gain values—named g i —for the groups. Further, the loudness controller is configured to determine the modified loudness—named L mod —for a plurality of groups using the respective group loudness L i and modified gain values—named h i —for the groups.
- the two pluralities of groups are identical and in a different embodiment different. The pluralities also depend on the respective data of the metadata.
- the loudness controller is configured to perform a limitation operation on the loudness compensation gain so that the loudness compensation gain is lower than an upper threshold and/or so that the loudness compensation gain is greater than a lower threshold.
- the loudness manipulator is configured to apply a corrected gain to a signal determined by the loudness compensation gain and by a normalization gain determined by a target loudness level set by user input and a metadata loudness level comprised by the metadata of the audio signal.
- the normalization gain is determined by using the ratio of the loudness level of the respective groups of the audio signal and the loudness level set by the user to be experienced by the user for the reproduction of the audio signal.
- the foregoing embodiments of audio processors allow a loudness compensation following a user input.
- the loudness compensation is improved by considering data describing groups of the audio signal and their relevance or kind of usage for the loudness compensation.
- the information about the groups refines the loudness compensation.
- an audio encoder for generating an audio signal comprising metadata.
- the audio encoder comprising: a loudness determiner for determining a loudness value for at least one group having one or more audio elements; and a metadata writer for introducing the determined loudness value as a group loudness into the metadata.
- the loudness determiner is configured to determine different loudness values and/or different gain values for different playback configurations
- the metadata writer is configured to introduce the determined different loudness values and/or different gain values in association with the respective playback configuration into the metadata.
- the metadata contains different data for the concerned groups for different playback configurations, thus, improving the playback of the groups of the audio signal.
- the loudness determiner is configured to determine different loudness values and/or different gain values for different presets referring to sets of at least one group comprising one or more audio elements.
- the metadata writer is configured to introduce the determined different loudness values and/or different gain values in association with the respective preset into the metadata.
- the presets refer to specific sets of groups that are associated with specific group loudnesses and/or reference gain values.
- the audio encoder further comprises a controller, wherein the controller is configured to determine which group is to be used for determining a loudness compensation gain or is to be neglected, and wherein the metadata writer is configured for writing an indication into the metadata indicating which group is to be used or is to be neglected for determining the loudness compensation gain.
- the indication is in one embodiment a flag.
- the indication refers to presets, playback configurations, anchors and/or durations and, hence, relevance of a group.
- the metadata contains for at least one group of the audio signal different data (e.g. group loudness or reference gain) with different values.
- the audio encoder further comprises an estimator, wherein the estimator is configured to compute a group loudness value for a group, where the group loudness value for the group is undetermined by the loudness determiner.
- the metadata writer is configured for introducing the computed group loudness value into the metadata so that all groups of the audio signal have associated group loudnesses.
- the audio encoder compensates a missing group loudness by computing it based on available data.
- An advantage is also achieved by a method for processing an audio signal.
- the method comprises at least the following steps:
- An advantage is also achieved by a method for generating an audio signal comprising metadata.
- the method comprises determining a loudness value for a group having one or more audio elements and introducing the determined loudness value for the group as a group loudness into the metadata.
- An advantage is also achieved by a computer program for performing, when running on a computer or a processor, one of the preceding methods.
- FIG. 1 shows an overview of an audio decoder
- FIG. 2 shows an overview of an audio processor according to the invention
- FIG. 3 shows an overview of an inventive audio encoder.
- FIG. 1 shows an overview of an MPEG-H 3D Audio decoder as an example for an audio processor, illustrating all major building blocks of the system:
- HOA in the form of audio signals 507 as outputs of the format converter 504 , the object renderer 505 , and the HOA renderer 506 are then mixed together in the mixing stage. This is done by a mixer 508 providing a mixed audio signal 509 .
- the possible user interactivity can be divided into e.g. two different categories:
- a group refers to a specific collection of individual audio elements.
- the specific grouping information of the audio elements is included in the MPEG-H 3D Audio metadata that is transmitted together with the audio content in the audio stream.
- the elements of a group cannot be interactively changed on their own. Only the entire group can be manipulated, i.e. all included elements together.
- An example is given by a group that consists of the channels corresponding to a stereo or 5.1 channel loudspeaker configuration. In an extreme case, a group can consist of only a single element, e.g. the dialog object of a program. The user is then able to change e.g. the level of this dialog object within the audio scene.
- Presets define a combination of groups in an audio scene. Presets can be used to efficiently signal different presentation of the same audio program within the same audio stream.
- the preset definition also includes default or initial rendering information of the individual groups, which is used in case the user does not apply any modification. The most important example of this rendering information is the gain that is applied to a group when rendering the entire audio scene.
- the configuration information that defines a preset is determined at the encoder and it is part of the metadata, e.g. MPEG-H 3D Audio metadata.
- main or default audio scene can be considered as a special type of preset that includes all audio elements without necessarily specifying grouping information. Nevertheless, default or initial rendering information (e.g. gain) for the individual audio elements is typically provided in the metadata also for the main audio scene.
- Loudness control is especially important in broadcast applications, where it represents an essential feature to fulfill applicable broadcast regulations and recommendations.
- the loudness control concept included in MPEG-H 3D Audio is based on metadata representing the measured loudness of the audio program.
- the metadata is transmitted in the audio stream as an embodiment of the audio signal to be processed by the audio processor together with the actual audio content.
- a loudness normalization gain is computed based on the transmitted loudness information and the target loudness level.
- the loudness normalization gain in one embodiment is then applied to the audio signal after the mixer 508 , as illustrated, for example, in FIG. 1 .
- additional loudness metadata is included, corresponding to the measured loudness of the different presets. Processing steps such as format conversion (downmixing) or dynamic range processing can potentially change the loudness of the audio. Thus, in one embodiment, additional loudness information is included to assure correct loudness normalization also in these cases.
- loudness information of individual groups or even single audio elements is transmitted.
- the information of group loudness is provided in one embodiment with respect to different loudspeaker configurations. For example, if a group consists of the channel signals, different group loudness information can be included for the case of a reproduction to a stereo or 5.1 loudspeaker configuration.
- the loudness information of groups will be used for the loudness control in interactive scenarios as proposed in this invention.
- the loudness information mentioned above refers to a large variety of configurations for a program (e.g. different presets or different loudspeaker reproduction layouts). Since these configurations are static, one embodiment envisages to measure their loudness at the encoder (or before the encoding process) and populate the corresponding metadata fields in the, for example, MPEG-H 3DA stream.
- an important feature of modern audio coding systems such as MPEG-H 3DA is the support of user interactivity at the decoder:
- the user can, e.g. adjust the volume of specific groups or even switch them on and off.
- An important use case is given by dialog enhancement, where the user can manipulate the level of the dialog object, or the group associated with the dialog.
- the user increases the level of an immersive sound bed, represented by an HOA-based group.
- the user wants to switch on specific groups, e.g. representing video description for the hearing impaired or voice-over tracks.
- Changing the level of groups also implies that the overall loudness of the rendered audio scene is changed compared to unmodified case. Thus, consistent playback loudness cannot be assured anymore after gain interactivity. Since the user may change the levels of different objects also more frequently, the loudness level of the audio output can vary over time even for the same program.
- the invention allows to improve loudness control at the decoder in order to enable consistent loudness normalization also in case of user interaction on the levels of groups of audio elements.
- the loudness of a program or a preset is preserved when the user changes the level of certain audio elements or groups within the rendered audio scene.
- a loudness compensation gain is determined in one embodiment based on a reference loudness corresponding to the original audio scene and a modified loudness taking into account gain interactivity of the user. The loudness compensation gain is then applied to the rendered audio signal together with the regular loudness normalization gain to achieve the desired decoder target loudness.
- FIG. 2 shows schematically an example of an audio processor 1 —also called decoder or just apparatus for processing an audio signal— 1 receiving an audio signal 100 and providing an output signal 101 .
- the output signal 101 in the shown example is an audio signal suitable to be fed to an—not shown—amplifier connected to loudspeakers of the playback situation or to be fed directly to loudspeakers or a headphone.
- the audio signal 100 comprises a bitstream with the audio signals of individual audio objects and metadata providing information about the audio elements and how to handle them.
- the audio signal 100 is submitted to a audio signal modifier 2 which receives user input 200 .
- the user input 200 refers—in the shown example—at least to the selection of a certain preset. Presets refer to specific combinations of groups of audio elements with associated reference gains g i and/or group loudnesses L i for the corresponding groups of audio elements. If the user does not chose a preset, a default preset with default values will be used in the shown embodiment.
- the user sets via the input 200 the gain values of individual groups.
- the modified gain values h imply that the corresponding group will be amplified or attenuated corresponding to the reference gain values g i comprised by the metadata.
- the user might prefer to listen to an amplified background choir and not—as usually—to the leading voice. Hence, the user will raise the gain value of the background choir and decrease the gain value of the lead voice or will switch off this voice.
- the user has also the possibility to switch a group off or on. Hence, if the user does not want to hear a group, the group can be switched off.
- the metadata comprises a flag implying that a group is switched off for a specific preset, the user can switch it on. This, for example, can be the case when the audio signal comprises different language versions of a spoken text and the presets refer to the different languages.
- switching a group on or off refers to whether the group is used in the playback or not.
- the signal modifier 2 modifies the audio signal 100 according to the user input 200 via amplifying or attenuation the groups of audio elements belonging to the audio signal 100 and according to the selected or to a default preset covered by the respective data of the metadata.
- a configuration converter 3 which converts data to the current playback configuration by which the audio signal 100 is going to be reproduced. Which playback configuration is given and, thus, is the current situation is also covered by the user input 200 , e.g. via a selection from a list.
- the metadata may refer to a surround sound situation whereas the current playback situation allows astereo playback.
- This conversion refers in one embodiment to the gain values as well as to the loudness values.
- the configuration converter 3 submits the converted data to the loudness controller 6 which also receives the user input 200 . Based on these data, the loudness controller 6 calculates the loudness compensation gain C which is submitted to the loudness manipulator 5 .
- the loudness manipulator 5 sets the overall loudness of the output signal 101 by using the loudness compensation gain C and the signal received from the mixer 4 .
- the mixer 4 receives in the shown embodiment via the configuration converter 3 the audio signal 100 after the modification by the audio signal modifier 2 and the conversion by the configuration converter 3 and combines the different groups of audio elements (compare FIG. 1 ).
- a specific audio scene is defined by a preset, i.e. a specific combination of groups.
- Each of the groups has an associated initial/default gain defined for the given preset. Additionally, the loudness of each group within the preset is assumed to be available.
- the preset may be either chosen by the user or set as a default preset. The following notation will be used:
- a group consists of the collection of channel signals corresponding to a specific loudspeaker configuration or, for example, to an HOA audio scene
- multiple group loudness values can be included in the metadata. These different loudness values are associated with different loudspeaker configurations used for playback. For example, if a group represents a channel bed with a 5.1 or 22.2 loudspeaker configuration, a different loudness may be measured for reproducing the group for the original 5.1 or 22.2 loudspeaker configuration compared to the case where the channel bed has to be mapped to a stereo reproduction system using the format converter. In this case, the group loudness associated with stereo reproduction is chosen in one embodiment if available in the transmitted metadata. Otherwise, the group loudness associated with the original loudspeaker configuration is used.
- the loudness information is not provided for each group separately, but the same loudness value is referred to by an ensemble of groups.
- the loudness of the modified audio scene is computed as
- the loudness compensation gain C is obtain from relating the reference loudness L ref of the preset to the modified loudness L mod of the preset:
- C lim ⁇ C max , if ⁇ ⁇ C > C max C , if ⁇ ⁇ C min ⁇ C ⁇ C max C min , if ⁇ ⁇ C ⁇ C min
- the loudness normalization gain G N used for loudness normalization according to the state of art (see e.g. the EP 2 879 131 A1) is then corrected according to
- the loudness normalization is done based on the original normalization gain G N and the loudness compensation is performed separately on the audio signals using the limited version of the compensation gain C lim .
- a certain group can be active only during a very short period of time within the program and it is completely silent for the remaining time. Due to the gating process during the loudness measurement e.g. according to ITU-R BS.1770-3—by the ITU Radiocommunication Sector (ITU-R) as one of the three sectors of the International Telecommunication Union (ITU)—, such a group can still have a significant measured loudness. This group loudness will then influence the loudness compensation gain during the entire program duration, although the group is active only during very short amount of time. On the other hand such a sparse group signal has only little contribution to the loudness measurement of the entire program/preset mix.
- ITU-R ITU Radiocommunication Sector
- ITU-R International Telecommunication Union
- the loudness compensation will lead to an attenuation of all remaining audio elements during the entire program duration.
- the loudness compensation process should ignore that particular sparse group.
- the metadata contains a corresponding flag for this group to be neglected for the calculation of the loudness compensation.
- information is added to the metadata included in the audio stream or audio signal that indicates whether a group should be excluded from the loudness compensation, i.e. from computing the reference and modified loudness of a preset or the global audio scene.
- This information is in one embodiment a simple flag for each group indicating whether it is included in the loudness compensation process or not.
- EBU-R128 involves measuring the loudness of the full program mix
- ATSC A/85 recommends measuring only the loudness of the anchor element of a program, which is typically represented by the dialog.
- the information which group is part of the program anchor is, in an embodiment, included in the metadata of the audio stream/audio signal.
- the reference loudness is obtained by
- L ref 10 ⁇ ⁇ log 10 ⁇ ⁇ i ⁇ ⁇ ⁇ A ref ⁇ ⁇ 10 g i 10 ⁇ 10 L i 10
- a ref denotes the set of indexes referring to groups that are part of the anchor element of the default audio scene or preset.
- the modified loudness for anchor-based loudness compensation using the set of group indexes A mod (referring to groups that are part of the anchor element of the modified audio scene or preset) reads
- the anchor-based approach is used for the case that one or all of the anchor groups are amplified by the user, i.e. h i >g i .
- the anchor groups are attenuated, the loudness compensation with respect to the loudness of the full mix is used, i.e. for the case that h i ⁇ g i .
- the information about the anchor groups is comprised by the metadata.
- the loudness compensation approach presented in the forgoing involves using the information on the loudness of each group within a preset or the global audio scene.
- the loudness information may be available only for some groups and missing for others.
- missing group loudness information is calculated from the loudness of the preset (or the default audio scene) and the group loudness values that are available.
- L p denote the measured loudness of the considered preset of the audio program, i.e. the measured joint loudness of the audio objects belonging to the respective preset.
- B denote the set of indexes to groups for which the loudness information is available.
- a residual loudness L res of the preset is computed from the preset loudness, the available group loudness information, and the default/initial gains of these groups:
- the residual loudness can be expressed as
- L res L A + 10 ⁇ ⁇ log 10 ( ⁇ i ⁇ ⁇ ⁇ B ⁇ ⁇ 10 g i 10 )
- L A L res - 10 ⁇ ⁇ log 10 ( ⁇ i ⁇ ⁇ ⁇ B ⁇ ⁇ 10 g i 10 )
- the reference loudness and modified loudness that may be used for the loudness compensation can then be computed as already discussed, where any missing group loudness L i is replaced by a corresponding estimate L A .
- the estimation of missing group loudness information is done either at the encoder side or the decoder side of the audio coding system.
- the information on the group loudness within the transmitted metadata in the audio stream can be either measured, or an corresponding estimate as described above can be included instead.
- the loudness compensation stage at the decoder has all loudness information that may be used and can do the processing in accordance to the case where all group loudness has been measured in advance by the encoder.
- the missing group loudness values in the metadata of the audio stream are estimated as described above, and then, the loudness compensation is based on the estimated group loudness values.
- a special use case is given if no information on the loudness of any group is provided in the metadata of the audio stream.
- the loudness compensation has to work only based on the relevant rendering information available, i.e. the default or initial gain of a group g i and its modified version h i after user interaction. This is referred to as blind loudness compensation, as no loudness information for the groups is known at the decoder.
- the blind loudness compensation is performed even if just one group loudness is missing in the metadata.
- the assumption is used that the loudness values of all groups within a preset are the same.
- the gain factor for blind loudness compensation may only use information on the group gains but no loudness related information.
- the blind loudness compensation is performed in case that at least one group loudness is missing. Hence, even one missing group loudness causes the blind loudness compensation.
- a general set of indexes is specified referring to groups that should be included for the computation of the reference loudness of a preset or the default audio scene.
- This set is derived from information in the metadata of the audio stream whether a group should be included for performing loudness compensation for the default audio scene or a preset. This information is usually introduced in the metadata of the audio stream at the encoder.
- the loudness compensation process is controlled by appropriately defining these bitstream elements. For example, if a certain group should be excluded, the corresponding bitstream element is set to “false”.
- Anchor-based loudness compensation is realized in one embodiment by including only groups that are part of the anchor element of the default audio scene or of a defined preset, and setting the corresponding bitstream elements to “true”. Other ways to provide this information can be used in different implementations.
- groups are discarded for computing the reference loudness L ref if they are switched off in the default audio scene or in a preset.
- the resulting set of indexes is denoted as K ref .
- any group that is switched off in the modified scene is excluded from computing the modified loudness L mod . If a group is switched off in the default scene, but switched on by the user in the modified scene, the corresponding group loudness is excluded from the computation of the reference loudness L ref but included in the computation of the modified loudness L mod and vice versa.
- the set of group indexes for the modified loudness L mod is denoted with K mod .
- the loudness compensation gain is then computed analogously to the discussion above by replacing M ref by K ref and by replacing M mod by K mod .
- the blind loudness compensation is used as a fallback mode.
- the same approach with respect to selecting group indexes for the loudness compensation (K ref and K mod ) as described above is applied in the fallback mode.
- FIG. 3 shows an embodiment of an audio encoder 20 which generates a digital audio signal 100 based on different audio sources.
- the audio signal 100 comprises metadata to be used e.g. by the audio processor discussed above.
- the audio encoder 20 comprises a loudness determiner 21 for determining a loudness value for at least one group having one or more audio elements 50 .
- a loudness determiner 21 for determining a loudness value for at least one group having one or more audio elements 50 .
- three audio sources X 1 , X 2 , and X 3 are present each comprised by one group.
- the loudness values of two of them X 2 and X 3 are determined as L 2 and L 3 and are submitted to a metadata writer 22 .
- the metadata writer 22 introduces the determined loudness values for the two groups X 2 and X 3 as corresponding group reference loudness information L 2 and L 3 into the metadata of the audio signal 100 .
- Gain values as reference gains g 1 , g 2 , g 3 for the groups X 1 , X 2 , and X 3 are also inserted by the metadata writer 22 into the metadata of the audio signal 100 .
- the group loudnesses and reference gain values are determined for specific presets and/or different playback configurations. Also, the loudness for different presets as a respective loudness overall L p is measured.
- the loudness of the first audio element 50 labelled as X 1 is not measured by the loudness determiner 21 but is calculated or estimated by the estimator 24 (see the discussion above) and is given as a corresponding reference loudness L 1 to the metadata writer 22 to be written into the metadata.
- the controller 23 in the shown embodiment is connected to the loudness determiner 21 as well as to the metadata writer 22 .
- the controller 23 determines which group or which groups are to be considered or to be neglected for the determination of the loudness compensation gain C.
- For the data about the usage of the groups an indication is written by the metadata writer 22 into the metadata.
- the corresponding data e.g. in the form of flags, indicates which group is to be used or which group is to be neglected for the determination of the loudness compensation gain C by the audio processor or by a decoder.
- the resulting audio signal 100 comprises the actual signals received from the audio objects 50 and the metadata characterizing the actual signals and their intended treatment by the audio decoder 1 .
- the data of the metadata refers to groups of audio objects, whereas it is also possible that a group covers just one audio object/element.
- the metadata contains at least some of the following data:
- the metadata advantageously contains different sets of data for different presets and/or different playback configurations. Hence, different recording and different reproduction situations are considered leading to different data sets for the relevant groups.
- the invention is in the following explained via different examples for implementing loudness compensation for user interactivity with an audio coding system.
- the encoder computes estimates of the missing group loudness values.
- the encoder may also apply different methods to estimate missing (not measured) group loudness information.
- the loudness compensation at the decoder is then performed as in the case that the loudness information has been measured for all groups.
- the audio stream includes loudness information only for a limited number of groups.
- the missing group loudness information is estimated at the decoder.
- the loudness compensation at the decoder is then performed as in the case that all loudness information that may be used has been included in the metadata of the audio stream.
- Another embodiment includes the blind loudness compensation as a fallback mode if any group loudness information that may be used is missing at the decoder to perform correct loudness compensation.
- the same mechanism for determining the set of indexes K ref and K mod for selecting the groups to be included in the computation of the reference and modified loudness as described above is used in the fallback mode.
- the selection of the set of group indexes K ref and K mod is still based on the corresponding information generated at the encoder side, which is provided with the metadata of the audio stream.
- a first embodiment refers to an audio processor for processing an audio signal, comprising: an audio signal modifier for modifying the audio signal in response to a user input; a loudness controller for determining a loudness compensation gain based on a reference loudness or a reference gain and a modified loudness or a modified gain, where the modified loudness or the modified gain depends on the user input; and a loudness manipulator for manipulating a loudness of a signal using the loudness compensation gain.
- a second embodiment depending on the first embodiment refers to an apparatus, wherein the audio signal comprises a bitstream with metadata, the metadata comprising a group loudness for a group and a gain value for a group.
- a third embodiment depending on the first or second embodiment refers to an apparatus, wherein the loudness controller is configured to calculate the reference loudness for a group or a set of groups using the group loudness or the group loudnesses and the gain value or the gain values for the group or the set of groups, and to calculate the modified loudness for a group or a set of groups using the group loudness or the group loudnesses and the modified gain value or the modified gain values for the group or the set of groups, wherein the modified gain value or the modified gain values are modified by the user input.
- a fourth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to discard a group for determining the reference loudness when the group is discarded in metadata of the audio signal, or wherein the loudness controller is configured to discard a group when determining the reference loudness, when the group is switched off in response to the user input, or wherein the loudness controller is configured to exclude a group from the computation of the reference loudness, when the group is switched off in the metadata and is switched on by the user input, or vice versa.
- a fifth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to calculate the loudness compensation gain by relating the reference loudness to the loudness of a preset, wherein the preset comprises one or more groups, and wherein a group comprises one or more objects.
- a sixth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to perform a limitation operation on the loudness compensation gain so that the loudness compensation gain is lower than an upper threshold or so that the loudness compensation gain is greater than a lower threshold.
- a seventh embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness manipulator is configured to apply a gain to the signal determined by the loudness compensation gain and by an original normalization gain determined by a target level set by the audio processor and a metadata level indicated in the metadata of the audio signal.
- An eighth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the audio signal comprises a compensation metadata information indicating which group is to be used for the determination of the loudness compensation gain or which group is not to be used for determining the loudness compensation gain, and wherein the loudness controller is configured to only use a group for determining the loudness compensation gain indicated to be used by the compensation metadata information or to not use a group for determining the loudness compensation gain indicated not to be used by the compensation metadata information.
- a ninth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the audio signal is indicated to have an anchor element, wherein the loudness controller is configured to only use information for an audio object or a group of audio objects of the anchor element for determining the loudness compensation gain.
- a tenth embodiment depending on one of the first to eighth embodiment refers to an apparatus, wherein the audio signal is indicated to have an anchor element, wherein the loudness controller is configured to only use the information for an audio object or a group of audio objects of the anchor element for determining the loudness compensation gain, when the one or more audio objects of the anchor element are amplified by the user input and to use information from one or more audio objects of the anchor element and information of one or more audio objects not included in the anchor element, when the one or more audio objects of the anchor element are attenuated by the user input.
- An eleventh embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to calculate a group loudness missing in the audio signal using a loudness of a preset comprising at least two groups and gain and loudness information not missing for the preset.
- a twelfth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to perform a blind loudness compensation using one or more gain values for one or more groups and one or more modified gain values for one or more groups.
- a thirteenth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to check, whether the audio signal comprises a reference loudness information, and if the audio signal does not comprise the reference loudness information, to perform a blind loudness compensation using one or more gain values for one or more groups and one or more modified gain values for one or more groups, or to check, whether a modified loudness information cannot be calculated and to perform a blind loudness compensation, when the modified loudness information cannot be calculated, wherein the blind loudness compensation comprises using one or more gain values for one or more groups and one or more modified gain values for or more groups.
- a fourteenth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the audio signal comprises different reference loudness information values for different playback configurations, wherein the apparatus further comprises a format converter for converting a signal to a predefined playback configuration, and wherein the loudness controller is configured to select the specific loudness value for the specific playback configuration used by the format converter.
- a fifteenth embodiment refers to an audio encoder for generating an audio signal comprising metadata, comprising: a loudness determiner for determining a loudness for a group having one or more audio object; and a metadata writer for introducing the loudness for the group as a reference loudness information into the metadata.
- a sixteenth embodiment depending on the fifteenth embodiment refers to an audio encoder, wherein the loudness determiner is configured to determine different loudness values for different playback configurations, and wherein the metadata writer is configured to introduce the different loudness values in association with the different playback configurations into the metadata.
- a seventeenth embodiment depending on the fifteenth or sixteenth embodiment refers to an audio encoder, further comprising a controller for determining, which group is to be used for a loudness compensation or not, and wherein the metadata writer is configured for writing an indication into the metadata indicating, which group is to be used or which group is not to be used for the loudness compensation.
- a eighteenth embodiment depending on one of the fifteenth to seventeenth embodiment refers to an audio encoder, wherein the loudness determiner is configured to compute a group loudness value for a group, where the group loudness value for the group is missing in the metadata, and wherein the metadata writer is configured for introducing the missing loudness value into the metadata so that all groups of the audio signal have associated reference loudness information.
- a nineteenth embodiment refers to a method for processing an audio signal, comprising: modifying the audio signal in response to a user input; determining a loudness compensation gain based on a reference loudness or a reference gain and a modified loudness or a modified gain, where the modified loudness or the modified gain depends on the user input; and manipulating a loudness of a signal using the loudness compensation gain.
- a twentieth embodiment refers to a method for generating an audio signal comprising metadata, comprising: determining a loudness for a group having one or more audio object; and introducing the loudness for the group as a reference loudness information into the metadata.
- a twenty-first embodiment refers to a computer program for performing, when running on a computer or a processor, the method according to the nineteenth embodiment or the method according to the twentieth embodiment.
- aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus.
- Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
- the inventive transmitted or encoded signal can be stored on a digital storage medium or can be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.
- embodiments of the invention can be implemented in hardware or in software.
- the implementation can be performed using a digital storage medium, for example a floppy disc, a DVD, a Blu-Ray, a CD, a ROM, a PROM, and EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
- Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
- embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer.
- the program code may, for example, be stored on a machine readable carrier.
- inventions comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
- an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
- a further embodiment of the inventive method is, therefore, a data carrier (or a non-transitory storage medium such as a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein.
- the data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitory.
- a further embodiment of the invention method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein.
- the data stream or the sequence of signals may, for example, be configured to be transferred via a data communication connection, for example, via the internet.
- a further embodiment comprises a processing means, for example, a computer or a programmable logic device, configured to, or adapted to, perform one of the methods described herein.
- a processing means for example, a computer or a programmable logic device, configured to, or adapted to, perform one of the methods described herein.
- a further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
- a further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver.
- the receiver may, for example, be a computer, a mobile device, a memory device or the like.
- the apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
- a programmable logic device for example, a field programmable gate array
- a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein.
- the methods are advantageously performed by any hardware apparatus.
- an audio processor ( 1 ) for processing an audio signal ( 100 ) includes an audio signal modifier ( 2 ).
- the audio signal modifier ( 2 ) is configured to modify the audio signal ( 100 ) in response to a user input.
- the audio processor further includes a loudness controller.
- the loudness controller ( 6 ) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (L ref ) or a reference gain (g i ) and on the other hand on a modified loudness (L mod ) or a modified gain (h i ).
- the modified loudness (L mod ) or the modified gain (h i ) depends on the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal ( 100 ) indicating which group is to be used or is not to be used for determining the loudness compensation gain (C).
- the group comprises one or more audio elements.
- the audio signal modifier further includes a loudness manipulator ( 5 ).
- the loudness manipulator ( 5 ) is configured to manipulate a loudness of a signal using the loudness compensation gain (C).
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on at least one flag comprised by the data of the metadata, and the flag is indicating whether or how a group is to be considered for determining the loudness compensation gain (C).
- the loudness controller ( 6 ) is configured to use only groups for determining the loudness compensation gain (C) when the groups belong to an anchor comprised by the metadata of the audio signal ( 100 ). In another alternative, the loudness controller ( 6 ) is configured to use only the groups belonging to the anchor for determining the loudness compensation gain (C) when the modified gain (h i ) of at least one group belonging to the anchor is greater than the corresponding reference gain (g i ), and/or the loudness controller ( 6 ) is configured to use groups belonging to the anchor and groups missing from the anchor for determining the loudness compensation gain (C) when the modified gain (h i ) of at least one group belonging to the anchor is lower than the corresponding reference gain (g i ), and the modified gain (h i ) depends on the user input.
- an audio processor ( 1 ) for processing an audio signal ( 100 ) includes an audio signal modifier ( 2 ).
- the audio signal modifier ( 2 ) is configured to modify the audio signal ( 100 ) in response to a user input.
- the audio processor further includes a loudness controller ( 6 ).
- the loudness controller ( 6 ) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (L ref ) or a reference gain (g i ) and on the other hand on a modified loudness (L mod ) or a modified gain (h i ).
- the modified loudness (L mod ) or the modified gain (h i ) depends on the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal ( 100 ) referring to at least one preset.
- the preset refers to a set of at least one group comprising one or more audio elements.
- the audio processor further includes a loudness manipulator ( 5 ).
- the loudness manipulator ( 5 ) is configured to manipulate a loudness of a signal using the loudness compensation gain (C).
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on group loudnesses (L i ) and/or gain values (g i ) of the at least one group of the set referred to by the preset.
- the loudness controller ( 6 ) is configured to determine the reference loudness (L ref ) for the set referred to by the preset using the respective group loudnesses (L i ) and the respective gain values (g i ).
- the loudness controller ( 6 ) is configured to determine the modified loudness (L mod ) for the set referred to by the preset using the respective group loudnesses (L i ) and the respective modified gain values (h i ).
- the modified gain values (h i ) are modified by the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on the data of the metadata referring to a selected preset, and the preset is selected by the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on the data of the metadata referring to a default preset and the default preset is set prior to or independently of a user input.
- an audio processor ( 1 ) for processing an audio signal ( 100 ) includes an audio signal modifier ( 2 ), the audio signal modifier ( 2 ) is configured to modify the audio signal ( 100 ) in response to a user input.
- the audio processor further includes a loudness controller ( 6 ), the loudness controller ( 6 ) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (L ref ) or a reference gain (g i ) and on the other hand on a modified loudness (L mod ) or a modified gain (h i ), and the modified loudness (L mod ) or the modified gain (h i ) depends on the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal ( 100 ) indicating whether a group is switched off or switched on.
- the group comprises one or more audio elements.
- the audio processor further includes a loudness manipulator ( 5 ), the loudness manipulator ( 5 ) is configured to manipulate a loudness of a signal using the loudness compensation gain (C).
- the loudness controller ( 6 ) is configured to discard a group for determining the modified loudness (L mod ) when the group is switched off in response to the user input.
- the loudness controller ( 6 ) is configured to discard a group for determining the reference loudness (L ref ) when the group is switched off in the metadata and to include the group for determining the modified loudness (L mod ) when the group is switched on by the user input. In another alternative, the loudness controller ( 6 ) is configured to include a group for determining the reference loudness (L ref ) when the group is switched on in the metadata and to exclude the group for determining the modified loudness (L mod ) when the group is switched off by the user input.
- an audio processor ( 1 ) for processing an audio signal ( 100 ) includes an audio signal modifier ( 2 ), the audio signal modifier ( 2 ) is configured to modify the audio signal ( 100 ) in response to a user input.
- the audio processor further includes a loudness controller ( 6 ), the loudness controller ( 6 ) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (L ref ) or a reference gain (g i ) and on the other hand on a modified loudness (L mod ) or a modified gain (h i ), the modified loudness (L mod ) or the modified gain (h i ) depends on the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal ( 100 ) with at least one group loudness missing in the metadata of a group comprised by the audio signal ( 100 ).
- the audio processor further includes a loudness manipulator ( 5 ), the loudness manipulator ( 5 ) is configured to manipulate a loudness of a signal ( 101 ) using the loudness compensation gain (C).
- the loudness controller ( 6 ) is configured to calculate the missing group loudness (L A ) using a loudness of a preset (L p ), the reference gain (g i ) of the group with missing group loudness as well as the group loudnesses (L i ) and the reference gains (g i ) for the groups having a group loudness (L i ).
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) in the case that the metadata of the audio signal ( 100 ) is missing at least one group loudness for a blind loudness compensation using only at least one reference gain (g i ) and at least one modified gain (h i ).
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) in the case that the metadata of the audio signal ( 100 ) is void of group loudnesses for a blind loudness compensation using only at least one reference gain (g ⁇ circumflex over ( ) ⁇ and at least one modified gain (h i ).
- an audio processor ( 1 ) for processing an audio signal ( 100 ) includes an audio signal modifier ( 2 ), the audio signal modifier ( 2 ) is configured to modify the audio signal ( 100 ) in response to a user input.
- the audio processor further includes a loudness controller ( 6 ), the loudness controller ( 6 ) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (L ref ) or a reference gain (g i ) and on the other hand on a modified loudness (L mod ) or a modified gain (h i ).
- the modified loudness (L mod ) or the modified gain (h i ) depends on the user input.
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal ( 100 ) referring to a playback configuration for a reproduction of the signal ( 100 ).
- the audio processor further includes a loudness manipulator ( 5 ), the loudness manipulator ( 5 ) is configured to manipulate a loudness of a signal ( 101 ) using the loudness compensation gain (C).
- the loudness controller ( 6 ) is configured to determine the loudness compensation gain (C) based on the data of the metadata referring to a playback configuration and comprising associated group loudnesses (L i ) and/or reference gain values (g i ).
- the audio signal ( 100 ) comprises a bitstream with the metadata, and wherein the metadata comprises the reference gain (g i ) for at least one group.
- the metadata of the audio signal ( 100 ) comprises a group loudness (L i ) for at least one group.
- the loudness controller ( 6 ) is configured to determine the reference loudness (L ref ) for at least one group using the group loudness (L i ) and the gain value (g i ) for the group, the loudness controller ( 6 ) is configured to determine the modified loudness (L mod ) for the group using the group loudness (L i ) and the modified gain value (h i ), and the modified gain value (h i ) is modified by the user input.
- the loudness controller ( 6 ) is configured to determine the reference loudness (L ref ) for a plurality of groups using the respective group loudnesses (U) and gain values (g i ) for the groups.
- loudness controller ( 6 ) is configured to determine the modified loudness (Lmod) for a plurality of groups using the respective group loudness (L i ) and modified gain value (h i ) for the groups.
- the loudness controller ( 6 ) is configured to perform a limitation operation on the loudness compensation gain (C) so that the loudness compensation gain (C) is lower than an upper threshold (C max ) and/or so that the loudness compensation gain (C) is greater than a lower threshold (C min ).
- an audio encoder ( 20 ) for generating an audio signal ( 100 ) includes a loudness determiner ( 21 ) for determining a loudness value for at least one group having one or more audio elements ( 50 ).
- the audio encoder further includes a metadata writer ( 22 ) for introducing the determined loudness value as a group loudness (L i ) into the metadata.
- the loudness determiner ( 21 ) is configured to determine different loudness values and/or different gain values for different playback configurations, and wherein the metadata writer ( 22 ) is configured to introduce the determined different loudness values and/or different gain values in association with the respective playback configuration into the metadata.
- the loudness determiner ( 21 ) is configured to determine different loudness values and/or different gain values for different presets referring to sets of at least one group comprising one or more audio elements
- the metadata writer ( 22 ) is configured to introduce the determined different loudness values and/or different gain values in association with the respective preset into the metadata.
- the audio encoder further includes a controller ( 23 ), the controller ( 23 ) is configured to determine which group is to be used for determining a loudness compensation gain (C) or is to be neglected, and wherein the metadata writer ( 22 ) is configured for writing an indication into the metadata indicating which group is to be used or is to be neglected for determining the loudness compensation gain (C).
- the audio encoder further includes an estimator ( 24 ), the estimator ( 24 ) is configured to compute a group loudness value for a group, the group loudness value for the group is undetermined by the loudness determiner ( 21 ), and the metadata writer ( 22 ) is configured for introducing the computed group loudness value into the metadata so that all groups of the audio signal ( 100 ) have associated group loudnesses.
- a method for processing an audio signal ( 100 ) includes modifying the audio signal ( 100 ) in response to a user input. The method further includes determining a loudness compensation gain (C) based on the one hand on a reference loudness (L ref ) or a reference gain (g i ) and on the other hand on a modified loudness (L mod ) or a modified gain (h i ), where the modified loudness (L mod ) or the modified gain (h i ) depends on the user input.
- C loudness compensation gain
- the loudness compensation gain (C) is determined based on metadata of the audio signal ( 100 ) indicating whether a group comprised by the audio signal ( 100 ) is to be used or is not to be used for determining the loudness compensation gain (C), wherein the group comprises one or more audio elements. And/or the loudness compensation gain (C) is determined based on metadata of the audio signal ( 100 ) referring to a preset, wherein the preset refers to a set of at least one group comprising one or more audio elements. And/or the loudness compensation gain (C) is determined based on metadata of the audio signal ( 100 ) indicating whether a group is switched off or switched on, wherein the group comprises one or more audio elements.
- a method for generating an audio signal ( 100 ) comprising metadata includes determining a loudness value for a group having one or more audio elements; and introducing the determined loudness value for the group as a group loudness (L i ) into the metadata.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Algebra (AREA)
- Mathematical Optimization (AREA)
- Mathematical Analysis (AREA)
- Pure & Applied Mathematics (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Computational Mathematics (AREA)
- Tone Control, Compression And Expansion, Limiting Amplitude (AREA)
- Circuit For Audible Band Transducer (AREA)
- Stereophonic System (AREA)
- Control Of Amplification And Gain Control (AREA)
Abstract
Description
- This application is a continuation of co-pending U.S. patent application Ser. No. 15/842,682 filed Dec. 14, 2017 and International Application No. PCT/EP2016/063205, filed Jun. 9, 2016, which is incorporated herein by reference in its entirety, and additionally claims priority from European Application No. EP 15172593.4, filed Jun. 17, 2015, which is incorporated herein by reference in its entirety.
- The invention refers to an audio processor and to an audio encoder. The invention also refers to corresponding methods.
- Modern audio coding systems do not only provide means to efficiently transmit audio content in a loudspeaker channel-based representation that is simply played back at the decoder side. They additionally include more advanced features to allow users to interact with the content and, thus, to influence how the audio is reproduced and rendered at the decoder. This allows for new types of user experiences compared to legacy audio coding systems.
- An example for an advanced audio coding systems is the MPEG-H 3D Audio standard (J. Herre at al., “MPEG-H Audio—The New Standard for Universal Spatial/3D Audio Coding”, 137th AES Convention, 2014, Los Angeles). It allows a transmission of immersive audio content in three different formats, channel-based, object-based, and scene-based using higher order ambisonics (HOA). It has been designed to offer new capabilities such as user interaction for personalization and adaptation of the audio for different use scenarios.
- The three different categories for content formats can be described as follows:
-
- Channel-based: Traditionally, spatial audio content (starting from simple two channel stereo) has been delivered as a set of channel signals which are designated to be reproduced by loudspeakers in a precisely defined, fixed target location relative to the listener.
- Object-based: Audio objects are signals that are to be reproduced as to originate from a specific target location that is specified by associated side information provided as metadata along with the audio. In contrast to channel signals, the actual placement of audio objects can vary over time and is not necessarily pre-defined during the sound production process but by rendering it to the target loudspeaker setup at the time of reproduction. This may also include user interactivity on the location or the level of an object or groups of objects.
- Higher Order Ambisonics (HOA) is an alternative approach to capture a 3D sound field by transmitting a number of ‘coefficient signals’ that have no direct relationship to channels or objects. The actual audio signals for reproduction are generated at the decoder taking into account the given loudspeaker configuration.
- A method for loudness compensation in object-based audio coding systems including user interaction has been presented in EP 2 879 131 A1. A decoder receives an audio input signal comprising audio object signals and generates an audio output signal. A signal processor determines a loudness compensation value for the audio output signal based on loudness information associated with the audio input signal and based on rendering information. The rendering information indicates whether one or more of the audio object signals shall be amplified or attenuated and can be adjusted by a user's wish.
- According to an embodiment, an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating which group is to be used or is not to be used for determining the loudness compensation gain, and wherein the group includes one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- According to another embodiment, an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to at least one preset, wherein the preset refers to a set of at least one group including one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- According to another embodiment, an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating whether a group is switched off or switched on, wherein the group includes one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- According to another embodiment, an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal with at least one group loudness missing in the metadata of a group included in the audio signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- According to another embodiment, an audio processor for processing an audio signal may have: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to a playback configuration for a reproduction of the signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- According to another embodiment, an audio encoder for generating an audio signal including metadata may have: a loudness determiner for determining a loudness value for at least one group having one or more audio elements; and a metadata writer for introducing the determined loudness value as a group loudness into the metadata.
- According to another embodiment, a method for processing an audio signal may have the steps of: modifying the audio signal in response to a user input; determining a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, where the modified loudness or the modified gain depends on the user input, wherein the loudness compensation gain is determined based on metadata of the audio signal indicating whether a group included in the audio signal is to be used or is not to be used for determining the loudness compensation gain, wherein the group includes one or more audio elements, and/or wherein the loudness compensation gain is determined based on metadata of the audio signal referring to a preset, wherein the preset refers to a set of at least one group including one or more audio elements, and/or wherein the loudness compensation gain is determined based on metadata of the audio signal indicating whether a group is switched off or switched on, wherein the group includes one or more audio elements, and/or wherein the loudness compensation gain is determined based on metadata of the audio signal with at least one group loudness missing in the metadata of a group included in the audio signal, and/or wherein the loudness compensation gain is determined based on metadata of the audio signal referring to a playback configuration for a reproduction of the signal; and manipulating a loudness of a signal using the loudness compensation gain.
- According to another embodiment, a method for generating an audio signal including metadata may have the steps of: determining a loudness value for a group having one or more audio elements; and introducing the determined loudness value for the group as a group loudness into the metadata.
- According to another embodiment, a non-transitory digital storage medium may have a computer program stored thereon to perform any of the inventive methods, when said computer program is run by a computer.
- An advantage is achieved by an audio processor for processing an audio signal, comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating which group is to be used or is not to be used for determining the loudness compensation gain, and wherein the group comprises one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- The audio processor—or decoder or apparatus for processing an audio signal—receives an audio signal and generates in one embodiment an output signal which comprises the audio objects and audio elements etc. of the audio signal to be reproduced, for example, by loudspeakers or earphones or to be stored at a medium and so on.
- The audio processor reacts to a user input via an audio signal modifier that is configured to modify the audio signal in response to a user input. The user input refers in one embodiment to an amplification or an attenuation of a group and/or to switching off a group or to switching on a group. The groups comprise one or more audio elements, e.g. audio objects, channels, objects or HOA components. The user input also refers, depending on the embodiment, to data concerning the playback configuration used for the reproduction of the signal. A further user input refers to a selection of a preset. A preset refers to a set of at least one group and specifies—depending on the embodiment—specifically measured group loudness values and/or gain values for the respective groups. The user input is used by the audio signal modifier for modifying appropriately the audio signal. In one embodiment, the metadata comprises data belonging to a plurality of presets.
- The preset refers in an embodiment to a set a group and defines in a different embodiment the groups that do not belong to the preset.
- The audio processor also comprises a loudness controller that is configured to determine a loudness compensation gain. The loudness compensation gain—here called C—allows to counterbalance the effect of the user input in order to provide a signal with an overall loudness as may be useful or as set by the user. The loudness compensation gain is determined based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain. Thus, the loudness compensation gain is determined based on a reference loudness or a reference gain and a modified loudness or a modified gain. The modified loudness or the modified gain are depending on the user input.
- The loudness controller is additionally configured to determine the loudness compensation gain based on metadata of the audio signal. The metadata that is associated with the audio signal carries information about the audio signal and the individual groups and is in one embodiment compromised by the audio signal itself.
- The data of the metadata—of the here discussed embodiment of the audio processor—is indicating whether a group—especially comprised by the audio signal—is to be used—e.g. is to be considered—or is not to be used—e.g. is to be neglected—for determining the loudness compensation gain. Hence, the information about the corresponding groups is either considered or neglected for determining the loudness compensation gain. In at least one embodiment, whether a group or groups is/are considered or neglected, depends additionally on the user input.
- In one embodiment, considering or neglecting groups includes also considering or neglecting them partially in the sense, that the groups and their respective values are only used for a part of the determination of the loudness compensation gain, e.g. only for the calculation of the reference or the modified loudness.
- The loudness compensation gain is used by a loudness manipulator comprised by the audio processor. The loudness manipulator manipulates a loudness of a signal using the loudness compensation gain. The applied loudness compensation gain is not only affected by the user input but is also the result of the data of the metadata associated with or even belonging to the audio signal.
- The signal manipulated by the loudness manipulator is according to an embodiment an output signal provided by the audio processor and based on the audio signal. The loudness manipulator in this embodiment provides the output signals and manipulates the loudness of the output signal using the loudness compensation gain.
- In a different embodiment, the loudness manipulator manipulates a loudness of a signal provided to the loudness manipulator and advantageously already modified according to the user input. In this embodiment, a part of the audio processor provides or generates a signal that is fed to the loudness manipulator and is accordingly processed, i.e. modified with regard to its loudness by the loudness manipulator.
- In a further embodiment, the signal whose loudness is manipulated by the loudness manipulator is the audio signal. In this case, the loudness manipulator modifies the metadata of the audio signal by the modification. This embodiment is associated with a further embodiment, in which the audio processor provides a modified audio signal. The modified audio signal is modified according to the user input and according to the modification of the loudness. This modified audio signal is afterwards also a bitstream.
- According to an embodiment of the audio processor, the loudness controller is configured to determine the loudness compensation gain based on at least one flag comprised by the data of the metadata, wherein the flag is indicating whether or how a group is to be considered for determining the loudness compensation gain. In this embodiment, the metadata comprises flags having, for example, either a “true” or “false” value indicating whether an associated group has to be considered for calculating the loudness compensation gain or not, respectively. The consideration of a group refers in one embodiment also to the question for which step of the calculation the group is to be used for. This refers e.g. to the calculation of the reference loudness and the modified loudness. The reference loudness and the modified loudness are the calculated overall loudnesses before and after the consideration of the user input, respectively. The flag indicates in a different embodiment that the corresponding group is present just during a short interval and, thus, can be neglected for determining the loudness compensation gain.
- According to an embodiment of the audio processor, the loudness controller is configured to use only groups for determining the loudness compensation gain when the groups belong to an anchor comprised by the metadata of the audio signal. The anchor refers in one embodiment, for example, to audio elements belonging to voices, dialogs or special sound effects.
- The handling of groups belonging to an anchor is further elaborated in the following embodiments.
- In one embodiment, the loudness controller is configured to use only the groups belonging to the anchor for determining the loudness compensation gain when the modified gain of at least one group belonging to the anchor is greater than the corresponding reference gain. Thus, just the groups of the anchor are used for the calculation of the loudness compensation gain when the gain value of at least one group of these “anchor groups” is increased due to the user input, i.e. when the user amplified at least one of these groups.
- In an alternative or supplemental embodiment, the loudness controller is configured to use groups belonging to the anchor and groups missing from the anchor for determining the loudness compensation gain when the modified gain of at least one group belonging to the anchor is lower than the corresponding reference gain. Thus, in this embodiment, not only groups belonging to the anchor but also groups that do not belong to the anchor are used for the calculation, when the gain value of at least one anchor group is lowered due to the user input.
- In one embodiment, the two foregoing embodiments are combined. Thus, the change of the gain of at least one group belonging to the anchor determines whether only anchor groups or anchor groups and non-anchor groups are used for determining the loudness compensation gain.
- An advantage is also achieved by an audio processor for processing an audio signal, comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to at least one preset, wherein the preset refers to a set of at least one group comprising one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- For the general description of the audio processor see the discussion above.
- The loudness controller of the audio processor refers to data of the metadata associated with or belonging to the audio signal. The data refers to a preset, wherein the preset refers to a set of at least one group comprising one or more audio elements. In this embodiment, it is taken care of the case that combinations of groups are associated with specific loudness and/or gain values for a specific preset. Hence, the metadata comprises data for the groups depending on different presets or at least on a default preset. Therefore, the loudness controller uses the data which is associated with a preset chosen by the user or which is a default preset.
- The audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, the embodiments discussed above are at least partially also realized with the audio processor mentioned before.
- According to an embodiment of the audio processor, the loudness controller is configured to determine the loudness compensation gain based on group loudnesses and/or gain values of the at least one group of the set referred to by the preset. The preset refers to a specific set of groups of audio elements comprised by the audio signal. For these groups, the metadata contains specific data—i.e. group loudnesses and/or gain values—to be used for the determination of the loudness compensation gain when the corresponding preset is chosen or set as a default preset.
- In a further embodiment, the loudness controller is configured to determine the reference loudness for the set referred to by the preset using the respective group loudnesses and the respective gain values. The loudness controller is also configured to determine the modified loudness for the set referred to by the preset using the respective group loudnesses and the respective modified gain values. The modified gain values are modified by the user input. In this embodiment, the reference loudness and the modified loudness are determined based on the values associated with a preset and for the groups belonging to the preset. The determination takes also care of the indication whether and how—e.g. for the determination of reference or modified loudness—the groups are to be used.
- In a further embodiment, the loudness controller is configured to determine the loudness compensation gain based on data comprised by the metadata of the audio signal referring to a selected preset and wherein the preset is selected by the user input. In this embodiment, the preset is chosen by the user via the user input.
- According to an embodiment of the audio processor, the loudness controller is configured to determine the loudness compensation gain based on data comprised by the metadata of the audio signal referring to a default preset. The default preset is set prior to or independently of a user input. This embodiment handles the situation that a user does not chose a preset. For this, a default preset is used, e.g. prior to any user input for ensuring that even without an interaction by the user a set of data—here covering a default preset—is used for determining the loudness compensation gain.
- An advantage is also achieved by an audio processor for processing an audio signal, comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal indicating whether a group is switched off or switched on, wherein the group comprises one or more audio elements; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- For the general description of the audio processor of this embodiment see the discussion above.
- The loudness controller here is configured to determine the loudness compensation gain based on metadata of the audio signal indicating whether a group is switched off or switched on. In an example, the audio signal may comprise as audio objects different soundtracks belonging to different language versions of a movie. The presets also may refer to different language versions. Hence, in the different presets one soundtrack of one language will be switched on while the other versions will be switched off. This example also shows that the user may switch between the different language versions by switching on a desired and offered language version and, thus, switching off the soundtrack associated with a default preset. Nevertheless, switching on one group does not always imply switching off another group and vice versa.
- The audio processor is in one embodiment configured according to at least one of the foregoing embodiments.
- The audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, the embodiments discussed above are at least partially also realized with the audio processor mentioned before. This holds also the other way around as one audio processor discussed above is in at least one embodiment realized taking the following embodiments into account.
- According to an embodiment, the loudness controller determines the loudness compensation gain based on the user input depending whether a group is switched off or switched on by the user input. Here, the user interaction affects the determination of the loudness controller gain.
- According to an embodiment of the audio processor, the loudness controller is configured to discard a group for determining the modified loudness when the group is switched off in response to the user input. If the user switches off a group, in this embodiment, the group is not used for determining the modified loudness which results from the loudness values representing the user's wishes.
- In a further embodiment, the loudness controller is configured to discard a group for determining the reference loudness when the group is switched off in the metadata and to include the group for determining the modified loudness when the group is switched on by the user input. In this embodiment, a group is switched off in the metadata and is not used for determining the reference loudness. If the user switches the group on, it is included for the evaluation of the modified loudness.
- According to an embodiment of the audio processor, the loudness controller is configured to include a group for determining the reference loudness when the group is switched on in the metadata and to exclude the group for determining the modified loudness when the group is switched off by the user input. In this embodiment, the reverse case of the foregoing embodiment is taken care of.
- An advantage is also achieved by an audio processor for processing an audio signal, comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal with at least one group loudness missing in the metadata of a group comprised by the audio signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- For the general description of the audio processor of this embodiment see the discussion above.
- In this audio processor (or decoder), the loudness controller takes care of the situation that for a group present within the audio signal the corresponding group loudness is missing. The group loudness may either be missing for a specific preset or playback configuration and so one or the metadata may be completely void of any group loudness for this group.
- The audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, the embodiments discussed above are at least partially also realized with the audio processor mentioned before. This holds also the other way around as the audio processor discussed above is in at least one embodiment realized taking the following embodiments into account.
- According to an embodiment of the audio processor, the loudness controller is configured to calculate the missing group loudness using a loudness of a preset, the reference gain of the group with missing group loudness as well as the group loudnesses and the reference gains for the groups having a group loudness. The loudness of the preset is the overall loudness of the groups of the preset.
- In a further embodiment, the loudness controller is configured to determine the loudness compensation gain in the case that the metadata of the audio signal is missing at least one group loudness for a blind loudness compensation using only at least one reference gain and at least one modified gain. In this embodiment, the case of at least one missing group loudness is handled identically to the case that all group loudnesses are missing.
- According to an embodiment of the audio processor, the loudness controller is configured to determine the loudness compensation gain in the case that the metadata of the audio signal is void of group loudnesses for a blind loudness compensation using only at least one reference gain and at least one modified gain.
- An advantage is also achieved by an audio processor for processing an audio signal, comprising: an audio signal modifier, wherein the audio signal modifier is configured to modify the audio signal in response to a user input; a loudness controller, wherein the loudness controller is configured to determine a loudness compensation gain based on the one hand on a reference loudness or a reference gain and on the other hand on a modified loudness or a modified gain, wherein the modified loudness or the modified gain depends on the user input, wherein the loudness controller is configured to determine the loudness compensation gain based on metadata of the audio signal referring to a playback configuration for a reproduction of the signal; and a loudness manipulator, wherein the loudness manipulator is configured to manipulate a loudness of a signal using the loudness compensation gain.
- For the general description of the audio processor of this embodiment see the discussion above.
- The audio processor determines the loudness compensation gain based on data referring to a specific playback configuration. The metadata associated with and in one embodiment being comprised by the audio signal, therefore, contains data specified for at least one playback configuration. In one embodiment, for each playback configuration, the metadata contain data corresponding to the respective playback—or reproduction—configuration.
- The audio processor is in one embodiment configured according to at least one of the foregoing embodiments. Hence, this audio processor is in one embodiment combined with at least one of the foregoing embodiments.
- According to an embodiment of the audio processor, the loudness controller is configured to determine the loudness compensation gain based on the data of the metadata referring to a playback configuration and comprising associated group loudnesses and/or reference gain values. Hence, the different playback configurations are associated with different gain values and/or group loudnesses for the respective groups.
- In one embodiment, the metadata comprises data for different presets and different playback configurations.
- In a further embodiment, the audio processor comprises a configuration converter for converting data comprised by the metadata and referring to the playback configuration to data referring to a current playback configuration, wherein the loudness controller is configured to determine the loudness compensation gain using data provided by the configuration converter. In this embodiment, the audio processor takes care of the situation that the current playback configuration for reproduction of the signal differs from the playback configurations provided by the metadata. Hence, the data of the metadata are converting in order to fit to the current playback configuration and the converted data are used for the determination of the loudness compensation gain.
- In an embodiment, the audio processor comprises a format converter for converting a signal to a predefined playback configuration. In a further embodiment, the loudness controller is configured to select the specific loudness value for the specific playback configuration used by the format converter.
- The following embodiments can be realized with any of the foregoing embodiments.
- In an embodiment, the audio signal comprises a bitstream with the metadata and the metadata comprises the reference gain for at least one group.
- According to an embodiment of the audio processor, the metadata of the audio signal comprises a group loudness for at least one group. In a further embodiment, the metadata comprises group loudnesses for a plurality of groups belonging to the audio signal.
- In a further embodiment, the loudness controller is configured to determine the reference loudness for at least one group using the group loudness and the gain value for the—at least one—group, wherein the loudness controller is configured to determine the modified loudness for the—at least one—group using the group loudness and the modified gain value, and wherein the modified gain value is modified by the user input.
- In an embodiment, the loudness controller is configured to determine the reference loudness—named Lref—for a plurality of groups using the respective group loudnesses—named Li—and gain values—named gi—for the groups. Further, the loudness controller is configured to determine the modified loudness—named Lmod—for a plurality of groups using the respective group loudness Li and modified gain values—named hi—for the groups. In one embodiment, the two pluralities of groups are identical and in a different embodiment different. The pluralities also depend on the respective data of the metadata.
- In a further embodiment, the loudness controller is configured to perform a limitation operation on the loudness compensation gain so that the loudness compensation gain is lower than an upper threshold and/or so that the loudness compensation gain is greater than a lower threshold.
- According to an embodiment of the audio processor, the loudness manipulator is configured to apply a corrected gain to a signal determined by the loudness compensation gain and by a normalization gain determined by a target loudness level set by user input and a metadata loudness level comprised by the metadata of the audio signal. In one embodiment, the normalization gain is determined by using the ratio of the loudness level of the respective groups of the audio signal and the loudness level set by the user to be experienced by the user for the reproduction of the audio signal.
- The foregoing embodiments of audio processors allow a loudness compensation following a user input. The loudness compensation is improved by considering data describing groups of the audio signal and their relevance or kind of usage for the loudness compensation. The information about the groups refines the loudness compensation.
- The foregoing embodiments refer to an audio processor or to an audio decoder. In the following, an encoder will be discussed providing the audio signal with associated or even comprised metadata to be used by an audio processor.
- An advantage is achieved by an audio encoder for generating an audio signal comprising metadata. The audio encoder comprising: a loudness determiner for determining a loudness value for at least one group having one or more audio elements; and a metadata writer for introducing the determined loudness value as a group loudness into the metadata.
- According to an embodiment of the audio encoder, the loudness determiner is configured to determine different loudness values and/or different gain values for different playback configurations, wherein the metadata writer is configured to introduce the determined different loudness values and/or different gain values in association with the respective playback configuration into the metadata. In this embodiment, the metadata contains different data for the concerned groups for different playback configurations, thus, improving the playback of the groups of the audio signal.
- In an embodiment, the loudness determiner is configured to determine different loudness values and/or different gain values for different presets referring to sets of at least one group comprising one or more audio elements. Further, the metadata writer is configured to introduce the determined different loudness values and/or different gain values in association with the respective preset into the metadata. In this embodiment, the presets refer to specific sets of groups that are associated with specific group loudnesses and/or reference gain values.
- In a further embodiment, the audio encoder further comprises a controller, wherein the controller is configured to determine which group is to be used for determining a loudness compensation gain or is to be neglected, and wherein the metadata writer is configured for writing an indication into the metadata indicating which group is to be used or is to be neglected for determining the loudness compensation gain. The indication is in one embodiment a flag. In some embodiments, the indication refers to presets, playback configurations, anchors and/or durations and, hence, relevance of a group.
- In at least one embodiment, the metadata contains for at least one group of the audio signal different data (e.g. group loudness or reference gain) with different values.
- According to an embodiment of the audio encoder, the audio encoder further comprises an estimator, wherein the estimator is configured to compute a group loudness value for a group, where the group loudness value for the group is undetermined by the loudness determiner. The metadata writer is configured for introducing the computed group loudness value into the metadata so that all groups of the audio signal have associated group loudnesses. In this embodiment, the audio encoder compensates a missing group loudness by computing it based on available data.
- An advantage is also achieved by a method for processing an audio signal.
- The method comprises at least the following steps:
-
- Modifying the audio signal in response to a user input.
- Determining a loudness compensation gain based on the one hand on a reference loudness (as an overall loudness of associated individual groups before a modification by a user) or a reference gain and on the other hand on a modified loudness (as the counterpart of the reference loudness being the combined loudness of the relevant groups after the user input) or a modified gain, where the modified loudness or the modified gain depends on the user input.
- The determination of the loudness compensation gain—named C—is performed using at least one or a combination of the following embodiments in which the loudness compensation gain is determined based on data of the metadata associated with—or even comprised by—the audio signal. In the different embodiments, the data are as follows wherein the respective groups comprise one or more audio elements:
- The data are indicating whether a group comprised by the audio signal is to be considered or to be neglected for determining the loudness compensation gain.
- The data are referring to a preset, wherein the preset refers to a set of at least one group.
- The data are indicating whether a group is switched off or switched on.
- In the data is at least one group loudness missing of a group comprised by the audio signal.
- The data are referring to a playback configuration for a reproduction of the signal.
- Manipulating a loudness of an output signal associated with the audio signal using the loudness compensation gain.
- An advantage is also achieved by a method for generating an audio signal comprising metadata. The method comprises determining a loudness value for a group having one or more audio elements and introducing the determined loudness value for the group as a group loudness into the metadata.
- An advantage is also achieved by a computer program for performing, when running on a computer or a processor, one of the preceding methods.
- The embodiments of the apparatus (whether audio processor or audio encoder) can also be performed by steps of the method and corresponding embodiments of the method. Therefore, the explanations given for the embodiments of the apparatus also hold for the method.
- Embodiments of the present invention will be detailed subsequently referring to the appended drawings, in which:
-
FIG. 1 shows an overview of an audio decoder, -
FIG. 2 shows an overview of an audio processor according to the invention and -
FIG. 3 shows an overview of an inventive audio encoder. -
FIG. 1 shows an overview of an MPEG-H 3D Audio decoder as an example for an audio processor, illustrating all major building blocks of the system: -
- As a first step, the received audio stream 500 (including the transmitted audio signals, be they channels, objects or HOA components, together with associated metadata) is decoded by the
decoder 501 providingaudio content 502 and associatedmetadata 503. - Channel signals are mapped to the target reproduction loudspeaker setup using a
format converter 504 which serves as a channel renderer and format converter. - Object signals are rendered to the target reproduction loudspeaker setup by the
object renderer 505 using the associated object metadata. - Higher Order Ambisonics content is rendered by a
HOA renderer 506 to the target reproduction loudspeaker setup using the associated HOA metadata. - The loudspeaker signals corresponding to the different components (channels, object,
- As a first step, the received audio stream 500 (including the transmitted audio signals, be they channels, objects or HOA components, together with associated metadata) is decoded by the
- HOA) in the form of
audio signals 507 as outputs of theformat converter 504, theobject renderer 505, and theHOA renderer 506 are then mixed together in the mixing stage. This is done by amixer 508 providing amixed audio signal 509. -
- The
output 509 of themixer 508 is then processed by the loudness control stage, where the audio is normalized to a desired target loudness level. Theloudness controller 510 performs a normalization as well as the loudness compensation. For this purpose, theloudness controller 510 receivesuser input 511. Theuser input 511 as a result of a user interaction refers also to information about the loudspeaker configuration to be used for the playback and is also submitted to theformat converter 504, theobject renderer 505, and to theHOA renderer 506. To theloudness controller 510metadata 503 is fed especially referring to rendering and/or loudness information extracted by thedecoder 501 from the receivedaudio stream 500. The resultingsignal 512 is in the shown embodiment submitted to the loudspeakers of the loudspeaker configuration available for the playback.
- The
- The possible user interactivity can be divided into e.g. two different categories:
-
- Selection of presets of the transmitted audio program.
- Manipulation of the default rendering of groups of audio elements.
- The meaning of presets and groups in the context of MPEG-H 3D Audio and of this invention is presented in the following.
- The individual channels, objects and HOA scenes available for a transmitted audio program are referred to as audio elements. A group refers to a specific collection of individual audio elements. The specific grouping information of the audio elements is included in the MPEG-H 3D Audio metadata that is transmitted together with the audio content in the audio stream. The elements of a group cannot be interactively changed on their own. Only the entire group can be manipulated, i.e. all included elements together. An example is given by a group that consists of the channels corresponding to a stereo or 5.1 channel loudspeaker configuration. In an extreme case, a group can consist of only a single element, e.g. the dialog object of a program. The user is then able to change e.g. the level of this dialog object within the audio scene.
- Presets define a combination of groups in an audio scene. Presets can be used to efficiently signal different presentation of the same audio program within the same audio stream. The preset definition also includes default or initial rendering information of the individual groups, which is used in case the user does not apply any modification. The most important example of this rendering information is the gain that is applied to a group when rendering the entire audio scene. The configuration information that defines a preset is determined at the encoder and it is part of the metadata, e.g. MPEG-H 3D Audio metadata.
- It should be noted that the main or default audio scene can be considered as a special type of preset that includes all audio elements without necessarily specifying grouping information. Nevertheless, default or initial rendering information (e.g. gain) for the individual audio elements is typically provided in the metadata also for the main audio scene.
- One of the most important features for a next generation audio delivery is advanced loudness control, i.e. proper signaling of loudness information and loudness normalization. Loudness control is especially important in broadcast applications, where it represents an essential feature to fulfill applicable broadcast regulations and recommendations.
- The loudness control concept included in MPEG-H 3D Audio is based on metadata representing the measured loudness of the audio program. The metadata is transmitted in the audio stream as an embodiment of the audio signal to be processed by the audio processor together with the actual audio content. At the decoder according to one embodiment, a loudness normalization gain is computed based on the transmitted loudness information and the target loudness level. The loudness normalization gain in one embodiment is then applied to the audio signal after the
mixer 508, as illustrated, for example, inFIG. 1 . - In order to take into account the specific feature of offering multiple presets of the same audio program with the same audio stream, additional loudness metadata is included, corresponding to the measured loudness of the different presets. Processing steps such as format conversion (downmixing) or dynamic range processing can potentially change the loudness of the audio. Thus, in one embodiment, additional loudness information is included to assure correct loudness normalization also in these cases.
- In another embodiment, loudness information of individual groups or even single audio elements is transmitted. The information of group loudness is provided in one embodiment with respect to different loudspeaker configurations. For example, if a group consists of the channel signals, different group loudness information can be included for the case of a reproduction to a stereo or 5.1 loudspeaker configuration. The loudness information of groups will be used for the loudness control in interactive scenarios as proposed in this invention.
- The loudness information mentioned above refers to a large variety of configurations for a program (e.g. different presets or different loudspeaker reproduction layouts). Since these configurations are static, one embodiment envisages to measure their loudness at the encoder (or before the encoding process) and populate the corresponding metadata fields in the, for example, MPEG-H 3DA stream.
- However, as already mentioned above, an important feature of modern audio coding systems such as MPEG-H 3DA is the support of user interactivity at the decoder: The user can, e.g. adjust the volume of specific groups or even switch them on and off. An important use case is given by dialog enhancement, where the user can manipulate the level of the dialog object, or the group associated with the dialog. In another example, the user increases the level of an immersive sound bed, represented by an HOA-based group. In another example, the user wants to switch on specific groups, e.g. representing video description for the hearing impaired or voice-over tracks.
- Changing the level of groups also implies that the overall loudness of the rendered audio scene is changed compared to unmodified case. Thus, consistent playback loudness cannot be assured anymore after gain interactivity. Since the user may change the levels of different objects also more frequently, the loudness level of the audio output can vary over time even for the same program.
- It is highly desirable to provide loudness control not only for static presentations of the audio program, but also to take into account user interactivity that changes the loudness of an audio scene. The invention allows to improve loudness control at the decoder in order to enable consistent loudness normalization also in case of user interaction on the levels of groups of audio elements.
- The loudness of a program or a preset is preserved when the user changes the level of certain audio elements or groups within the rendered audio scene. A loudness compensation gain is determined in one embodiment based on a reference loudness corresponding to the original audio scene and a modified loudness taking into account gain interactivity of the user. The loudness compensation gain is then applied to the rendered audio signal together with the regular loudness normalization gain to achieve the desired decoder target loudness.
-
FIG. 2 shows schematically an example of an audio processor 1—also called decoder or just apparatus for processing an audio signal—1 receiving anaudio signal 100 and providing anoutput signal 101. Theoutput signal 101 in the shown example is an audio signal suitable to be fed to an—not shown—amplifier connected to loudspeakers of the playback situation or to be fed directly to loudspeakers or a headphone. Theaudio signal 100 comprises a bitstream with the audio signals of individual audio objects and metadata providing information about the audio elements and how to handle them. - The
audio signal 100 is submitted to a audio signal modifier 2 which receivesuser input 200. Theuser input 200 refers—in the shown example—at least to the selection of a certain preset. Presets refer to specific combinations of groups of audio elements with associated reference gains gi and/or group loudnesses Li for the corresponding groups of audio elements. If the user does not chose a preset, a default preset with default values will be used in the shown embodiment. - Further, the user sets via the
input 200 the gain values of individual groups. The modified gain values h, imply that the corresponding group will be amplified or attenuated corresponding to the reference gain values gi comprised by the metadata. For example, the user might prefer to listen to an amplified background choir and not—as usually—to the leading voice. Hence, the user will raise the gain value of the background choir and decrease the gain value of the lead voice or will switch off this voice. - The user has also the possibility to switch a group off or on. Hence, if the user does not want to hear a group, the group can be switched off. The other way, if the metadata comprises a flag implying that a group is switched off for a specific preset, the user can switch it on. This, for example, can be the case when the audio signal comprises different language versions of a spoken text and the presets refer to the different languages. Hence, switching a group on or off refers to whether the group is used in the playback or not.
- To sum up, the signal modifier 2 modifies the
audio signal 100 according to theuser input 200 via amplifying or attenuation the groups of audio elements belonging to theaudio signal 100 and according to the selected or to a default preset covered by the respective data of the metadata. - It follows a
configuration converter 3 which converts data to the current playback configuration by which theaudio signal 100 is going to be reproduced. Which playback configuration is given and, thus, is the current situation is also covered by theuser input 200, e.g. via a selection from a list. For example, the metadata may refer to a surround sound situation whereas the current playback situation allows astereo playback. This conversion refers in one embodiment to the gain values as well as to the loudness values. - The
configuration converter 3 submits the converted data to theloudness controller 6 which also receives theuser input 200. Based on these data, theloudness controller 6 calculates the loudness compensation gain C which is submitted to the loudness manipulator 5. - The loudness manipulator 5 sets the overall loudness of the
output signal 101 by using the loudness compensation gain C and the signal received from the mixer 4. The mixer 4 receives in the shown embodiment via theconfiguration converter 3 theaudio signal 100 after the modification by the audio signal modifier 2 and the conversion by theconfiguration converter 3 and combines the different groups of audio elements (compareFIG. 1 ). - For the explanation, in an illustrative example the case is considered where a specific audio scene is defined by a preset, i.e. a specific combination of groups. Each of the groups has an associated initial/default gain defined for the given preset. Additionally, the loudness of each group within the preset is assumed to be available. The preset may be either chosen by the user or set as a default preset. The following notation will be used:
-
- Li is the loudness of the i-th group of the preset.
- gi is the initial/default gain of the i-th group (given, for example, in dB scale).
- hi is the modified interactivity gain of the i-th group (given e.g. in dB scale)
- Mref denotes the set of indexes referring to groups that are included for the computation of the reference loudness of a preset (or the default audio scene).
- Mmod denotes the set of indexes referring to groups that are included for the computation of the modified loudness of a preset (or the modified audio scene).
- In case that a group consists of the collection of channel signals corresponding to a specific loudspeaker configuration or, for example, to an HOA audio scene, multiple group loudness values can be included in the metadata. These different loudness values are associated with different loudspeaker configurations used for playback. For example, if a group represents a channel bed with a 5.1 or 22.2 loudspeaker configuration, a different loudness may be measured for reproducing the group for the original 5.1 or 22.2 loudspeaker configuration compared to the case where the channel bed has to be mapped to a stereo reproduction system using the format converter. In this case, the group loudness associated with stereo reproduction is chosen in one embodiment if available in the transmitted metadata. Otherwise, the group loudness associated with the original loudspeaker configuration is used. An analogous strategy for selecting the appropriate group loudness is proposed in case that a group represents and HOA-based audio scene. In this case the group loudness associated with the present playback loudspeaker configuration should be used (if available in the metadata) instead of the group loudness associated with a reference loudspeaker layout.
- In some embodiments, the loudness information is not provided for each group separately, but the same loudness value is referred to by an ensemble of groups.
- In general, it is a reasonable to assume that the audio signals in the different groups are uncorrelated. The reference loudness of the preset can then be computed as
-
- Analogously, the loudness of the modified audio scene is computed as
-
- In case that a group is switched off in the default setting of the preset, the group is discarded when computing the reference loudness Lref. Analogously, if a user switches off a group, that group is discarded when computing the modified loudness Lmod. If a group is switched off in the default preset, but switched on by the user in the modified scene, the corresponding group loudness Li is excluded from the computation of the reference loudness Lref but included in the computation of the modified loudness Lmod and vice versa. Note that discarding a group that is switched off can equivalently be interpreted as setting its gain (gi or hi) to −∞. In this case Mref=Mmod. Hence, both loudness Lref and Lmod are calculated referring to the same sets of groups.
- The loudness compensation gain C is obtain from relating the reference loudness Lref of the preset to the modified loudness Lmod of the preset:
-
- The loudness compensation gain C is limited in one embodiment within a range of allowed gains to avoid any undesired behavior for extreme cases:
-
- The loudness normalization gain GN used for loudness normalization according to the state of art (see e.g. the EP 2 879 131 A1) is then corrected according to
-
G corrected =G N +C lim - assuring consistent loudness after gain interactivity by the user. Alternatively, the loudness normalization is done based on the original normalization gain GN and the loudness compensation is performed separately on the audio signals using the limited version of the compensation gain Clim.
- The above discussion has been based on a preset of the audio program. It should be mentioned that there are not always presets available for a program, but only a single global default scene is defined. This case is handled analogously to the preset case described above, where the set of indexes Mref and Mmod refer to the groups of the default scene and its modified version, respectively.
- There are situations, where it is appropriate to intentionally exclude certain groups from the loudness compensation process. For example, a certain group can be active only during a very short period of time within the program and it is completely silent for the remaining time. Due to the gating process during the loudness measurement e.g. according to ITU-R BS.1770-3—by the ITU Radiocommunication Sector (ITU-R) as one of the three sectors of the International Telecommunication Union (ITU)—, such a group can still have a significant measured loudness. This group loudness will then influence the loudness compensation gain during the entire program duration, although the group is active only during very short amount of time. On the other hand such a sparse group signal has only little contribution to the loudness measurement of the entire program/preset mix.
- For example, if a user chooses to boost such a sparse group/object, the loudness compensation will lead to an attenuation of all remaining audio elements during the entire program duration. Such a behavior is undesired and the loudness compensation process should ignore that particular sparse group. Hence, the metadata contains a corresponding flag for this group to be neglected for the calculation of the loudness compensation.
- In order to provide the functionality described above, information is added to the metadata included in the audio stream or audio signal that indicates whether a group should be excluded from the loudness compensation, i.e. from computing the reference and modified loudness of a preset or the global audio scene. This information is in one embodiment a simple flag for each group indicating whether it is included in the loudness compensation process or not.
- Different broadcast regulations on loudness control use different approaches to define program loudness. While EBU-R128 involves measuring the loudness of the full program mix, ATSC A/85 recommends measuring only the loudness of the anchor element of a program, which is typically represented by the dialog.
- Such different approaches to measuring loudness for a program are also taken into account for the loudness compensation. The anchor based loudness compensation can be immediately concluded from the loudness compensation of the full mix as discussed above.
- For the anchor-based reference and the modified loudness of a preset (or the default mix of a program) only those groups are included which contribute to the program anchor. The information which group is part of the program anchor is, in an embodiment, included in the metadata of the audio stream/audio signal. The reference loudness is obtained by
-
- where Aref denotes the set of indexes referring to groups that are part of the anchor element of the default audio scene or preset.
- Analogously, the modified loudness for anchor-based loudness compensation using the set of group indexes Amod (referring to groups that are part of the anchor element of the modified audio scene or preset) reads
-
- It immediately follows that the compensation gain is obtained as
-
- The remaining steps to perform loudness compensation are not changed compared to the full program mix case (see the discussion above).
- In some cases, a mixture of both loudness compensation approaches—anchor-based and based on the full program mix—are beneficial for the user experience of the loudness compensation.
- In an embodiment, the anchor-based approach is used for the case that one or all of the anchor groups are amplified by the user, i.e. hi>gi. On the other hand, if the anchor groups are attenuated, the loudness compensation with respect to the loudness of the full mix is used, i.e. for the case that hi<gi. The information about the anchor groups is comprised by the metadata.
- The loudness compensation approach presented in the forgoing involves using the information on the loudness of each group within a preset or the global audio scene. In some scenarios, the loudness information may be available only for some groups and missing for others. Hence in one embodiment, missing group loudness information is calculated from the loudness of the preset (or the default audio scene) and the group loudness values that are available.
- Let Lp denote the measured loudness of the considered preset of the audio program, i.e. the measured joint loudness of the audio objects belonging to the respective preset. Furthermore, let B denote the set of indexes to groups for which the loudness information is available. A residual loudness Lres of the preset is computed from the preset loudness, the available group loudness information, and the default/initial gains of these groups:
-
- An alternative representation of the residual loudness can be obtained by considering the group loudness values that are not available and the corresponding default/initial gains:
-
- In practice it is a reasonable to assume that the loudness of each group for which the loudness information is missing is equal:
-
LiLA, for i∉B - In this case, the residual loudness can be expressed as
-
- From this, an estimate for the missing groups loudness values is immediately obtained as
-
- The reference loudness and modified loudness that may be used for the loudness compensation can then be computed as already discussed, where any missing group loudness Li is replaced by a corresponding estimate LA.
- The estimation of missing group loudness information is done either at the encoder side or the decoder side of the audio coding system.
- If the estimation is done at the encoder, the information on the group loudness within the transmitted metadata in the audio stream can be either measured, or an corresponding estimate as described above can be included instead. Then, the loudness compensation stage at the decoder has all loudness information that may be used and can do the processing in accordance to the case where all group loudness has been measured in advance by the encoder.
- If the estimation is done at the decoder, the missing group loudness values in the metadata of the audio stream are estimated as described above, and then, the loudness compensation is based on the estimated group loudness values.
- A special use case is given if no information on the loudness of any group is provided in the metadata of the audio stream. In this case, the loudness compensation has to work only based on the relevant rendering information available, i.e. the default or initial gain of a group gi and its modified version hi after user interaction. This is referred to as blind loudness compensation, as no loudness information for the groups is known at the decoder. In another embodiment, the blind loudness compensation is performed even if just one group loudness is missing in the metadata.
- For the compensation, the assumption is used that the loudness values of all groups within a preset are the same. In an embodiment of blind loudness compensation, the assumption is introduced that Li=LA for all groups included in Mref and Mmod, respectively. By this, a rule for computing the loudness compensation gain is obtained according to
-
- Note that the gain factor for blind loudness compensation may only use information on the group gains but no loudness related information.
- In a further embodiment, the blind loudness compensation is performed in case that at least one group loudness is missing. Hence, even one missing group loudness causes the blind loudness compensation.
- In this section the foregoing will be summarized:
- In one embodiment, a general set of indexes is specified referring to groups that should be included for the computation of the reference loudness of a preset or the default audio scene. This set is derived from information in the metadata of the audio stream whether a group should be included for performing loudness compensation for the default audio scene or a preset. This information is usually introduced in the metadata of the audio stream at the encoder.
- At the encoder, the loudness compensation process is controlled by appropriately defining these bitstream elements. For example, if a certain group should be excluded, the corresponding bitstream element is set to “false”. Anchor-based loudness compensation is realized in one embodiment by including only groups that are part of the anchor element of the default audio scene or of a defined preset, and setting the corresponding bitstream elements to “true”. Other ways to provide this information can be used in different implementations.
- As already mentioned in one embodiment, groups are discarded for computing the reference loudness Lref if they are switched off in the default audio scene or in a preset. The resulting set of indexes is denoted as Kref.
- Analogously, any group that is switched off in the modified scene is excluded from computing the modified loudness Lmod. If a group is switched off in the default scene, but switched on by the user in the modified scene, the corresponding group loudness is excluded from the computation of the reference loudness Lref but included in the computation of the modified loudness Lmod and vice versa. The set of group indexes for the modified loudness Lmod is denoted with Kmod.
- The loudness compensation gain is then computed analogously to the discussion above by replacing Mref by Kref and by replacing Mmod by Kmod.
- For the case that any of the group loudness information that may be used to compute either the reference or the modified loudness is missing at the decoder, the blind loudness compensation is used as a fallback mode. The same approach with respect to selecting group indexes for the loudness compensation (Kref and Kmod) as described above is applied in the fallback mode.
-
FIG. 3 shows an embodiment of anaudio encoder 20 which generates adigital audio signal 100 based on different audio sources. Theaudio signal 100 comprises metadata to be used e.g. by the audio processor discussed above. - The
audio encoder 20 comprises aloudness determiner 21 for determining a loudness value for at least one group having one or moreaudio elements 50. In the shown example, three audio sources X1, X2, and X3 are present each comprised by one group. The loudness values of two of them X2 and X3 are determined as L2 and L3 and are submitted to ametadata writer 22. Themetadata writer 22 introduces the determined loudness values for the two groups X2 and X3 as corresponding group reference loudness information L2 and L3 into the metadata of theaudio signal 100. - Gain values as reference gains g1, g2, g3 for the groups X1, X2, and X3 are also inserted by the
metadata writer 22 into the metadata of theaudio signal 100. According to a further embodiment, the group loudnesses and reference gain values are determined for specific presets and/or different playback configurations. Also, the loudness for different presets as a respective loudness overall Lp is measured. - The loudness of the
first audio element 50, labelled as X1 is not measured by theloudness determiner 21 but is calculated or estimated by the estimator 24 (see the discussion above) and is given as a corresponding reference loudness L1 to themetadata writer 22 to be written into the metadata. - The
controller 23 in the shown embodiment is connected to theloudness determiner 21 as well as to themetadata writer 22. Thecontroller 23 determines which group or which groups are to be considered or to be neglected for the determination of the loudness compensation gain C. For the data about the usage of the groups an indication is written by themetadata writer 22 into the metadata. The corresponding data, e.g. in the form of flags, indicates which group is to be used or which group is to be neglected for the determination of the loudness compensation gain C by the audio processor or by a decoder. - The resulting
audio signal 100 comprises the actual signals received from the audio objects 50 and the metadata characterizing the actual signals and their intended treatment by the audio decoder 1. The data of the metadata refers to groups of audio objects, whereas it is also possible that a group covers just one audio object/element. - The metadata contains at least some of the following data:
-
- measured loudness values Li for the individual groups,
- reference gain values gi for the individual groups which describe the loudness or prominence of the groups in relation to the other concerned groups together,
- a reference loudness Lref as the resulting loudness of the combined groups for a given preset and/or a given playback configuration,
- an indicator whether (e.g. whether the group belongs to an anchor or whether the duration of the group is so short that it can be neglected etc.) or how (e.g. for the calculation of the reference and/or modified loudness) a group or its corresponding values are used for determining the loudness compensation gain C.
- For each group, the metadata advantageously contains different sets of data for different presets and/or different playback configurations. Hence, different recording and different reproduction situations are considered leading to different data sets for the relevant groups.
- The invention is in the following explained via different examples for implementing loudness compensation for user interactivity with an audio coding system.
-
- At the encoder side, the loudness of each group included in the default audio scene and/or presets is determined. The loudness information is introduced in the metadata comprised as a part by the audio stream or the audio signal.
- Multiple loudness values are included for at least one group, where different values are associated with different loudspeaker playback configurations (e.g. stereo, 5.1 or others).
- On the encoder side, additional metadata is created that corresponds to the information whether a group should be included for performing loudness compensation, i.e. whether it should be considered for the computation of the reference loudness and the modified loudness, respectively. For example, anchor-based loudness compensation is realized by configuring the metadata to include only groups that are part of the anchor element of the default audio scene or of a defined preset.
- The decoder receives that audio stream, representing the audio signal and associated metadata. The decoder decodes the audio stream to generate decoded audio signals corresponding to channels and/or objects and/or Higher-Order Ambisonics formats.
- Based on the metadata, the decoder selects all group indexes that should be included for the loudness compensation for a given audio scene or preset.
- At the decoder, the reference loudness Lref of the audio scene or a preset is computed based on the default gains gi of each selected group and the corresponding loudness information. If multiple loudness values are transmitted for a group, the loudness value associated with the given playback loudspeaker configuration is chosen.
- Analogously, the modified loudness Lmod is computed from the loudness information of the selected groups and the modified gains hi after user interaction.
- The loudness compensation gain C for the default audio scene or a preset is computed based on the reference loudness Lref and the modified loudness Lmod.
- The loudness compensation gain C is applied to the audio signal before playback providing the output signal.
- In some embodiments, it is not feasible to measure the loudness information that may be used for all groups at the encoder. Then, the encoder computes estimates of the missing group loudness values. The encoder may also apply different methods to estimate missing (not measured) group loudness information. The loudness compensation at the decoder is then performed as in the case that the loudness information has been measured for all groups.
- In further embodiments, the audio stream includes loudness information only for a limited number of groups. In this case, the missing group loudness information is estimated at the decoder. The loudness compensation at the decoder is then performed as in the case that all loudness information that may be used has been included in the metadata of the audio stream.
- Another embodiment includes the blind loudness compensation as a fallback mode if any group loudness information that may be used is missing at the decoder to perform correct loudness compensation. The same mechanism for determining the set of indexes Kref and Kmod for selecting the groups to be included in the computation of the reference and modified loudness as described above is used in the fallback mode. In other words, the selection of the set of group indexes Kref and Kmod is still based on the corresponding information generated at the encoder side, which is provided with the metadata of the audio stream.
- Some embodiments of the invention will follow that can be combined with the foregoing:
- A first embodiment refers to an audio processor for processing an audio signal, comprising: an audio signal modifier for modifying the audio signal in response to a user input; a loudness controller for determining a loudness compensation gain based on a reference loudness or a reference gain and a modified loudness or a modified gain, where the modified loudness or the modified gain depends on the user input; and a loudness manipulator for manipulating a loudness of a signal using the loudness compensation gain.
- A second embodiment depending on the first embodiment refers to an apparatus, wherein the audio signal comprises a bitstream with metadata, the metadata comprising a group loudness for a group and a gain value for a group.
- A third embodiment depending on the first or second embodiment refers to an apparatus, wherein the loudness controller is configured to calculate the reference loudness for a group or a set of groups using the group loudness or the group loudnesses and the gain value or the gain values for the group or the set of groups, and to calculate the modified loudness for a group or a set of groups using the group loudness or the group loudnesses and the modified gain value or the modified gain values for the group or the set of groups, wherein the modified gain value or the modified gain values are modified by the user input.
- A fourth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to discard a group for determining the reference loudness when the group is discarded in metadata of the audio signal, or wherein the loudness controller is configured to discard a group when determining the reference loudness, when the group is switched off in response to the user input, or wherein the loudness controller is configured to exclude a group from the computation of the reference loudness, when the group is switched off in the metadata and is switched on by the user input, or vice versa.
- A fifth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to calculate the loudness compensation gain by relating the reference loudness to the loudness of a preset, wherein the preset comprises one or more groups, and wherein a group comprises one or more objects.
- A sixth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to perform a limitation operation on the loudness compensation gain so that the loudness compensation gain is lower than an upper threshold or so that the loudness compensation gain is greater than a lower threshold.
- A seventh embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness manipulator is configured to apply a gain to the signal determined by the loudness compensation gain and by an original normalization gain determined by a target level set by the audio processor and a metadata level indicated in the metadata of the audio signal.
- An eighth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the audio signal comprises a compensation metadata information indicating which group is to be used for the determination of the loudness compensation gain or which group is not to be used for determining the loudness compensation gain, and wherein the loudness controller is configured to only use a group for determining the loudness compensation gain indicated to be used by the compensation metadata information or to not use a group for determining the loudness compensation gain indicated not to be used by the compensation metadata information.
- A ninth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the audio signal is indicated to have an anchor element, wherein the loudness controller is configured to only use information for an audio object or a group of audio objects of the anchor element for determining the loudness compensation gain.
- A tenth embodiment depending on one of the first to eighth embodiment refers to an apparatus, wherein the audio signal is indicated to have an anchor element, wherein the loudness controller is configured to only use the information for an audio object or a group of audio objects of the anchor element for determining the loudness compensation gain, when the one or more audio objects of the anchor element are amplified by the user input and to use information from one or more audio objects of the anchor element and information of one or more audio objects not included in the anchor element, when the one or more audio objects of the anchor element are attenuated by the user input.
- An eleventh embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to calculate a group loudness missing in the audio signal using a loudness of a preset comprising at least two groups and gain and loudness information not missing for the preset.
- A twelfth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to perform a blind loudness compensation using one or more gain values for one or more groups and one or more modified gain values for one or more groups.
- A thirteenth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the loudness controller is configured to check, whether the audio signal comprises a reference loudness information, and if the audio signal does not comprise the reference loudness information, to perform a blind loudness compensation using one or more gain values for one or more groups and one or more modified gain values for one or more groups, or to check, whether a modified loudness information cannot be calculated and to perform a blind loudness compensation, when the modified loudness information cannot be calculated, wherein the blind loudness compensation comprises using one or more gain values for one or more groups and one or more modified gain values for or more groups.
- A fourteenth embodiment depending on one of the preceding embodiments refers to an apparatus, wherein the audio signal comprises different reference loudness information values for different playback configurations, wherein the apparatus further comprises a format converter for converting a signal to a predefined playback configuration, and wherein the loudness controller is configured to select the specific loudness value for the specific playback configuration used by the format converter.
- A fifteenth embodiment refers to an audio encoder for generating an audio signal comprising metadata, comprising: a loudness determiner for determining a loudness for a group having one or more audio object; and a metadata writer for introducing the loudness for the group as a reference loudness information into the metadata.
- A sixteenth embodiment depending on the fifteenth embodiment refers to an audio encoder, wherein the loudness determiner is configured to determine different loudness values for different playback configurations, and wherein the metadata writer is configured to introduce the different loudness values in association with the different playback configurations into the metadata.
- A seventeenth embodiment depending on the fifteenth or sixteenth embodiment refers to an audio encoder, further comprising a controller for determining, which group is to be used for a loudness compensation or not, and wherein the metadata writer is configured for writing an indication into the metadata indicating, which group is to be used or which group is not to be used for the loudness compensation.
- A eighteenth embodiment depending on one of the fifteenth to seventeenth embodiment refers to an audio encoder, wherein the loudness determiner is configured to compute a group loudness value for a group, where the group loudness value for the group is missing in the metadata, and wherein the metadata writer is configured for introducing the missing loudness value into the metadata so that all groups of the audio signal have associated reference loudness information.
- A nineteenth embodiment refers to a method for processing an audio signal, comprising: modifying the audio signal in response to a user input; determining a loudness compensation gain based on a reference loudness or a reference gain and a modified loudness or a modified gain, where the modified loudness or the modified gain depends on the user input; and manipulating a loudness of a signal using the loudness compensation gain.
- A twentieth embodiment refers to a method for generating an audio signal comprising metadata, comprising: determining a loudness for a group having one or more audio object; and introducing the loudness for the group as a reference loudness information into the metadata.
- A twenty-first embodiment refers to a computer program for performing, when running on a computer or a processor, the method according to the nineteenth embodiment or the method according to the twentieth embodiment.
- Although some aspects have been described in the context of an apparatus, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds to a method step or a feature of a method step. Analogously, aspects described in the context of a method step also represent a description of a corresponding block or item or feature of a corresponding apparatus. Some or all of the method steps may be executed by (or using) a hardware apparatus, like for example, a microprocessor, a programmable computer or an electronic circuit. In some embodiments, some one or more of the most important method steps may be executed by such an apparatus.
- The inventive transmitted or encoded signal can be stored on a digital storage medium or can be transmitted on a transmission medium such as a wireless transmission medium or a wired transmission medium such as the Internet.
- Depending on certain implementation requirements, embodiments of the invention can be implemented in hardware or in software. The implementation can be performed using a digital storage medium, for example a floppy disc, a DVD, a Blu-Ray, a CD, a ROM, a PROM, and EPROM, an EEPROM or a FLASH memory, having electronically readable control signals stored thereon, which cooperate (or are capable of cooperating) with a programmable computer system such that the respective method is performed. Therefore, the digital storage medium may be computer readable.
- Some embodiments according to the invention comprise a data carrier having electronically readable control signals, which are capable of cooperating with a programmable computer system, such that one of the methods described herein is performed.
- Generally, embodiments of the present invention can be implemented as a computer program product with a program code, the program code being operative for performing one of the methods when the computer program product runs on a computer. The program code may, for example, be stored on a machine readable carrier.
- Other embodiments comprise the computer program for performing one of the methods described herein, stored on a machine readable carrier.
- In other words, an embodiment of the inventive method is, therefore, a computer program having a program code for performing one of the methods described herein, when the computer program runs on a computer.
- A further embodiment of the inventive method is, therefore, a data carrier (or a non-transitory storage medium such as a digital storage medium, or a computer-readable medium) comprising, recorded thereon, the computer program for performing one of the methods described herein. The data carrier, the digital storage medium or the recorded medium are typically tangible and/or non-transitory.
- A further embodiment of the invention method is, therefore, a data stream or a sequence of signals representing the computer program for performing one of the methods described herein. The data stream or the sequence of signals may, for example, be configured to be transferred via a data communication connection, for example, via the internet.
- A further embodiment comprises a processing means, for example, a computer or a programmable logic device, configured to, or adapted to, perform one of the methods described herein.
- A further embodiment comprises a computer having installed thereon the computer program for performing one of the methods described herein.
- A further embodiment according to the invention comprises an apparatus or a system configured to transfer (for example, electronically or optically) a computer program for performing one of the methods described herein to a receiver. The receiver may, for example, be a computer, a mobile device, a memory device or the like. The apparatus or system may, for example, comprise a file server for transferring the computer program to the receiver.
- In some embodiments, a programmable logic device (for example, a field programmable gate array) may be used to perform some or all of the functionalities of the methods described herein. In some embodiments, a field programmable gate array may cooperate with a microprocessor in order to perform one of the methods described herein. Generally, the methods are advantageously performed by any hardware apparatus.
- In one embodiment, an audio processor (1) for processing an audio signal (100) includes an audio signal modifier (2). The audio signal modifier (2) is configured to modify the audio signal (100) in response to a user input. The audio processor further includes a loudness controller. The loudness controller (6) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (Lref) or a reference gain (gi) and on the other hand on a modified loudness (Lmod) or a modified gain (hi). The modified loudness (Lmod) or the modified gain (hi) depends on the user input. The loudness controller (6) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal (100) indicating which group is to be used or is not to be used for determining the loudness compensation gain (C). The group comprises one or more audio elements. The audio signal modifier further includes a loudness manipulator (5). The loudness manipulator (5) is configured to manipulate a loudness of a signal using the loudness compensation gain (C). In one alternative, the loudness controller (6) is configured to determine the loudness compensation gain (C) based on at least one flag comprised by the data of the metadata, and the flag is indicating whether or how a group is to be considered for determining the loudness compensation gain (C). In another alternative, the loudness controller (6) is configured to use only groups for determining the loudness compensation gain (C) when the groups belong to an anchor comprised by the metadata of the audio signal (100). In another alternative, the loudness controller (6) is configured to use only the groups belonging to the anchor for determining the loudness compensation gain (C) when the modified gain (hi) of at least one group belonging to the anchor is greater than the corresponding reference gain (gi), and/or the loudness controller (6) is configured to use groups belonging to the anchor and groups missing from the anchor for determining the loudness compensation gain (C) when the modified gain (hi) of at least one group belonging to the anchor is lower than the corresponding reference gain (gi), and the modified gain (hi) depends on the user input.
- In one embodiment, an audio processor (1) for processing an audio signal (100) includes an audio signal modifier (2). The audio signal modifier (2) is configured to modify the audio signal (100) in response to a user input. The audio processor further includes a loudness controller (6). The loudness controller (6) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (Lref) or a reference gain (gi) and on the other hand on a modified loudness (Lmod) or a modified gain (hi). The modified loudness (Lmod) or the modified gain (hi) depends on the user input. The loudness controller (6) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal (100) referring to at least one preset. The preset refers to a set of at least one group comprising one or more audio elements. The audio processor further includes a loudness manipulator (5). The loudness manipulator (5) is configured to manipulate a loudness of a signal using the loudness compensation gain (C). In one alternative, the loudness controller (6) is configured to determine the loudness compensation gain (C) based on group loudnesses (Li) and/or gain values (gi) of the at least one group of the set referred to by the preset. In another alternative, the loudness controller (6) is configured to determine the reference loudness (Lref) for the set referred to by the preset using the respective group loudnesses (Li) and the respective gain values (gi). The loudness controller (6) is configured to determine the modified loudness (Lmod) for the set referred to by the preset using the respective group loudnesses (Li) and the respective modified gain values (hi). The modified gain values (hi) are modified by the user input. In another alternative, the loudness controller (6) is configured to determine the loudness compensation gain (C) based on the data of the metadata referring to a selected preset, and the preset is selected by the user input. In another alternative, the loudness controller (6) is configured to determine the loudness compensation gain (C) based on the data of the metadata referring to a default preset and the default preset is set prior to or independently of a user input.
- In one embodiment, an audio processor (1) for processing an audio signal (100) includes an audio signal modifier (2), the audio signal modifier (2) is configured to modify the audio signal (100) in response to a user input. The audio processor further includes a loudness controller (6), the loudness controller (6) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (Lref) or a reference gain (gi) and on the other hand on a modified loudness (Lmod) or a modified gain (hi), and the modified loudness (Lmod) or the modified gain (hi) depends on the user input. The loudness controller (6) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal (100) indicating whether a group is switched off or switched on. The group comprises one or more audio elements. The audio processor further includes a loudness manipulator (5), the loudness manipulator (5) is configured to manipulate a loudness of a signal using the loudness compensation gain (C). In one alternative, the loudness controller (6) is configured to discard a group for determining the modified loudness (Lmod) when the group is switched off in response to the user input. In another alternative, the loudness controller (6) is configured to discard a group for determining the reference loudness (Lref) when the group is switched off in the metadata and to include the group for determining the modified loudness (Lmod) when the group is switched on by the user input. In another alternative, the loudness controller (6) is configured to include a group for determining the reference loudness (Lref) when the group is switched on in the metadata and to exclude the group for determining the modified loudness (Lmod) when the group is switched off by the user input.
- In one embodiment, an audio processor (1) for processing an audio signal (100) includes an audio signal modifier (2), the audio signal modifier (2) is configured to modify the audio signal (100) in response to a user input. The audio processor further includes a loudness controller (6), the loudness controller (6) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (Lref) or a reference gain (gi) and on the other hand on a modified loudness (Lmod) or a modified gain (hi), the modified loudness (Lmod) or the modified gain (hi) depends on the user input. The loudness controller (6) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal (100) with at least one group loudness missing in the metadata of a group comprised by the audio signal (100). The audio processor further includes a loudness manipulator (5), the loudness manipulator (5) is configured to manipulate a loudness of a signal (101) using the loudness compensation gain (C). In another alternative, the loudness controller (6) is configured to calculate the missing group loudness (LA) using a loudness of a preset (Lp), the reference gain (gi) of the group with missing group loudness as well as the group loudnesses (Li) and the reference gains (gi) for the groups having a group loudness (Li). In another alternative, the loudness controller (6) is configured to determine the loudness compensation gain (C) in the case that the metadata of the audio signal (100) is missing at least one group loudness for a blind loudness compensation using only at least one reference gain (gi) and at least one modified gain (hi). Alternatively, the loudness controller (6) is configured to determine the loudness compensation gain (C) in the case that the metadata of the audio signal (100) is void of group loudnesses for a blind loudness compensation using only at least one reference gain (g{circumflex over ( )} and at least one modified gain (hi).
- In one embodiment, an audio processor (1) for processing an audio signal (100) includes an audio signal modifier (2), the audio signal modifier (2) is configured to modify the audio signal (100) in response to a user input. The audio processor further includes a loudness controller (6), the loudness controller (6) is configured to determine a loudness compensation gain (C) based on the one hand on a reference loudness (Lref) or a reference gain (gi) and on the other hand on a modified loudness (Lmod) or a modified gain (hi). The modified loudness (Lmod) or the modified gain (hi) depends on the user input. The loudness controller (6) is configured to determine the loudness compensation gain (C) based on metadata of the audio signal (100) referring to a playback configuration for a reproduction of the signal (100). The audio processor further includes a loudness manipulator (5), the loudness manipulator (5) is configured to manipulate a loudness of a signal (101) using the loudness compensation gain (C). In one alternative, the loudness controller (6) is configured to determine the loudness compensation gain (C) based on the data of the metadata referring to a playback configuration and comprising associated group loudnesses (Li) and/or reference gain values (gi). In another alternative, the audio signal (100) comprises a bitstream with the metadata, and wherein the metadata comprises the reference gain (gi) for at least one group. Alternatively, the metadata of the audio signal (100) comprises a group loudness (Li) for at least one group. In another alternative, the loudness controller (6) is configured to determine the reference loudness (Lref) for at least one group using the group loudness (Li) and the gain value (gi) for the group, the loudness controller (6) is configured to determine the modified loudness (Lmod) for the group using the group loudness (Li) and the modified gain value (hi), and the modified gain value (hi) is modified by the user input. In one alternative, the loudness controller (6) is configured to determine the reference loudness (Lref) for a plurality of groups using the respective group loudnesses (U) and gain values (gi) for the groups. In another alternative, loudness controller (6) is configured to determine the modified loudness (Lmod) for a plurality of groups using the respective group loudness (Li) and modified gain value (hi) for the groups. Alternatively, the loudness controller (6) is configured to perform a limitation operation on the loudness compensation gain (C) so that the loudness compensation gain (C) is lower than an upper threshold (Cmax) and/or so that the loudness compensation gain (C) is greater than a lower threshold (Cmin). Alternatively, the loudness manipulator (5) is configured to apply a corrected gain (Gcorrected) to the signal determined by the loudness compensation gain (C) and by a normalization gain (GN) determined by a target loudness level set by user input and a metadata loudness level comprised by the metadata of the audio signal (100). In another embodiment, an audio encoder (20) for generating an audio signal (100) includes a loudness determiner (21) for determining a loudness value for at least one group having one or more audio elements (50). The audio encoder further includes a metadata writer (22) for introducing the determined loudness value as a group loudness (Li) into the metadata. In one alternative, the loudness determiner (21) is configured to determine different loudness values and/or different gain values for different playback configurations, and wherein the metadata writer (22) is configured to introduce the determined different loudness values and/or different gain values in association with the respective playback configuration into the metadata. Alternatively, the loudness determiner (21) is configured to determine different loudness values and/or different gain values for different presets referring to sets of at least one group comprising one or more audio elements, and the metadata writer (22) is configured to introduce the determined different loudness values and/or different gain values in association with the respective preset into the metadata. In one alternative, the audio encoder further includes a controller (23), the controller (23) is configured to determine which group is to be used for determining a loudness compensation gain (C) or is to be neglected, and wherein the metadata writer (22) is configured for writing an indication into the metadata indicating which group is to be used or is to be neglected for determining the loudness compensation gain (C). In another alternative, the audio encoder further includes an estimator (24), the estimator (24) is configured to compute a group loudness value for a group, the group loudness value for the group is undetermined by the loudness determiner (21), and the metadata writer (22) is configured for introducing the computed group loudness value into the metadata so that all groups of the audio signal (100) have associated group loudnesses.
- In one embodiment, a method for processing an audio signal (100) includes modifying the audio signal (100) in response to a user input. The method further includes determining a loudness compensation gain (C) based on the one hand on a reference loudness (Lref) or a reference gain (gi) and on the other hand on a modified loudness (Lmod) or a modified gain (hi), where the modified loudness (Lmod) or the modified gain (hi) depends on the user input. The loudness compensation gain (C) is determined based on metadata of the audio signal (100) indicating whether a group comprised by the audio signal (100) is to be used or is not to be used for determining the loudness compensation gain (C), wherein the group comprises one or more audio elements. And/or the loudness compensation gain (C) is determined based on metadata of the audio signal (100) referring to a preset, wherein the preset refers to a set of at least one group comprising one or more audio elements. And/or the loudness compensation gain (C) is determined based on metadata of the audio signal (100) indicating whether a group is switched off or switched on, wherein the group comprises one or more audio elements. And/or the loudness compensation gain (C) is determined based on metadata of the audio signal (100) with at least one group loudness (LA) missing in the metadata of a group comprised by the audio signal (100). And/or the loudness compensation gain (C) is determined based on metadata of the audio signal (100) referring to a playback configuration for a reproduction of the signal (100). The method further includes manipulating a loudness of a signal using the loudness compensation gain (C). In one alternative, a method for generating an audio signal (100) comprising metadata includes determining a loudness value for a group having one or more audio elements; and introducing the determined loudness value for the group as a group loudness (Li) into the metadata. In all cases herein, the various alternatives may be implemented in the various embodiments, and the embodiments described herein are not strictly limited by these descriptions.
- While this invention has been described in terms of several embodiments, there are alterations, permutations, and equivalents which fall within the scope of this invention. It should also be noted that there are many alternative ways of implementing the methods and compositions of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, permutations and equivalents as fall within the true spirit and scope of the present invention.
Claims (25)
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/413,507 US10838687B2 (en) | 2015-06-17 | 2019-05-15 | Loudness control for user interactivity in audio coding systems |
US17/028,777 US11379178B2 (en) | 2015-06-17 | 2020-09-22 | Loudness control for user interactivity in audio coding systems |
US17/805,260 US20220291896A1 (en) | 2015-06-17 | 2022-06-03 | Loudness control for user interactivity in audio coding systems |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15172593 | 2015-06-17 | ||
EP15172593 | 2015-06-17 | ||
EP15172593.4 | 2015-06-17 | ||
PCT/EP2016/063205 WO2016202682A1 (en) | 2015-06-17 | 2016-06-09 | Loudness control for user interactivity in audio coding systems |
US15/842,682 US10394520B2 (en) | 2015-06-17 | 2017-12-14 | Loudness control for user interactivity in audio coding systems |
US16/413,507 US10838687B2 (en) | 2015-06-17 | 2019-05-15 | Loudness control for user interactivity in audio coding systems |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/842,682 Continuation US10394520B2 (en) | 2015-06-17 | 2017-12-14 | Loudness control for user interactivity in audio coding systems |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/028,777 Continuation US11379178B2 (en) | 2015-06-17 | 2020-09-22 | Loudness control for user interactivity in audio coding systems |
Publications (2)
Publication Number | Publication Date |
---|---|
US20190265944A1 true US20190265944A1 (en) | 2019-08-29 |
US10838687B2 US10838687B2 (en) | 2020-11-17 |
Family
ID=53442595
Family Applications (4)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/842,682 Active US10394520B2 (en) | 2015-06-17 | 2017-12-14 | Loudness control for user interactivity in audio coding systems |
US16/413,507 Active US10838687B2 (en) | 2015-06-17 | 2019-05-15 | Loudness control for user interactivity in audio coding systems |
US17/028,777 Active US11379178B2 (en) | 2015-06-17 | 2020-09-22 | Loudness control for user interactivity in audio coding systems |
US17/805,260 Pending US20220291896A1 (en) | 2015-06-17 | 2022-06-03 | Loudness control for user interactivity in audio coding systems |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/842,682 Active US10394520B2 (en) | 2015-06-17 | 2017-12-14 | Loudness control for user interactivity in audio coding systems |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/028,777 Active US11379178B2 (en) | 2015-06-17 | 2020-09-22 | Loudness control for user interactivity in audio coding systems |
US17/805,260 Pending US20220291896A1 (en) | 2015-06-17 | 2022-06-03 | Loudness control for user interactivity in audio coding systems |
Country Status (20)
Country | Link |
---|---|
US (4) | US10394520B2 (en) |
EP (2) | EP3311379B1 (en) |
JP (4) | JP6578383B2 (en) |
KR (1) | KR102122004B1 (en) |
CN (2) | CN107820711B (en) |
AR (6) | AR105028A1 (en) |
AU (4) | AU2016279775A1 (en) |
BR (1) | BR112017026915B1 (en) |
CA (2) | CA3131960A1 (en) |
ES (1) | ES2936089T3 (en) |
FI (1) | FI3311379T3 (en) |
HK (1) | HK1246962A1 (en) |
MX (1) | MX2017016333A (en) |
MY (1) | MY181475A (en) |
PL (1) | PL3311379T3 (en) |
PT (1) | PT3311379T (en) |
RU (1) | RU2685999C1 (en) |
TW (1) | TWI664623B (en) |
WO (1) | WO2016202682A1 (en) |
ZA (1) | ZA201708348B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11838578B2 (en) | 2019-11-20 | 2023-12-05 | Dolby International Ab | Methods and devices for personalizing audio content |
Families Citing this family (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2005299410B2 (en) | 2004-10-26 | 2011-04-07 | Dolby Laboratories Licensing Corporation | Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal |
TWI447709B (en) | 2010-02-11 | 2014-08-01 | Dolby Lab Licensing Corp | System and method for non-destructively normalizing loudness of audio signals within portable devices |
CN103325380B (en) | 2012-03-23 | 2017-09-12 | 杜比实验室特许公司 | Gain for signal enhancing is post-processed |
CN112185399A (en) | 2012-05-18 | 2021-01-05 | 杜比实验室特许公司 | System for maintaining reversible dynamic range control information associated with a parametric audio encoder |
US10844689B1 (en) | 2019-12-19 | 2020-11-24 | Saudi Arabian Oil Company | Downhole ultrasonic actuator system for mitigating lost circulation |
US9841941B2 (en) | 2013-01-21 | 2017-12-12 | Dolby Laboratories Licensing Corporation | System and method for optimizing loudness and dynamic range across different playback devices |
MX339611B (en) | 2013-01-21 | 2016-05-31 | Dolby Laboratories Licensing Corp | Audio encoder and decoder with program loudness and boundary metadata. |
US9715880B2 (en) | 2013-02-21 | 2017-07-25 | Dolby International Ab | Methods for parametric multi-channel encoding |
CN104080024B (en) | 2013-03-26 | 2019-02-19 | 杜比实验室特许公司 | Volume leveller controller and control method and audio classifiers |
CN105190618B (en) | 2013-04-05 | 2019-01-25 | 杜比实验室特许公司 | Acquisition, recovery and the matching to the peculiar information from media file-based for autofile detection |
TWM487509U (en) | 2013-06-19 | 2014-10-01 | 杜比實驗室特許公司 | Audio processing apparatus and electrical device |
CN105556837B (en) | 2013-09-12 | 2019-04-19 | 杜比实验室特许公司 | Dynamic range control for various playback environments |
EP4379714A2 (en) | 2013-09-12 | 2024-06-05 | Dolby Laboratories Licensing Corporation | Loudness adjustment for downmixed audio content |
CN105142067B (en) | 2014-05-26 | 2020-01-07 | 杜比实验室特许公司 | Audio signal loudness control |
EP4060661B1 (en) | 2014-10-10 | 2024-04-24 | Dolby Laboratories Licensing Corporation | Transmission-agnostic presentation-based program loudness |
ES2936089T3 (en) * | 2015-06-17 | 2023-03-14 | Fraunhofer Ges Forschung | Sound intensity control for user interaction in audio encoding systems |
CN112020827A (en) | 2018-01-07 | 2020-12-01 | 格雷斯诺特有限公司 | Method and apparatus for volume adjustment |
EP3617871A1 (en) * | 2018-08-28 | 2020-03-04 | Koninklijke Philips N.V. | Audio apparatus and method of audio processing |
CN111048108B (en) * | 2018-10-12 | 2022-06-24 | 北京微播视界科技有限公司 | Audio processing method and device |
CN111131860A (en) * | 2018-10-31 | 2020-05-08 | 北京猎户星空科技有限公司 | Audio and video playing method, device, equipment and medium |
US11304021B2 (en) * | 2018-11-29 | 2022-04-12 | Sony Interactive Entertainment Inc. | Deferred audio rendering |
CN110231087B (en) * | 2019-06-06 | 2021-07-23 | 江苏省广播电视集团有限公司 | High-definition television audio loudness analysis alarm and normalization manufacturing method and device |
US20210006976A1 (en) * | 2019-07-03 | 2021-01-07 | Qualcomm Incorporated | Privacy restrictions for audio rendering |
CN112584275B (en) * | 2019-09-29 | 2022-04-22 | 深圳Tcl新技术有限公司 | Sound field expansion method, computer equipment and computer readable storage medium |
KR20220071954A (en) * | 2020-11-24 | 2022-05-31 | 가우디오랩 주식회사 | Method for performing normalization of audio signal and apparatus therefor |
CN114449413B (en) * | 2022-02-16 | 2023-12-22 | 深圳万兴软件有限公司 | Method, device, equipment and storage medium for controlling loudness of audio signal |
CN116033314B (en) * | 2023-02-15 | 2023-05-30 | 南昌航天广信科技有限责任公司 | Audio automatic gain compensation method, system, computer and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110085025A1 (en) * | 2009-10-13 | 2011-04-14 | Vincent Pace | Stereographic Cinematography Metadata Recording |
US20140039890A1 (en) * | 2011-04-28 | 2014-02-06 | Dolby International Ab | Efficient content classification and loudness estimation |
US20150325243A1 (en) * | 2013-01-21 | 2015-11-12 | Dolby Laboratories Licensing Corporation | Audio encoder and decoder with program loudness and boundary metadata |
US20160196830A1 (en) * | 2013-06-19 | 2016-07-07 | Dolby Laboratories Licensing Corporation | Audio encoder and decoder with program information or substream structure metadata |
US20170013387A1 (en) * | 2014-04-02 | 2017-01-12 | Dolby International Ab | Exploiting metadata redundancy in immersive audio metadata |
Family Cites Families (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO1996015614A1 (en) * | 1994-11-09 | 1996-05-23 | Oki Telecom | Independent volume control for multi-system radio telephone |
DE69942521D1 (en) * | 1998-04-14 | 2010-08-05 | Hearing Enhancement Co Llc | USER ADJUSTABLE VOLUME CONTROL FOR HEARING |
SG185134A1 (en) * | 2003-05-28 | 2012-11-29 | Dolby Lab Licensing Corp | Method, apparatus and computer program for calculating and adjusting the perceived loudness of an audio signal |
RU2347282C2 (en) * | 2003-07-07 | 2009-02-20 | Конинклейке Филипс Электроникс Н.В. | System and method of sound signal processing |
AU2005299410B2 (en) * | 2004-10-26 | 2011-04-07 | Dolby Laboratories Licensing Corporation | Calculating and adjusting the perceived loudness and/or the perceived spectral balance of an audio signal |
DE602006010323D1 (en) * | 2006-04-13 | 2009-12-24 | Fraunhofer Ges Forschung | decorrelator |
JP5082327B2 (en) * | 2006-08-09 | 2012-11-28 | ソニー株式会社 | Audio signal processing apparatus, audio signal processing method, and audio signal processing program |
KR100868475B1 (en) | 2007-02-16 | 2008-11-12 | 한국전자통신연구원 | Method for creating, editing, and reproducing multi-object audio contents files for object-based audio service, and method for creating audio presets |
KR20080082916A (en) | 2007-03-09 | 2008-09-12 | 엘지전자 주식회사 | A method and an apparatus for processing an audio signal |
WO2009093867A2 (en) * | 2008-01-23 | 2009-07-30 | Lg Electronics Inc. | A method and an apparatus for processing audio signal |
US8315396B2 (en) * | 2008-07-17 | 2012-11-20 | Fraunhofer-Gesellschaft Zur Foerderung Der Angewandten Forschung E.V. | Apparatus and method for generating audio output signals using object based metadata |
US8908874B2 (en) | 2010-09-08 | 2014-12-09 | Dts, Inc. | Spatial audio encoding and reproduction |
EP2541542A1 (en) * | 2011-06-27 | 2013-01-02 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Apparatus and method for determining a measure for a perceived level of reverberation, audio processor and method for processing a signal |
WO2012122397A1 (en) * | 2011-03-09 | 2012-09-13 | Srs Labs, Inc. | System for dynamically creating and rendering audio objects |
TWI603632B (en) | 2011-07-01 | 2017-10-21 | 杜比實驗室特許公司 | System and method for adaptive audio signal generation, coding and rendering |
US9312829B2 (en) * | 2012-04-12 | 2016-04-12 | Dts Llc | System for adjusting loudness of audio signals in real time |
KR101956161B1 (en) * | 2012-08-30 | 2019-03-08 | 삼성전자 주식회사 | Method and apparatus for controlling audio output |
WO2014114781A1 (en) | 2013-01-28 | 2014-07-31 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Method and apparatus for normalized audio playback of media with and without embedded loudness metadata on new media devices |
US9559651B2 (en) * | 2013-03-29 | 2017-01-31 | Apple Inc. | Metadata for loudness and dynamic range control |
TWI530941B (en) * | 2013-04-03 | 2016-04-21 | 杜比實驗室特許公司 | Methods and systems for interactive rendering of object based audio |
EP2833549B1 (en) * | 2013-08-01 | 2016-04-06 | EchoStar UK Holdings Limited | Loudness level control for audio reception and decoding equipment |
EP4379714A2 (en) | 2013-09-12 | 2024-06-05 | Dolby Laboratories Licensing Corporation | Loudness adjustment for downmixed audio content |
EP2879131A1 (en) * | 2013-11-27 | 2015-06-03 | Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. | Decoder, encoder and method for informed loudness estimation in object-based audio coding systems |
JP6518254B2 (en) * | 2014-01-09 | 2019-05-22 | ドルビー ラボラトリーズ ライセンシング コーポレイション | Spatial error metrics for audio content |
US10063207B2 (en) * | 2014-02-27 | 2018-08-28 | Dts, Inc. | Object-based audio loudness management |
ES2936089T3 (en) | 2015-06-17 | 2023-03-14 | Fraunhofer Ges Forschung | Sound intensity control for user interaction in audio encoding systems |
-
2016
- 2016-06-09 ES ES16730766T patent/ES2936089T3/en active Active
- 2016-06-09 CN CN201680034882.0A patent/CN107820711B/en active Active
- 2016-06-09 CN CN202010806373.3A patent/CN112291699B/en active Active
- 2016-06-09 KR KR1020187001349A patent/KR102122004B1/en active IP Right Grant
- 2016-06-09 PT PT167307669T patent/PT3311379T/en unknown
- 2016-06-09 EP EP16730766.9A patent/EP3311379B1/en active Active
- 2016-06-09 CA CA3131960A patent/CA3131960A1/en active Pending
- 2016-06-09 MY MYPI2017001875A patent/MY181475A/en unknown
- 2016-06-09 RU RU2018101440A patent/RU2685999C1/en active
- 2016-06-09 MX MX2017016333A patent/MX2017016333A/en unknown
- 2016-06-09 AU AU2016279775A patent/AU2016279775A1/en not_active Abandoned
- 2016-06-09 PL PL16730766.9T patent/PL3311379T3/en unknown
- 2016-06-09 EP EP22206207.7A patent/EP4156180A1/en active Pending
- 2016-06-09 BR BR112017026915-5A patent/BR112017026915B1/en active IP Right Grant
- 2016-06-09 WO PCT/EP2016/063205 patent/WO2016202682A1/en active Application Filing
- 2016-06-09 CA CA2988645A patent/CA2988645C/en active Active
- 2016-06-09 JP JP2017565686A patent/JP6578383B2/en active Active
- 2016-06-09 FI FIEP16730766.9T patent/FI3311379T3/en active
- 2016-06-16 AR ARP160101807A patent/AR105028A1/en active IP Right Grant
- 2016-06-16 TW TW105118958A patent/TWI664623B/en active
-
2017
- 2017-12-08 ZA ZA201708348A patent/ZA201708348B/en unknown
- 2017-12-14 US US15/842,682 patent/US10394520B2/en active Active
-
2018
- 2018-05-10 HK HK18106069.3A patent/HK1246962A1/en unknown
-
2019
- 2019-03-11 JP JP2019043353A patent/JP6838093B2/en active Active
- 2019-05-15 US US16/413,507 patent/US10838687B2/en active Active
- 2019-10-11 AU AU2019246882A patent/AU2019246882B2/en active Active
-
2020
- 2020-09-22 US US17/028,777 patent/US11379178B2/en active Active
-
2021
- 2021-02-10 JP JP2021019428A patent/JP7233458B2/en active Active
- 2021-08-03 AR ARP210102159A patent/AR123133A2/en unknown
- 2021-08-03 AR ARP210102160A patent/AR123134A2/en unknown
- 2021-08-03 AR ARP210102165A patent/AR123139A2/en unknown
- 2021-08-03 AR ARP210102161A patent/AR123135A2/en unknown
- 2021-08-03 AR ARP210102162A patent/AR123136A2/en unknown
- 2021-12-22 AU AU2021290313A patent/AU2021290313B2/en active Active
-
2022
- 2022-06-03 US US17/805,260 patent/US20220291896A1/en active Pending
-
2023
- 2023-02-21 JP JP2023025054A patent/JP2023062138A/en active Pending
-
2024
- 2024-04-04 AU AU2024202169A patent/AU2024202169A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110085025A1 (en) * | 2009-10-13 | 2011-04-14 | Vincent Pace | Stereographic Cinematography Metadata Recording |
US20140039890A1 (en) * | 2011-04-28 | 2014-02-06 | Dolby International Ab | Efficient content classification and loudness estimation |
US20150325243A1 (en) * | 2013-01-21 | 2015-11-12 | Dolby Laboratories Licensing Corporation | Audio encoder and decoder with program loudness and boundary metadata |
US20160196830A1 (en) * | 2013-06-19 | 2016-07-07 | Dolby Laboratories Licensing Corporation | Audio encoder and decoder with program information or substream structure metadata |
US20170013387A1 (en) * | 2014-04-02 | 2017-01-12 | Dolby International Ab | Exploiting metadata redundancy in immersive audio metadata |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11838578B2 (en) | 2019-11-20 | 2023-12-05 | Dolby International Ab | Methods and devices for personalizing audio content |
Also Published As
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11379178B2 (en) | Loudness control for user interactivity in audio coding systems | |
US11568881B2 (en) | Methods and systems for generating and rendering object based audio with conditional rendering metadata | |
US11950080B2 (en) | Method and device for processing audio signal, using metadata | |
US11929082B2 (en) | Audio encoder and an audio decoder |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FEPP | Fee payment procedure |
Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
AS | Assignment |
Owner name: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KUCH, FABIAN;UHLE, CHRISTIAN;KRATSCHMER, MICHAEL;AND OTHERS;SIGNING DATES FROM 20180113 TO 20180131;REEL/FRAME:049324/0448 Owner name: FRAUNHOFER-GESELLSCHAFT ZUR FOERDERUNG DER ANGEWANDTEN FORSCHUNG E.V., GERMANY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KUCH, FABIAN;UHLE, CHRISTIAN;KRATSCHMER, MICHAEL;AND OTHERS;SIGNING DATES FROM 20180113 TO 20180131;REEL/FRAME:049324/0448 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: AWAITING TC RESP., ISSUE FEE NOT PAID |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |