CN106023999B - For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio - Google Patents

For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio Download PDF

Info

Publication number
CN106023999B
CN106023999B CN201610541939.8A CN201610541939A CN106023999B CN 106023999 B CN106023999 B CN 106023999B CN 201610541939 A CN201610541939 A CN 201610541939A CN 106023999 B CN106023999 B CN 106023999B
Authority
CN
China
Prior art keywords
spatial parameter
audio
coding
decoding
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610541939.8A
Other languages
Chinese (zh)
Other versions
CN106023999A (en
Inventor
胡瑞敏
杨乘
王晓晨
杜鹏慧
苏柳月
武庭照
陈玮
杨玉红
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University WHU
Original Assignee
Wuhan University WHU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University WHU filed Critical Wuhan University WHU
Priority to CN201610541939.8A priority Critical patent/CN106023999B/en
Publication of CN106023999A publication Critical patent/CN106023999A/en
Application granted granted Critical
Publication of CN106023999B publication Critical patent/CN106023999B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention provides the decoding methods and system for improving three-dimensional audio spatial parameter compression ratio, the present invention inputs audio signal, the spatial side information of three-dimensional audio and the number of the affiliated audio object of spatial parameter of three-dimensional audio in coding, and when coding successively clusters spatial parameter, quantifies, intraframe coding, inter-frame difference coding;Inter-frame difference decoding, intraframe decoder, inverse quantization, spatial parameter mapping are successively carried out when decoding;The present invention is based on the different sub-band spatial parameters in the same frame of same sound source to have the characteristics that similitude, and the compression ratio of the spatial parameter of three-dimensional audio, available higher three-dimensional audio spatial parameter compression ratio are improved using the method for spatial parameter cluster.

Description

For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio
Technical field
The present invention relates to digital audio fields, for the demand for improving three-dimensional audio spatial parameter compression ratio, more particularly to A kind of decoding method and system improving three-dimensional audio spatial parameter compression ratio.
Background technique
In the end of the year 2009, three-dimensional movie " A Fanda " climbs up top box-office value in global more than 30 a countries, in September, 2010 Just, the accumulative box office in the whole world is more than 2,700,000,000 dollars.Why " A Fanda " can obtain the box office achievement of such splendidness, be that it is adopted Completely new three-dimensional special effect making technology brings the shock effect on people's sense organ.
In order to provide a kind of feeling and a kind of more true sound field more immersed, space in 3d space to auditor Audio object encodes (SAOC), and direction audio coding (DirAC) and space squeezing audio coding (S3AC) are suggested.With 3D The raising of spatial resolution and more and more sound channels or object, the bit rate of spatial parameter also sharp improve.For example, In space orientation point of quantification (SLQP) method of S3AC coding, the bit rate of spatial parameter is 18kbps/ object, then for 16 sound objects, spatial parameter need the bit rate of 288kbps.Therefore, the ratio of the spatial parameter in 3D audio coding is reduced Special rate is very urgent.
Compression method BCC, the MPEG Surround and S3AC of spatial parameter considers the characteristic between consecutive frame, then The bit rate of spatial parameter can be reduced by differential encoding.These methods can remove empty between consecutive frame in identical frequency band Between parameter inter-frame redundancy, but redundancy still has in the frame of spatial parameter between same sound source different frequency bands in same frame. If can try every possible means to remove redundancy in these frames, spatial parameter bit rate can be further compressed.
Summary of the invention
It is an object of the invention to, in deficiency present on compression 3D audio space parameter, provide for the above-mentioned prior art A kind of new object-based spatial parameter compression method for 3D audio recording;This method is based on same sound source in same frame The characteristic of interior different frequency bands spatial parameter having the same can with height ratio remove in existing spatial parameter compression method Redundancy in the frame for the spatial parameter not considered, thus further compression space parameters bit rate.
Technical solution of the present invention provides a kind of for improving the decoding method of three-dimensional audio spatial parameter compression ratio, packet Include cataloged procedure and decoding process, the cataloged procedure the following steps are included:
Step C1, input include three-dimensional sound signal, three-dimensional audio spatial parameter and spatial parameter comprising n object Three-dimensional audio time-domain signal is transformed to frequency domain by the number of affiliated audio object, specific as follows,
If the time-domain signal of three-dimensional audio is s (t), the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), three Tie up the spatial parameter of audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is become Frequency domain is changed to, obtains the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n, f)…、SK(n,f);Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is kth The frequency domain presentation of a aeoplotropism audio signal;Indicate the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ For horizontal angle,For elevation angle, r is apart from side information;The value of k is 1,2 ..., and K, K are original aeoplotropism audio signal Sum;The value of Index (n, f) is the number of the affiliated audio object of spatial parameter;N represents frame index, and f represents frequency indices;
Step C2 carries out intraframe coding to the spatial parameter of input, realize it is as follows, to belonging to same audio pair in same frame The spatial parameter of the different frequency bands of elephant is clustered;To the spatial parameter after clusterQuantified;After quantization Spatial parameter carry out intraframe coding;
Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is difference volume Code;
The decoding process includes the following steps;
Step D1 carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding;
Step D2 carries out intraframe decoder to spatial parameter, and realization is as follows, carries out intraframe decoder to spatial parameter;To in frame Decoded spatial parameter carries out inverse quantization;Restore original spatial parameter
The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains the time domain expression s ' of audio signal (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) after encoding and decoding Signal;Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it is former The number Index (n, f) of the affiliated audio object of the spatial parameter of beginning constitutes the sound of the decoded three-dimensional audio comprising n object Frequency signal, the number of spatial parameter and the affiliated audio object of spatial parameter.
It further, is the space to the different frequency bands for belonging to same audio object in same frame in the step C2 Parameter is clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentGathered Class, the spatial parameter after generating cluster
It further, is by the difference of the clustered same audio object for belonging to same frame in the step D2 The spatial parameter of frequency bandTheir corresponding frequency bands are mapped to, original spatial parameter is reduced into
Further, in the step C2, to the spatial parameter after clusterQuantified, the amount Change is perception quantization or directly quantization;To after quantization spatial parameter carry out intraframe coding, the coding be perceptual coding or Direct coding.
Further, in the step D2, to spatial parameter carry out intraframe decoder, the decoding be perception decoding or It directly decodes;Inverse quantization is carried out to the spatial parameter after intraframe decoder, the inverse quantization is the inverse for being directed to perception quantization Change or be directed to the inverse quantization directly quantified.
It is a kind of for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, including encoder and decoder;
The encoder comprises the following modules:
Time-frequency conversion module, for input include comprising the three-dimensional sound signal of n object, three-dimensional audio spatial parameter with And the number of the affiliated audio object of spatial parameter, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f); Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is k-th of aeoplotropism audio letter Number frequency domain presentation;Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height Angle is spent, r is apart from side information;The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal;Index(n,f) Value be the affiliated audio object of spatial parameter number;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding to the spatial parameter of input, including for belonging in same frame The spatial parameter of the different frequency bands of same audio object is clustered;To the spatial parameter after clusterThe amount of progress Change;Intraframe coding is carried out to the spatial parameter after quantization;
Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is poor Coded;
The decoder comprises the following modules:
Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding;
Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for solve in frame to spatial parameter Code;Inverse quantization is carried out to the spatial parameter after intraframe decoder;Restore original spatial parameter
Time-frequency inverse transform block obtains audio signal for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain Time domain express s ' (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) warp Signal after crossing encoding and decoding;Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded, The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.
Further, the intraframe coding module includes cluster module, and the cluster module is used for in same frame The spatial parameter for belonging to the different frequency bands of same audio object is clustered, i.e., identical for n, the value of Index (n, f) it is identical but F different spatial parametersIt is clustered, the spatial parameter after generating cluster
Further, the intraframe decoder module includes recovery module, and the recovery module is used for will be clustered The same audio object for belonging to same frame different frequency bands spatial parameterTheir corresponding frequency bands are mapped to, It is reduced into original spatial parameter
Further, the intraframe coding module includes quantization modules, after the quantization modules are used for cluster Spatial parameterQuantified, the quantization is perception quantization or directly quantization;To the spatial parameter after quantization into Row intraframe coding, the coding are perceptual coding or direct coding.
Further, the intraframe decoder module includes inverse quantization module, and the inverse quantization module is used for space Parameter carries out intraframe decoder, and the decoding is perception decoding or directly decodes;Spatial parameter after intraframe decoder is carried out anti- Quantization, the inverse quantization are to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
Join the beneficial effects of the present invention are: the present invention is based on the different frequencies of same sound source in same frame with identical space Then number carries out spatial parameter frame in coding side by spatial parameter cluster, spatial parameter quantization, spatial parameter intraframe coding Between differential encoding, further compress three-dimensional audio space parameters bit rate, improve spatial parameter compression ratio.Decoding end is to three-dimensional sound Frequency code stream is decoded, including carries out inter-frame difference decoding, spatial parameter intraframe decoder, after intraframe decoder to spatial parameter Spatial parameter carries out inverse quantization, and the spatial parameter of cluster is mapped, and obtains audio signal, the spatial parameter of three-dimensional audio And the number of the affiliated audio object of spatial parameter.Therefore, the present invention solves previous only existing by increasing encoding and decoding in frame Spatial parameter compression method in do not consider the defect of redundancy in spatial parameter frame, can further compress three-dimensional audio space ginseng Number bit rate, improves spatial parameter compression ratio.
Detailed description of the invention
Fig. 1 is the flow chart of the coding side of the embodiment of the present invention;
Fig. 2 is the flow chart of the decoding end of the embodiment of the present invention.
Specific embodiment
Below in conjunction with drawings and examples the present invention will be described in detail technical solution, (wherein step C1 to step C3 is encoded Journey, step D1 to step D3 are decoding process).
Referring to Fig. 1, the coding side of the embodiment of the present invention executes following below scheme:
The time-domain signal s (t) of three-dimensional audio is transformed to frequency domain by step C1, obtain three-dimensional audio frequency-region signal S (n, f)。
The input of coding side are as follows: the three-dimensional sound signal comprising n object, three-dimensional audio spatial parameter and spatial parameter The number of affiliated audio object.The time domain of the audio signal of three-dimensional audio is expressed as s (t), and s (t) is by s1(t)、s2(t)、…、sK (t) it constitutes, t indicates the time;The spatial parameter of three-dimensional audio namely the corresponding spatial parameter of each time frequency pointByIt constitutes;The number of the affiliated audio object of spatial parameter, uses Index (n, f) expression.Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal,Indicate k-th of aeoplotropism The corresponding spatial parameter of audio signal, spatial parameter is by direction parameter (horizontal angle θ, elevation angle) and distance parameter r composition.K's Value is 1,2 ..., K, and K is the sum of original aeoplotropism audio signal.
The time-domain signal of three-dimensional audio is transformed into frequency domain, it can be by the time-domain signal s (t) of three-dimensional audio using Fu in short-term In leaf transformation (STFT) transform to frequency domain, obtain the frequency-region signal S (n, f) of three-dimensional audio, S (n, f) is by S1(n,f)、S2(n, f)、…、SK(n,f).Wherein, Sk(n, f) is the frequency domain presentation of k-th of aeoplotropism audio signal, and n represents frame index, and f represents frequency Rate index.It is converted when it is implemented, the other methods such as MDCT or Hilbert Huang can also be used.
K=8, f=1,2 ... in embodiment, 40.8 aeoplotropism audio signal s1(t)、s2(t)、…、s8(t) frequency domain Signal is (S1(n,f),S2(n,f),…,S8(n, f)), their corresponding spatial parameters areAnd the number of the affiliated object of these spatial parameters is Index (n, f).
Step C2 carries out intraframe coding when embodiment carries out step C3 to spatial parameter and specifically performs following steps:
C21: the spatial parameter for the different frequency bands for belonging to same audio object in same frame is clustered, i.e., for n phase Together, the spatial parameter that the value of Index (n, f) is identical but f is differentIt carries out Cluster, the spatial parameter after generating cluster
C22: to the spatial parameter after clusterQuantified, it can be with It is perception quantization or directly quantization;
C23: intraframe coding is carried out to the spatial parameter after quantization, can be perceptual coding or direct coding;
Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and embodiment carries out step C3 When, coding method is differential encoding.
Referring to fig. 2, the decoding end of the embodiment of the present invention executes following below scheme:
Step D1 carries out decoding inter frames to spatial parameter, and when embodiment carries out step D1, coding/decoding method is differential decoding.
Step D2 carries out intraframe decoder when embodiment carries out step D2 to spatial parameter and specifically performs following steps:
D21: carrying out intraframe decoder to spatial parameter, can be perception decoding or directly decodes;
D22: to after intraframe decoder spatial parameter carry out inverse quantization, can be directed to perception quantization inverse quantization or It is directed to the inverse quantization directly quantified;
D23: by the spatial parameter of the different frequency bands of the clustered same audio object for belonging to same frame Their corresponding frequency bands are mapped to, original spatial parameter is reduced into
The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains the time domain expression s ' of audio signal (t), S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is signal of the s (t) after encoding and decoding;It is right comprising n Time domain expression s ' (t) of the audio signal of elephant and step D2 gained spatial parameterAnd original spatial parameter institute The number Index (n, f) for belonging to audio object constitutes the audio signal of the decoded three-dimensional audio comprising n object, space ginseng Several and the affiliated audio object of spatial parameter number.Different configuration of loudspeaker or earphone can be used when specific implementation accordingly Three-dimensional audio sound field is rebuild, can restore original three-dimensional audio.
Embodiment is by 8 aeoplotropism audio signal (S ' after encoding and decoding1(n,f),S’2(n,f),…,S’8(n, f)) transformation To time domain, 8 aeoplotropism audio signal s ' are obtained1(t),s’2(t),…,s’8(t) and spatial parameter has been decoded And the number Index of the original affiliated audio object of spatial parameter (n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object, spatial parameter and the affiliated audio of spatial parameter The number of object.The present embodiment realizes the playback of three-dimensional sound signal of the band apart from side information using earphone, in order to realize ear The three-dimensional audio of machine is reappeared, and is needed with the library related transfer function (HRTF) to the end, the library PKU&IOA HRTF to far field and near field all It measures, distance r changes to 160cm from 20cm, and the resolution ratio of horizontal angle and elevation angle is 5 respectively0With 100, we select The library PKU&IOA HRTF rebuilds to complete to have carried out the three-dimensional audio of frame data compression and interframe compression.
By Experimental comparison, three of three-dimensional audio compression method than original only interframe encode of intraframe coding are increased The compression effectiveness for tieing up audio compression method is good, and compression ratio is higher and reconstruction audio quality is still kept.Due to increasing in frame Coding can eliminate redundancy in frame, therefore this method improves three-dimensional space on the basis of guaranteeing reconstruction three-dimensional audio quality Compression of parameters rate reduces spatial parameter bit rate.
Method provided by the present invention can realize automatic running using software technology, can also realize as corresponding modularization system System.It is provided by the invention a kind of for improving the parametric codec system of three-dimensional audio spatial impression distance perception, including encoder and Decoder, the encoder comprise the following modules,
Time-frequency conversion module, for input include comprising the three-dimensional sound signal of n object, three-dimensional audio spatial parameter with And the number of the affiliated audio object of spatial parameter, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f); Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is k-th of aeoplotropism audio letter Number frequency domain presentation;Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height Angle is spent, r is apart from side information;The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal;Index(n,f) Value be the affiliated audio object of spatial parameter number;N represents frame index, and f represents frequency indices;
Intraframe coding module, for carrying out intraframe coding to the spatial parameter of input, including for belonging in same frame The spatial parameter of the different frequency bands of same audio object is clustered;To the spatial parameter after clusterThe amount of progress Change;Intraframe coding is carried out to the spatial parameter after quantization;
Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is poor Coded;
The decoder comprises the following modules:
Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding;
Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for solve in frame to spatial parameter Code;Inverse quantization is carried out to the spatial parameter after intraframe decoder;Restore original spatial parameter
Time-frequency inverse transform block obtains audio signal for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain Time domain express s ' (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) warp Signal after crossing encoding and decoding;Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded, The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.
Intraframe coding module includes cluster module, and the cluster module is used for belonging to same audio object in same frame The spatial parameters of different frequency bands clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentIt is clustered, the spatial parameter after generating cluster
Intraframe decoder module includes recovery module, and the recovery module is used to clustered belonging to the same of same frame The spatial parameter of the different frequency bands of one audio objectTheir corresponding frequency bands are mapped to, original space is reduced into Parameter
Intraframe coding module includes quantization modules, and the quantization modules are used for the spatial parameter after clusterQuantified, the quantization is perception quantization or directly quantization;Spatial parameter after quantization is carried out in frame Coding, the coding is perceptual coding or direct coding.
Intraframe decoder module includes inverse quantization module, and the inverse quantization module is used to carry out spatial parameter to solve in frame Code, the decoding are perception decodings or directly decode;Inverse quantization, the inverse are carried out to the spatial parameter after intraframe decoder Change is to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
Each module specific implementation is corresponding to method and step, and it will not go into details by the present invention.
Specific embodiment described herein is only to give an example to the content of present invention.The neck of technology belonging to the present invention The technical staff in domain can make various modifications or additions to the described embodiments or replace by a similar method Generation, but without departing from the contents of the present invention or beyond the scope of the appended claims.

Claims (10)

1. a kind of for improving the decoding method of three-dimensional audio spatial parameter compression ratio, which is characterized in that including cataloged procedure And decoding process, the cataloged procedure the following steps are included:
Step C1, input include belonging to the three-dimensional sound signal comprising n object, three-dimensional audio spatial parameter and spatial parameter Three-dimensional audio time-domain signal is transformed to frequency domain by the number of audio object, specific as follows,
If the time-domain signal of three-dimensional audio is s (t), the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), three-dimensional audio Spatial parameterDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is become Frequency domain is changed to, obtains the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n, f)…、SK(n,f);Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is kth The frequency domain presentation of a aeoplotropism audio signal;Indicate the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ For horizontal angle,For elevation angle, r is apart from side information;The value of k is 1,2 ..., and K, K are original aeoplotropism audio signal Sum;The value of Index (n, f) is the number of the affiliated audio object of spatial parameter;N represents frame index, and f represents frequency indices;
Step C2 carries out intraframe coding to the spatial parameter of input, realize it is as follows, to belonging to same audio object in same frame The spatial parameter of different frequency bands is clustered;To the spatial parameter after clusterQuantified;To the space after quantization Parameter carries out intraframe coding;
Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, coding method is differential encoding;Institute Decoding process is stated to include the following steps,
Step D1 carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding;
Step D2 carries out intraframe decoder to spatial parameter, and realization is as follows, carries out intraframe decoder to spatial parameter;To intraframe decoder Spatial parameter afterwards carries out inverse quantization;Restore original spatial parameter
The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains time domain expression s ' (t) of audio signal, The S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is signal of the s (t) after encoding and decoding; Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it is original The number Index (n, f) of the affiliated audio object of spatial parameter constitutes the audio letter of the decoded three-dimensional audio comprising n object Number, the number of spatial parameter and the affiliated audio object of spatial parameter.
2. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
It in the step C2, is clustered to the spatial parameter for the different frequency bands for belonging to same audio object in same frame, Spatial parameter i.e. identical for n, that the value of Index (n, f) is identical but f is differentIt is clustered, after generating cluster Spatial parameter
3. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
It is by the spatial parameter of the different frequency bands of the clustered same audio object for belonging to same frame in the step D2Their corresponding frequency bands are mapped to, original spatial parameter is reduced into
4. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
In the step C2, to the spatial parameter after clusterQuantified, the quantization be perception quantization or Directly quantify;Intraframe coding is carried out to the spatial parameter after quantization, the coding is perceptual coding or direct coding.
5. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:
In the step D2, intraframe decoder is carried out to spatial parameter, the decoding is perception decoding or directly decodes;To frame Interior decoded spatial parameter carries out inverse quantization, and the inverse quantization is to be directed to the inverse quantization of perception quantization or be directed to straight Connect the inverse quantization of quantization.
6. a kind of for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: including encoder and Decoder,
The encoder comprises the following modules,
Time-frequency conversion module includes three-dimensional sound signal, three-dimensional audio spatial parameter and sky comprising n object for inputting Between the affiliated audio object of parameter number, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets the time domain of three-dimensional audio Signal is s (t), and the s (t) includes s1(t)、s2(t)、sk(t)…、sK(t), the spatial parameter of three-dimensional audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f);The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S1(n,f)、S2(n,f)、Sk(n,f)…、SK(n,f); Wherein, skIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time;Sk(n, f) is k-th of aeoplotropism audio letter Number frequency domain presentation;Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height Angle is spent, r is apart from side information;The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal;Index(n,f) Value be the affiliated audio object of spatial parameter number;N represents frame index, and f represents frequency indices;
Intraframe coding module, for the spatial parameter progress intraframe coding to input, including for same to belonging in same frame The spatial parameter of the different frequency bands of audio object is clustered;To the spatial parameter after clusterQuantified;To amount Spatial parameter after change carries out intraframe coding;
Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is difference volume Code;
The decoder comprises the following modules:
Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding;
Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for carrying out intraframe decoder to spatial parameter;
Inverse quantization is carried out to the spatial parameter after intraframe decoder;Restore original spatial parameter
Time-frequency inverse transform block, for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain, obtain audio signal when S ' (t) is expressed in domain, and the S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) by compiling solution Signal after code;Spatial parameter obtained by time domain expression s ' (t) of audio signal comprising n object and intraframe decoder moduleAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded, The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.
7. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: The intraframe coding module includes cluster module, and the cluster module is used for belonging to same audio object in same frame The spatial parameter of different frequency bands is clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentIt is clustered, the spatial parameter after generating cluster
8. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: The intraframe decoder module includes recovery module, and the recovery module is used to clustered belonging to the same of same frame The spatial parameter of the different frequency bands of audio objectTheir corresponding frequency bands are mapped to, original space ginseng is reduced into Number
9. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: The intraframe coding module includes quantization modules, and the quantization modules are used for the spatial parameter after cluster Quantified, the quantization is perception quantization or directly quantization;Intraframe coding is carried out to the spatial parameter after quantization, it is described Coding is perceptual coding or direct coding.
10. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, feature exists In: the intraframe decoder module includes inverse quantization module, and the inverse quantization module is used to carry out spatial parameter to solve in frame Code, the decoding are perception decodings or directly decode;Inverse quantization, the inverse are carried out to the spatial parameter after intraframe decoder Change is to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.
CN201610541939.8A 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio Active CN106023999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610541939.8A CN106023999B (en) 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610541939.8A CN106023999B (en) 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Publications (2)

Publication Number Publication Date
CN106023999A CN106023999A (en) 2016-10-12
CN106023999B true CN106023999B (en) 2019-06-11

Family

ID=57108555

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610541939.8A Active CN106023999B (en) 2016-07-11 2016-07-11 For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Country Status (1)

Country Link
CN (1) CN106023999B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108206022B (en) * 2016-12-16 2020-12-18 南京青衿信息科技有限公司 Codec for transmitting three-dimensional acoustic signals by using AES/EBU channel and coding and decoding method thereof
MX2020005045A (en) 2017-11-17 2020-08-20 Fraunhofer Ges Forschung Apparatus and method for encoding or decoding directional audio coding parameters using quantization and entropy coding.
GB2576769A (en) * 2018-08-31 2020-03-04 Nokia Technologies Oy Spatial parameter signalling
GB2578625A (en) * 2018-11-01 2020-05-20 Nokia Technologies Oy Apparatus, methods and computer programs for encoding spatial metadata
GB2586586A (en) * 2019-08-16 2021-03-03 Nokia Technologies Oy Quantization of spatial audio direction parameters
US20240046939A1 (en) * 2020-12-15 2024-02-08 Nokia Technologies Oy Quantizing spatial audio parameters
CN115662448B (en) * 2022-10-17 2023-10-20 深圳市超时代软件有限公司 Method and device for converting audio data coding format

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101521013A (en) * 2009-04-08 2009-09-02 武汉大学 Spatial audio parameter bidirectional interframe predictive coding and decoding devices
CN101609674A (en) * 2008-06-20 2009-12-23 华为技术有限公司 Decoding method, device and system
US7974287B2 (en) * 2006-02-23 2011-07-05 Lg Electronics Inc. Method and apparatus for processing an audio signal
CN102177542A (en) * 2008-10-10 2011-09-07 艾利森电话股份有限公司 Energy conservative multi-channel audio coding
CN103165134A (en) * 2013-04-02 2013-06-19 武汉大学 Coding and decoding device of audio signal high frequency parameter
CN103400582A (en) * 2013-08-13 2013-11-20 武汉大学 Encoding and decoding method and system for multi-channel three-dimensional voice frequency
CN103928030A (en) * 2014-04-30 2014-07-16 武汉大学 Gradable audio coding system and method based on sub-band space attention measure
CN104064194A (en) * 2014-06-30 2014-09-24 武汉大学 Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20070025907A (en) * 2005-08-30 2007-03-08 엘지전자 주식회사 Method of effective bitstream composition for the parameter band number of channel conversion module in multi-channel audio coding

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7974287B2 (en) * 2006-02-23 2011-07-05 Lg Electronics Inc. Method and apparatus for processing an audio signal
CN101609674A (en) * 2008-06-20 2009-12-23 华为技术有限公司 Decoding method, device and system
CN102177542A (en) * 2008-10-10 2011-09-07 艾利森电话股份有限公司 Energy conservative multi-channel audio coding
CN101521013A (en) * 2009-04-08 2009-09-02 武汉大学 Spatial audio parameter bidirectional interframe predictive coding and decoding devices
CN103165134A (en) * 2013-04-02 2013-06-19 武汉大学 Coding and decoding device of audio signal high frequency parameter
CN103400582A (en) * 2013-08-13 2013-11-20 武汉大学 Encoding and decoding method and system for multi-channel three-dimensional voice frequency
CN103928030A (en) * 2014-04-30 2014-07-16 武汉大学 Gradable audio coding system and method based on sub-band space attention measure
CN104064194A (en) * 2014-06-30 2014-09-24 武汉大学 Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
AVS-P10移动音频编解码标准与关键技术;胡瑞敏等;《电视技术》;20101031;第34卷(第10期);第4-8页

Also Published As

Publication number Publication date
CN106023999A (en) 2016-10-12

Similar Documents

Publication Publication Date Title
CN106023999B (en) For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio
CN111226442B (en) Method of configuring transforms for video compression and computer-readable storage medium
KR101221918B1 (en) A method and an apparatus for processing a signal
CN102342105B (en) For carrying out the Apparatus and method for of Code And Decode to multi-layer video
KR20200100061A (en) Apparatus and method for encoding or decoding directional audio coding parameters using different time/frequency resolutions
CN106415714A (en) Coding independent frames of ambient higher-order ambisonic coefficients
TW200935403A (en) Technique for encoding/decoding of codebook indices for quantized MDCT spectrum in scalable speech and audio codecs
CN106463121A (en) Higher order ambisonics signal compression
CN109448741A (en) A kind of 3D audio coding, coding/decoding method and device
TWI702594B (en) Backward-compatible integration of high frequency reconstruction techniques for audio signals
BRPI0612218A2 (en) adaptive residual audio coding
JP2016529544A (en) Audio encoder, audio decoder, method, and computer program using joint encoded residual signal
JP7413334B2 (en) Backward-compatible integration of harmonic converters for high-frequency reconstruction of audio signals
CN104064194A (en) Parameter coding/decoding method and parameter coding/decoding system used for improving sense of space and sense of distance of three-dimensional audio frequency
TWI820123B (en) Integration of high frequency reconstruction techniques with reduced post-processing delay
CN109996073A (en) A kind of method for compressing image, system, readable storage medium storing program for executing and computer equipment
TWI463483B (en) Method and device of bitrate distribution/truncation for scalable audio coding
WO2015096789A1 (en) Method and device for use in vector quantization encoding/decoding of audio signal
JP2014513813A (en) Adaptive gain-shape rate sharing
CN110660401A (en) Audio object coding and decoding method based on high-low frequency domain resolution switching
CN103065634A (en) Three-dimensional audio space parameter quantification method based on perception characteristic
CN107112020A (en) The parametrization mixing of audio signal
JP6094322B2 (en) Orthogonal transformation device, orthogonal transformation method, computer program for orthogonal transformation, and audio decoding device
CN112365896B (en) Object-oriented encoding method based on stack type sparse self-encoder
CN104347077B (en) A kind of stereo coding/decoding method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant