CN106023999B

CN106023999B - For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Info

Publication number: CN106023999B
Application number: CN201610541939.8A
Authority: CN
Inventors: 胡瑞敏; 杨乘; 王晓晨; 杜鹏慧; 苏柳月; 武庭照; 陈玮; 杨玉红
Original assignee: Wuhan University WHU
Current assignee: Wuhan University WHU
Priority date: 2016-07-11
Filing date: 2016-07-11
Publication date: 2019-06-11
Anticipated expiration: 2036-07-11
Also published as: CN106023999A

Abstract

The present invention provides the decoding methods and system for improving three-dimensional audio spatial parameter compression ratio, the present invention inputs audio signal, the spatial side information of three-dimensional audio and the number of the affiliated audio object of spatial parameter of three-dimensional audio in coding, and when coding successively clusters spatial parameter, quantifies, intraframe coding, inter-frame difference coding；Inter-frame difference decoding, intraframe decoder, inverse quantization, spatial parameter mapping are successively carried out when decoding；The present invention is based on the different sub-band spatial parameters in the same frame of same sound source to have the characteristics that similitude, and the compression ratio of the spatial parameter of three-dimensional audio, available higher three-dimensional audio spatial parameter compression ratio are improved using the method for spatial parameter cluster.

Description

For improving the decoding method and system of three-dimensional audio spatial parameter compression ratio

Technical field

The present invention relates to digital audio fields, for the demand for improving three-dimensional audio spatial parameter compression ratio, more particularly to A kind of decoding method and system improving three-dimensional audio spatial parameter compression ratio.

Background technique

In the end of the year 2009, three-dimensional movie " A Fanda " climbs up top box-office value in global more than 30 a countries, in September, 2010 Just, the accumulative box office in the whole world is more than 2,700,000,000 dollars.Why " A Fanda " can obtain the box office achievement of such splendidness, be that it is adopted Completely new three-dimensional special effect making technology brings the shock effect on people's sense organ.

In order to provide a kind of feeling and a kind of more true sound field more immersed, space in 3d space to auditor Audio object encodes (SAOC), and direction audio coding (DirAC) and space squeezing audio coding (S3AC) are suggested.With 3D The raising of spatial resolution and more and more sound channels or object, the bit rate of spatial parameter also sharp improve.For example, In space orientation point of quantification (SLQP) method of S3AC coding, the bit rate of spatial parameter is 18kbps/ object, then for 16 sound objects, spatial parameter need the bit rate of 288kbps.Therefore, the ratio of the spatial parameter in 3D audio coding is reduced Special rate is very urgent.

Compression method BCC, the MPEG Surround and S3AC of spatial parameter considers the characteristic between consecutive frame, then The bit rate of spatial parameter can be reduced by differential encoding.These methods can remove empty between consecutive frame in identical frequency band Between parameter inter-frame redundancy, but redundancy still has in the frame of spatial parameter between same sound source different frequency bands in same frame. If can try every possible means to remove redundancy in these frames, spatial parameter bit rate can be further compressed.

Summary of the invention

It is an object of the invention to, in deficiency present on compression 3D audio space parameter, provide for the above-mentioned prior art A kind of new object-based spatial parameter compression method for 3D audio recording；This method is based on same sound source in same frame The characteristic of interior different frequency bands spatial parameter having the same can with height ratio remove in existing spatial parameter compression method Redundancy in the frame for the spatial parameter not considered, thus further compression space parameters bit rate.

Technical solution of the present invention provides a kind of for improving the decoding method of three-dimensional audio spatial parameter compression ratio, packet Include cataloged procedure and decoding process, the cataloged procedure the following steps are included:

Step C1, input include three-dimensional sound signal, three-dimensional audio spatial parameter and spatial parameter comprising n object Three-dimensional audio time-domain signal is transformed to frequency domain by the number of affiliated audio object, specific as follows,

If the time-domain signal of three-dimensional audio is s (t), the s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), three Tie up the spatial parameter of audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f)；The time-domain signal s (t) of three-dimensional audio is become Frequency domain is changed to, obtains the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S₁(n,f)、S₂(n,f)、S_k(n, f)…、S_K(n,f)；Wherein, s_kIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time；S_k(n, f) is kth The frequency domain presentation of a aeoplotropism audio signal；Indicate the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ For horizontal angle,For elevation angle, r is apart from side information；The value of k is 1,2 ..., and K, K are original aeoplotropism audio signal Sum；The value of Index (n, f) is the number of the affiliated audio object of spatial parameter；N represents frame index, and f represents frequency indices；

Step C2 carries out intraframe coding to the spatial parameter of input, realize it is as follows, to belonging to same audio pair in same frame The spatial parameter of the different frequency bands of elephant is clustered；To the spatial parameter after clusterQuantified；After quantization Spatial parameter carry out intraframe coding；

Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is difference volume Code；

The decoding process includes the following steps；

Step D1 carries out decoding inter frames to spatial parameter, and coding/decoding method is differential decoding；

Step D2 carries out intraframe decoder to spatial parameter, and realization is as follows, carries out intraframe decoder to spatial parameter；To in frame Decoded spatial parameter carries out inverse quantization；Restore original spatial parameter

The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains the time domain expression s ' of audio signal (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) after encoding and decoding Signal；Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it is former The number Index (n, f) of the affiliated audio object of the spatial parameter of beginning constitutes the sound of the decoded three-dimensional audio comprising n object Frequency signal, the number of spatial parameter and the affiliated audio object of spatial parameter.

It further, is the space to the different frequency bands for belonging to same audio object in same frame in the step C2 Parameter is clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentGathered Class, the spatial parameter after generating cluster

It further, is by the difference of the clustered same audio object for belonging to same frame in the step D2 The spatial parameter of frequency bandTheir corresponding frequency bands are mapped to, original spatial parameter is reduced into

Further, in the step C2, to the spatial parameter after clusterQuantified, the amount Change is perception quantization or directly quantization；To after quantization spatial parameter carry out intraframe coding, the coding be perceptual coding or Direct coding.

Further, in the step D2, to spatial parameter carry out intraframe decoder, the decoding be perception decoding or It directly decodes；Inverse quantization is carried out to the spatial parameter after intraframe decoder, the inverse quantization is the inverse for being directed to perception quantization Change or be directed to the inverse quantization directly quantified.

It is a kind of for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, including encoder and decoder；

The encoder comprises the following modules:

Time-frequency conversion module, for input include comprising the three-dimensional sound signal of n object, three-dimensional audio spatial parameter with And the number of the affiliated audio object of spatial parameter, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets three-dimensional audio Time-domain signal is s (t), and the s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), the spatial parameter of three-dimensional audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f)；The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S₁(n,f)、S₂(n,f)、S_k(n,f)…、S_K(n,f)； Wherein, s_kIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time；S_k(n, f) is k-th of aeoplotropism audio letter Number frequency domain presentation；Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height Angle is spent, r is apart from side information；The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal；Index(n,f) Value be the affiliated audio object of spatial parameter number；N represents frame index, and f represents frequency indices；

Intraframe coding module, for carrying out intraframe coding to the spatial parameter of input, including for belonging in same frame The spatial parameter of the different frequency bands of same audio object is clustered；To the spatial parameter after clusterThe amount of progress Change；Intraframe coding is carried out to the spatial parameter after quantization；

Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is poor Coded；

The decoder comprises the following modules:

Decoding inter frames module, for carrying out decoding inter frames to spatial parameter, coding/decoding method is differential decoding；

Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for solve in frame to spatial parameter Code；Inverse quantization is carried out to the spatial parameter after intraframe decoder；Restore original spatial parameter

Time-frequency inverse transform block obtains audio signal for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain Time domain express s ' (t), the S ' (n, f) that contracts is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) warp Signal after crossing encoding and decoding；Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded, The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.

Further, the intraframe coding module includes cluster module, and the cluster module is used for in same frame The spatial parameter for belonging to the different frequency bands of same audio object is clustered, i.e., identical for n, the value of Index (n, f) it is identical but F different spatial parametersIt is clustered, the spatial parameter after generating cluster

Further, the intraframe decoder module includes recovery module, and the recovery module is used for will be clustered The same audio object for belonging to same frame different frequency bands spatial parameterTheir corresponding frequency bands are mapped to, It is reduced into original spatial parameter

Further, the intraframe coding module includes quantization modules, after the quantization modules are used for cluster Spatial parameterQuantified, the quantization is perception quantization or directly quantization；To the spatial parameter after quantization into Row intraframe coding, the coding are perceptual coding or direct coding.

Further, the intraframe decoder module includes inverse quantization module, and the inverse quantization module is used for space Parameter carries out intraframe decoder, and the decoding is perception decoding or directly decodes；Spatial parameter after intraframe decoder is carried out anti- Quantization, the inverse quantization are to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.

Join the beneficial effects of the present invention are: the present invention is based on the different frequencies of same sound source in same frame with identical space Then number carries out spatial parameter frame in coding side by spatial parameter cluster, spatial parameter quantization, spatial parameter intraframe coding Between differential encoding, further compress three-dimensional audio space parameters bit rate, improve spatial parameter compression ratio.Decoding end is to three-dimensional sound Frequency code stream is decoded, including carries out inter-frame difference decoding, spatial parameter intraframe decoder, after intraframe decoder to spatial parameter Spatial parameter carries out inverse quantization, and the spatial parameter of cluster is mapped, and obtains audio signal, the spatial parameter of three-dimensional audio And the number of the affiliated audio object of spatial parameter.Therefore, the present invention solves previous only existing by increasing encoding and decoding in frame Spatial parameter compression method in do not consider the defect of redundancy in spatial parameter frame, can further compress three-dimensional audio space ginseng Number bit rate, improves spatial parameter compression ratio.

Detailed description of the invention

Fig. 1 is the flow chart of the coding side of the embodiment of the present invention；

Fig. 2 is the flow chart of the decoding end of the embodiment of the present invention.

Specific embodiment

Below in conjunction with drawings and examples the present invention will be described in detail technical solution, (wherein step C1 to step C3 is encoded Journey, step D1 to step D3 are decoding process).

Referring to Fig. 1, the coding side of the embodiment of the present invention executes following below scheme:

The time-domain signal s (t) of three-dimensional audio is transformed to frequency domain by step C1, obtain three-dimensional audio frequency-region signal S (n, f)。

The input of coding side are as follows: the three-dimensional sound signal comprising n object, three-dimensional audio spatial parameter and spatial parameter The number of affiliated audio object.The time domain of the audio signal of three-dimensional audio is expressed as s (t), and s (t) is by s₁(t)、s₂(t)、…、s_K (t) it constitutes, t indicates the time；The spatial parameter of three-dimensional audio namely the corresponding spatial parameter of each time frequency pointByIt constitutes；The number of the affiliated audio object of spatial parameter, uses Index (n, f) expression.Wherein, s_kIt (t) is the time domain expression of k-th of aeoplotropism audio signal,Indicate k-th of aeoplotropism The corresponding spatial parameter of audio signal, spatial parameter is by direction parameter (horizontal angle θ, elevation angle) and distance parameter r composition.K's Value is 1,2 ..., K, and K is the sum of original aeoplotropism audio signal.

The time-domain signal of three-dimensional audio is transformed into frequency domain, it can be by the time-domain signal s (t) of three-dimensional audio using Fu in short-term In leaf transformation (STFT) transform to frequency domain, obtain the frequency-region signal S (n, f) of three-dimensional audio, S (n, f) is by S₁(n,f)、S₂(n, f)、…、S_K(n,f).Wherein, S_k(n, f) is the frequency domain presentation of k-th of aeoplotropism audio signal, and n represents frame index, and f represents frequency Rate index.It is converted when it is implemented, the other methods such as MDCT or Hilbert Huang can also be used.

K=8, f=1,2 ... in embodiment, 40.8 aeoplotropism audio signal s₁(t)、s₂(t)、…、s₈(t) frequency domain Signal is (S₁(n,f),S₂(n,f),…,S₈(n, f)), their corresponding spatial parameters areAnd the number of the affiliated object of these spatial parameters is Index (n, f).

Step C2 carries out intraframe coding when embodiment carries out step C3 to spatial parameter and specifically performs following steps:

C21: the spatial parameter for the different frequency bands for belonging to same audio object in same frame is clustered, i.e., for n phase Together, the spatial parameter that the value of Index (n, f) is identical but f is differentIt carries out Cluster, the spatial parameter after generating cluster

C22: to the spatial parameter after clusterQuantified, it can be with It is perception quantization or directly quantization；

C23: intraframe coding is carried out to the spatial parameter after quantization, can be perceptual coding or direct coding；

Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and embodiment carries out step C3 When, coding method is differential encoding.

Referring to fig. 2, the decoding end of the embodiment of the present invention executes following below scheme:

Step D1 carries out decoding inter frames to spatial parameter, and when embodiment carries out step D1, coding/decoding method is differential decoding.

Step D2 carries out intraframe decoder when embodiment carries out step D2 to spatial parameter and specifically performs following steps:

D21: carrying out intraframe decoder to spatial parameter, can be perception decoding or directly decodes；

D22: to after intraframe decoder spatial parameter carry out inverse quantization, can be directed to perception quantization inverse quantization or It is directed to the inverse quantization directly quantified；

D23: by the spatial parameter of the different frequency bands of the clustered same audio object for belonging to same frame Their corresponding frequency bands are mapped to, original spatial parameter is reduced into

The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains the time domain expression s ' of audio signal (t), S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is signal of the s (t) after encoding and decoding；It is right comprising n Time domain expression s ' (t) of the audio signal of elephant and step D2 gained spatial parameterAnd original spatial parameter institute The number Index (n, f) for belonging to audio object constitutes the audio signal of the decoded three-dimensional audio comprising n object, space ginseng Several and the affiliated audio object of spatial parameter number.Different configuration of loudspeaker or earphone can be used when specific implementation accordingly Three-dimensional audio sound field is rebuild, can restore original three-dimensional audio.

Embodiment is by 8 aeoplotropism audio signal (S ' after encoding and decoding₁(n,f),S’₂(n,f),…,S’₈(n, f)) transformation To time domain, 8 aeoplotropism audio signal s ' are obtained₁(t),s’₂(t),…,s’₈(t) and spatial parameter has been decoded And the number Index of the original affiliated audio object of spatial parameter (n, f) constitutes the audio signal of the decoded three-dimensional audio comprising n object, spatial parameter and the affiliated audio of spatial parameter The number of object.The present embodiment realizes the playback of three-dimensional sound signal of the band apart from side information using earphone, in order to realize ear The three-dimensional audio of machine is reappeared, and is needed with the library related transfer function (HRTF) to the end, the library PKU&IOA HRTF to far field and near field all It measures, distance r changes to 160cm from 20cm, and the resolution ratio of horizontal angle and elevation angle is 5 respectively⁰With 10⁰, we select The library PKU&IOA HRTF rebuilds to complete to have carried out the three-dimensional audio of frame data compression and interframe compression.

By Experimental comparison, three of three-dimensional audio compression method than original only interframe encode of intraframe coding are increased The compression effectiveness for tieing up audio compression method is good, and compression ratio is higher and reconstruction audio quality is still kept.Due to increasing in frame Coding can eliminate redundancy in frame, therefore this method improves three-dimensional space on the basis of guaranteeing reconstruction three-dimensional audio quality Compression of parameters rate reduces spatial parameter bit rate.

Method provided by the present invention can realize automatic running using software technology, can also realize as corresponding modularization system System.It is provided by the invention a kind of for improving the parametric codec system of three-dimensional audio spatial impression distance perception, including encoder and Decoder, the encoder comprise the following modules,

The decoder comprises the following modules:

Intraframe coding module includes cluster module, and the cluster module is used for belonging to same audio object in same frame The spatial parameters of different frequency bands clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentIt is clustered, the spatial parameter after generating cluster

Intraframe decoder module includes recovery module, and the recovery module is used to clustered belonging to the same of same frame The spatial parameter of the different frequency bands of one audio objectTheir corresponding frequency bands are mapped to, original space is reduced into Parameter

Intraframe coding module includes quantization modules, and the quantization modules are used for the spatial parameter after clusterQuantified, the quantization is perception quantization or directly quantization；Spatial parameter after quantization is carried out in frame Coding, the coding is perceptual coding or direct coding.

Intraframe decoder module includes inverse quantization module, and the inverse quantization module is used to carry out spatial parameter to solve in frame Code, the decoding are perception decodings or directly decode；Inverse quantization, the inverse are carried out to the spatial parameter after intraframe decoder Change is to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.

Each module specific implementation is corresponding to method and step, and it will not go into details by the present invention.

Specific embodiment described herein is only to give an example to the content of present invention.The neck of technology belonging to the present invention The technical staff in domain can make various modifications or additions to the described embodiments or replace by a similar method Generation, but without departing from the contents of the present invention or beyond the scope of the appended claims.

Claims

1. a kind of for improving the decoding method of three-dimensional audio spatial parameter compression ratio, which is characterized in that including cataloged procedure And decoding process, the cataloged procedure the following steps are included:

Step C1, input include belonging to the three-dimensional sound signal comprising n object, three-dimensional audio spatial parameter and spatial parameter Three-dimensional audio time-domain signal is transformed to frequency domain by the number of audio object, specific as follows,

If the time-domain signal of three-dimensional audio is s (t), the s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), three-dimensional audio Spatial parameterDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f)；The time-domain signal s (t) of three-dimensional audio is become Frequency domain is changed to, obtains the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S₁(n,f)、S₂(n,f)、S_k(n, f)…、S_K(n,f)；Wherein, s_kIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time；S_k(n, f) is kth The frequency domain presentation of a aeoplotropism audio signal；Indicate the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ For horizontal angle,For elevation angle, r is apart from side information；The value of k is 1,2 ..., and K, K are original aeoplotropism audio signal Sum；The value of Index (n, f) is the number of the affiliated audio object of spatial parameter；N represents frame index, and f represents frequency indices；

Step C2 carries out intraframe coding to the spatial parameter of input, realize it is as follows, to belonging to same audio object in same frame The spatial parameter of different frequency bands is clustered；To the spatial parameter after clusterQuantified；To the space after quantization Parameter carries out intraframe coding；

Step C3 carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, coding method is differential encoding；Institute Decoding process is stated to include the following steps,

Step D2 carries out intraframe decoder to spatial parameter, and realization is as follows, carries out intraframe decoder to spatial parameter；To intraframe decoder Spatial parameter afterwards carries out inverse quantization；Restore original spatial parameter

The frequency domain presentation S ' (n, f) of audio signal is transformed to time domain by step D3, obtains time domain expression s ' (t) of audio signal, The S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is signal of the s (t) after encoding and decoding； Time domain expression s ' (t) of audio signal comprising n object and step D2 gained spatial parameterAnd it is original The number Index (n, f) of the affiliated audio object of spatial parameter constitutes the audio letter of the decoded three-dimensional audio comprising n object Number, the number of spatial parameter and the affiliated audio object of spatial parameter.

2. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:

It in the step C2, is clustered to the spatial parameter for the different frequency bands for belonging to same audio object in same frame, Spatial parameter i.e. identical for n, that the value of Index (n, f) is identical but f is differentIt is clustered, after generating cluster Spatial parameter

3. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:

It is by the spatial parameter of the different frequency bands of the clustered same audio object for belonging to same frame in the step D2Their corresponding frequency bands are mapped to, original spatial parameter is reduced into

4. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:

In the step C2, to the spatial parameter after clusterQuantified, the quantization be perception quantization or Directly quantify；Intraframe coding is carried out to the spatial parameter after quantization, the coding is perceptual coding or direct coding.

5. according to claim 1 for improving the decoding method of three-dimensional audio spatial parameter compression ratio, it is characterised in that:

In the step D2, intraframe decoder is carried out to spatial parameter, the decoding is perception decoding or directly decodes；To frame Interior decoded spatial parameter carries out inverse quantization, and the inverse quantization is to be directed to the inverse quantization of perception quantization or be directed to straight Connect the inverse quantization of quantization.

6. a kind of for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: including encoder and Decoder,

The encoder comprises the following modules,

Time-frequency conversion module includes three-dimensional sound signal, three-dimensional audio spatial parameter and sky comprising n object for inputting Between the affiliated audio object of parameter number, three-dimensional audio time-domain signal is transformed into frequency domain, specifically sets the time domain of three-dimensional audio Signal is s (t), and the s (t) includes s₁(t)、s₂(t)、s_k(t)…、s_K(t), the spatial parameter of three-dimensional audioDescribedIncluding The number of the affiliated audio object of spatial parameter is Index (n, f)；The time-domain signal s (t) of three-dimensional audio is transformed into frequency domain, is obtained To the frequency-region signal S (n, f) of three-dimensional audio, the S (n, f) includes S₁(n,f)、S₂(n,f)、S_k(n,f)…、S_K(n,f)； Wherein, s_kIt (t) is the time domain expression of k-th of aeoplotropism audio signal, t indicates the time；S_k(n, f) is k-th of aeoplotropism audio letter Number frequency domain presentation；Indicating the corresponding spatial parameter of k-th of aeoplotropism audio signal, θ is horizontal angle,For height Angle is spent, r is apart from side information；The value of k is 1,2 ..., and K, K are the sum of original aeoplotropism audio signal；Index(n,f) Value be the affiliated audio object of spatial parameter number；N represents frame index, and f represents frequency indices；

Intraframe coding module, for the spatial parameter progress intraframe coding to input, including for same to belonging in same frame The spatial parameter of the different frequency bands of audio object is clustered；To the spatial parameter after clusterQuantified；To amount Spatial parameter after change carries out intraframe coding；

Inter-coding module carries out interframe encode to spatial parameter, generates three-dimensional audio encoding code stream, and coding method is difference volume Code；

The decoder comprises the following modules:

Intraframe decoder module is used to carry out spatial parameter intraframe decoder, including for carrying out intraframe decoder to spatial parameter；

Inverse quantization is carried out to the spatial parameter after intraframe decoder；Restore original spatial parameter

Time-frequency inverse transform block, for the frequency domain presentation S ' (n, f) of audio signal to be transformed to time domain, obtain audio signal when S ' (t) is expressed in domain, and the S ' (n, f) is signal of the S (n, f) after encoding and decoding, and s ' (t) is s (t) by compiling solution Signal after code；Spatial parameter obtained by time domain expression s ' (t) of audio signal comprising n object and intraframe decoder moduleAnd it includes n right that the number Index (n, f) of the original affiliated audio object of spatial parameter, which is constituted decoded, The audio signal of the three-dimensional audio of elephant, the number of spatial parameter and the affiliated audio object of spatial parameter.

7. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: The intraframe coding module includes cluster module, and the cluster module is used for belonging to same audio object in same frame The spatial parameter of different frequency bands is clustered, i.e., identical for n, the spatial parameter that the value of Index (n, f) is identical but f is differentIt is clustered, the spatial parameter after generating cluster

8. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: The intraframe decoder module includes recovery module, and the recovery module is used to clustered belonging to the same of same frame The spatial parameter of the different frequency bands of audio objectTheir corresponding frequency bands are mapped to, original space ginseng is reduced into Number

9. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, it is characterised in that: The intraframe coding module includes quantization modules, and the quantization modules are used for the spatial parameter after cluster Quantified, the quantization is perception quantization or directly quantization；Intraframe coding is carried out to the spatial parameter after quantization, it is described Coding is perceptual coding or direct coding.

10. according to claim 6 for improving the coding/decoding system of three-dimensional audio spatial parameter compression ratio, feature exists In: the intraframe decoder module includes inverse quantization module, and the inverse quantization module is used to carry out spatial parameter to solve in frame Code, the decoding are perception decodings or directly decode；Inverse quantization, the inverse are carried out to the spatial parameter after intraframe decoder Change is to be directed to the inverse quantization of perception quantization or be directed to the inverse quantization directly quantified.