CN1150774C

CN1150774C - Compatible surround sound algorithm

Info

Publication number: CN1150774C
Application number: CNB001174487A
Authority: CN
Inventors: 晓梁; 梁晓
Original assignee: Individual
Current assignee: Individual
Priority date: 2000-09-27
Filing date: 2000-09-27
Publication date: 2004-05-19
Anticipated expiration: 2020-09-27
Also published as: CN1286574A

Abstract

The present invention provides a scheme for a multichannel surrounding sound algorithm which is compatible with a digital double-sound-track coding standard for PCM, MPEG, etc. in CD, VCD, DVD, and DVB and hides coded orders. The present invention at least comprises a sound track which is compatible with the original track, and the digital sound quantisation of the present invention uses trapezoid linear PCM coding quantisation; the sound data of the right and the left sound tracks comprises two layers, a first layer is a stated layer, and the second layer is a self-defined layer; the first layer is divided into an intensity compression method, a waveshape compression method, a hidden order MPEG way, or a hidden order PCM way. The product of the present invention can repeatedly broadcast the sound information of the left track under without an algorithm decoder of the present invention, and can repeatedly broadcast the sound information of the right track on the part layer; if the multichannel surrounding sound can be broadcast by the algorithm decoder of the present invention, and the dynamic range of the surrounding sound is higher than that of the code in the original way.

Description

A kind of Campatible surround sound method

The present invention relates to a kind of Campatible surround sound method.

It is 98117867.7 " compatible DOLBY DIGITAL " (being AC-3, down together) and " international movement motion picture expert group version standard-II " that existing Campatible surround sound patent has only a kind of patent No..(also being " international efficient compression standard-II " is MPEG-II, down together) audio coder and algorithm thereof, this technical scheme is utilizing " digital signal processor (being DSP) " chip to realize " DOLBY DIGITAL " and " international movement motion picture expert group version standard-II " initiatively recognition coding.But " DOLBY DIGITAL ", " international movement motion picture expert group version standard-II " algorithm own are same appearances at same product, promptly " DOLBY DIGITAL " coding can not be applied to " international movement motion picture expert group version standard-II " coding, and " international movement motion picture expert group version standard-II " can not be applied to " DOLBY DIGITAL " coding.

The object of the present invention is to provide a kind of Campatible ambient sound method that at least can compatible two kinds of codings.

In order to achieve the above object, having a sound channel at least in the Campatible surround sound method of the present invention is original encoding, is shared sound channel, and its digital audio quantizes to adopt trapezoidal linear " pulse code modulation " coded quantization; The left and right acoustic channels voice data has two-layer, and ground floor is a specified layer, and the second layer is self-defined layer; Ground floor is divided into intensity compress mode, waveform compress mode, implicit instruction " international movement motion picture expert group version standard " mode or implicit instruction " pulse code modulation " mode.

Owing to adopted such scheme, Campatible surround sound method of the present invention; Utilize former two digital sound channels that do not have digital multichannel, develop into digital multitrack surround sound, wherein have a sound channel and former mode compatibility at least, like this when playing sound, during even without the decompressing device of this programme, at least one sound channel output still can be provided, when the decompressing device of this programme is arranged, under the left and right acoustic channels acting in conjunction, the output of surround sound or other multichannel can be provided, greatly improve former media capabilities, made things convenient for the original subscriber again, also former medium dynamic range can have been brought up to 22 bits " pulse code modulation " more than the form from 16 bits simultaneously.(note: sound channel bag letter left and right acoustic channels herein, but and be not equal to left and right acoustic channels.)

Realize principle

Describe the execution mode of this programme in detail below in conjunction with accompanying drawing.

Fig. 1 is the format chart of trapezoidal linear " pulse code modulation " coded quantization;

Fig. 2 is the instruction layout plan;

Fig. 3 is sound channel layout and sound channel relation code figure;

Fig. 4 is " international movement motion picture expert group version standard " layer Bit data format chart.

Has a passage in the method for the present invention at least, as L channel is shared sound channel, can adopt " pulse code modulation " or " international movement motion picture expert group version standard " expression, therefore, when not having decoder of the present invention, the sound channel of can resetting when using simultaneously with decompressing device of the present invention, can provide surround sound or other multichannel.Its digital audio quantizes to use trapezoidal linear " pulse code modulation " coded quantization instead by original linearity " pulse code modulation " coded quantization, be exactly in expressing voice signal value bit, represent numerical value with a part of bit, another part bit is represented weights, and form as shown in Figure 1.Be in the numerical value bit that every bit quantization is identical under the same weights; Every bit is expressed from low to high by trapezoidal in weights, and to establish the highest weight value be 1.5 times of corresponding " pulse code modulation " most significant bit value.Therefore 16 trapezoidal linear " pulse code modulation " coding can be expressed 24 bits " pulse code modulation " dynamic range, and weak sound degree of quantization increases.As the medium that adopt trapezoidal linear " pulse code modulation " to encode, should identify in medium kind.

Its method flow is as follows:

A1-7: analog signal input, 1 road to 7 tunnel input;

B1-7: analog signal 24 bits " pulse code modulation " quantize, high-order quantizing process;

C1-7:24 bit " pulse code modulation " is processed into 16 bit trapezoidal linear coding by numerical digit, the low level conversion process, and wherein weights are 8;

C ' 1-7:23 bit " pulse code modulation " is processed into 15 bit trapezoidal linear coding by numerical digit, and wherein weights are 8.

When L channel adopted " pulse code modulation " expression, the compression of R channel was divided into two-layer, and ground floor is the hard and fast rule layer, and the second layer is self-defined layer.Ground floor is divided into three kinds of modes again according to media properties and institute's expression effect, first kind of mode is " pulse code modulation " implicit instruction mode, number of plies code is 1111, leaves " pulse code modulation " coding lowest order continuously in, and leading command bits of data; The second way is " a relative intensity value compress mode ", and number of plies code is 100, leaves " pulse code modulation " coding lowest order continuously in; The third mode is " a waveform compress mode ", and number of plies code is 110, leaves " pulse code modulation " coding lowest order continuously in.The above-mentioned number of plies and data bit flow as shown in Figure 2, T is a time orientation.Represent the number of plies with lowest order in this bit stream, but first kind of mode sound channel position be placed directly in " pulse code modulation " lowest order and follow after code, and separate with a bit, the bit place value is 0.Sound channel layout and sound channel relation code is as shown in Figure 3: sound channel code 000 expression supper bass occurs in the ground floor, the 001 mid-sound of expression and preceding about sound occur, wherein the True Data position is the mid-sound of null representation, a left side and back R channel before 010 expression, the right side and back L channel before 011 expression, 100 expression front left and rear left, the preceding right side of 101 expressions and back are right, mid-sound in 110 expression backs and back left and right acoustic channels, wherein the True Data position is a mid-sound after the null representation.Ground floor is followed following principle: 1. the L channel data are tried one's best at medium left side channel position and are positioned the sound channel state; 2 packed data sound channels are in the right channel position of medium as far as possible and are positioned moving sound channel state; First kind of mode data of ground floor left and right sides passage all removes lowest order, and adopts " pulse code modulation " data or trapezoidal linear " pulse code modulation " data format.Data are by following ordering in the third mode bit stream of the second layer

True Data initial value → compression value → gap digit → reduction length → gap digit → intensity or wave form → last place value → gap digit → repeat number.

Ordering is that every row bit in chronological sequence is listed as into series form, and represent packed data by above-mentioned method for expressing, with following method representation packed record method: True Data initial value first place begins in order to expression data true value, 1 expression beginning, the no true value of 0 expression, therefore true value initial value data length is the same with L channel length, will take in the secondary series data bit three, is placed in the higher bit three.In ground floor, if compress with intensity method, the first value representation of True Data is with the difference of L channel intensity; If with the waveform display method compression, True Data is represented the initial value that waveform begins.Compression value is expression intensity trend value in the intensity compression, represents to raise with 1, represents to reduce with 0, and fixed value is represented in 0,1 or 1,0 alternate appearance.Compression value is expression waveform modelling value in the waveform compression, represents to raise with 1, represents to reduce by 0,1 or 1, the 0 crossing expression fixed value that occurs with 0.Reduction length is used to represent above-mentioned data length on time domain, increases a sampling value on every increase by the 1 expression time domain of numerical value.The expression in waveform compression of waveform mode is sine, cosine, pulse or other, uses 01,10,11,00 2 bit representations respectively, and other value is a required value, and last place value is illustrated in intensity or waveform end value in the algorithm.Gap digit if there are data to occur at 0000 o'clock, then adds 1 with 0,000 four bit representation after the 3rd 0, remove during decoding again, and the True Data initial value is within the rule certainly.Repeat number is illustrated in the above-mentioned data and repeats number, is to improve compression measure, and every increase numerical value 1 expression repeats once.Above-mentioned compressed format is to be based upon on the shielding effect of human auditory system, promptly get go by force a little less than, cover in advance or lag behind and cover, adding shared sound channel of the present invention hints obliquely at relatively, because shared L channel has true waveform (when adopting " pulse code modulation " or trapezoidal linear " pulse code modulation ") or true waveform (when adopting " international movement motion picture expert group version standard ") contrast is calculated, need not to carry out again discrete cosine transform or sub-band conversion, only need get that one piece of data carries out intensity or waveform modelling gets final product.On effect, have the high and high characteristics of compression ratio of fidelity.On interpolation numerical value, utilize sine, cosine, benefit beginning angle, phase angle, amplitude, the constant of trigonometric function formula to find the solution, also improve the data of waveform compression simultaneously greatly.Its method flow is as follows:

D: all channel datas store synchronously;

E: select " pulse code modulation " form according to medium;

F selects implicit instruction " pulse code modulation " mode according to user's needs, otherwise removes g ';

G: select two or a main sound channel, Lp=L+0.25jc+ajs according to seven sound channel; Rp=R+0.25jc-ajs two sound channels are pressed following formula and are calculated, and eject implicit instruction;

H: synchronous dateout and instruction lookahead is placed on lowest order;

G ':, promptly select the relative intensity compress mode to compress other channel data and eject command adapted thereto if seven channel data association is bigger; If seven channel data is related little, promptly selects the compression of waveform compress mode, and simulate and eject command adapted thereto with trigonometric function and waveform trend;

H ': synchronous dateout and instruction is placed on the low level of " pulse code modulation " form.

Lp is the left channel information amount in the formula, L is the information that is in left-hand in two sound channels, and c is the channel information that mediates in two sound channels, then is zero if do not have, Rp is the channel information amount, R is the information that is in dextrad in two sound channels, and s is that two sound channels are to be in the back acoustic intelligence, then is 0 if do not have, a is a coefficient between 0.25 to 0.5, need situation decide when making, j is plural, equals 90 ° in above-mentioned formula.

When voice data adopts record such as " international movement motion picture expert group version standard " alternate manner, L channel adopts the customary way record, R channel then adopts in " international movement motion picture expert group version standard " bitstream data the additional data bit position putting implicit sound channel instruction, position as shown in Figure 4.Then select above-mentioned relative intensity compress mode or waveform compress mode for use if contain much information, sound channel layout and sound channel code are equally as shown in Figure 3.Its flow process is as follows:

I: L channel is selected corresponding form for use according to media properties, and R channel carries out corresponding mode according to the amount of information size and compresses, and ejects sound channel and instruct into additional bits or appointed information position;

K: output synchronously.

Method of the present invention also can be in the following ways:

L channel is shared sound channel, deposit shared voice data, and Storage Format is a former custom medium institute employing form, the sound channel that therefore can comprise above-mentioned packed data or implicit director data, its existence form is " pulse code modulation ", " international movement motion picture expert group version standard ", " European video broadcast standards " form, its packet is the used modes of former custom medium, or adopts mode provided by the present invention.As long as adopt the decoder of the inventive method, promptly available two-digit sound channel reduction multichannel or true surround sound, this method suggestion left and right acoustic channels voice data has two-layer, and ground floor is a specified layer, and the second layer is self-defined layer, to leave further room for development.In the analog sampling process, in order further to improve back digital compression rate and effect, the ground floor of analog sampling can

Select following formula for use:

Lp＝L+0.25jc+ajs；

Rp＝R+0.25jc-ajs

Above-mentioned two formulas have selects to cooperate channel allocation in the following formula, can build rich and varied multitrack surround sound effect.It should be noted that above-mentioned two formulas are not the Doby simulation formula that surround sound adopted.This method flow is as follows:

A ' 1-7: original analog input;

" 1-7: by Lp=L+0.25jc+ajs, the Rp=R+0.25jc-ajs conversion also has selection to form to a by a ' 1-7.

Coding/decoding method

The present invention decompresses, and according to media properties, seeks instruction and carries out inverse process, and flow process is as follows:

L: voice data input synchronously, seek media formats, if " international movement motion picture expert group version standard " form is then undertaken by following, otherwise undertaken by N;

M: carry out decompress(ion) according to " international movement motion picture expert group version standard " form, and seek implicit sound channel instruction, decompressed data is bound in the corresponding sound channel goes;

N: seek compressed code instruction in " pulse code modulation ", if mode 1 is then undertaken by following, otherwise undertaken by P;

O: channel data in the searching mode 1, and channel data is bound in the corresponding sound channel goes;

P: if mode 2 codes, thresholding and intensity repetition values, intensity end place value when seeking out strength difference in the data, intensity trend value, intensity, and and the channel data of L channel data decompression Cheng Xin, and, the decompress(ion) voice data is bound in the corresponding sound channel goes according to concrete sound channel in the instruction of implicit sound channel.If mode 3 codes are then undertaken by following;

Q: find out waveform begins in the mode 3 initial value, waveform modelling value, waveform time span, waveform subformation, waveform repetition values, waveform end value and trigonometric function formula, synthetic concrete waveform, and be bound in the represented sound channel of implicit channel value and go;

R: export with the L channel synchrodata;

S: according to media formats is to adopt trapezoidal linear " pulse code modulation " coding, then is transformed into common " pulse code modulation " coding, otherwise directly arrives T;

T: digital to analog conversion, the output of 1-7 channel sound.

Claims

1. Campatible surround sound method, it is characterized in that: it has a sound channel at least is original encoding, is shared sound channel, its digital audio quantizes to adopt trapezoidal linear " pulse code modulation " coded quantization; The left and right acoustic channels voice data has two-layer, and ground floor is a specified layer, and the second layer is self-defined layer; Ground floor is divided into intensity compress mode, waveform compress mode, implicit instruction " international movement motion picture expert group version standard " mode or implicit instruction " pulse code modulation " mode; Described trapezoidal linear " pulse code modulation " coded quantization in the bit of expressing the voice signal value, is represented numerical value with a part of bit, and another part bit is represented weights; Be in the numerical value bit that every bit quantization is identical under the same weights, every bit is expressed from low to high by trapezoidal in weights; Its method flow is:

A1-7: analog signal input, 1 road to 7 tunnel input;

C ' 1-7:23 bit " pulse code modulation " is processed into 15 bit trapezoidal linear coding by numerical digit, and wherein weights are 8;

The flow process of described intensity compress mode, waveform compress mode, implicit instruction " pulse code modulation " is:

D: all channel datas store synchronously;

E: select " pulse code modulation " form according to medium;

F: select implicit instruction " pulse code modulation " mode according to user's needs, otherwise remove g ';

H: synchronous dateout and instruction lookahead is placed on lowest order;

G ': if seven channel data association is bigger, promptly selects the relative intensity compress mode to compress other sound channel and eject command adapted thereto,, promptly select the compression of waveform compress mode if seven channel data is related little; And simulate the ejection command adapted thereto with trigonometric function and waveform trend;

H ': synchronous dateout and instruction is placed on " pulse code modulation " form low level

Lp is the left channel information amount in the formula, L is the information that is in left-hand in two sound channels, and c is the acoustic intelligence that mediates in two sound channels, then is zero if do not have, Rp is the right channel information amount, R is the dextrad information that is in two sound channels, and s is the information that is in back sound in two sound channels, then is zero if do not have, a is a coefficient between 0.25 to 0.5, need situation decide when making, j is plural, equals 90 in above-mentioned formula;

The method flow of described implicit instruction " international movement motion picture expert group version standard " mode is:

I: L channel is selected the correlation method compression for use according to media properties, and R channel compresses according to the amount of information size, and ejects sound channel and instruct into additional bits or appointed information position;

K: output synchronously.

2. a kind of Campatible surround sound method according to claim 1 is characterized in that: the peak of described weights is 1.5 times of corresponding " pulse code modulation " most significant bit value.

3. a kind of Campatible surround sound method according to claim 1 is characterized in that: data are pressed following ordering in the bit stream of relative intensity compress mode and waveform compress mode:

True Data initial value → compression value → gap digit → reduction length → gap digit → intensity or wave form → last place value → gap digit → repeat number

4. a kind of Campatible surround sound method according to claim 3 is characterized in that:

The packed record method is: True Data initial value first place begins in order to expression data true value, 1 expression beginning, and the no true value of 0 expression, true value initial value data length is the same with L channel length, is placed in the higher bit three;

In ground floor, if compress with intensity method, the first value representation of True Data is with the difference of L channel intensity; If with the waveform display method compression, True Data is represented the initial value that waveform begins:

Compression value is expression intensity trend value in the intensity compression, represents to raise with 1, represents to reduce with 0, and fixed value is represented in 0,1 or 1,0 alternate appearance;

Compression value is expression waveform modelling value in the waveform compression, represents to raise with 1, represents to reduce with 0, and fixed value is represented in 0,1 or 1,0 alternate appearance;

Reduction length is used to represent above-mentioned data length on time domain, increases a sampling value on every increase by the 1 expression time domain of numerical value;

The expression in waveform compression of waveform mode is sine, cosine, pulse or other, uses 01,10,11,00 2 bit representations respectively, and other value is a required value, and last place value is illustrated in intensity or waveform end value in the algorithm;

Gap digit if there are data to occur at 0000 o'clock, then adds 1 with 0,000 four bit representation after the 3rd 0, remove during decoding again;

Repeat number is illustrated in the above-mentioned data and repeats number, is to improve compression measure, and every increase numerical value 1 expression repeats once.