CN101116135B

CN101116135B - Sound synthesis

Info

Publication number: CN101116135B
Application number: CN2006800046437A
Authority: CN
Inventors: M·施泽尔巴; A·C·登布林克; A·J·格里茨; A·W·J·乌门; M·克莱恩米德林克
Original assignee: Koninklijke Philips Electronics NV
Current assignee: Koninklijke Philips NV
Priority date: 2005-02-10
Filing date: 2006-02-01
Publication date: 2012-11-14
Anticipated expiration: 2026-02-01
Also published as: JP2008530608A; KR101207325B1; CN101116135A; KR20070104465A; EP1851752A1; EP1851752B1; WO2006085244A1; JP5063364B2; US20080184871A1; US7781665B2

Abstract

A device (1) is arranged for synthesizing sound represented by sets of parameters, each set comprising noise parameters (NP) representing noise components of the sound and optionally also other parameters representing other components, such as transients and sinusoids. Each set of parameters may correspond with a sound channel, such as a MIDI voice. In order to reduce the computational load, the device comprises a selection unit (2) for selecting a limited number of sets from the total number of sets on the basis of a perceptual relevance value, such as the amplitude or energy. The device further comprises a synthesizing unit (3) for synthesizing the noise components using the noise parameters of the selected sets only.

Description

Sound is synthetic

The present invention relates to the synthetic of sound.More specifically, the present invention relates to a kind of equipment and method of synthetic video, wherein sound is represented by parameter set, and each set comprises other parameters of noise parameter with other components of expression of the noise component of representing sound.

Represent that with parameter set sound is well-known.So-called parameter coding technology is used to encode sound efficiently, representes sound with series of parameters.Suitable demoder can utilize this series of parameters to rebuild original sound fully.This series of parameters can be divided into set, and each is gathered corresponding to individual other sound source (sound channel), such as (people) speaker or musical instrument.

Popular MIDI (musical instrument digital interface) agreement allows music to represent through the set of instructions for musical instruments.Give particular instrument with each command assignment.Each instruction can utilize one or multichannel (in MIDI, being called " part (voices) ") more.The number of channels that can use simultaneously is called polyphony level (polyphony level) or polyphony (polyphony).The MIDI instruction can be by high efficiency of transmission and/or storage.

Compositor generally includes the sound definition of data, for example voice bank (sound bank) or tone color (patch) data.In voice bank, the sample of musical instrument sound is stored as voice data, and tamber data is a sound generator definition controlled variable.

The MIDI instruction makes compositor from voice bank, retrieve voice data, and synthetic sound by this data representation.As common wave table (wavetable) was synthetic, these voice datas can be actual sample sounds, are digitized sound (waveforms).Yet sample sound needs a large amount of storeies usually, and this is infeasible in relative small device, particularly in such as the hand-held consumer devices that moves (honeycomb) phone.

Replacedly, sample sound can be by parameter that comprises amplitude, frequency, phase place and/or envelope shape parameter and the parametric representation that allows to rebuild sample sound.The needed memory space of stored sound sample parameter is significantly less than the actual needed memory space of sample sound of storage usually.Yet the synthetic of sound possibly have huge calculated amount.In the time need carrying out synthetic simultaneously (polyphony of height), especially like this to a lot of parameter sets of representing different sound channels (" part " among the MIDI).Usually linear growth that is to say computation burden along with the passage that will be synthesized (" part ") quantity, along with the degree linear growth of polyphony.This just makes and in handheld device, uses very difficulty of this technology.

By M.Szczerba; The paper " Parametric Audio Coding Based Wavetable Synthesis " that W.Oomen and M.Klein Middelink accomplish; AudioEngineering Society Convention Paper No.6063; Berlin (Germany), discloses a kind of SSC (sinusoidal coding) wavetable synthesis in May, 2004.The SSC scrambler is decomposed into instantaneous, sine and noise component with audio frequency input, and be each generation parametric representation of these components.These parametric representations are stored in the voice bank.SSC demoder (compositor) utilizes this parametric representation to rebuild the original audio input.For the reconstruction noise component, the temporal envelope of individual channels and gain combination separately and addition are mixed white noise with the temporal envelope of this combination, to produce the noise signal of shaping in time then mutually.Utilize the spectrum envelope parameter generating filter coefficient of individual channels, this filter coefficient is used for the time is gone up the noise signal of shaping and carries out filtering, thus produce in time with frequency spectrum on all by the noise signal of shaping.

Although this known configuration is very effective, yet confirming temporal envelope and spectrum envelope for a lot of sound channels needs a large amount of calculated loads.In a lot of modern audio systems, can use 64 sound channels, and imagine more sound channel.This just makes that this known configuration does not suit to be used in the limited relative small device of computing power.

On the other hand, to realizing that in such as the hand-held consumer devices of mobile phone the synthetic demand of sound increases.The consumer hopes that now their handheld device can produce the wider sound of scope, such as different ring tones.

Therefore, the objective of the invention is to overcome these and other problems of prior art, and a kind of equipment and method of noise component of synthetic video are provided, this equipment and method are more efficient, and can reduce calculated load.

Therefore, the present invention provides a kind of equipment of synthetic video, and wherein sound is represented by parameter set, and each set comprises the noise parameter of the noise component of representing sound, and this equipment comprises:

-selecting arrangement based on perceptual relevance value (perceptual relevance value), is selected a limited number of set from whole set,

-synthesizer only utilizes the noise parameter composite noise component of selected set.

Through selecting a limited number of parameter set and only utilizing these a limited number of parameter sets to synthesize, abandon the residue set effectively, can reduce synthetic calculated load greatly.Through utilizing perceptual relevance value to select set, do not use the perceived effect of some parameter set shockingly little.

Should reckon with, for example only utilize 5 in 64 parameter sets, will badly influence the perceived quality that institute rebuilds (that is, synthesizing) sound.Yet the inventor has been found that as in this example, through five set of suitable selection, sound quality is not affected.When number of sets further reduces, cause sound quality to descend.Yet this decline is gradually, and selects the number of three set still can accept.

Except the noise parameter of noise component of expression sound, parameter sets can also comprise other parameters of other components of expression sound.Therefore, each parameter sets can comprise noise parameter and other parameters, such as sine and/or instantaneous parameters.Yet it also is possible that set includes only noise parameter.

Notice that the selection of sets of noise parameters is preferably irrelevant with other arbitrary parameters, such as sine and instantaneous parameters.Yet in certain embodiments, selecting arrangement also is configured to or more other parameters based on other sound component of expression, from whole set, selects a limited number of set.That is to say, can comprise any sine and/or the transient component parameter of set, and the selection of influence set noise parameter thus.

In a preferred embodiment, this equipment comprises and is used to adjudicate the judgement part that will select which parameter set, and the selection part of the Information Selection parameter set that is used for providing based on the judgement part.Yet, it is contemplated that such embodiment, wherein, the judgement part is formed an independent integral unit with the selection part.Replacedly, this equipment can comprise the selection part that is used for selecting based on the perceptual relevance value that is included in parameter set parameter set.If comprise perceptual relevance value or any other values that need not any other judging process and confirm to select in the parameter set, so just no longer need adjudicate part.

Synthesis device of the present invention can comprise the single wave filter that all noises that are selected set is carried out frequency spectrum shaping; And Paul levinson-De Bin (Levinson-Durbin) unit that is used for confirming the filter filtering parameter, wherein this single wave filter preferably is made up of (Laguerre) wave filter in the glug.By this way, can realize synthesizing very efficiently.

Valuably, equipment of the present invention may further include gain compensation means, is used for selected noise component is carried out gain compensation to any owing to be rejected the energy loss that any noise component of (rejected) causes.Because the energy distribution of unaccepted any noise component is on selected noise component, so this gain compensation means allows the gross energy of noise to keep basically not influenced by selection course.

In addition; The present invention provides a kind of encoding device that utilizes parameter set to represent sound; Each parameter set comprises the noise parameter of the noise component of representing sound, and this equipment comprises relevant (relevance) detecting device, is used to provide the relevant correlation of perception of each noise parameter of expression.This correlation parameter preferably is added in each set, and can be determined based on sensor model.The parameter set that obtains can convert sound into again by the synthesis device of above-mentioned definition.

The present invention also provides a kind of subscriber equipment that comprises the synthesis device of above-mentioned definition.This subscriber equipment preferably but must not be portable be more preferably hand-heldly, can be made up of mobile (honeycomb) phone, CD Player, DVD player, MP3 player, PDA (personal digital assistant) or other suitable equipment.

The present invention further provides a kind of synthetic sound method of being represented by parameter set, and each set comprises the noise parameter of the noise component of representing sound, and this method comprises the steps:

-based on perceptual relevance value, from whole set, select a limited number of set,

-only utilize the noise parameter composite noise component of selected set.

In the method for the invention, perceptual relevance value can be indicated noise amplitude and/or noise energy.

Parameter set can only comprise noise parameter, but also can comprise other parameters of other components of expression sound, such as sinusoidal and/or instantaneous.

Method of the present invention can comprise further step: be directed against any because the energy loss that unaccepted any noise component causes is carried out gain compensation to selected noise component.Through using this step, the noise gross energy does not receive the influence of selection course basically.

The present invention is extra also to provide a kind of computer program, is used to carry out the method for above-mentioned definition.Computer program can comprise on the light or magnetic carrier that is stored in such as CD or DVD, perhaps the set of storage and the computer executable instructions can be for example downloaded from remote server via the Internet.

Below with reference to exemplary embodiment shown in the drawings, the present invention is further explained, wherein:

Fig. 1 schematically shows according to noise synthesis device of the present invention.

Fig. 2 schematically shows the parameter set of the expression sound that is used for the present invention.

Fig. 3 has schematically shown equipments choice part among Fig. 1 in more detail.

Fig. 4 has schematically shown the composite part of equipment among Fig. 1 in more detail.

Fig. 5 schematically shows the sound synthesis device that has merged present device.

Fig. 6 schematically shows audio coding equipment.

Only comprise selected cell (selecting arrangement) 2 and synthesis unit (synthesizer) 3 through the noise synthesis device 1 shown in the limiting examples among Fig. 1.According to the present invention, selected cell 2 receives noise parameter NP, selects the noise parameter of limited quantity, and the parameter N P ' of these selections is passed to synthesis unit 3.Synthesis unit 3 only utilizes the noise parameter NP ' that selects to synthesize shaped noise, and promptly time and/or spectrum envelope are by the noise of shaping.To combine Fig. 4 below, discuss an exemplary embodiment of synthesis unit 3 in more detail.

Noise parameter NP can be audio parameter collection S ₁, S ₂..., S _NA part, as shown in Figure 2.Shown in instance in, parameter set S _i(i=1...N) comprise the instantaneous parameters TP that representes instantaneous sound component, the sine parameter SP of expression sinusoidal sound components and the noise parameter NP of expression noise sound components.S set _iCan utilize aforesaid SSC scrambler or other suitable scramblers to produce.Will understand that some scramblers can not produce instantaneous parameters (TP) and other scramblers can not produce sine parameter (SP).These parameters can be followed midi format, also can not follow midi format.

Each S set _iThe sound channel (perhaps " part " in the MIDI system) that can represent an activation.

Illustrate in greater detail the selection of noise parameter among Fig. 3, it schematically shows the embodiment of the selected cell 2 of equipment 1.The exemplary selected cell 2 of this of Fig. 3 comprises judgement part 21 and selects part 22.Judgement part 21 all receives noise parameter NP with selection part 22.21 of parts of judgement need suitable choice judgement based on composition parameter.

Suitable composition parameter is gain g _iIn a preferred embodiment, g _iBe noise collection S _iThe gain of (referring to Fig. 2) temporal envelope.Yet, can also use the amplitude of individual noise components, perhaps can derive energy value by parameter.Will be clear, amplitude and energy have been indicated the perception of noise, so their amplitude has been formed perceptual relevance value.Valuably, usability perception model (acoustics and the psychological perception that for example comprise people's ear) is confirmed also (selectively) weighting proper parameters.

It is synthetic that which noise parameter 21 judgements of judgement part will be used for noise with.The optimization criterion that utilization is applied on the perceptual relevance value is entered a judgement, for example from AG g _iIn find five highest-gains.Corresponding assembly (for example 2,3,12,23 and 41) is fed to and selects part 22.In certain embodiments, selecting parameter (being correlation) can be included in noise parameter NP has suffered.In these embodiment, judgement part 21 can be omitted.

Select part 22 to be configured to be used to select noise parameter by the set of judgement part 21 indications.Abandon the noise parameter of residue set.As a result, next have only a limited number of noise parameter to be passed to synthesis unit (3 among Fig. 1) also is synthesized.Therefore, greatly reduce the calculated load of synthesis unit.

The inventor recognizes that the quantity of the noise parameter that is used to synthesize can significantly reduce, and sound quality is not had any substantial loss.The number that is selected set can be less relatively, for example from selecting 5 (7.8%) 64 altogether.Usually, although at least 10% be preferred, the number that is selected set should be 4.5% of total number at least, in case sound quality has any loss that perceives.Further reduce to and be lower than approximately 4.5% if be selected the number of set, the sound quality that then is synthesized descends gradually, but for some is used, can also accept.Will be appreciated that also can use the higher number percent such as 15%, 20%, 30% or 40%, even now will increase calculated load.

The judgement that comprise which set, does not comprise which set is made based on perceptual relevance value by judgement part 21; The for example amplitude of noise component (grade), the sharpness data (articulation data) that from voice bank (control envelop generator, LF oscillator etc.), obtain and the information that from the MIDI data, obtains for example have (note-on) speed and the controller relevant with sharpness of record.Also can utilize other perceptual relevance value.Usually, have maximum related value, for example M set of the highest noise amplitude (or gain) is selected.

In addition, or alternately, judgement part 21 can be used other parameters from each set.For example, can use sine parameter to reduce the number of noise parameter.Utilize sinusoidal (and/or instantaneous) parameter, can construct and shelter curve, can be left in the basket thereby amplitude is lower than the noise parameter of sheltering curve.The noise parameter of set therefore can with shelter curve ratio.If they are fallen below the curve, then refuse the noise parameter of this set.

Will be appreciated that S set _i(Fig. 2) in each time quantum, carry out usually, for example each time frame with the selection of noise and synthetic.Therefore, noise parameter can only refer to certain time quantum with other parameters.Time quantum such as time frame can be overlapped.

Illustrate in greater detail the exemplary embodiment of the synthesis unit 3 of Fig. 1 among Fig. 4.In this embodiment, utilize time (time domain) envelope and frequency spectrum (frequency field) envelope to produce noise.

Temporal envelope generator 311,312 and 313 receives to correspond respectively to and is selected S set _iEnvelope parameters b _i(i=1...M).According to the present invention, the number M that is selected set is less than available number of sets N.The temporal envelope parameter b _iDefinition is by the temporal envelope of generator 311-313 output.Multiplier 331,332 and 333 usefulness gain g separately _iMultiply by temporal envelope.The adjusted temporal envelope of the gain that as a result of obtains is by totalizer 341 additions, and is fed to next multiplier 339, and (in vain) noise that generates with noise generator 305 there multiplies each other.As a result of obtain by shaping in time but the noise signal that has in fact balanced frequency spectrum usually is fed to (optional) overlapping addition again (overlap-and-add) circuit 360.In this circuit, the noise segments of time frame is combined subsequently, forms continuous signal, and it is fed to wave filter 390.

As stated, gain g ₁To g _MCorresponding to selecteed set.Because N available set arranged, g therefore gains _M+1To g _NCorresponding to unaccepted set.In preferred embodiment shown in Figure 4, do not abandon gain g _M+1To g _N, but regulate gain g with them ₁To g _MThis gain compensation is used for reducing or even eliminates noise parameter and select the influence to the grade (being amplitude) that is synthesized noise.

Therefore, extra totalizer 343 and convergent-divergent (scaling) unit 349 of also comprising of the embodiment of Fig. 4.Totalizer 343 g that will gain _M+1To g _NAddition, and the storage gain that will as a result of obtain is fed to the unit for scaling 349 of applying of zooming coefficient 1/M, to produce compensating gain g _c, wherein M is selecteed as stated number of sets.Then with this compensating gain g _cBe added to each gain g through totalizer 334,335... _lTo g _M, number of adders equals M.Be distributed in through the storage gain that will be rejected component and be selected on the component, it is constant basically that noise energy keeps, because the change in sound level that noise component is selected to cause has been avoided.

Will be appreciated that totalizer 343, unit for scaling 349 and totalizer 334,335... are optional, in other embodiments, these unit can not occur.If unit for scaling 349 can replacedly be arranged between totalizer 341 and the multiplier 339.

Wave filter 390 is (Laguerre) wave filter in the glug in a preferred embodiment, is used for to the frequency spectrum of noise signals shaping.From being selected S set _iThe spectrum envelope parameter a that derives _iBe fed to auto-correlation unit 321, auto-correlation unit 321 calculates the auto-correlation of these parameters.The auto-correlation addition that totalizer 342 will as a result of obtain, and it is fed to unit 370, so that confirm the filter coefficient of frequency spectrum shaping wave filter 390.In a preferred embodiment, unit 370 is configured to confirm filter coefficient according to known Paul levinson-De Bin (Levinson-Durbin) algorithm.The linear filter coefficient that will as a result of be obtained by converting unit 380 then converts (Laguerre) filter coefficient in the glug into.Utilize (Laguerre) wave filter 390 in the glug to come the spectrum envelope of shaping (in vain) noise then.

As definite every group of parameter a _iSubstituting of autocorrelation function can be used more high-efficiency method.The power spectrum that calculating is selected set (that is, selecteed activate channel or " part ") carries out inverse fourier transform through the power spectrum to addition then and calculates autocorrelation function.The autocorrelation function that will as a result of obtain then is fed to Paul levinson-De Bin (Levinson-Durbin) unit 370.

Will be appreciated that parameter a _i, b _i, g _iWith going into all is the part of the noise parameter represented with NP among Fig. 1 and Fig. 2.In the selected cell embodiment of Fig. 3,22 of judgement parts are used gain parameter g _iYet, it is contemplated that such embodiment, wherein parameter a _i, b _i, g _iWith λ some or all and possibly also have other parameters (for example about sinusoidal component and/or instantaneous) also can be used by judgement part 22.It is noted that it can be constant that parameter is gone into, and need not be the part of noise parameter NP.

Fig. 5 schematically shows the sound synthesizer that the present invention is used for.Compositor 5 comprises noise compositor 51, sinusoidal compositor 52 and instantaneous compositor 53.Output signal (synthetic instantaneous, sine and noise) by totalizer 54 additions, forms synthetic audio output signal.Noise compositor 51 comprises as above defined equipment (1 among Fig. 1) valuably.

Compositor 5 can be the part of audio frequency (sound) demoder (not shown).Audio decoder can comprise demodulation multiplexer, is used for the incoming bit stream demultiplexing, and isolates the set of instantaneous parameters (TP), sine parameter (SP) and noise parameter (NP).

Only come coding audio signal s (n) with three phases through the audio coding equipment 6 shown in the limiting examples among Fig. 6.

In phase one, any momentary signal component that utilizes instantaneous parameters to extract among the 61 couples of sound signal s in (TPE) unit (n) is encoded.This parameter is offered multiplexed (MUX) unit 68 and instantaneous synthetic (TS) unit 62.When the 68 pairs of parameters in multiplexed unit suitably make up and multiplexed so that in sending to such as Fig. 5 during the demoder of equipment 5, instantaneous synthesis unit 62 is rebuild instantaneous (transients) of coding.In first assembled unit 63, the instantaneous of these reconstructions deducted from original audio signal s (n), eliminated instantaneous M signal basically to form.

In the subordinate phase, any sinusoidal signal component in the M signal (being sine and cosine) is by sinusoids parameter extraction (SPE) unit 64 codings.The parameter that as a result of obtains is fed to multiplexed unit 68 and sinusoidal synthetic (SS) unit 65.In second assembled unit 66, from middle signal, deduct the sine of rebuilding by sinusoidal synthesis unit 65, obtain residual (residual) signal.

In phase III, utilize time/frequency envelope data extraction (TFE) unit 67, residual signal is encoded.Therefore it is noted that owing to removed instantaneously and sinusoidal in first and second stages, suppose that residual signal is a noise signal.Therefore, the suitable noise parameter of time/frequency envelope data extraction (TFE) unit 67 usefulness is represented residual noise.

According to prior art about the summary of noise modeling and coding techniques in the S.N.Levine of Stanford Univ USA statement to some extent in the 5th chapter of the paper of delivering in 1999 " Audio Representations forData Compression and Compressed Domain Processing ", its full content here is incorporated in this document.

The parameter that obtains from all three phases is by appropriate combination, and multiplexed by multiplexed (MUX) unit 68, and additional parameter coding is also carried out in this multiplexed unit 68, and huffman coding or time difference coding for example is so that reduce the required bandwidth of transmission.

Notice that parameter extraction (i.e. coding)

unit

61,64 and 67 can quantize the parameter of being extracted.Replacedly or additionally, can in multiplexed (MUX) unit 68, quantize.Notice that further s (n) is a digital signal, n representes sample number, S set _i(n) being used as digital signal sends.Yet, also can be applied to simulating signal.

When in MUX unit 68, make up with multiplexed (and encode alternatively and/or quantize) after, come transmission parameter via transmission medium, transmission medium is such as being satellite link, glass fiber cable, copper cable and other suitable media arbitrarily.

Audio coding equipment 6 further comprises correlation detector (RD) 69.This correlation detector 69 receives predetermined parameter, such as noise gain g _iAnd confirm that their acoustics (perception) is relevant (as shown in Figure 3).The correlation that as a result of obtains is fed back to multiplexer 68, and they are inserted into S set there _i(n) form output bit flow.Demoder can use the correlation that is included in this set to select suitable noise parameter then, and needn't confirm that their perception is relevant.Like this, demoder can be simpler and quick.

Although correlation detector (RD) 69 shown in Fig. 6 for being connected to multiplexer 68, correlation detector 69 also may instead be the time of being directly connected to/frequency envelope data extraction (TFE) unit 67.The operation of correlation detector 69 can be similar with the operation of the judgement part 21 shown in Fig. 3.

The equipment of audio coding shown in Fig. 66 has three phases.Yet audio coding equipment 6 can also be made up of the stage that is less than three, for example had only two stages that produce sinusoidal and noise parameter, perhaps more than three phases, produced extra parameter.Therefore it is contemplated that and

unit

61,62 and 63 wherein do not occur by such embodiment.The audio coding equipment 6 of Fig. 6 can be arranged to generation valuably can be by the audio frequency parameter of the decoding of synthesis device shown in Fig. 1 (synthesizing).

Synthesis device of the present invention can be used for portable set; Hand-held consumer devices particularly is such as cell phone, PDA (personal digital assistant), wrist-watch, game station, solid state audio player, electronic musical instrument, digital telephone answer phone, portable CD and/or DVD player etc.

Can be clear that from above; The present invention also provides a kind of synthetic sound method of being represented by parameter set; Wherein each parameter set comprises the noise parameter of the noise component of representing sound, also comprises other parameters of representing other components alternatively, such as instantaneous and/or sinusoidal.Method of the present invention comprises the steps: in essence

-only utilize the noise parameter composite noise component of selected set.

Method of the present invention can extraly comprise following optional step: be directed against any because the energy loss that the refusal noise component causes is carried out gain compensation to selected noise component.Further optional method step can be derived from top description.

Additionally; The present invention provides a kind of encoding device of representing sound with parameter set; Each parameter set comprises the noise parameter of the noise component of representing sound; Preferably also comprise instantaneous and/or sine parameter, this equipment comprises correlation detector, is used to provide the relevant correlation of perception of each noise parameter of expression.

The present invention is based on such understanding, promptly when the noise component of synthetic video, select a limited number of sound channel in fact can not can to cause being synthesized sound and degrade.The present invention has benefited from further understanding, promptly selects sound channel to minimize based on perceptual relevance value or has eliminated the distortion that is synthesized sound.

Notice that any term that uses in the document should not be construed as limiting the scope of the invention.Especially, word " comprises " and " comprising " do not mean that and get rid of any element do not have special declaration.(circuit) element can be replaced by a plurality of (circuit) elements or their equivalent.

It will be apparent to those skilled in the art that the embodiment that explains above the invention is not restricted to, can not depart under the situation of liking the defined scope of the invention of claim enclosed, make a lot of modifications and interpolation.

Claims

1. equipment (1) that is used for synthetic video, wherein sound utilizes the incompatible expression of parameter set, and each set comprises the noise parameter (NP) of the noise component of representing sound, and this equipment comprises:

-selecting arrangement (2) is used for the part set of selecting to have maximum perceptual relevance value from the set of whole numbers based on perceptual relevance value, the amplitude of wherein said perceptual relevance value indication noise component and/or energy and

-synthesizer (3) is used for only utilizing the noise parameter of selected set to come the composite noise component.

2. equipment according to claim 1, wherein each parameter sets further comprises the transient component of representing sound and/or other parameters (SP of sinusoidal component; TP).

3. equipment according to claim 2, wherein selecting arrangement (2) also is arranged to: based on or more other parameters (SP of other components of representing sound; TP), from the set of whole numbers, select a limited number of set.

4. equipment according to claim 1, the wherein temporal envelope and/or the spectrum envelope of noise parameter (NP) definition noise.

5. equipment according to claim 1, wherein each parameter sets is corresponding to single sound channel.

6. equipment according to claim 1 comprises: be used to adjudicate the judgement part (21) that will select which parameter sets, and the selection part (22) that is used for selecting based on the information that judgement part (21) provides parameter sets.

7. equipment according to claim 1 comprises the selection part (22) that is used for selecting based on the perceptual relevance value that is comprised in parameter sets parameter sets.

8. equipment according to claim 1, wherein synthesizer (3) comprising: be used for selecting the noise of set to carry out the single filter (390) of frequency spectrum shaping to all, and Paul levinson-De Bin unit (370) that are used for the filtering parameter of definite wave filter (390).

9. equipment according to claim 1 further comprises gain compensation means (343,349), is used for to any because energy loss that any unaccepted noise component caused and the gain of selected noise component is compensated.

10. an audio frequency compositor (5) comprises the equipment (1) that is used for synthetic video according to claim 1.

11. a subscriber equipment comprises the equipment (1) that is used for synthetic video according to claim 1.

12. the method for a synthetic video, wherein sound utilizes the incompatible expression of parameter set, and each set comprises the noise parameter (NP) of the noise component of representing sound, and this method comprises the steps:

-based on perceptual relevance value, from the set of whole numbers, select to have the part set of maximum perceptual relevance value, the amplitude of wherein said perceptual relevance value indication noise component and/or energy and

-only utilize the noise parameter of selected set, composite noise component.

13. method according to claim 12, wherein each parameter sets further comprises the transient component of representing sound and/or other parameters (SP of sinusoidal component; TP).

14. method according to claim 13 is wherein also based on of other component or more other parameters (SP of expression sound; TP) carry out the step of from the set of whole numbers, selecting a limited number of set.

15. method according to claim 12, the wherein temporal envelope and/or the spectrum envelope of noise parameter definition noise.

16. method according to claim 12, wherein each parameter sets is corresponding to single sound channel.

17. method according to claim 12 further may further comprise the steps:, the gain of selected noise component is compensated to any because energy loss that any unaccepted noise component caused.

18. method according to claim 12, wherein each parameter sets comprises perceptual relevance value.