CN102324235A - Sound mixing encoding method, device and system - Google Patents

Sound mixing encoding method, device and system Download PDF

Info

Publication number
CN102324235A
CN102324235A CN201110205093A CN201110205093A CN102324235A CN 102324235 A CN102324235 A CN 102324235A CN 201110205093 A CN201110205093 A CN 201110205093A CN 201110205093 A CN201110205093 A CN 201110205093A CN 102324235 A CN102324235 A CN 102324235A
Authority
CN
China
Prior art keywords
audio
audio mixing
stream
codes
mixing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201110205093A
Other languages
Chinese (zh)
Inventor
张清
苗磊
李伟
许剑峰
许丽净
杜正中
胡晨
杨毅
齐峰岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN201110205093A priority Critical patent/CN102324235A/en
Publication of CN102324235A publication Critical patent/CN102324235A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a terminal side encoding method comprising the following steps of: setting a sound mixing identifier for sound information according to a sound mixing strategy, and encoding the sound information according to the sound mixing identifier information to get core encoded data; if the sound mixing identifier information indicates that sound mixing is needed, calculating dynamic side information, and generating and outputting an audio encoded code stream containing the sound mixing identifier, the core encoded data and the dynamic side information; and if the sound mixing identifier indicates that sound mixing is not needed, generating and outputting the audio encoded code stream containing the sound mixing identifier and the core encoded data by the terminal. The invention further discloses a corresponding network side sound mixing encoding method, and a device and a system for sound mixing encoding. According to the scheme of the invention, the problem of signal overflow and introduced error in a sound mixing process can be solved, and the encoding efficiency cannot be decreased.

Description

A kind of audio mixing coding method, device and system
Technical field
The present invention relates to the multimedia communication technology field, particularly a kind of audio mixing coding method, device and system.
Background technology
At present, the applied more and more of real-time multimedia communication service, in order to satisfy growing business demand, it is very important that for example multimedia conference system or the like, so various multimedia conference system correlation techniques seems.
In multimedia conferencing, audio interaction is the most basic key element.In centralized conference, all (Multi-point Controlling Unit, MCU) foundation is sent audio code stream and is received audio code stream from MCU to MCU in real time based on the connection of clean culture (unicast) with multipoint control unit at each terminal.Therefore, the input of MCU all is the audio code streams behind the various encoding scheme codings, and it is output as according to synthesis strategy and carries out the audio code stream after audio mixing is handled.
Be illustrated in figure 1 as a multimedia conference system synoptic diagram, wherein frame of broken lines can be regarded a MCU unit as.Terminal location 1, audio code stream such as input such as terminal location 2 grades is through decoding respectively, and decoded audio code stream is encoded respectively to the audio code stream behind the audio mixing behind audio mixing unit audio mixing again, outputs to relevant terminal again.Multimedia conference system as shown in Figure 1 has M terminal to participate in audio mixing.For specific moment t, each terminal can be sent voice data and MCU, and MCU at first decodes voice data, and every road signal is carried out the audio mixing CALCULATION OF PARAMETERS, finally the multipath decoding signal is carried out audio mixing and handles.The algorithms most in use that audio mixing is handled promptly adds and all road decoded datas, will add with after data again through encoder encodes, finally be sent to each terminal.
Adopt above-mentioned time domain stack audio mixing scheme, usually can introduce noise.This is that wherein min representes the lower limit of scope because all there is certain scope [min, max] at each terminal in the sound signal that transmits to MCU, and max representes the upper limit of scope.When directly adding and during the signal of all roads, exceeding signal span [min, max] possibly.Because there is the problem that quantizes upper and lower bound in digital audio and video signals, the stack computing causes the result to overflow possibly.Common processing means are to overflow detection, and then carry out saturation arithmetic, and the result who promptly surpasses the upper limit is changed to higher limit, and the value that surpasses lower limit is changed to lower limit.This computing itself has destroyed the original temporal signatures of voice signal, thereby has introduced noise, and Here it is the reason of explosion sound and voice non-continuous event can occur in some system.
Along with the terminal data of participating in audio mixing increases; The frequency that occurs overflowing also constantly rises, so there is a terminal number upper limit in this type time domain stack audio mixing scheme, and this higher limit is very low; The experiment proof; Under a lot of situation,, flow can't have been differentiated if its result just has a lot of noises with interrupted when 4 terminals participation audio mixings.
Summary of the invention
In view of this, the embodiment of the invention proposes a kind of audio mixing coding method, can overcome the noise problem of time domain audio mixing coding in the prior art.Said audio mixing coding method comprises the steps:
Acoustic information is provided with the audio mixing flag according to the audio mixing strategy, according to zone bit information said acoustic information is encoded, the result of coding is as the core encoder data;
If audio mixing flag information is the needs audio mixing, then calculate dynamic side information, generate and export the stream of audio codes that comprises said audio mixing flag, core encoder data and dynamic side information; If audio mixing flag information for not needing audio mixing, then generates and exports the stream of audio codes that comprises said audio mixing flag and core encoder data;
Network side receives the stream of audio codes of self terminal; Audio mixing flag information according to wherein judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road stream of audio codes that audio mixing is handled, select N road stream of audio codes, the core encoder data of selected N road stream of audio codes are carried out audio mixing handle according to dynamic side information wherein; And the stream of audio codes behind the output audio mixing, wherein N is smaller or equal to M '.
The embodiment of the invention also proposes a kind of end side coding method, comprises the steps:
According to the audio mixing strategy acoustic information is provided with the audio mixing sign, according to said audio mixing identification information said acoustic information being encoded obtains the core encoder data;
If said audio mixing identification information is the needs audio mixing, then calculate dynamic side information, generate and export the stream of audio codes that comprises said audio mixing sign, core encoder data and dynamic side information; If said audio mixing identification information is not for needing audio mixing, then the terminal generates and exports the stream of audio codes that comprises said audio mixing sign and core encoder data.
The embodiment of the invention also proposes the coding method of a kind of network side audio mixing, comprises the steps:
Receive M road stream of audio codes, whether needs carry out audio mixing to this stream of audio codes handles according to wherein audio mixing identification information judgment, and needs are carried out M ' the road stream of audio codes that audio mixing is handled; Select N road stream of audio codes according to dynamic side information wherein; The core encoder data of selected N road stream of audio codes are carried out audio mixing handle, and the stream of audio codes behind the output audio mixing, wherein M, M ' and N are positive integer; N is smaller or equal to M ', and M ' is smaller or equal to M.
The embodiment of the invention proposes a kind of multimedia conference system, comprises M terminal and multipoint control unit;
Comprise M terminal and multipoint control unit, it is characterized in that,
Said terminal is used for the acoustic information collected is provided with the audio mixing flag according to the audio mixing strategy of this locality, according to zone bit information said acoustic information is encoded, and the result of coding is as the core encoder data; And the audio mixing flag is set according to the audio mixing strategy of this locality; Generate and output to comprise said core encoder data, audio mixing flag be to need the audio mixing and the dynamic stream of audio codes of side information, generate perhaps and export that to comprise said core encoder data be the stream of audio codes that does not need audio mixing with the audio mixing flag;
Said multipoint control unit is used to receive the stream of audio codes of self terminal; Value according to wherein audio mixing flag judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road audio code stream that audio mixing is handled,, the core encoder data of selected N road audio code stream are carried out audio mixing handle according to selecting N road audio code stream in the dynamic side information wherein; And the stream of audio codes behind the output audio mixing; Wherein M, M ' and N are positive integer, and N is smaller or equal to M ', and M ' is smaller or equal to M.
The embodiment of the invention proposes a kind of multimedia conferencing terminal, comprising:
The sound collecting module is used to collect acoustic information;
The audio mixing policy module is used for according to the audio mixing strategy that is provided with in advance the collected acoustic information of said sound collecting module being provided with the audio mixing flag;
The core encoder module is used for said acoustic information is encoded, output core encoder data;
Become frame module; Be used for calculating dynamic side information according to the audio mixing flag of said audio mixing policy module setting; And according to the value of said audio mixing flag; Generation comprises the coded audio data frame of said core encoder data, audio mixing flag and dynamic side information, perhaps generates the coded audio data frame that comprises said core encoder data and audio mixing flag;
Output module, the coded audio data frame that is used for the said one-tenth frame module generation of externally output is as stream of audio codes.
The embodiment of the invention proposes a kind of multipoint control unit, comprising:
Selected cell; Be used for receiving stream of audio codes from M terminal; Value according to the audio mixing flag of said stream of audio codes judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road stream of audio codes that audio mixing is handled, select N road stream of audio codes according to dynamic side information wherein;
The audio mixing unit is used for that the core encoder data of the selected N of said selected cell road stream of audio codes are carried out audio mixing and handles, and obtains the stream of audio codes behind the audio mixing of M ' road;
Transmitting element is used for the stream of audio codes from said audio mixing unit is sent to the corresponding target terminal.
Can find out from above technical scheme,, in encoding code stream, carry out the demarcation of audio mixing flag and increase the corresponding dynamic side information in end side; At network side, select the stream of audio codes of needs audio mixing to carry out the audio mixing processing according to audio mixing flag and dynamic side information, the noise problem in the time of can solving the audio mixing coding.
Description of drawings
Fig. 1 is a multimedia conference system synoptic diagram of prior art;
Fig. 2 is the multimedia conference system synoptic diagram of the embodiment of the invention;
Fig. 3 is the structural drawing of the coded frame data in the stream of audio codes of terminal cell encoder output of the embodiment of the invention;
Fig. 4 is the coding process flow diagram of the end side of the embodiment of the invention;
Fig. 5 is the audio mixing coding process flow diagram of the MCU side of the embodiment of the invention;
A kind of multimedia conferencing terminal block diagram that Fig. 6 proposes for inventive embodiments;
A kind of multipoint control unit block diagram that Fig. 7 proposes for the embodiment of the invention.
Embodiment
The embodiment of the invention proposes the audio mixing coding method based on the audio mixing flag; In the data stream of terminal output; Except the core encoder code stream of voice-bearer, also comprise audio mixing flag and dynamic side information, wherein dynamically side information carries the required information of audio mixing coding; If the audio mixing flag need to be set to audio mixing, dynamic side information is set then; If the audio mixing flag do not need to be set to audio mixing, dynamic side information is not set then.MCU carries out the audio mixing processing according to the core encoder code stream that said audio mixing flag selection needs carry out the audio mixing processing.
For making the object of the invention, technical scheme and advantage clearer, the present invention is done further to set forth in detail below in conjunction with accompanying drawing.
Fig. 2 shows the multimedia conference system synoptic diagram figure of the embodiment of the invention.In this multimedia conference system, comprise M terminal, i.e. terminal 1,2...... terminal, terminal M; Also comprise a MCU.
With terminal 1 is example, and this terminal comprises cell encoder 201, and the sound that the sound collection means at 201 pairs of terminals 1 of cell encoder such as microphone are collected is encoded, and generates the core encoder code stream that carries said acoustic information.The audio mixing strategy that cell encoder 201 also is provided with according to this locality is provided with the audio mixing flag.Said audio mixing strategy is used for confirming whether the acoustic coding of this terminal output need carry out the audio mixing processing; Needs according to reality can be provided with different audio mixing strategies; For example; Can different priority be set to different terminal, preferentially carry out audio mixing for audio code stream from the high terminal of priority; The acoustic energy threshold value can also be set, and the acoustic energy of collecting when the terminal surpasses this energy threshold and then the audio code stream at this terminal is carried out audio mixing or the like.And a plurality of audio mixing strategies can use simultaneously.
If the audio mixing flag that is provided with need to represent audio mixing, then cell encoder 201 also will generate dynamic side information, writes in the audio code stream; If the audio mixing flag do not need to represent audio mixing, then only comprise core encoder and audio mixing flag in the audio code stream of cell encoder 201 outputs.
Fig. 3 shows the structural drawing of the coded frame data in the stream of audio codes of terminal cell encoder output of the embodiment of the invention.If the total length of a Frame is the n bit, when the audio mixing flag was represented to need audio mixing, this coded frame data comprised the audio mixing flag of t bit shown in the last figure among Fig. 3, the dynamic side information of m bit, and the core encoder of n-m-t bit.Wherein, the audio mixing flag is arranged on frame head, is convenient to MCU identification.When the audio mixing flag was represented not need audio mixing, this coded frame data comprised the audio mixing flag of t bit and the core encoder of n-t bit shown in the figure below among Fig. 3.
For arrowband enhancement layer G.711 (Low Band Enhance, LBE) coding, the desirable following numerical value of various piece among Fig. 3: t=1, n=80, m=9.
Side information comprises: frame energy (Frame Energy) harmony cent value (Voicing score), if the side information code length is 9 bits, then wherein 6 bits are the frame energy of quantification, the sound score value of 3 bits for quantizing.
Wherein, the frame energy calculation is represented with formula (1):
Frame _ Energy = Σ i = 0 Frame _ Length - 1 S 2 ( i ) Frame _ Length - - - ( 1 )
Frame_Length is a frame length, and S (i) is that (i is the sampled value sequence number in the frame for Quadrature Mirror Filter, low band signal QMF) through Quadrature Mirror Filter QMF.
The sound score value calculates with formula (2):
Voicing _ score = Zero _ Cros sin g _ Rate Scale _ factor - - - ( 2 )
Wherein, in zero-crossing rate (Zero_Crossing_Rate) the expression 10ms, time domain waveform zero passage number of times.The reduction factor (Scale_Factor) is the reduction constant that is provided with in advance, and value is [0,1].
According to actual conditions, dynamically side information also can be set to other amount that can be used for handling as audio mixing basis for estimation, for example, can be set to quiet activity detection (VAD).
After the audio code stream of terminal output sends to MCU, at first import selected cell 202.Selected cell 202 at first identifies the audio mixing flag from the stream of audio codes of receiving; Value according to the audio mixing flag; Determine whether that need carry out audio mixing to this road stream of audio codes handles; If do not need audio mixing to handle, then selected cell 202 exports this road stream of audio codes to the corresponding target terminal.For the stream of audio codes that all M ' (M ' smaller or equal to M) road needs audio mixing to handle, selected cell 202 is selected N (N is smaller or equal to M ') road stream of audio codes according to wherein dynamic side information; These stream of audio codes are sent to corresponding demoder respectively; After decoding, re-send to audio mixing unit 203 and carry out audio mixing and handle, obtain the audio code stream behind the audio mixing of M ' road; Again with this M ' road audio code stream respectively with after the encoder encodes, be sent to relevant terminal.
The cataloged procedure of the end side of the embodiment of the invention is as shown in Figure 4, comprises the steps:
Step 401: the acoustic information collected is provided with the audio mixing flag according to the audio mixing strategy of this locality, then said acoustic information is encoded, the result of coding is as the core encoder data;
Step 402: if the audio mixing flag is set is the needs audio mixing, then calculate dynamic side information, can calculate frame energy harmony cent value as dynamic side information according to aforementioned formula (1) and formula (2).
Step 403: generate and the output audio encoding code stream.Said generation stream of audio codes specifically comprises: if set audio mixing flag then generates the coded audio data frame that comprises said audio mixing flag, core encoder data and dynamic side information for effectively; If set audio mixing flag is invalid, then generate the coded audio data frame that comprises said audio mixing flag and core encoder data.Said audio mixing flag be arranged on Frame before, preferably, length is 1 bit.
The audio mixing cataloged procedure of the MCU side of the embodiment of the invention is as shown in Figure 5, comprises the steps:
Step 501:MCU receives the stream of audio codes of self terminal, judges whether that according to the value of wherein audio mixing flag needs carry out audio mixing to this stream of audio codes and handle, if then execution in step 502, otherwise, execution in step 503.
Step 502: this road stream of audio codes is directly sent to corresponding purpose terminal, and finish processing to this road stream of audio codes.
Step 503: the stream of audio codes of receiving for synchronization from the individual terminal of M '; And the audio mixing flag in these stream of audio codes is need carry out the audio mixing processing; MCU is according to the dynamic side information in these code streams; Therefrom select N road stream of audio codes, and abandon remaining M '-N road stream of audio codes.Wherein N is smaller or equal to M '.
Can be according to the value of energy in the side information, if greater than some threshold value T, audio mixing then is less than then not carrying out audio mixing.
504: the core encoder data to selected N road stream of audio codes are decoded respectively, decoded core encoder data are carried out audio mixing handle, and obtain the audio code stream behind the audio mixing of M ' road.
Step 505: the audio code stream behind the audio mixing of said M ' road is encoded respectively, the stream of audio codes behind coding of the M ' road behind the coding and the audio mixing is sent to the individual purpose of M ' terminal respectively.
Fig. 6 is a kind of multimedia conferencing terminal that inventive embodiments proposes, and comprising:
Sound collecting module 601 is used to collect acoustic information;
Audio mixing policy module 602 is used for according to the audio mixing strategy that is provided with in advance said sound collecting module 601 collected acoustic informations being provided with the audio mixing flag;
Core encoder module 603 is used for said acoustic information is encoded, output core encoder data; If audio mixing policy module 602 audio mixing flag do not need to be set to audio mixing, when then core encoder module 603 is encoded, need not to consider the Bit Allocation in Discrete of dynamic side information; If this audio mixing flag need to be set to audio mixing, when then core encoder module 603 is encoded, need to consider the Bit Allocation in Discrete of dynamic side information.For example; If total bit number of coded frame data is the n bit, the audio mixing flag is the t bit, and dynamically side information is the m bit; Then for the situation that need not consider the Bit Allocation in Discrete of dynamic side information, the core encoder data length that core encoder module 603 codings obtain is the n-t bit; Consider the situation of the Bit Allocation in Discrete of dynamic side information for needs, the core encoder data length that core encoder module 603 codings obtain is the n-m-t bit.
Become frame module 604; Be used for calculating dynamic side information according to the audio mixing flag that said audio mixing policy module 603 is provided with; And according to the value of said audio mixing flag; Generation comprises the audio data frame of said core encoder data, audio mixing flag and dynamic side information, perhaps generates the audio data frame that comprises said core encoder data and audio mixing flag;
Output module 605 is used for the audio data frame that said one-tenth frame module 604 generates is externally exported as stream of audio codes.
Fig. 7 is a kind of multipoint control unit that the embodiment of the invention proposes, and comprising:
Selected cell 701; Be used for receiving stream of audio codes from M terminal; Value according to the audio mixing flag of said stream of audio codes judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road stream of audio codes that audio mixing is handled, select N road stream of audio codes according to dynamic side information wherein;
Audio mixing unit 702 is used for that the core encoder data of the selected N of said selected cell road stream of audio codes are carried out audio mixing and handles, and obtains the audio code stream behind the audio mixing of M ' road;
Transmitting element 703 is used for the audio code stream from said audio mixing unit is sent to the corresponding target terminal.
The stream of audio codes that said selected cell 701 will not need audio mixing to handle sends to said transmitting element 703; Then said transmitting element 703 will send to the corresponding target terminal from the stream of audio codes of said selected cell.
Said multipoint control unit further comprises: demoder 704 is used for the core encoder data of said selected cell 701 selected stream of audio codes are decoded, and decoded core encoder data is sent to said audio mixing unit 702;
Scrambler 705 be used for encoding from the audio code stream behind the audio mixing of said audio mixing unit 702, and the stream of audio codes after will encoding sends to said transmitting element 703.
Embodiment of the invention scheme is carried out the demarcation of audio mixing flag and is increased the corresponding dynamic side information in encoding code stream, according to audio mixing flag and dynamic assignment side information Bit Allocation in Discrete.MCU according to the audio mixing flag and dynamically side information select the stream of audio codes of needs audio mixing to carry out audio mixing to handle, can introduce the problem of error in the time of can solving that signal overflows and large-signal carried out audio mixing, and reduce the computation complexity of MCU; When not carrying out audio mixing, can make full use of the code stream Bit Allocation in Discrete, improve the core encoder quality.The present invention program both can be used for mixer system, can use the codec of coding/decoding system commonly used again, and the Based Intelligent Control of favourable realization encoding code stream strengthens MCU unit interactivity.
The above is merely preferred embodiment of the present invention, not in order to restriction the present invention, all any modifications of within spirit of the present invention and principle, being done, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims (13)

1. an audio mixing coding method is characterized in that, comprises the steps:
Acoustic information is provided with the audio mixing flag according to the audio mixing strategy, according to zone bit information said acoustic information is encoded, the result of coding is as the core encoder data;
If audio mixing flag information is the needs audio mixing, then calculate dynamic side information, generate and export the stream of audio codes that comprises said audio mixing flag, core encoder data and dynamic side information; If audio mixing flag information for not needing audio mixing, then generates and exports the stream of audio codes that comprises said audio mixing flag and core encoder data;
Network side receives the stream of audio codes of self terminal; Audio mixing flag information according to wherein judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road stream of audio codes that audio mixing is handled, select N road stream of audio codes, the core encoder data of selected N road stream of audio codes are carried out audio mixing handle according to dynamic side information wherein; And the stream of audio codes behind the output audio mixing, wherein N is smaller or equal to M '.
2. method according to claim 1 is characterized in that, said dynamic side information comprises frame energy, sound score value and/or quiet activity detection.
3. method according to claim 2 is characterized in that, the dynamic side information of said calculating comprises: according to formula Frame _ Energy = Σ i = 0 Frame _ Length - 1 S 2 ( i ) Frame _ Length Calculate the frame energy, wherein, Frame_Energy representes the frame energy, and S (i) is the low band signal through Quadrature Mirror Filter QMF, and i is the sampled value sequence number in the frame.
4. method according to claim 2 is characterized in that, the dynamic side information of said calculating comprises: according to formula Voicing _ Score = Zero _ Cros Sin g _ Rate Scale _ Factor Calculate the sound score value, wherein Zero_Crossing_Rate represented in the schedule time, the time domain waveform zero passage number of times of said acoustic information; Scale_Factor is the reduction constant that is provided with in advance, and value is [0,1].
5. method according to claim 1; It is characterized in that; The information of said basis audio mixing flag wherein judges whether that needs carry out audio mixing to this stream of audio codes and handle; Its judged result is handled for not carrying out audio mixing to this stream of audio codes, then further comprises: export said stream of audio codes to the purpose terminal.
6. according to each described method of claim 1 to 5; It is characterized in that; Said core encoder data to selected N road stream of audio codes are carried out audio mixing and are handled; And the audio code stream behind the output audio mixing comprises: the core encoder data in the audio code stream of selected N road are decoded respectively, decoded N road core encoder data are carried out audio mixing handle, and obtain the audio code stream behind the audio mixing of M ' road; Audio code stream behind the audio mixing of said M ' road is encoded respectively, the stream of audio codes behind coding of the M ' road behind the coding and the audio mixing is sent to the individual purpose of M ' terminal respectively.
7. an end side coding method is characterized in that, comprises the steps:
According to the audio mixing strategy acoustic information is provided with the audio mixing sign, according to said audio mixing identification information said acoustic information being encoded obtains the core encoder data;
If said audio mixing identification information is the needs audio mixing, then calculate dynamic side information, generate and export the stream of audio codes that comprises said audio mixing sign, core encoder data and dynamic side information; If said audio mixing identification information is not for needing audio mixing, then the terminal generates and exports the stream of audio codes that comprises said audio mixing sign and core encoder data.
8. the audio mixing coding method of a network side is characterized in that, comprises the steps:
Receive M road stream of audio codes, whether needs carry out audio mixing to this stream of audio codes handles according to wherein audio mixing identification information judgment, and needs are carried out M ' the road stream of audio codes that audio mixing is handled; Select N road stream of audio codes according to dynamic side information wherein; The core encoder data of selected N road stream of audio codes are carried out audio mixing handle, and the stream of audio codes behind the output audio mixing, wherein M, M ' and N are positive integer; N is smaller or equal to M ', and M ' is smaller or equal to M.
9. a multimedia conference system comprises M terminal and multipoint control unit, it is characterized in that,
Said terminal is used for the acoustic information collected is provided with the audio mixing flag according to the audio mixing strategy of this locality, according to zone bit information said acoustic information is encoded, and the result of coding is as the core encoder data; And the audio mixing flag is set according to the audio mixing strategy of this locality; Generate and output to comprise said core encoder data, audio mixing flag be to need the audio mixing and the dynamic stream of audio codes of side information, generate perhaps and export that to comprise said core encoder data be the stream of audio codes that does not need audio mixing with the audio mixing flag;
Said multipoint control unit is used to receive the stream of audio codes of self terminal; Value according to wherein audio mixing flag judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road audio code stream that audio mixing is handled,, the core encoder data of selected N road audio code stream are carried out audio mixing handle according to selecting N road audio code stream in the dynamic side information wherein; And the stream of audio codes behind the output audio mixing; Wherein M, M ' and N are positive integer, and N is smaller or equal to M ', and M ' is smaller or equal to M.
10. a multimedia conferencing terminal is characterized in that, comprising:
The sound collecting module is used to collect acoustic information;
The audio mixing policy module is used for according to the audio mixing strategy that is provided with in advance the collected acoustic information of said sound collecting module being provided with the audio mixing flag;
The core encoder module is used for said acoustic information is encoded, output core encoder data;
Become frame module; Be used for calculating dynamic side information according to the audio mixing flag of said audio mixing policy module setting; And according to the value of said audio mixing flag; Generation comprises the coded audio data frame of said core encoder data, audio mixing flag and dynamic side information, perhaps generates the coded audio data frame that comprises said core encoder data and audio mixing flag;
Output module, the coded audio data frame that is used for the said one-tenth frame module generation of externally output is as stream of audio codes.
11. a multipoint control unit is characterized in that, comprising:
Selected cell; Be used for receiving stream of audio codes from M terminal; Value according to the audio mixing flag of said stream of audio codes judges whether that needs carry out audio mixing to this stream of audio codes and handle; Needs are carried out M ' the road stream of audio codes that audio mixing is handled, select N road stream of audio codes according to dynamic side information wherein;
The audio mixing unit is used for that the core encoder data of the selected N of said selected cell road stream of audio codes are carried out audio mixing and handles, and obtains the stream of audio codes behind the audio mixing of M ' road;
Transmitting element is used for the stream of audio codes from said audio mixing unit is sent to the corresponding target terminal.
12. multipoint control unit according to claim 11 is characterized in that, the stream of audio codes that said selected cell will not need audio mixing to handle sends to said transmitting element; Then said transmitting element will send to the corresponding target terminal from the stream of audio codes of said selected cell.
13. according to claim 11 or 12 described multipoint control units; It is characterized in that; Said multipoint control unit further comprises: demoder; Be used for the core encoder data of the selected stream of audio codes of said selected cell are decoded, and decoded core encoder data are sent to said audio mixing unit;
Scrambler be used for encoding from the audio code stream behind the audio mixing of said audio mixing unit, and the stream of audio codes after will encoding sends to said transmitting element.
CN201110205093A 2007-10-19 2007-10-19 Sound mixing encoding method, device and system Pending CN102324235A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110205093A CN102324235A (en) 2007-10-19 2007-10-19 Sound mixing encoding method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110205093A CN102324235A (en) 2007-10-19 2007-10-19 Sound mixing encoding method, device and system

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN2007101813767A Division CN101414463B (en) 2007-10-19 2007-10-19 Method, apparatus and system for encoding mixed sound

Publications (1)

Publication Number Publication Date
CN102324235A true CN102324235A (en) 2012-01-18

Family

ID=45451969

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110205093A Pending CN102324235A (en) 2007-10-19 2007-10-19 Sound mixing encoding method, device and system

Country Status (1)

Country Link
CN (1) CN102324235A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070878A (en) * 2019-03-26 2019-07-30 苏州科达科技股份有限公司 The coding/decoding method and electronic equipment of audio code stream
CN110995946A (en) * 2019-12-25 2020-04-10 苏州科达科技股份有限公司 Sound mixing method, device, equipment, system and readable storage medium
CN111741177A (en) * 2020-06-12 2020-10-02 浙江齐聚科技有限公司 Audio mixing method, device, equipment and medium for online conference

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110070878A (en) * 2019-03-26 2019-07-30 苏州科达科技股份有限公司 The coding/decoding method and electronic equipment of audio code stream
CN110995946A (en) * 2019-12-25 2020-04-10 苏州科达科技股份有限公司 Sound mixing method, device, equipment, system and readable storage medium
CN110995946B (en) * 2019-12-25 2021-08-20 苏州科达科技股份有限公司 Sound mixing method, device, equipment, system and readable storage medium
CN111741177A (en) * 2020-06-12 2020-10-02 浙江齐聚科技有限公司 Audio mixing method, device, equipment and medium for online conference

Similar Documents

Publication Publication Date Title
CN101414463B (en) Method, apparatus and system for encoding mixed sound
CN103050124B (en) Sound mixing method, Apparatus and system
CN101536086B (en) A method and an apparatus for decoding an audio signal
CN103988486B (en) The method of active channel is selected in the audio mixing of multiparty teleconferencing
CN1110145C (en) Scalable audio coding/decoding method and apparatus
CN101268351B (en) Robust decoder
CN1144180C (en) Method and apparatus for preforming reducer rate variable rate vocoding
CN101308658A (en) Audio decoder based on system on chip and decoding method thereof
CN101320563B (en) Background noise encoding/decoding device, method and communication equipment
CN101656072A (en) Mixer, mixing method and session system using the mixer
CN104934036B (en) Audio coding apparatus, method and audio decoding apparatus, method
RU2009114741A (en) ENCODING AND DECODING OF AUDIO OBJECTS
CN102097098B (en) Digital steganography and digital extraction methods with compressed audio as masking carrier
CN101414462A (en) Audio encoding method and multi-point audio signal mixing control method and corresponding equipment
CN101819781A (en) Communicator and communication means
CN102741831A (en) Scalable audio in a multi-point environment
CN101632117A (en) The method and apparatus that is used for decoded audio signal
CN101989430A (en) Audio mixing processing system and audio mixing processing method
CN101466043A (en) Method, equipment and system for processing multipath audio signal
CN102216983A (en) Apparatus and method for encoding at least one parameter associated with a signal source
Yang et al. High-fidelity multichannel audio coding with Karhunen-Loeve transform
CN102324235A (en) Sound mixing encoding method, device and system
CN103680509B (en) A kind of voice signal discontinuous transmission and ground unrest generation method
CN101502043B (en) Method for carrying out a voice conference, and voice conference system
CN102395097A (en) Method and system for down-mixing multi-channel audio signals

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20120118