WO2010034254A1 - 视频及音频处理方法、多点控制单元和视频会议*** - Google Patents

视频及音频处理方法、多点控制单元和视频会议*** Download PDF

Info

Publication number
WO2010034254A1
WO2010034254A1 PCT/CN2009/074228 CN2009074228W WO2010034254A1 WO 2010034254 A1 WO2010034254 A1 WO 2010034254A1 CN 2009074228 W CN2009074228 W CN 2009074228W WO 2010034254 A1 WO2010034254 A1 WO 2010034254A1
Authority
WO
WIPO (PCT)
Prior art keywords
stream
channel
video
module
channel video
Prior art date
Application number
PCT/CN2009/074228
Other languages
English (en)
French (fr)
Inventor
王向炯
龙彦波
Original Assignee
华为终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为终端有限公司 filed Critical 华为终端有限公司
Priority to EP09815623A priority Critical patent/EP2334068A4/en
Publication of WO2010034254A1 publication Critical patent/WO2010034254A1/zh
Priority to US13/073,068 priority patent/US20110261151A1/en

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1827Network arrangements for conference optimisation or adaptation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/23608Remultiplexing multiplex streams, e.g. involving modifying time stamps or remapping the packet identifiers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2365Multiplexing of several video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2389Multiplex stream processing, e.g. multiplex stream encrypting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • H04N21/25808Management of client data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4347Demultiplexing of several video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/438Interfacing the downstream path of the transmission network originating from a server, e.g. retrieving encoded video stream packets from an IP network
    • H04N21/4385Multiplex stream processing, e.g. multiplex stream decrypting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/478Supplemental services, e.g. displaying phone caller identification, shopping application
    • H04N21/4788Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present invention relates to audio and video technology, and more particularly to a video processing method, an audio processing method, a video processing device, an audio processing device, a multipoint control unit, and a video conferencing system. Background technique
  • each site can only send one video image, which is usually the conference room scene captured by the camera, providing participants with similar face-to-face effects.
  • a dual-stream standard has emerged, allowing participants to send two video images.
  • the mainstream sends the conference room scene captured by the camera.
  • the auxiliary stream can send the film image on the laptop, which improves the participants.
  • Data sharing there is a telepresence system, which can transmit multiple camera images at the same time, and multiple camera images can be spliced together to form a complete conference room scene with a larger viewing angle, providing participants with a high-presence video communication experience. .
  • the dual-stream or multi-stream conference mode brings great convenience and experience to users, all sites in the same conference are required to support dual-stream or multi-stream at the same time, and cannot be compatible with the existing single stream. If a user of a single-stream site needs to participate in a dual-stream or multi-stream conference, the single-stream device must be replaced with a dual-stream or multi-stream device. The cost of the dual-stream or multi-stream device is relatively high. Therefore, a solution is needed to support the single-stream site. Hybrid networking between dual-stream venues and multi-stream venues to minimize overall construction costs. In the prior art, a solution is adopted to solve the hybrid conference of the single-stream site and the dual-stream site.
  • the solution is to forward the primary video stream of the dual-stream site to the single-stream site and discard the secondary video stream of the dual-stream site.
  • the dual-stream site and the single-stream site are connected to each other, the secondary video stream is discarded.
  • the single-stream site can only see the mainstream image sent by the dual-stream site. Meeting effect.
  • the hybrid networking solution between the remote presentation site and the single-stream site and the dual-stream site and the remote presentation site with different numbers of routes is not provided in the prior art. Summary of the invention
  • the present invention provides a video processing method, an audio processing method, a video processing device, an audio processing device, a multipoint control unit, and a video conferencing system, which solve the problem of hybrid networking supporting different audio and video venues.
  • the embodiment of the invention provides a video processing method, including:
  • An embodiment of the present invention provides an audio processing method, including:
  • the conference terminal includes at least one terminal that remotely presents the conference site and a terminal that has an audio stream with a different number of ways from the remote presentation conference site;
  • the mixed audio stream is sent to each conference terminal.
  • the embodiment of the invention provides a video processing device, including:
  • a video acquisition module configured to acquire an N-channel video stream sent by the first conference terminal, where each first conference terminal supports an N-channel video stream;
  • a determining module configured to determine a second conference terminal that interacts with the first conference terminal, where the second conference terminal supports a different L channel video stream than N;
  • a processing module configured to carry the N-channel video information carried in the N-channel video stream in the L-channel video stream;
  • a transmission module configured to transmit the L channel video stream to the second conference terminal.
  • An embodiment of the present invention provides an audio processing apparatus, including:
  • An audio acquiring module configured to acquire an audio stream of each conference terminal, where the conference terminal includes at least End
  • a mixing module configured to perform mixing processing on audio streams of each conference terminal
  • the sending module is configured to send the mixed audio stream to each conference terminal.
  • An embodiment of the present invention provides a multipoint control unit, including:
  • a first access module configured to access the first conference terminal, and transmit, by the first conference terminal, a first media stream, where the first media stream includes an N channel video stream and an N channel audio stream;
  • the media switching module is configured to transmit all the information in the first media stream to the second conference terminal, and transmit the information in the second media stream to the first conference terminal.
  • An embodiment of the present invention provides a video conference system, including:
  • At least two conference terminals the conference terminal supporting at least two media flow paths;
  • a multipoint control unit configured to exchange all information carried in the media stream of the at least two conference terminals.
  • the embodiment of the present invention processes the accessed audio and video stream, so that the number of processed audio and video channels is the same as that of the receiving site, and the interworking between the different number of sites is realized.
  • the inter-communication between the remote presentation site, the single-stream site, and the dual-stream site enables the sites of different routes to be mixed and networked to reduce the cost of the entire network.
  • FIG. 1 is a schematic structural diagram of a video conference system according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of a video processing method according to Embodiment 1 of the present invention
  • 3 is a schematic structural diagram of a multipoint control unit according to Embodiment 2 of the present invention
  • FIG. 4 is a schematic flowchart of Embodiment 1 of a video processing method according to Embodiment 2 of the present invention
  • FIG. 6 is a schematic structural diagram of a multipoint control unit according to Embodiment 3 of the present invention
  • FIG. 7 is a schematic structural diagram of a multipoint control unit according to Embodiment 4 of the present invention.
  • FIG. 8 is a schematic flowchart of an audio processing method according to Embodiment 4 of the present invention.
  • FIG. 9 is a schematic structural diagram of an embodiment of a video processing apparatus according to the present invention.
  • FIG. 10 is a schematic structural diagram of an embodiment of an audio processing device according to the present invention. detailed description
  • FIG. 1 is a schematic structural diagram of a video conference system according to an embodiment of the present invention, including a first conference terminal, a second conference terminal different from the first conference channel, and a media stream interaction between the first conference terminal and the second conference terminal.
  • Multipoint Control Unit (MCU) 13 The first conference terminal and the second conference terminal are at least one, wherein the first conference terminal shown in FIG. 1 is a first telepresence conference site 111 and a second telepresence conference site 112 with three routes, and the transmission is respectively A.
  • the second conference terminal is the first single-stream site 121, the second single-stream site 122, and the third single-stream site 123, respectively, G, H, The I-way media stream; the MCU 13 is responsible for core exchange, mixing, split-screen processing, etc. between the various sites (including the single-stream site, the dual-stream site, and the multi-stream site).
  • the first telepresence site 111 and the second telepresence site 112 The first single-stream site 121, the second single-stream site 122, and the third single-stream site 123 are connected to the MCU 13 through a transport network such as E1 or IP or ISDN, and the media stream (including the video stream and the audio stream; the iMCU 13 is aggregated, and the MCU 13 is paired with The media streams accessed by the sites are controlled and exchanged in a unified manner to implement media stream exchange between sites.
  • the second remote presentation site 112 can receive the media information (G, H, I) of the first single-stream site 121, the second single-stream site 122, and the third single-stream site 123.
  • the first single-stream site 121 The media information (D, E, F) of the second telepresence site 112 can be received. Therefore, the interaction between the remote presentation site and the single-stream site is realized, and the problem that the different road numbers cannot be merged in the prior art is solved. The problem.
  • the remote presentation site and the single-stream site can also interact with each other.
  • the first remote presentation site 111 can receive the media information of the second remote presentation site 112, and the second single-stream site 122
  • the third single-stream site 123 receives the media information of the first single-stream site 111 and the second single-stream site 122, and thus can be compatible with the prior art.
  • the embodiment may further include a service management station 14 for pre-defining various parameters of the system, and then transmitting the predefined parameters to the MCU 13, so that the MCU 13 performs unified control according to the predefined parameters. management.
  • a service management station 14 for pre-defining various parameters of the system, and then transmitting the predefined parameters to the MCU 13, so that the MCU 13 performs unified control according to the predefined parameters. management.
  • the structure and implementation of a specific MCU can be seen in the following embodiments.
  • the MCU performs the interaction between the sites of different media streams, and can implement a hybrid networking between different number of sites.
  • the remote presentation site and the single stream can be configured according to the conditions of each user in the network.
  • the site terminal and the dual-stream site terminal do not need to configure high-performance devices on the entire network. Therefore, the construction cost of the entire network can be reduced, and the waste of resources of the previously used devices can be avoided.
  • Step 21 An MCU acquires an N-channel video stream sent by a first conference terminal. For example, the MCU receives three video streams from a telepresence site.
  • Step 22 The MCU determines a second conference terminal that interacts with the first conference terminal, and the second conference terminal supports the L-channel video stream, where L and N are different.
  • the second conference terminal is a single-stream conference site, and supports one video stream.
  • Step 23 The MCU carries the N-channel video information carried in the N-channel video stream in the L-channel video stream.
  • the video stream supported by the first single-stream site 121 is one way, and the number of channels supported by the second telepresence site 112 accessed by the MCU is different. Therefore, the MCU needs three channels of video.
  • the stream is processed to carry information in the three-way video stream in a video stream, such as the information stream D, E, F).
  • Compared to the existing conference TV system requires each meeting In the existing dual-stream site, only the main video stream is transmitted to the single-stream site, and the information is lost.
  • all the information in the first media stream is retained in the second media stream obtained after the processing, so as to avoid information loss.
  • the three-way video information of the remote presentation site is synthesized, and the three-way video information can be sent to the single-stream conference site in a time-sharing manner. For details, refer to the following embodiments.
  • Step 24 The MCU transmits the L channel video stream to the second conference terminal. For example, a video stream carrying three streams of video information is sent to a single stream conference site.
  • the video stream from each site is processed, so that the number of channels of the accessed video stream and the output video stream are different, and the convergence between the sites of different paths is realized; and the output video stream is reserved. Enter all the information in the video stream to avoid loss of information.
  • FIG. 3 is a schematic structural diagram of a multipoint control unit according to Embodiment 2 of the present invention.
  • This embodiment is a schematic structural diagram of an MCU for video, including a first access module 31, a second access module 32, a video synthesis module 33, and Media exchange module 34.
  • the first access module 31 is connected to the first conference terminal, and is configured to access the N video stream of the first conference terminal, for example, accessing the three video streams of the telepresence site in FIG. 1; the second access module 32 And connecting to the second conference terminal, the L-channel video stream that is different from the N for accessing the second conference terminal, for example, accessing one video stream of the single-stream conference site in FIG.
  • the module 31 is connected to combine the N video streams into the L video streams.
  • the three video streams of the telepresence site of FIG. 1 are combined into one video stream.
  • the media switching module 34 is connected to the video synthesizing module 33 for The L-channel video stream that is synthesized by the N-channel video stream is forwarded to the second conference terminal.
  • the three-way video stream of FIG. 1 is combined into one video stream and sent to the single-stream conference site.
  • the video synthesizing module 33 can also be used to directly forward the accessed uncombined N-channel video stream to the media switching module 34, and then transmit the corresponding to the corresponding multi-channel conference site.
  • the second telepresence conference site of FIG. The multiple video streams are forwarded directly to the first telepresence venue 111 through the media switching module 34.
  • the video synthesis module is specifically configured to synthesize a plurality of N channels of video information into L channel video information, for example, synthesizing L N channels of video information into L channel video information, and synthesizing each N channel video information into One channel of video information; or the video synthesizing module is specifically configured to synthesize an N channel of video information into L channel video information, such as keeping the (L-1) channel video information in an N channel of video information unchanged, and The video information of the (L-1) channel is synthesized into one channel of video information.
  • the embodiment may further include a protocol conversion/rate adaptation module 35, where the protocol conversion/rate adaptation module 35 is located between the video synthesis module and the media exchange module, and between the second access module and the media exchange module.
  • the embodiment may further include a conference control module, where the conference control module is connected to each module in the MCU, and is configured to manage and control the internal MCU including the access module and the video synthesis module according to various parameters input by the service management station 14.
  • the protocol conversion/rate adaptation module and the media switching module work together to implement user management of the conference.
  • the control access module sends the accessed video stream to the protocol conversion/rate adaptation module, or directly sends the video synthesis.
  • the module controls whether the video synthesis module performs video stream synthesis or direct video stream forwarding; controls the media switching module to send the processed video stream to which conference site; and can also control the uniform operation of these modules.
  • the video synthesis module is used to implement the synthesis of multiple video streams, and the number of roads that can be supported by a small number of conference sites can be achieved. Therefore, the video information of multiple conference sites can be transmitted to a small number of conference sites without requiring less support. The number of sites is upgraded to a site that supports more channels, saving equipment costs.
  • FIG. 4 is a schematic flowchart diagram of Embodiment 1 of a video processing method according to Embodiment 2 of the present invention.
  • the remote presentation site is the input side
  • the single-stream site and the remote presentation site are the output side.
  • the first access module accesses the multi-channel video stream input by the remote presentation site.
  • This embodiment includes:
  • Step 41 The media access channel is established between the telepresence site and the first access module in the MCU by using a standard protocol (H.323 or SIP or H.320) call and capability negotiation process, and the first access module in the MCU acquires the remote channel. Present multiple video streams of the venue.
  • a standard protocol H.323 or SIP or H.320
  • Step 42 The first access module sends the multiple video streams to the video synthesis module.
  • the video synthesis module performs image decoding on the received multiple video streams, and the decoded original image is scaled and grouped. A new image is synthesized, and then the new image is encoded.
  • the video synthesis module will encode one video stream, and the encoded image will be video stream.
  • the video synthesis module can send the synthesized video channel to the media switching module, and can also access multiple channels of the access module.
  • the stream is directly forwarded to the media switching module for exchange between remote presentation sites. Specifically, whether to synthesize or directly forward can be controlled by the conference control module.
  • Step 43 The video synthesis module sends the synthesized video stream to the media switching module.
  • the media exchange module forwards the video stream between the sites according to the instructions of the conference control module.
  • Step 44 The video synthesis module forwards the multiple video streams directly to the media switching module.
  • Step 45 The media switching module sends the synthesized video stream to the single stream site.
  • the multi-channel video stream is combined into one video stream, and after being forwarded by the media switching module, the single-stream site can receive the multi-way information of the telepresence site.
  • the first single-stream venue 121 can view the video image of the second telepresence site 112 including the three-way video information (D, E
  • Step 46 The media switching module sends the multiple video streams to the telepresence site.
  • the information of the second telepresence venue 112 is transmitted to the first telepresence venue 111.
  • a multiplexed circuit is used as an example.
  • the code stream synthesis principle can be used to implement an arbitrarily networking between an N-stream site and an L-stream site. Specific practices can include the following two methods:
  • Manner 1 Combine a number of N channels of video information into L channel video information, that is, separately synthesize a plurality of N stream sites to obtain an L channel video stream. Specifically, the N video streams of the N stream site are combined into one video stream including N pictures, and the video stream is sent to a video channel of the L stream site, and the remaining L-1 of the L stream site are The video channel can be used to receive video stream information of other sites. For example, two three-stream sites are processed, and three video streams of each three-stream site are combined into one video stream, and finally two video streams are formed, which can then be sent to the dual-stream site. Therefore, in this manner, the L stream site can receive the combined picture of the L sites.
  • Manner 2 Combine an N-channel video information into L-channel video information, that is, synthesize an N-stream site to obtain an L-channel video stream.
  • the L-1 video stream in the N-stream site is separately sent to the L-1 video channels of the L-stream site, and the remaining N-(L-1)-channel video streams of the N-stream site are combined to include
  • One video stream of the N-(L-1) pictures is sent to the remaining one video channel of the L-stream conference site.
  • one channel of a three-stream site is kept unchanged, and the other two channels are combined into one channel, and finally two video streams are formed, which can then be sent to the dual-stream site. Therefore, in this way, it can be ensured that the L-stream site sees the largest picture as much as possible.
  • the video intercommunication problem of a multi-site with a small number of venues is solved by synthesizing the video streams of multiple venues.
  • FIG. 5 is a schematic flowchart of a second embodiment of a video processing method according to Embodiment 2 of the present invention.
  • a single-stream site is used as an input side, and a remote presentation site is an output side.
  • the method includes the following steps: Step 51-53
  • the single-stream site sends a single video stream to the media switching module through the second access module.
  • the first single-stream site 121, the second single-stream site 122, and the third single-stream site 123 respectively send respective video streams (G, H, and I) to the media switching center.
  • Step 54 The media switching module combines the single video streams of the multiple single-stream sites into multiple video streams. For example, the above three single video streams are combined into three video streams.
  • the combined multi-channel video stream is sent to a telepresence venue.
  • Step 55 The media switching module forwards the multiple video streams to another remote presentation site.
  • three video streams (G, H, I) are sent to the second telepresence venue 112.
  • the method of combining the three paths into three paths is used to implement the hybrid networking between any L-stream site and the N-stream site.
  • the specific method may be: arbitrarily selecting a total of N channels of video code streams to be sent to the N-stream site in a plurality of L-stream sites. For example, the video streams of two dual-stream sites are combined into four video streams and output to a telepresence site with four channels.
  • the video intercommunication problem of a small number of venues and a small number of conference venues is solved by synthesizing a plurality of video streams of a small number of venues.
  • FIG. 6 is a schematic structural diagram of a multipoint control unit according to Embodiment 3 of the present invention, where this embodiment is
  • a schematic structural diagram of an MCU for video includes a first access module 61, a second access module 62, and a media switching module 63.
  • the first access module 61 is configured to access the N video stream of the first conference terminal, for example, access the video stream of the remote presentation site.
  • the second access module 62 is configured to access the L-channel video stream of the second conference terminal that is different from the N, for example, to access the video stream of the single-stream conference site.
  • the first conference terminal is the input side
  • the second conference terminal is the output side.
  • the difference from the MCU provided in the second embodiment is that the video synthesis unit is not included in this embodiment.
  • the media switching module 63 selects the L channel video stream in the N channel video streams according to the preset condition or the condition of the video stream, and obtains a number of L channel video streams in a time division; thereafter, the plurality of L channels are The video stream is time-divisionally transmitted to the second conference terminal. If the first time instant is selected, the path of the second telepresence conference site 112 in FIG.
  • the third time selects the path including the F information, and then transmits to the first single stream site 121 in FIG. 1 respectively, so that the first single stream site 121 can timely see the second telepresence site 112. all content.
  • the specific selection of the L-channel video stream in the N-channel video stream at a certain time may be:
  • Manner 1 According to the preset control rules, if the user can set the information of the video stream that he needs to be the corresponding control rule, select the L channel video stream in the N channel video stream.
  • Manner 2 According to the preset priority, the L channel is transmitted to the L stream site according to the priority from the highest to the lowest in the N video stream.
  • Manner 3 The MCU analyzes the audio stream corresponding to the N channels of the video stream, and selects the video stream corresponding to the L channel audio stream to be transmitted to the L stream site according to the volume of the audio stream.
  • the N-stream site carries a flag indicating the priority level in the video stream.
  • the MCU selects the L-channel video stream to transmit to the L-stream site according to the priority from high to low.
  • the embodiment may further include a protocol conversion/rate adaptation module 64 and a conference control module.
  • the functions of the two modules are the same as in the second embodiment.
  • the protocol conversion/rate adaptation module 64 is used for the conversion and adaptation between protocols and rates, and the conference control module controls each module.
  • FIG. 7 is a schematic structural diagram of a multi-point control unit according to Embodiment 4 of the present invention.
  • the present embodiment is a schematic structural diagram of an MCU for audio, including a first access module 71 and a second access module 72.
  • the first access module 71 is configured to access an N-channel audio stream.
  • the second access module 72 is configured to access an L-channel audio stream different from N.
  • the audio stream selection/synthesis module 73 is connected to the access module that accesses the non-single channel audio stream. For example, if N is not 1 and L is 1, the audio stream selection/synthesis module is connected to the first access module.
  • the audio stream selection/synthesis module is configured to select or synthesize multiple audio streams accessed by the first access module or/and the second access module, that is, the audio stream with the highest volume can be selected according to the volume of each audio stream. , or combine at least two audio streams into one audio stream.
  • the mixing module 75 is configured to perform centralized mixing of the audio streams of the respective sites, and the input side of the centralized mixing is a selection or synthesis of one audio code stream of the remote presentation site, and a direct audio code stream of the single stream site.
  • the mixing may be specifically to decode the audio stream of each site, and then select the voices of several venues for digital synthesis according to the volume, and the synthesized voice data is re-encoded, and the encoded code stream is sent through the media switching module. To each venue.
  • the coding may be separately coded according to specific protocols or rates of different sites to meet the requirements of protocols or rates of different sites.
  • the media switching module 74 exchanges the audio streams after the centralized mixing in each site.
  • the embodiment may further include a conference control module, and is connected to each of the foregoing modules (the first access module, the second access module, the mixing module, and the media switching module) to control each module.
  • the mixing module is configured to perform mixing processing on the audio streams of each site, so that the venues can hear the sounds of other sites and implement audio intercommunication between different sites.
  • FIG. 8 is a schematic flowchart of an audio processing method according to Embodiment 4 of the present invention, including: Step 81: A remote presentation site establishes a media channel with a first access module by using a call and capability negotiation process.
  • Step 82 The first access module sends the multi-channel audio stream of the telepresence site to the audio stream selection/synthesis module.
  • the audio stream selection/synthesis module selects an audio stream according to the specification of the conference control module or automatically selects one audio stream according to the volume of the audio stream, or the audio stream selection/synthesis module combines the multiple audio streams into one channel including multiple channels.
  • the audio stream of voice messages Can be set as needed Whether you choose to go all the way or combine them all the way.
  • Step 83 The audio stream selection/synthesis module sends the selected or synthesized audio stream to the media exchange module.
  • Step 84 The media switching module sends the synthesized audio stream to the mixing module.
  • Steps 85-86 The mixing module sends the mixed audio stream to the single stream site through the media switching module and the second access module, and sends the message to the telepresence site through the media switching module and the first access module.
  • the second access module and the first access module at the receiving end are not shown in the figure.
  • the audio stream of each site is collected into a mixing module for mixing, and then distributed to each site through the media switching module, so that each site can hear the voice of the conference and realize the audio intercommunication of the site.
  • the mixing module performs encoding according to different audio protocols during the mixing process, and can realize audio intercommunication between the sites of different audio protocols.
  • the MCU includes a first access module, a second access module, and a media switching module; the first access module is configured to access the first conference terminal, and the first conference terminal transmits the first media stream, where the first The media stream includes an N video stream and an N channel audio stream; the second access module is configured to access the second conference terminal, and the second conference terminal transmits the second media stream, where the second media stream includes the L channel video.
  • the stream and the L channel audio stream are different from each other.
  • the media switching module is configured to transmit all the information in the first media stream to the second conference terminal, and transmit the information in the second media stream to the first conference terminal.
  • the MCU includes the first access module, the second access module, and the media switching module, and further includes a video synthesis module, an audio stream selection/synthesis module, and a mixing module.
  • the video synthesizing module is connected to the first access module, and is configured to synthesize the N video streams into the L video stream, and forward the data to the second conference terminal by using the media switching module;
  • the L channel video stream is merged into an N channel video stream and forwarded to the first conference terminal;
  • the audio stream selection/synthesis module is connected to the first access module and/or the second access module, and is used when N is greater than 1 , synthesize N channels of audio streams into one channel of audio stream or select among N channels of audio streams according to volume Selecting an audio stream to obtain a first audio stream.
  • the L audio stream is synthesized into one audio stream or one audio stream is selected in the L channel audio stream according to the volume, and a second audio stream is obtained.
  • a mixing module configured to use a first audio stream obtained by the audio stream selection/synthesis module or an audio stream accessed by the first access module, and a second of the audio stream selection/synthesis module
  • the audio stream or an audio stream accessed by the second access module is subjected to a mixing process, and the audio stream after the mixing process is sent to the first conference terminal and the second conference terminal through the media switching module.
  • the video synthesizing module is specifically configured to synthesize a plurality of N channels of video information into L channel video information, such as synthesizing L N channels of video information into L channel video information, and synthesizing each N channel video information into one channel of video information; or
  • the video synthesizing module is specifically configured to synthesize an N-channel video information into L-channel video information, such as keeping the (L-1)-channel video information in an N-channel video information unchanged, and then N-(L-1) The video information of the road is synthesized into one video information.
  • the MCU includes the foregoing first access module, the second access module, and the media switching module, and further includes an audio stream selection/synthesis module and a mixing module;
  • the media switching module is further configured to time-divisionally Selecting an L-channel video stream from the N-channel video stream, obtaining a plurality of time-sharing L-channel video streams, and transmitting the plurality of L-channel video streams to the second conference terminal in time division; audio stream selection/synthesis module And connecting to the first access module and/or the second access module, configured to synthesize N audio streams into one audio stream when N is greater than 1, or select one audio stream in the N audio streams according to volume, to obtain The first audio stream of all the way, when L is greater than 1, the L channel audio stream is synthesized into one channel audio stream or the audio stream is selected in the L channel audio stream according to the volume, and the second audio stream of one channel is obtained;
  • the mixing module is used a first audio stream obtained by the audio stream selection/synthesis module or an audio stream accessed by the first access module, and
  • the media switching module is configured to select, according to a preset control rule, an L channel video stream specified by a preset control rule in the N channel video stream; or the media switching module is configured to use the preset priority level according to the preset priority
  • the L channel video stream is selected in the N channel video stream; or the media switching module is configured to select the L channel video stream according to the audio stream corresponding to each video stream according to the volume of the audio stream; or the media switching module is used according to each The priority carried in the video stream, select the L channel video stream.
  • the MCU further includes a protocol conversion/rate adaptation module, where the protocol conversion/rate adaptation module is connected to the first access module and the second access module, and is configured to perform protocol conversion on the N video stream and the L channel video stream or Rate adaptation processing.
  • FIG. 9 is a schematic structural diagram of an embodiment of a video processing apparatus according to the present invention, including a video obtaining module 91, a determining module 92, a processing module 93, and a transmitting module 94.
  • the video acquisition module 91 is configured to acquire the N video stream sent by the first conference terminal
  • the determining module 92 is configured to determine a second conference terminal that interacts with the first conference terminal accessed by the video acquisition module 91, where the second conference terminal
  • the L-channel video stream is different from the N.
  • the processing module 93 is configured to carry the N-channel video information carried in the N-channel video stream acquired by the video acquiring module 91 in the L supported by the second conference terminal determined by the determining module 92.
  • the transmission module 94 is configured to transmit the L channel video stream obtained by the processing module 93 to the second conference terminal.
  • the processing module is specifically configured to synthesize the N video information into L channel video information, and carry the L channel video information in the L channel video stream, respectively.
  • the processing module is specifically configured to combine the plurality of the N-channel video information into the L-way information, and carry the L-channel video information in the L-channel video stream.
  • the processing module may be specifically configured to select the L-channel video stream in the N-channel video stream in a time-sharing manner, and obtain a plurality of L-channel video streams in a time-sharing manner;
  • the transmitting the L-channel video stream to the second conference terminal includes: transmitting the plurality of L-channel video streams to the second conference terminal in a time division manner.
  • the embodiment may further include a protocol conversion/rate adaptation module, configured to perform protocol conversion or/and rate adaptation on the N video stream and the L channel video stream.
  • a protocol conversion/rate adaptation module configured to perform protocol conversion or/and rate adaptation on the N video stream and the L channel video stream.
  • video streams are synthesized or combined or selected, and video intercommunication between conference terminals of different numbers of channels can be realized.
  • FIG. 10 is a schematic structural diagram of an embodiment of an audio processing device according to the present invention, including an audio acquiring module 101, a mixing module 102, and a transmitting module 103.
  • the audio acquisition module 101 is configured to acquire an audio stream of each conference terminal, where the conference terminal includes at least one remote presentation site terminal and the remote presentation
  • the terminal has a terminal with an audio stream of a different number of channels;
  • the mixing module 102 is configured to perform audio mixing processing on the audio stream of each conference terminal by the audio acquiring module 101; and the audio used by the sending module 103 to mix the sound mixing module 102
  • the stream is sent to each conference terminal.
  • the embodiment may further include an audio synthesis/selection module, which is connected to the audio acquisition module, and is configured to respectively synthesize the audio streams of each conference terminal into one audio stream or select one audio stream according to the volume, and combine or select one channel.
  • the audio streams are separately sent to the mixing module.
  • the audio intercommunication between the sites of different numbers of channels is realized by the mixing process.
  • the method includes the steps of the foregoing method embodiments; and the foregoing storage medium includes: a medium that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Computer Graphics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Description

视频及音频处理方法、 多点控制单元和视频会议***
技术领域
本发明涉及音视频技术, 尤其涉及一种视频处理方法、 音频处理方法、 视频处理装置、 音频处理装置、 多点控制单元和视频会议***。 背景技术
最初的视频会议***中, 每个会场只能发送一路视频图像, 一般为摄像 机采集的会议室场景, 给与会者提供类似面对面的效果。 随着视频会议技术 的不断进步, 又出现了双流标准, 允许与会者发送两路视频图像, 主流发送 摄像机采集的会议室场景, 辅流可以发送笔记本电脑上的胶片图像, 提高了 与会者之间的数据共享。 再进一步的又出现了远程呈现***, 该***可以同 时传送多个摄像机图像, 并且多个摄像机图像可以拼接在一起构成一个视角 更大的完整会议室场景, 给与会者提供高临场感视频沟通体验。
虽然双流或多流的会议模式给用户带来了极大的方便和体验, 但是要求 同一个会议的所有会场都要同时支持双流或多流, 不能与现有的单流兼容。 单流会场的用户如果需要参加双流或多流会议, 就必须将单流设备替换为双 流或多流设备, 而双流或多流设备的成本比较高, 因此需要一种方案来支持 单流会场、 双流会场、 多流会场之间的混合组网, 以尽量减低整体建设成本。 现有技术中存在一种方案可以解决单流会场和双流会场的混合会议, 其采用 的方案是将双流会场的主视频流转发给单流会场, 而丟弃双流会场的辅视频 流。 术在实现双流会场和单流会场的混合组网时, 由于丟弃了双流会场的辅视频 流, 造成单流会场只能看到双流会场发送的主流图像, 而看不到辅流图像, 影响会议效果。 同时, 现有技术中还没有给出远程呈现会场与单流会场及双 流会场及路数不同的远程呈现会场之间的混合组网方案。 发明内容
本发明是提供一种视频处理方法、 音频处理方法、 视频处理装置、 音频 处理装置、 多点控制单元和视频会议***, 解决支持不同音视频会场的混合 组网问题。
本发明实施例提供了一种视频处理方法, 包括:
获取第一会议终端发送的 N路视频流,每个第一会议终端支持 N路视频 流;
确定与所述第一会议终端进行交互的第二会议终端, 所述第二会议终端 支持与 N不同的 L路视频流;
将所述 N路视频流中携带的 N路视频信息, 携带在 L路视频流中; 将所述 L路视频流传输给所述第二会议终端。
本发明实施例提供了一种音频处理方法, 包括:
获取各会议终端的音频流, 所述会议终端至少包括一个远程呈现会场的 终端及与所述远程呈现会场具有不同路数的音频流的终端;
对各会议终端的音频流进行混音处理;
将混音后的音频流发送给各会议终端。
本发明实施例提供了一种视频处理装置, 包括:
视频获取模块, 用于获取第一会议终端发送的 N路视频流, 每个第一会 议终端支持 N路视频流;
确定模块, 用于确定与所述第一会议终端进行交互的第二会议终端, 所 述第二会议终端支持与 N不同的 L路视频流;
处理模块, 用于将所述 N路视频流中携带的 N路视频信息, 携带在 L路 视频流中;
传输模块, 用于将所述 L路视频流传输给所述第二会议终端。
本发明实施例提供了一种音频处理装置, 包括:
音频获取模块, 用于获取各会议终端的音频流, 所述会议终端至少包括 端;
混音模块, 用于对各会议终端的音频流进行混音处理;
发送模块, 用于将混音后的音频流发送给各会议终端。
本发明实施例提供了一种多点控制单元, 包括:
第一接入模块, 用于接入第一会议终端, 与所述第一会议终端传输第一 媒体流, 所述第一媒体流包括 N路视频流和 N路音频流;
第二接入模块, 用于接入第二会议终端, 与所述第二会议终端传输第二 媒体流, 所述第二媒体流包括 L路视频流和 L路音频流, L与 N不相同; 媒体交换模块 , 用于将第一媒体流中的信息全部传输给第二会议终端 , 将第二媒体流中的信息全部传输给第一会议终端。
本发明实施例提供了一种视频会议***, 包括:
至少两个会议终端, 所述会议终端至少支持两种媒体流路数;
多点控制单元, 用于交换所述至少两个会议终端的媒体流中携带的全部 信息。
由上述技术方案可知, 本发明实施例通过对接入的音视频流进行处理, 使处理后的音视频路数与接收会场的路数相同, 实现各个不同路数会场之间 的互通, 即实现远程呈现会场、 单流会场、 双流会场之间的互通融合, 使这 些不同路数的会场可以混合组网, 降低整网建设成本。 附图说明
为了更清楚地说明本发明实施例的技术方案, 下面将对实施例描述中所 需要使用的附图作简单地介绍, 显而易见地, 下面描述中的附图仅仅是本发 明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动的前 提下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例提供的视频会议***的结构示意图;
图 2为本发明实施例一提供的视频处理方法的流程示意图; 图 3为本发明实施例二提供的多点控制单元的结构示意图; 图 4为本发明实施例二提供的视频处理方法的实施例一的流程示意图; 图 5为本发明实施例二提供的视频处理方法的实施例二的流程示意图; 图 6为本发明实施例三提供的多点控制单元的结构示意图;
图 7为本发明实施例四提供的多点控制单元的结构示意图;
图 8为本发明实施例四提供的音频处理方法的流程示意图;
图 9为本发明视频处理装置实施例的结构示意图;
图 10为本发明音频处理装置实施例的结构示意图。 具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明的一部分实施例, 而不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有 做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
图 1为本发明实施例提供的视频会议***的结构示意图, 包括第一会议 终端、 与第一会议路数不同的第二会议终端和用于进行第一会议终端与第二 会议终端媒体流交互的多点控制单元( Multipoint Control Unit, MCU ) 13。 第 一会议终端和第二会议终端至少为一个, 其中, 图 1所示的第一会议终端为 路数为三路的第一远程呈现会场 111、 第二远程呈现会场 112, 传输的分别为 A、 B、 (^及0、 E、 F三路媒体流; 第二会议终端为第一单流会场 121、 第二 单流会场 122、第三单流会场 123 ,分别传输的为 G、 H、 I一路媒体流; MCU 13 负责各个会场 (包括单流会场、 双流会场、 多流会场)之间的核心交换、 混 音、 分屏处理等。 第一远程呈现会场 111、 第二远程呈现会场 112与第一单流 会场 121、 第二单流会场 122、 第三单流会场 123通过 E1或 IP或 ISDN等传 输网络接入 MCU13 ,媒体流(包括视频流和音频流 ; iMCU13上汇聚, MCU13 对各个会场接入的媒体流统一进行控制、 交换, 以实现各个会场之间的媒体 流交换。 参见图 1 , 第二远程呈现会场 112可以接收到第一单流会场 121、 第二单 流会场 122、 第三单流会场 123的媒体信息 (G、 H、 I ), 第一单流会场 121 可以接收到第二远程呈现会场 112的媒体信息(D、 E、 F ), 因此, 实现了远 程呈现会场和单流会场之间的交互, 解决了现有技术中不同路数会场之间不 能融合的问题。 同时, 同现有技术一样, 远程呈现会场之间、 单流会场之间 也可以交互,例如, 第一远程呈现会场 111可以接收到第二远程呈现会场 112 的媒体信息, 第二单流会场 122、 第三单流会场 123分别接收到第一单流会 场 111、 第二单流会场 122的媒体信息, 因此, 可以同现有技术兼容。
本实施例还可以进一步包括业务管理台 14, 业务管理台 14用于对*** 的各种参数进行预定义, 之后将预定义的各参数传输给 MCU13 , 使 MCU13 根据预定义的各参数进行统一控制管理。 具体的 MCU 的结构及实现方式可 以参见下述的实施例。
本实施例中, MCU进行不同媒体流路数的会场之间的交互, 可以实现不 同路数会场之间的混合组网, 可以根据网络中各用户的情况配置相应的远程 呈现会场终端、 单流会场终端、 双流会场终端, 而不必整网配置性能较高的 设备, 因此, 可以降低整网的建设成本, 避免之前使用的设备的资源浪费。
图 2为本发明实施例一提供的视频处理方法的流程示意图, 包括: 步骤 21 : MCU获取第一会议终端发送的 N路视频流。 例如, MCU接收 远程呈现会场的三路视频流。
步骤 22: MCU确定与所述第一会议终端进行交互的第二会议终端, 所 述第二会议终端支持 L路视频流, L与 N不同。 例如, 第二会议终端为单流 会场, 支持一路视频流。
步骤 23: MCU将所述 N路视频流中携带的 N路视频信息, 携带在 L路 视频流中。例如,参照图 1 ,第一单流会场 121支持的视频流为一路,与 MCU 接入的第二远程呈现会场 112支持的三路视频流的路数不相同, 因此, MCU 需要对三路视频流进行处理, 使三路视频流中的信息携带在一路视频流中, 如该一路视频流中包括信息 D、 E、 F )。 相比于现有会议电视***中要求各会 于现有双流会场只将主视频流传输给单流会场造成的信息丟失问题, 本实施 例在处理后得到的第二媒体流中保留了原有的第一媒体流中全部信息, 避免 信息丟失的问题。 上述将远程呈现会场的三路视频信息进行合成, 还可以将 三路视频信息分时地以一路视频流的方式发送给单流会场, 具体的可参见下 述实施例。
步骤 24: MCU将所述 L路视频流传输给所述第二会议终端。 例如, 将 携带三流视频信息的一路视频流发送给单流会场。
本实施例通过对来自各会场的视频流进行处理, 使接入的视频流与输出 的视频流的路数不相同, 实现各不同路数的会场之间的融合; 并且实现输出 的视频流保留输入视频流的全部信息, 避免信息丟失。
图 3为本发明实施例二提供的多点控制单元的结构示意图, 本实施例是 针对视频的 MCU的结构示意图, 包括第一接入模块 31、 第二接入模块 32、 视频合成模块 33和媒体交换模块 34。第一接入模块 31与第一会议终端连接, 用于接入第一会议终端的 N路视频流, 例如, 接入图 1中的远程呈现会场的 三路视频流; 第二接入模块 32与第二会议终端连接, 用于接入第二会议终端 的与 N不同的 L路视频流, 例如, 接入图 1中的单流会场的一路视频流; 视 频合成模块 33与第一接入模块 31相连, 用于将 N路视频流合成为 L路视频 流, 例如, 将图 1 的远程呈现会场的三路视频流合成一路视频流; 媒体交换 模块 34与视频合成模块 33相连, 用于将 N路视频流合成得到的 L路视频流 转发给第二会议终端, 例如, 将图 1 的三路视频流合成为一路视频流后发送 给单流会场。 同时, 视频合成模块 33还可以用于将接入的未合成的 N路视 频流直接转发给媒体交换模块 34, 再传输给相应的支持多路的会场, 例如, 图 1的第二远程呈现会场 112将多路视频流通过媒体交换模块 34直接转发给 第一远程呈现会场 111。
其中, 视频合成模块具体用于将若干个 N路视频信息合成为 L路视频信 息,如将 L个 N路视频信息合成为 L路视频信息,每个 N路视频信息合成为 一路视频信息; 或者所述视频合成模块具体用于将一个 N路视频信息合成为 L路视频信息, 如将一个 N路视频信息中的(L-1 )路视频信息保持不变, 将 N- ( L-1 )路的视频信息合成为一路视频信息。 本实施例还可以进一步包括协 议转换 /速率适配模块 35, 协议转换 /速率适配模块 35分别位于视频合成模块 与媒体交换模块之间, 及位于第二接入模块与媒体交换模块之间, 用于对不 同的协议和速率进行转换和适配, 即把源视频流格式转换为目的视频格式, 或把源视频带宽转换为目的视频带宽。 如果会场之间的协议或速率不需要转 换和适配, 就不需要经过此模块。 进一步的, 本实施例还可以包括会议控制 模块, 会议控制模块与 MCU内部的各模块连接, 用于根据业务管理台 14输 入的各种参数, 管理、 控制 MCU 内部包括接入模块、 视频合成模块、 协议 转换 /速率适配模块、媒体交换模块共同动作, 实现用户对会议的管理, 例如, 控制接入模块将接入的视频流送入协议转换 /速率适配模块, 还是直接送入视 频合成模块;控制视频合成模块是进行视频流合成还是直接进行视频流转发; ***体交换模块将处理后的视频流送入哪个会场; 并且还可以控制这些模 块统一工作等。
本实施例通过视频合成模块实现多路视频流的合成, 达到路数少会场可 以支持的路数, 因此可以实现将路数多会场的视频信息传输给路数少会场, 而无需将支持较少路数的会场升级成支持较多路数的会场, 节省设备成本。
图 4为本发明实施例二提供的视频处理方法的实施例一的流程示意图。 本实施例以远程呈现会场为输入侧, 单流会场及远程呈现会场为输出侧的情 形为例, 第一接入模块接入的为远程呈现会场输入的多路视频流。 该实施例 包括:
步骤 41 : 通过标准协议 ( H.323或 SIP或 H.320 )的呼叫、能力协商过程, 远程呈现会场与 MCU中的第一接入模块建立媒体通道, MCU中的第一接入 模块获取远程呈现会场的多路视频流。
步骤 42: 第一接入模块将多路视频流发送给视频合成模块。 视频合成模 块对接收到的多路视频流进行图像解码, 解码出来的原始图像进行缩放、 组 合成一幅新的图像, 然后对该新的图像进行编码, 根据会议控制模块的控制 获知需要传输给单流会场, 因此视频合成模块将编码得到一路视频流, 并将 编码后的图像一路视频流发送给媒体交换模块。 同时, 因为远程呈现会场之 间交换的是多路视频流, 所以, 视频合成模块除了可以将合成后的一路视频 流发送给媒体交换模块之外, 还可以将接入模块接入的多路视频流直接转发 给媒体交换模块, 用于在远程呈现会场之间进行交换。 具体的是要合成还是 直接转发可以由会议控制模块进行控制。
步骤 43: 视频合成模块将合成后的视频流发送给媒体交换模块。 媒体交 换模块根据会议控制模块的指令, 将视频流在各个会场之间进行转发。
步骤 44: 视频合成模块将多路视频流直接转发给媒体交换模块。
步骤 45: 媒体交换模块将合成后的视频流发送给单流会场。 通过视频合 成模块的处理, 将多路视频流合成一路视频流, 并且, 经过媒体交换模块的 转发, 单流会场便可以收看到远程呈现会场的多路信息。 参见图 1 , 经过路 数合成, 第一单流会场 121可以收看到第二远程呈现会场 112的包括三路视 频信息 (D、 E|、 F ) 的视频图像。
步骤 46: 媒体交换模块将多路视频流发送给远程呈现会场。 参照图 1 , 将第二远程呈现会场 112的信息传输给第一远程呈现会场 111。
本实施例以多路转换成一路为例, 应用该码流合成原理可以实现任意的 N流会场与 L流会场之间的混和组网, ^叚设 N〉 L。 具体的做法可以包括下述 两种方式:
方式一: 将若干个 N路视频信息合成为 L路视频信息, 即对若干个 N流 会场进行分别合成,得到 L路视频流。 具体可以为将 N流会场的 N路视频码 流组合成包括 N个画面的一路视频流, 再把该一路视频流发送给 L流会场的 一个视频通道中, L流会场的其余 L-1 个视频通道可以用于接收其他会场的 视频流信息。 例如, 对两个三流会场进行处理, 将每个三流会场的三路视频 流合成为一路视频流, 最终形成两路视频流, 之后可以发送给双流会场。 因 此, 通过该方式可以使 L流会场接收到 L个会场的组合画面。 方式二: 将一个 N路视频信息合成为 L路视频信息, 即对一个 N流会场 进行合成得到 L路视频流。 具体可以为将 N流会场中的 L-1路视频流分别发 送给 L流会场的 L-1个视频通道中, N流会场的其余的 N- ( L-1 )路的视频 流组合成包括 N- ( L-1 )个画面的一路视频流 , 将该包括多画面的一路视频 流发送给 L流会场的剩下的一个视频通道中。 例如, 将一个三流会场的一路 保持不变, 另两路合成为一路, 最终形成两路视频流, 之后可以发送给双流 会场。 因此, 通过该方式可以保证 L流会场尽可能看到最多的大画面。
本实施例通过将路数多会场的视频流进行合成, 解决了路数少会场收看 路数多会场的视频互通问题。
图 5为本发明实施例二提供的视频处理方法的实施例二的流程示意图, 本实施例是以单流会场为输入侧, 远程呈现会场为输出侧的情形为例, 包括: 步骤 51-53:单流会场将单路视频流分别通过第二接入模块发送给媒体交 换模块。 例如, 参照图 1 , 第一单流会场 121、 第二单流会场 122、 第三单流 会场 123分别将各自的一路视频流(分别为 G、 H、 I )发送给媒体交换中心。
步骤 54:媒体交换模块将多个单流会场的单路视频流合并成多路视频流。 例如, 将上述的三个单路视频流合并成三路视频流。 并将合并后的多路视频 流发送给一个远程呈现会场。
步骤 55: 媒体交换模块将多路视频流转发给另一远程呈现会场。 例如, 参照图 1 , 将三路视频流(G、 H、 I )发送给第二远程呈现会场 112。
本实施例以一路转换成三路为例, 应用该会场合成原理可以实现任意的 L流会场与 N流会场之间的混和组网, 4叚设 N〉 L。 具体的做法可以为: 在若 干个 L流会场中任意选择共 N路的视频码流发送给 N流会场。 例如, 将 2个 双流会场的视频流合并成 4路视频流后, 输出给一个路数为四的远程呈现会 场。
本实施例通过将若干个路数少会场的视频码流进行合成, 解决了路数多 会场收看路数少会场的视频互通问题。
图 6为本发明实施例三提供的多点控制单元的结构示意图, 本实施例是 针对视频的 MCU的结构示意图, 包括第一接入模块 61、 第二接入模块 62和 媒体交换模块 63。 第一接入模块 61用于接入第一会议终端的 N路视频流, 例如, 接入远程呈现会场的视频流。 第二接入模块 62用于接入第二会议终端 的与 N不同的 L路视频流, 例如, 接入单流会场的视频流。
以 N大于 L为例, 第一会议终端为输入侧, 第二会议终端为输出侧的情 况下, 与实施例二提供的 MCU不同的是, 本实施例中不包括视频合成单元, 本实施例中的媒体交换模块 63根据预设的条件或者视频流的条件分时地在 N 路视频流中选择 L路视频流, 得到分时的若干个 L路视频流; 之后, 将该若 干个 L路视频流分时传输给第二会议终端。 如设置第一时刻选择图 1中的第 二远程呈现会场 112的包含 D信息的这一路(如可以根据视频流的源地址确 定是否为需要选择的视频流), 第二时刻选择包含 E信息的这一路, 第三时刻 选择包含 F信息的这一路, 之后分别传输给图 1 中的第一单流会场 121 , 这 样第一单流会场 121便会分时地看到第二远程呈现会场 112的全部内容。 其 中, 某一时刻在 N路视频流中选择 L路视频流具体可以为:
方式一: 根据预设的控制规则, 如用户可以将自身需要的视频流的信息 设置为相应的控制规则, 在 N路视频流中选择 L路视频流。 方式二: 根据预 设的优先级,在 N路视频流中按照优先级从高到低排列选择 L路传输给 L流 会场。
方式三: MCU通过分析接入的 N路视频流对应的音频流, 按照音频流 的音量从高到低排列选择 L路音频流对应的视频流传输给 L流会场。
方式四: N流会场在视频流中携带一个表明优先级高低的标志, MCU根 据优先级从高到低排列选择 L路视频流传输给 L流会场。
本实施例还可以进一步包括协议转换 /速率适配模块 64和会议控制模块。 这两个模块的作用于实施例二相同, 如, 协议转换 /速率适配模块 64用于协 议及速率之间的转换和适配, 会议控制模块对各模块进行控制。
图 7为本发明实施例四提供的多点控制单元的结构示意图, 本实施例是 针对音频的 MCU的结构示意图, 包括第一接入模块 71、 第二接入模块 72、 音频码流选择 /合成模块 73、 媒体交换模块 74和混音模块 75。 第一接入模块 71用于接入 N路音频流。第二接入模块 72用于接入与 N不同的 L路音频流。 音频码流选择 /合成模块 73与接入非单路音频流的接入模块对应连接, 例如, N不为 1 , L为 1 , 则音频码流选择 /合成模块与第一接入模块连接; 若 N及 L 均不为 1 , 则存在两个音频码流选择 /合成模块, 分别与第一接入模块和第二 接入模块连接。音频码流选择 /合成模块用于对第一接入模块或 /和第二接入模 块接入的多路音频流进行选择或者合成, 即可以根据各音频流的音量选择一 路音量最大的音频流, 或者将至少两路的音频流合成为一路音频流。 混音模 块 75用于对各个会场的音频流进行集中混音,集中混音的输入方为远程呈现 会场的选择或合成的一路音频码流, 及单流会场的直接的一路音频码流。 混 音可以具体为将各个会场的音频码流进行解码, 然后根据音量大小, 选择其 中几路会场的语音进行数字合成, 合成后的语音数据重新进行编码, 编码后 的码流经过媒体交换模块发送给各个会场。 其中, 编码可以根据不同会场的 具体协议或速率等分别编码, 以满足不同会场的协议或速率等要求。 媒体交 换模块 74对各会场集中混音后的音频流进行交换。
本实施例还可以进一步包括会议控制模块, 与上述各个模块(第一接入 模块、 第二接入模块、 混音模块、 媒体交换模块)连接, 对各个模块进行控 制。
本实施例中, 混音模块用于对各会场的音频流进行混音处理, 可以使各 会场听到其它会场的声音, 实现不同会场之间的音频互通。
图 8为本发明实施例四提供的音频处理方法的流程示意图, 包括: 步骤 81 : 远程呈现会场通过呼叫、 能力协商过程建立与第一接入模块之 间的媒体通道。
步骤 82: 第一接入模块将远程呈现会场的多路音频流发送给音频码流选 择 /合成模块。音频码流选择 /合成模块根据会议控制模块的指定选择某一路音 频流或自动根据音频流的音量选择一路音频流, 或者, 音频码流选择 /合成模 块将多路音频流合成为一路包括多路语音信息的音频码流。 可以根据需要设 定是选择出一路还是组合成一路。
步骤 83:音频码流选择 /合成模块将选择或合成后的音频流发送给媒体交 换模块。
步骤 84: 媒体交换模块将合成后的音频流发送给混音模块。
步骤 85-86:混音模块将混音后的音频流通过媒体交换模块和第二接入模 块发送给单流会场, 及通过媒体交换模块和第一接入模块发送给远程呈现会 场。 图中未示出接收端的第二接入模块及第一接入模块。
本实施例将各个会场的音频流集中到混音模块进行混音, 再通过媒体交 换模块分发给各个会场, 使各个会场都能够听到会议的声音, 实现会场的音 频互通。 同时, 混音模块在混音处理时根据不同音频协议进行编码, 可以实 现不同音频协议的会场之间的音频互通。
上述实施例针对视频和音频分别对 MCU进行了描述, 其中图 3、 图 6是 针对视频的, 图 7是针对音频的。 由于 MCU需要对视频和音频都进行处理, 因此 MCU可以结合图 3及图 7, 或者结合图 6及图 7。 即 MCU包括第一接 入模块、 第二接入模块和媒体交换模块; 第一接入模块用于接入第一会议终 端, 与所述第一会议终端传输第一媒体流, 所述第一媒体流包括 N路视频流 和 N路音频流; 第二接入模块用于接入第二会议终端, 与所述第二会议终端 传输第二媒体流, 所述第二媒体流包括 L路视频流和 L路音频流, L与 N不 相同; 媒体交换模块用于将第一媒体流中的信息全部传输给第二会议终端, 将第二媒体流中的信息全部传输给第一会议终端。
更为具体的, 以 N大于 L为例, MCU包括上述的第一接入模块、 第二 接入模块和媒体交换模块, 还包括视频合成模块、 音频码流选择 /合成模块和 混音模块。 视频合成模块与第一接入模块相连, 用于将 N路视频流合成为 L 路视频流, 通过所述媒体交换模块转发给第二会议终端; 所述媒体交换模块 还用于将多个所述 L路视频流合并为 N路视频流, 转发给第一会议终端; 音 频码流选择 /合成模块, 与第一接入模块和 /或第二接入模块相连, 用于当 N 大于 1时,将 N路音频流合成为一路音频流或者根据音量在 N路音频流中选 择一路音频流, 得到一路的第一音频流, 当 L大于 1时, 将 L路音频流合成 为一路音频流或者根据音量在 L路音频流中选择一路音频流, 得到一路的第 二音频流; 混音模块, 用于对音频码流选择 /合成模块得到的一路的第一音频 流或者第一接入模块接入的一路音频流, 及音频码流选择 /合成模块得到的一 路的第二音频流或者第二接入模块接入的一路音频流, 进行混音处理, 并将 混音处理后的音频流通过媒体交换模块发送给第一会议终端和第二会议终 端。 其中, 视频合成模块具体用于将若干个 N路视频信息合成为 L路视频信 息,如将 L个 N路视频信息合成为 L路视频信息,每个 N路视频信息合成为 一路视频信息; 或者所述视频合成模块具体用于将一个 N路视频信息合成为 L路视频信息, 如将一个 N路视频信息中的(L-1 )路视频信息保持不变, 将 N- ( L-1 )路的视频信息合成为一路视频信息。
或者, MCU包括上述的第一接入模块、 第二接入模块和媒体交换模块, 还包括音频码流选择 /合成模块和混音模块; 所述媒体交换模块还用于分时地 在所述 N路视频流中选择 L路视频流, 得到分时的若干个 L路视频流, 并将 所述若干个 L路视频流分时传输给所述第二会议终端; 音频码流选择 /合成模 块, 与第一接入模块和 /或第二接入模块相连, 用于当 N大于 1时, 将 N路 音频流合成为一路音频流或者根据音量在 N路音频流中选择一路音频流, 得 到一路的第一音频流, 当 L大于 1时, 将 L路音频流合成为一路音频流或者 根据音量在 L路音频流中选择一路音频流, 得到一路的第二音频流; 混音模 块, 用于对音频码流选择 /合成模块得到的一路的第一音频流或者第一接入模 块接入的一路音频流, 及音频码流选择 /合成模块得到的一路的第二音频流或 者第二接入模块接入的一路音频流, 进行混音处理, 并将混音处理后的音频 流通过媒体交换模块发送给第一会议终端和第二会议终端。 其中, 媒体交换 模块用于根据预设的控制规则, 在所述 N路视频流中选择预设的控制规则指 定的 L路视频流; 或者媒体交换模块用于根据预设的优先级, 在所述 N路视 频流中选择 L路视频流; 或者媒体交换模块用于根据与各视频流对应的音频 流, 按照音频流的音量大小选择 L路视频流; 或者媒体交换模块用于根据各 视频流中携带的优先级, 选择 L路视频流。
或者, MCU进一步包括协议转换 /速率适配模块, 协议转换 /速率适配模 块与第一接入模块和第二接入模块相连, 用于对 N路视频流和 L路视频流进 行协议转换或速率适配处理。
图 9为本发明视频处理装置实施例的结构示意图,包括视频获取模块 91、 确定模块 92、 处理模块 93和传输模块 94。 视频获取模块 91用于获取第一会 议终端发送的 N路视频流; 确定模块 92用于确定与视频获取模块 91接入的 第一会议终端进行交互的第二会议终端, 所述第二会议终端支持与 N不同的 L路视频流; 处理模块 93用于将视频获取模块 91获取的所述 N路视频流中 携带的 N路视频信息, 携带在确定模块 92确定的第二会议终端支持的 L路 视频流中; 传输模块 94用于将处理模块 93得到的所述 L路视频流传输给所 述第二会议终端。
其中, 若 N大于 L, 所述处理模块具体用于将所述 N路视频信息合成为 L路视频信息, 将所述 L路视频信息分别携带在 L路视频流中。
若 N小于 L, 所述处理模块具体用于将多个所述 N路视频信息合并为 L 路信息, 将所述 L路视频信息分别携带在 L路视频流中。
若 N大于 L, 所述处理模块还可以具体用于分时地在所述 N路视频流中 选择 L路视频流, 得到分时的若干个 L路视频流; 所述传输模块具体用于将 所述 L路视频流传输给所述第二会议终端包括: 将所述若干个 L路视频流分 时传输给所述第二会议终端。
本实施例还可以进一步包括协议转换 /速率适配模块,该协议转换 /速率适 配模块用于对所述 N路视频流和 L路视频流进行协议转换或 /和速率适配。
本实施例对视频流进行合成或合并或选择处理, 可以实现不同路数的会 议终端之间的视频互通。
图 10 为本发明音频处理装置实施例的结构示意图, 包括音频获取模块 101、 混音模块 102和发送模块 103。 音频获取模块 101用于获取各会议终端 的音频流, 所述会议终端至少包括一个远程呈现会场的终端及与所述远程呈 现会场具有不同路数的音频流的终端;混音模块 102用于对音频获取模块 101 获取各会议终端的音频流进行混音处理; 发送模块 103用于将混音模块 102 混音后的音频流发送给各会议终端。
本实施例还可以进一步包括音频合成 /选择模块, 与音频获取模块连接, 用于分别将各会议终端的音频流合成为一路音频流或者根据音量选择一路音 频流, 并将合成的或者选择的一路音频流分别发送给所述混音模块。
本实施例通过混音处理, 实现不同路数的会场之间的音频互通。
本领域普通技术人员可以理解: 实现上述方法实施例的全部或部分步骤 可以通过程序指令相关的硬件来完成, 前述的程序可以存储于一计算机可读 取存储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而前述 的存储介质包括: ROM, RAM, 磁碟或者光盘等各种可以存储程序代码的介 质。
最后应说明的是: 以上实施例仅用以说明本发明的技术方案而非对其进 行限制, 尽管参照较佳实施例对本发明进行了详细的说明, 本领域的普通技 术人员应当理解: 其依然可以对本发明的技术方案进行修改或者等同替换, 而这些修改或者等同替换亦不能使修改后的技术方案脱离本发明技术方案的 4青神和范围。

Claims

权 利 要 求
1、 一种视频处理方法, 其特征在于, 包括:
获取第一会议终端发送的 N路视频流,每个第一会议终端支持 N路视频 流;
确定与所述第一会议终端进行交互的第二会议终端, 所述第二会议终端 支持与 N不同的 L路视频流;
将所述 N路视频流中携带的 N路视频信息, 携带在 L路视频流中; 将所述 L路视频流传输给所述第二会议终端。
2、 根据权利要求 1 所述的视频处理方法, 其特征在于, 所述将所述 N 路视频流中携带的 N路视频信息, 携带在 L路视频流中包括:
若 N大于 L, 将所述 N路视频信息合成为 L路视频信息, 将所述 L路视 频信息分别携带在 L路视频流中;
或者,
若 N小于 L, 将多个所述 N路视频信息合并为 L路视频信息, 将所述 L 路视频信息分别携带在 L路视频流中;
或者,
若 N大于 L, 分时地在所述 N路视频流中选择 L路视频流, 得到分时的 若干个 L路视频流;
所述将所述 L路视频流传输给所述第二会议终端包括: 将所述若干个 L 路视频流分时传输给所述第二会议终端。
3、 根据权利要求 2 所述的视频处理方法, 其特征在于, 所述将所述 N 路视频信息合成为 L路视频信息包括:
当所述 N路视频信息为两个以上的 N路视频信息时,将所述两个以上的 N路视频信息合成为 L路视频信息; 或者
当所述 N路视频信息为一个 N路视频信息时,将一个 N路视频信息合成 为 L路视频信息。
4、 根据权利要求 3所述的视频处理方法, 其特征在于: 所述将所述两个以上的 N路视频信息合成为 L路视频信息包括:将 L个 N路视频信息合成为 L路视频信息,每个 N路视频信息合成为一路视频信息; 或者,
所述将一个 N路视频信息合成为 L路视频信息包括:将 N路视频信息中 的 (L-1 )路视频信息保持不变, 将 N- ( L-1 )路的视频信息合成为一路视频 信息。
5、 根据权利要求 2 所述的视频处理方法, 其特征在于, 所述在所述 N 路视频流中选择 L路视频流包括:
根据预设的控制规则, 在所述 N路视频流中选择预设的控制规则指定的 L路视频流; 或者
根据预设的优先级, 在所述 N路视频流中选择 L路视频流; 或者 根据与各视频流对应的音频流,按照音频流的音量大小选择 L路视频流; 或者
根据各视频流中携带的优先级, 选择 L路视频流。
6、 根据权利要求 1所述的视频处理方法, 其特征在于, 还包括: 对所述 N路视频流和 L路视频流进行协议转换或 /和速率适配。
7、 一种音频处理方法, 其特征在于, 包括:
获取各会议终端的音频流, 所述会议终端至少包括一个远程呈现会场的 终端及与所述远程呈现会场具有不同路数的音频流的终端;
对各会议终端的音频流进行混音处理;
将混音后的音频流发送给各会议终端。
8、 根据权利要求 7所述的音频处理方法, 其特征在于, 所述对各会议终 端的音频流进行混音处理包括: 将各非单流会议终端的音频流合成为一路音 频流或者将各非单流会议终端的音频流根据音量选择一路音频流后, 进行混 音处理。
9、 一种视频处理装置, 其特征在于, 包括:
视频获取模块, 用于获取第一会议终端发送的 N路视频流, 每个第一会 议终端支持 N路视频流;
确定模块, 用于确定与所述第一会议终端进行交互的第二会议终端, 所 述第二会议终端支持与 N不同的 L路视频流;
处理模块, 用于将所述 N路视频流中携带的 N路视频信息, 携带在 L路 视频流中;
传输模块, 用于将所述 L路视频流传输给所述第二会议终端。
10、 根据权利要求 9所述的视频处理装置, 其特征在于: 若 N大于 L, 所述处理模块具体用于将所述 N路视频信息合成为 L路视频信息, 将所述 L 路视频信息分别携带在 L路视频流中;
或者,
若 N小于 L, 所述处理模块具体用于将多个所述 N路视频信息合并为 L 路信息, 将所述 L路视频信息分别携带在 L路视频流中;
或者,
若 N大于 L, 所述处理模块具体用于分时地在所述 N路视频流中选择 L 路视频流, 得到分时的若干个 L路视频流;
所述传输模块具体用于将所述 L路视频流传输给所述第二会议终端包括: 将 所述若干个 L路视频流分时传输给所述第二会议终端。
11、 根据权利要求 10所述的视频处理装置, 其特征在于: 所述处理模块 进一步具体用于将若干个 N路视频信息合成为 L路视频信息; 或者所述处理 模块进一步地具体用于将一个 N路视频信息合成为 L路视频信息。
12、 根据权利要求 11所述的视频处理装置, 其特征在于: 所述处理模块 进一步具体用于将 L个 N路视频信息合成为 L路视频信息,每个 N路视频信 息合成为一路视频信息; 或者所述处理模块进一步地具体用于将一个 N路视 频信息中的 (L-1 )路视频信息保持不变, 将 N- ( L-1 )路的视频信息合成为 一路视频信息。
13、 根据权利要求 10所述的视频处理装置, 其特征在于:
所述处理模块用于根据预设的控制规则, 在所述 N路视频流中选择预设 的控制规则指定的 L路视频流; 或者
所述处理模块用于根据预设的优先级, 在所述 N路视频流中选择 L路视 频流; 或者
所述处理模块用于根据与各视频流对应的音频流, 按照音频流的音量大 小选择 L路视频流; 或者
所述处理模块用于根据各视频流中携带的优先级, 选择 L路视频流。
14、 根据权利要求 9所述的视频处理装置, 其特征在于, 还包括: 协议转换 /速率适配模块,用于对所述 N路视频流和 L路视频流进行协议 转换或 /和速率适配。
15、 一种音频处理装置, 其特征在于, 包括:
音频获取模块, 用于获取各会议终端的音频流, 所述会议终端至少包括 端;
混音模块, 用于对各会议终端的音频流进行混音处理;
发送模块, 用于将混音后的音频流发送给各会议终端。
16、 根据权利要求 15所述的音频处理装置, 其特征在于, 还包括: 音频合成 /选择模块, 与所述音频获取模块连接, 用于分别将各会议终端 的音频流合成为一路音频流或者根据音量选择一路音频流, 并将合成的或者 选择的一路音频流分别发送给所述混音模块。
17、 一种多点控制单元, 其特征在于, 包括:
第一接入模块, 用于接入第一会议终端, 与所述第一会议终端传输第一 媒体流, 所述第一媒体流包括 N路视频流和 N路音频流;
第二接入模块, 用于接入第二会议终端, 与所述第二会议终端传输第二 媒体流, 所述第二媒体流包括 L路视频流和 L路音频流, L与 N不相同; 媒体交换模块 , 用于将第一媒体流中的信息全部传输给第二会议终端 , 将第二媒体流中的信息全部传输给第一会议终端。
18、 根据权利要求 17所述的多点控制单元, 若 N大于 L, 其特征在于, 还包括:
视频合成模块, 与第一接入模块相连, 用于将 N路视频流合成为 L路视 频流;
所述媒体交换模块具体用于将合成后的 L路视频流转发给第二会议终 端;所述媒体交换模块还具体用于将多个所述 L路视频流合并为 N路视频流, 转发给第一会议终端。
19、 根据权利要求 18所述的多点控制单元, 其特征在于: 所述视频合成 模块具体用于将若干个 N路视频信息合成为 L路视频信息; 或者所述视频合 成模块具体用于将一个 N路视频信息合成为 L路视频信息。
20、 根据权利要求 19所述的多点控制单元, 其特征在于: 所述视频合成 模块进一步具体用于将 L个 N路视频信息合成为 L路视频信息,每个 N路视 频信息合成为一路视频信息; 或者所述视频合成模块进一步地具体用于将一 个 N路视频信息中的 (L-1 )路视频信息保持不变, 将 N- ( L-1 )路的视频信 息合成为一路视频信息。
21、 根据权利要求 17所述的多点控制单元, 若 N大于 L, 其特征在于: 所述媒体交换模块还用于分时地在所述 N路视频流中选择 L路视频流, 得到 分时的若干个 L路视频流, 并将所述若干个 L路视频流分时传输给所述第二 会议终端。
22、 根据权利要求 21所述的多点控制单元, 其特征在于:
所述媒体交换模块用于根据预设的控制规则, 在所述 N路视频流中选择 预设的控制规则指定的 L路视频流; 或者
所述媒体交换模块用于根据预设的优先级, 在所述 N路视频流中选择 L 路视频流; 或者
所述媒体交换模块用于根据与各视频流对应的音频流, 按照音频流的音 量大小选择 L路视频流; 或者
所述媒体交换模块用于根据各视频流中携带的优先级,选择 L路视频流。
23、 根据权利要求 17所述的多点控制单元, 若 N大于 L, 其特征在于, 还包括:
音频码流选择 /合成模块, 与第一接入模块和 /或第二接入模块相连, 用于 当 N大于 1时,将 N路音频流合成为一路音频流或者根据音量在 N路音频流 中选择一路音频流, 得到一路的第一音频流, 当 L大于 1时, 将 L路音频流 合成为一路音频流或者根据音量在 L路音频流中选择一路音频流, 得到一路 的第二音频流;
混音模块, 用于对音频码流选择 /合成模块得到的一路的第一音频流或者 第一接入模块接入的一路音频流, 及音频码流选择 /合成模块得到的一路的第 二音频流或者第二接入模块接入的一路音频流, 进行混音处理, 并将混音处 理后的音频流通过媒体交换模块发送给第一会议终端和第二会议终端;
或者,
音频码流选择 /合成模块, 与第一接入模块和第二接入模块相连, 用于将 N路音频流合成为一路音频流或者根据音量在 N路音频流中选择一路音频 流, 得到一路的第一音频流, 将 L路音频流合成为一路音频流或者根据音量 在 L路音频流中选择一路音频流, 得到一路的第二音频流;
混音模块, 用于第一音频流和第二音频流进行混音处理, 并将混音处理 后的音频流通过媒体交换模块发送给第一会议终端和第二会议终端。
24、 根据权利要求 17至 23任一权利要求所述的多点控制单元, 其特征 在于, 还包括: 协议转换 /速率适配模块, 与第一接入模块和第二接入模块相 连, 用于对 N路视频流和 L路视频流进行协议转换或速率适配处理。
25、 一种视频会议***, 其特征在于, 包括:
至少两个会议终端, 所述会议终端至少支持两种媒体流路数;
多点控制单元, 用于交换所述至少两个会议终端的媒体流中携带的全部 信息。
26、 根据权利要求 25所述的视频会议***, 其特征在于: 所述多点控制 单元为权利要求 17至 24任一权利要求所述的多点控制单元。
PCT/CN2009/074228 2008-09-28 2009-09-25 视频及音频处理方法、多点控制单元和视频会议*** WO2010034254A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP09815623A EP2334068A4 (en) 2008-09-28 2009-09-25 VIDEO AND AUDIOVER PROCESSING, MULTI-POINT CONTROL UNIT AND VIDEO CONFERENCE SYSTEM
US13/073,068 US20110261151A1 (en) 2008-09-28 2011-03-28 Video and audio processing method, multipoint control unit and videoconference system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200810223810.8 2008-09-28
CN2008102238108A CN101370114B (zh) 2008-09-28 2008-09-28 视频及音频处理方法、多点控制单元和视频会议***

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US13/073,068 Continuation US20110261151A1 (en) 2008-09-28 2011-03-28 Video and audio processing method, multipoint control unit and videoconference system

Publications (1)

Publication Number Publication Date
WO2010034254A1 true WO2010034254A1 (zh) 2010-04-01

Family

ID=40413705

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2009/074228 WO2010034254A1 (zh) 2008-09-28 2009-09-25 视频及音频处理方法、多点控制单元和视频会议***

Country Status (4)

Country Link
US (1) US20110261151A1 (zh)
EP (1) EP2334068A4 (zh)
CN (1) CN101370114B (zh)
WO (1) WO2010034254A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102883131A (zh) * 2011-07-15 2013-01-16 中兴通讯股份有限公司 基于远程呈现***的信令交互方法及装置
EP2569673A1 (de) * 2010-05-11 2013-03-20 Stephan Overkott Holografisches live-präsentationssystem und verfahren zur liveübertragung einer holografischen präsentation
US8830294B2 (en) 2009-05-27 2014-09-09 Huawei Device Co., Ltd. Method and system for video conference control, videoconferencing network equipment, and videoconferencing site

Families Citing this family (38)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101370114B (zh) * 2008-09-28 2011-02-02 华为终端有限公司 视频及音频处理方法、多点控制单元和视频会议***
CN101510990A (zh) * 2009-02-27 2009-08-19 深圳华为通信技术有限公司 一种处理远程呈现会议用户信号的方法及***
NO332394B1 (no) * 2009-04-29 2012-09-10 Cisco Systems Int Sarl Fremgangsmate og innretning for a opprette samtidig innkommende linjevitsjede anrop
US8520821B2 (en) * 2009-07-24 2013-08-27 Citrix Systems, Inc. Systems and methods for switching between computer and presenter audio transmission during conference call
CN102143346B (zh) * 2010-01-29 2013-02-13 广州市启天科技股份有限公司 一种巡航拍摄存储方法及***
CN101820524A (zh) * 2010-03-22 2010-09-01 中兴通讯股份有限公司 用于电视会议的视频播放方法
CN101931783A (zh) * 2010-09-21 2010-12-29 天地阳光通信科技(北京)有限公司 一种视频会议双流发送***及方法
CN102256099A (zh) * 2011-06-20 2011-11-23 中兴通讯股份有限公司 参数控制方法及装置
CN102868880B (zh) * 2011-07-08 2017-09-05 中兴通讯股份有限公司 一种基于远程呈现的媒体传输方法及***
TWI451746B (zh) * 2011-11-04 2014-09-01 Quanta Comp Inc 視訊會議系統及視訊會議方法
CN103248882A (zh) 2012-02-02 2013-08-14 腾讯科技(深圳)有限公司 多媒体数据传输的方法、多媒体数据传输装置及***
CN103634562B (zh) * 2012-08-24 2017-08-29 中国电信股份有限公司 用于视频会议的数据传送方法和***
CN103634697B (zh) * 2012-08-24 2017-09-26 中兴通讯股份有限公司 网真技术的实现方法和网真设备
CN102843542B (zh) * 2012-09-07 2015-12-02 华为技术有限公司 多流会议的媒体协商方法、设备和***
CN103051864B (zh) * 2012-12-26 2016-08-17 浙江元亨通信技术股份有限公司 移动视频会议方法
CN103905776B (zh) * 2012-12-26 2018-01-16 华为技术有限公司 码流处理方法和***、多点控制单元
CN104349117B (zh) * 2013-08-09 2019-01-25 华为技术有限公司 多内容媒体通信方法、装置及***
US10091461B2 (en) 2013-10-15 2018-10-02 Polycom, Inc. System and method for real-time adaptation of a conferencing system to current conditions of a conference session
CN103841462B (zh) * 2013-12-03 2018-01-26 深圳市九洲电器有限公司 数字机顶盒多屏幕播放节目的方法及装置
CN105227895B (zh) * 2014-06-30 2020-12-18 宝利通公司 Mcu堆叠中的视频布局及处理的方法
CN104469261B (zh) * 2014-12-26 2017-12-05 北京网视通联科技有限公司 一种基于cdn网络的视频会议***及方法
CN105141884A (zh) * 2015-08-26 2015-12-09 苏州科达科技股份有限公司 混合会议中广播音视频码流的控制方法、装置及***
US9706171B1 (en) * 2016-03-15 2017-07-11 Microsoft Technology Licensing, Llc Polyptych view including three or more designated video streams
CN106791583A (zh) * 2017-01-23 2017-05-31 北京思特奇信息技术股份有限公司 一种视频会议***及实现方法
CN108810443A (zh) * 2017-04-28 2018-11-13 南宁富桂精密工业有限公司 视频画面合成方法及多点控制单元
CN107241598B (zh) * 2017-06-29 2020-03-24 贵州电网有限责任公司 一种针对多路h.264视频会议的GPU解码方法
KR101861561B1 (ko) * 2017-07-24 2018-05-29 (주)유프리즘 복수 개의 영상회의용 단말을 이용하여 멀티 스크린 영상회의를 제공할 수 있는 영상회의 서버 및 그 방법
CN107396032A (zh) * 2017-07-26 2017-11-24 安徽四创电子股份有限公司 一种基于x86架构的多点控制单元及其工作方法
CN108881794B (zh) * 2017-12-08 2019-11-19 视联动力信息技术股份有限公司 一种基于视联网终端的网络会议通信方法和装置
CN108040218A (zh) * 2017-12-20 2018-05-15 苏州科达科技股份有限公司 一种视频会议的通讯方法及通讯设备
CN110418099B (zh) * 2018-08-30 2021-08-31 腾讯科技(深圳)有限公司 一种音视频处理方法、装置及存储介质
CN111355917A (zh) * 2018-12-20 2020-06-30 中兴通讯股份有限公司 信令服务器、媒体服务器、视频会议***及方法
CN109688365B (zh) * 2018-12-27 2021-02-19 北京真视通科技股份有限公司 视频会议的处理方法和计算机可读存储介质
CN109660751A (zh) * 2018-12-28 2019-04-19 中兴通讯股份有限公司 一种视频会议实现方法及装置、视频会议***、存储介质
JP2020202531A (ja) * 2019-06-13 2020-12-17 パナソニックIpマネジメント株式会社 会議システム、ビデオ会議装置及び映像処理方法
CN111182258B (zh) * 2020-02-11 2022-12-23 视联动力信息技术股份有限公司 一种网络会议的数据传输方法和装置
CN111787269B (zh) * 2020-07-20 2021-10-26 南京百家云科技有限公司 一种多媒体信息的生成方法、装置、电子设备及存储介质
JP2022089616A (ja) 2020-12-04 2022-06-16 パナソニックIpマネジメント株式会社 会議システム、ビデオ会議装置および映像処理方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1450807A (zh) * 2002-04-09 2003-10-22 华为技术有限公司 会议电视终端的双视传送***
CN101068345A (zh) * 2007-05-24 2007-11-07 杭州华三通信技术有限公司 视频监控方法和***以及网络传输设备
CN101098244A (zh) * 2006-06-26 2008-01-02 华为技术有限公司 一种多点会议中媒体处理的方法和***
CN101370114A (zh) * 2008-09-28 2009-02-18 深圳华为通信技术有限公司 视频及音频处理方法、多点控制单元和视频会议***

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5446491A (en) * 1993-12-21 1995-08-29 Hitachi, Ltd. Multi-point video conference system wherein each terminal comprises a shared frame memory to store information from other terminals
US8081205B2 (en) * 2003-10-08 2011-12-20 Cisco Technology, Inc. Dynamically switched and static multiple video streams for a multimedia conference
US9065667B2 (en) * 2006-09-05 2015-06-23 Codian Limited Viewing data as part of a video conference
US8208004B2 (en) * 2007-05-08 2012-06-26 Radvision Ltd. Device, methods, and media for providing multi-point video conferencing unit functions

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1450807A (zh) * 2002-04-09 2003-10-22 华为技术有限公司 会议电视终端的双视传送***
CN101098244A (zh) * 2006-06-26 2008-01-02 华为技术有限公司 一种多点会议中媒体处理的方法和***
CN101068345A (zh) * 2007-05-24 2007-11-07 杭州华三通信技术有限公司 视频监控方法和***以及网络传输设备
CN101370114A (zh) * 2008-09-28 2009-02-18 深圳华为通信技术有限公司 视频及音频处理方法、多点控制单元和视频会议***

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU RUIFANG ET AL: "The design and implement of MCU in H.323 video conferencing, part 1,3", JOURNAL OF BEIJING UNIVERSITY OF POSTS AND TELECOMMUNICATIONS, vol. 26, no. 5, July 2003 (2003-07-01), XP008145580 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8830294B2 (en) 2009-05-27 2014-09-09 Huawei Device Co., Ltd. Method and system for video conference control, videoconferencing network equipment, and videoconferencing site
EP2569673A1 (de) * 2010-05-11 2013-03-20 Stephan Overkott Holografisches live-präsentationssystem und verfahren zur liveübertragung einer holografischen präsentation
CN102883131A (zh) * 2011-07-15 2013-01-16 中兴通讯股份有限公司 基于远程呈现***的信令交互方法及装置
CN102883131B (zh) * 2011-07-15 2017-02-08 中兴通讯股份有限公司 基于远程呈现***的信令交互方法及装置

Also Published As

Publication number Publication date
CN101370114A (zh) 2009-02-18
EP2334068A1 (en) 2011-06-15
US20110261151A1 (en) 2011-10-27
EP2334068A4 (en) 2011-11-30
CN101370114B (zh) 2011-02-02

Similar Documents

Publication Publication Date Title
WO2010034254A1 (zh) 视频及音频处理方法、多点控制单元和视频会议***
US8988486B2 (en) Adaptive video communication channel
KR100880150B1 (ko) 멀티 포인트 화상회의 시스템 및 해당 미디어 프로세싱방법
US9172912B2 (en) Telepresence method, terminal and system
US7257641B1 (en) Multipoint processing unit
EP1683356B1 (en) Distributed real-time media composer
JP5320406B2 (ja) オーディオ処理の方法、システム、及び制御サーバ
CN101483758B (zh) 一种视频监控***与视频会议***的融合***
US8600530B2 (en) Method for determining an audio data spatial encoding mode
CN102868880B (zh) 一种基于远程呈现的媒体传输方法及***
WO2007082433A1 (fr) Appareil, dispositif de réseau et procédé de transmission de signaux audio et vidéo
CN101198008A (zh) 一种实现多屏多画面的方法和***
WO2011140812A1 (zh) 多画面合成方法、***及媒体处理装置
WO2011057511A1 (zh) 实现混音的方法、装置和***
WO2011015136A1 (zh) 一种会议控制的方法、装置和***
WO2014044059A1 (zh) 一种视频会议录播的方法、设备及***
WO2012175025A1 (zh) 远程呈现会议***、远程呈现会议的录制与回放方法
WO2015003532A1 (zh) 多媒体会议的建立方法、装置及***
US9013537B2 (en) Method, device, and network systems for controlling multiple auxiliary streams
CN101675626B (zh) 把数据从第一网络格式转换为非网络格式以及从非网络格式转换为第二网络格式
JP2003223407A (ja) コンテンツ共有支援システム、ユーザ端末装置、コンテンツ共有支援サーバ、複数のユーザ間でコンテンツを共有するための方法、そのプログラム並びにプログラム記録媒体
WO2014026478A1 (zh) 一种视频会议信号处理的方法、视频会议服务器及***
TWI531244B (zh) 視訊會議資料處理方法及系統
CN115734028A (zh) 一种基于级联编码的媒体流推送方法及***
JP2017092802A (ja) 会議通話システム及びそれに用いられるバックエンドシステム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09815623

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2009815623

Country of ref document: EP