WO2023071356A1 - 视频会议处理方法、处理设备、会议***以及存储介质 - Google Patents

视频会议处理方法、处理设备、会议***以及存储介质 Download PDF

Info

Publication number
WO2023071356A1
WO2023071356A1 PCT/CN2022/109317 CN2022109317W WO2023071356A1 WO 2023071356 A1 WO2023071356 A1 WO 2023071356A1 CN 2022109317 W CN2022109317 W CN 2022109317W WO 2023071356 A1 WO2023071356 A1 WO 2023071356A1
Authority
WO
WIPO (PCT)
Prior art keywords
target video
video
stream data
recording
participating
Prior art date
Application number
PCT/CN2022/109317
Other languages
English (en)
French (fr)
Inventor
朱玉荣
张轶君
宋向阳
张志广
Original Assignee
朱玉荣
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 朱玉荣 filed Critical 朱玉荣
Publication of WO2023071356A1 publication Critical patent/WO2023071356A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/152Multipoint control units therefor
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/146Data rate or code amount at the encoder output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/08Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division
    • H04N7/0806Systems for the simultaneous or sequential transmission of more than one television signal, e.g. additional information signals, the signals occupying wholly or partially the same frequency band, e.g. by time division the signals being two or more video signals

Definitions

  • This application mainly relates to the field of video conference application, and more specifically relates to a video conference processing method, processing equipment, conference system and storage medium.
  • multipoint control unit Multipoint Control Unit, MCU
  • MCU Multipoint Control Unit
  • MCU settings may not be configured, resulting in the inability to conduct video conferences.
  • the present application proposes a video conference processing method, the method comprising:
  • the participating terminals refer to terminals that establish a media session connection with the recording and broadcasting terminal;
  • the target video encoding parameter is smaller than the corresponding video encoding parameter of the participating terminal
  • the video stream data of the plurality of participating terminals are screen-combined, and the obtained target video stream data is sent to each of the participating terminals.
  • the sending the target video encoding parameters to the participating terminals includes:
  • the target video resolution configured for the multi-point control performance of the recording and broadcasting terminal; the target video resolution is smaller than the video resolution configured by the participating terminals;
  • the session initiation protocol sending a coding adjustment request carrying the target video resolution to the participating terminal, so that the participating terminal responds to the coding adjustment request and adjusts the default video resolution to the target video resolution, to obtain the video stream data with the target video resolution.
  • the acquiring the target video resolution configured for the multi-point control performance of the recording terminal includes:
  • the layout format of the video interface and the video resolution configured by the participating terminals determine the target video resolution configured for the multi-point control performance of the recording and broadcasting terminal.
  • the performing screen-combining processing on the video stream data, and sending the obtained target video stream data to the participating terminal for playing includes:
  • the implementation method of establishing a media session between the participating terminal and the recording and broadcasting terminal includes:
  • the meeting access request is generated according to a session initiation protocol .
  • the present application also proposes a video conference processing method, the method comprising:
  • the target video coding parameter is smaller than the corresponding video coding parameter of the participating terminal;
  • the video stream data is processed on screen to obtain the target video stream data to be output;
  • the present application also proposes a video conference processing device, the device comprising:
  • the number of participating terminals acquisition module is used to obtain the number of participating terminals of the target video conference; wherein, the participating terminals refer to terminals that establish a media session connection with the recording and broadcasting terminal;
  • a target video encoding parameter sending module configured to detect that the number of participating terminals reaches the threshold of participation, and send target video encoding parameters to the participating terminals; the target video encoding parameters are smaller than the corresponding video encoding parameters of the participating terminals. encoding parameters;
  • a video stream data receiving module configured to receive video stream data with the target video encoding parameters sent by the participating terminals
  • the video stream combined screen processing module is configured to perform combined screen processing on the video stream data of a plurality of participating terminals, and send the obtained target video stream data to the participating terminals.
  • the present application also proposes a video conference processing device, the device comprising:
  • the media session building module is used to establish a media session connection with the recording and broadcasting terminal for the target video conference;
  • a target video encoding parameter receiving module configured to receive the target video encoding parameter sent by the recording and broadcasting terminal; the target video encoding parameter is smaller than the corresponding video encoding parameter of the participating terminal;
  • a video encoding parameter adjustment module configured to adjust the video encoding parameters of the participating terminals to the target video encoding parameters
  • a video stream data sending module configured to obtain video stream data having the target video coding parameters, and send the video stream data to the recording and broadcasting terminal, so that the recording and broadcasting terminal has multiple information about the target video conference
  • the video stream data sent by each participating terminal is processed in a combined screen to obtain the target video stream data to be output;
  • the video stream data playing module is configured to receive the target video stream data sent by the recording and broadcasting terminal, decode the target video stream data, and play the decoded video stream data.
  • the present application also proposes a video conferencing system, the system includes a recording and broadcasting terminal and a plurality of participating terminals, wherein:
  • the recording and broadcasting terminal includes a first communication interface, a first memory and a first processor, wherein:
  • the first memory is used to store a first program for implementing the above-mentioned video conference processing method executed on the recording and broadcasting terminal side;
  • the first processor is configured to load and execute the first program stored in the first memory, so as to realize the video conference processing method executed on the recording terminal side;
  • the participating terminal includes a display, an audio player, an audio collector, an image collector, a second communication interface, a second memory and a second processor, wherein:
  • the second memory is used to store a second program that implements the above video conference processing method executed on the participant terminal side;
  • the second processor is configured to load and execute the second program stored in the second memory, so as to realize the video conference processing method executed on the participant terminal side.
  • the present application also proposes a computer-readable storage medium, on which a computer program is stored, and the computer program is loaded and executed by a processor to implement the above-mentioned video conference processing method.
  • this application provides a video conference processing method, processing equipment, conference system and storage medium. If the current environment is not equipped with a dedicated multi-point control device for video conferences, this application will be replaced by a recording and broadcasting terminal.
  • the multi-point control device supports the realization of video conferencing, meets the application requirements of multi-point video conferencing, and the recording and broadcasting terminal detects that the number of participating terminals of the target video conference reaches the participation threshold, and will send target video encoding parameters to each participating terminal, so that The participating terminals reduce their own video encoding parameters to the target video encoding parameters, so as to reduce the video bit rate of the video stream data transmitted from the participating terminals to the recording and broadcasting terminals, reduce bandwidth occupation, reduce the risk of packet loss, and enable the recording and broadcasting terminals to receive After receiving the video stream data of each participating terminal, there is no need to perform zoom processing for screen-combining processing, which improves processing efficiency.
  • Fig. 1 is a schematic structural diagram of a multi-point video conferencing system
  • Figure 2 is a schematic structural diagram of an optional example of a video conference system suitable for the video conference processing method proposed in this application
  • FIG. 3 is a schematic diagram of a hardware structure of another optional example of a video conference system suitable for the video conference processing method proposed in this application;
  • FIG. 4 is a schematic flow diagram of an optional example of a video conference processing method implemented on the recording and broadcasting terminal side of the present application;
  • FIG. 5 is a schematic flow diagram of another optional example of the video conference processing method implemented on the recording and broadcasting terminal side proposed by the present application;
  • FIG. 6 is a schematic flow diagram of an optional example of a video conference processing method implemented on the participant terminal side proposed by the present application;
  • FIG. 7 is a schematic structural diagram of an optional example of a video conference processing device proposed in this application.
  • FIG. 8 is a schematic structural diagram of another optional example of a video conference processing device proposed in this application.
  • Fig. 9 is a schematic structural diagram of another optional example of a video conference processing apparatus proposed in this application.
  • the recording and broadcasting terminal can be used as the participating terminal of the multipoint video conference system, and each participating terminal can access the multipoint control unit (Multipoint Control Unit, MCU) equipment to meet the video communication requirements between multiple participating terminals.
  • MCU Multipoint Control Unit
  • the multi-point video conferencing system may not be equipped with a separate MCU device (that is, a multi-point control device).
  • a recording terminal with a built-in MCU as a temporary The call access of each participating terminal participating in the conference, the processing and transmission of the audio and video code stream of each participating terminal, etc., enable the entire system to use this one-to-two, Video conferencing is realized by means of one dragging three groups of meetings.
  • the recording and broadcasting terminal used as a temporary MCU device for this meeting. If the MCU function of the recording and broadcasting terminal is activated and switched to the MCU working mode (that is, the multi-point control working mode), it can receive The default video resolution (such as 1920*1080 or 1280*720) video stream data sent by each participating terminal participating in the meeting makes the recording and broadcasting terminal in the MCU working mode, receiving 2 or 3 or even more channels
  • the bit rate may reach 6 ⁇ 8Mbps (megabits per second, a transmission rate unit, referring to the number of bits (bits) transmitted per second), which will occupy a large amount of Bandwidth increases the risk probability of packet loss during video stream data transmission and reduces the data transmission reliability of multipoint video conferencing.
  • the recording and broadcasting terminal as a temporary MCU device receives the video stream data sent by each participating terminal, it also needs to decode, zoom, and synthesize the same picture on the video stream data.
  • the terminal has more CPU resources, which will affect the working performance of the recording and broadcasting terminal and reduce the data processing efficiency.
  • this application proposes to determine the target video resolution (that is, a video encoding parameter) required by the recording and broadcasting terminal (that is, a terminal that can switch to the MCU working mode and be used as a temporary MCU device) after the group meeting is successful. ), which can be determined according to the network performance parameters, work performance parameters, and default video resolutions of the participating terminals.
  • the temporary MCU device can send the target video resolution to each participating terminal participating in the conference, so that each The participating terminals can adjust the video stream resolution of the video stream data to be sent according to the target video resolution, so as to reduce the bit rate of the video stream data transmission during the meeting, reduce the bandwidth occupation, and reduce the video stream data transmission process. Risk of packet loss.
  • the temporary MCU device after the uniform adjustment of the video resolution, when the temporary MCU device processes the received video stream data, it can directly process the decoded video stream data in a combined screen, which saves the scaling process The resources and time occupied by the process are reduced, and the efficiency of data processing is improved.
  • a flow chart is used in the present application to describe the operations performed by the system according to the embodiment of the present application. It should be understood that the preceding or following operations are not necessarily performed in the exact order. Instead, various steps may be processed in reverse order or simultaneously. At the same time, other operations can be added to these procedures, or a certain step or steps can be removed from these procedures.
  • FIG. 2 it is a schematic structural diagram of an optional application scenario of a video conference system suitable for the video conference processing method proposed in this application. As shown in FIG. ,in:
  • the recording and broadcasting terminal 10 can enter the multi-point control working mode to run, and it can be used as a temporary multi-point control device to support the realization of multi-point video conferencing.
  • the recording and broadcasting terminal 10 can have a built-in multi-point control unit MCU, and when it starts running, the recording and broadcasting terminal 10 can be operated in a multi-point control working mode, and the implementation process will not be described in detail in this application.
  • the recording and broadcasting terminal 10 supports video conferencing, and the recording and broadcasting device 10 may include but not limited to a first communication interface 11, a first memory 12 and a first processor 13 .
  • the first memory 12 can be used to store the first program of the video conference processing method implemented on the recording and broadcasting terminal side proposed in the present application; the first processor 13 can be used to load and execute the program stored in the first memory 12.
  • the first program implements the video conference processing method described on the recording and broadcasting terminal side in the following embodiments, and the implementation process is not described in detail in this embodiment of the present application.
  • the first communication interface 11 , the first memory 12 and the first processor 13 can be deployed in the built-in MCU of the recording terminal 10 , and the deployment method will not be described in detail in this application.
  • the first processor 13 may be the above-mentioned MCU.
  • the first communication interface 11, the first memory 12 and the first processor 13 may be directly deployed in the housing of the recording and broadcasting terminal 10. Do limit.
  • the recording and broadcasting terminal 10 can be used as a temporary MCU device, start its built-in MCU function, enter the multi-point control working mode, and establish and participate in the conference terminal 20 of this video conference.
  • media session realize the video stream data interaction between each participating terminal 20 and the recording and broadcasting terminal 10, and through the recording and broadcasting terminal 10, realize the video streaming data interaction between each participating terminal 20, the realization process can be combined
  • the working principle of the multi-point control equipment in the video conferencing system is determined.
  • This application has a built-in MCU for the recording and broadcasting terminal 10 in the video conferencing system to realize the tandem, distribution, and distribution of audio, video, data, signaling and other signals of each participating terminal 20
  • the interactive processing process is not described in detail.
  • the first communication interface 11 may include but not limited to WIFI module, 4G/5G/6G (fourth generation mobile communication network/fifth generation mobile communication network/sixth generation mobile communication network) module, GPRS module, GSM module, etc.
  • the data interface of the communication module to realize data interaction with other terminals; as required, it can also include USB interfaces, serial/parallel ports, various types of multimedia interfaces, etc., to realize wired connections with corresponding interfaces of other terminals, and
  • the application does not limit the interface type and quantity of the first communication interface 11 included in the recording terminal 10 for the data interaction among the components inside the recording terminal 10, and it depends on the situation.
  • the above-mentioned first memory 12 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device or other volatile solid-state storage devices.
  • the first processor 13 may be a central processing unit (Central Processing Unit, CPU), an application-specific integrated circuit (application-specific integrated circuit, ASIC), a digital signal processor (DSP), an application-specific integrated circuit (ASIC), an off-the-shelf Programmable gate array (FPGA) or other programmable logic devices, etc.
  • CPU Central Processing Unit
  • ASIC application-specific integrated circuit
  • DSP digital signal processor
  • ASIC application-specific integrated circuit
  • FPGA off-the-shelf Programmable gate array
  • the above-mentioned first processor 13 may include but not limited to an audio processor, a video processor, a data processor, a control processor, a multiplexer, etc.
  • the processing requirements of the conference processing method are determined. This application does not limit the type and number of processors included in the first processor 13, which may be determined according to the actual situation.
  • the built-in MCU of the recording and broadcasting terminal is the core of the system, which can provide management and control functions for multipoint video conferences, usually including multipoint controllers and multipoint processors.
  • Appropriate conference control methods such as chairman control, speaker control and voice control, can be used to conduct conferences, and the implementation process will not be described in detail in this application. It should be noted that this application does not limit the product type of the above-mentioned recording and broadcasting terminal with a built-in MCU.
  • the participant terminal 20 may be an electronic device used by a user to participate in a video conference, which may include but is not limited to a smart phone, a tablet computer, a wearable device, a smart watch, an augmented reality technology (Augmented Reality (AR) equipment, Virtual Reality (Virtual Reality) Reality, VR) equipment, vehicle-mounted equipment, robots, desktop computers, etc., users can select appropriate electronic equipment according to the needs of the scene, request access to the target video conference terminal, and establish the above-mentioned recording and broadcasting terminal 10 (That is, a multimedia session between temporary MCU devices) to realize video stream data interaction between the two.
  • AR Augmented Reality
  • VR Virtual Reality
  • vehicle-mounted equipment vehicle-mounted equipment
  • robots desktop computers
  • the above-mentioned conference terminal 20 may include but not limited to a display 21, an audio player 22, an audio collector 23, an image collector 24, a second communication interface 25, a second
  • the hardware structure of the memory 26, the second processor 27, etc. can be determined according to the functional requirements of the conference participant terminal 20, and this application does not list them one by one.
  • the display 21 may include a display panel, such as a touch display panel, a non-touch display panel, etc.
  • a display panel such as a touch display panel, a non-touch display panel, etc.
  • This application does not limit the content display principle and structure of the display 21.
  • the target video obtained by the MCU combined screen processing can be displayed.
  • Streaming data that is, the conference interface that presents the video conference. This application does not impose restrictions on the layout and content of the conference interface, which can be determined according to the situation.
  • the audio player 22 may include a loudspeaker, etc., for outputting audio signals in the target video stream data; the audio collector 23 may include a pickup, etc., for collecting speech audio signals, etc.
  • the image collector 24 can include a camera etc., and is used to collect the image information of the meeting participants.
  • the category and working principle of this type of input device/output device of the participant terminal 20 of the present application will not be described in detail, and can be based on the requirements of the video conference Application requirements are determined.
  • the category of the second communication interface 25 of the participating terminal 20 can be combined with the above description of the first communication interface 11. It can be understood that the first communication interface 11 and the second communication interface 25 can include at least one pair of matching communication interface, establish a multimedia session connection between the two, and realize multimedia data interaction between the two.
  • the second memory 26 can be used to store the second program of the video conference processing method implemented on the participant terminal side proposed by this application; the second processor 27 can be used to load and execute the second program stored in the second memory 26 to realize the following implementation In the example, the video conference processing method described on the participant terminal side and the implementation process are not described in detail in this embodiment of the present application.
  • the device types of the second memory 26 and the second processor 27 reference may be made to but not limited to the above description of the device types of the first memory 12 and the first processor 13 , which will not be repeated in this embodiment.
  • the conference participant terminals may also include sensor modules composed of various sensors, antennas, power management modules and other devices, which are not listed here in this application.
  • the video conferencing system shown in FIG. 2 and FIG. 3 does not constitute a limitation to the video conferencing system proposed in the embodiment of the present application.
  • the video conferencing system may include More or fewer components, or a combination of certain components, can be determined according to the needs of the scene, and this application does not
  • FIG. 4 it is a schematic flowchart of an optional example of the video conference processing method proposed by this application.
  • This method can be applied to the above-mentioned recording and broadcasting terminal.
  • the embodiment of the present application does not limit the product type of the recording and broadcasting terminal.
  • the recording and broadcasting terminal can have a built-in MCU, which can be used as a temporary multi-point control device during the video conference.
  • the video conference processing method proposed in this embodiment may include:
  • Step S11 obtaining the number of participating terminals of the target video conference
  • the target video conference can be a multi-point conference constructed for any business scenario, and in the target video conference, the above-mentioned recording and broadcasting terminal with built-in MCU is used as a temporary MCU device, which can also be said to be used as a
  • the session server meets the management and control requirements of this video conference. Therefore, the participating terminal of the target video conference may refer to a terminal that establishes a media session connection with a recording and broadcasting terminal with a built-in MCU, which may include but not limited to the electronic devices listed above.
  • the MCU can be activated by the recording terminal with built-in MCU, and after entering the multi-point control working mode, it can actively call several terminals to participate in the target video conference; or the terminal can actively send the conference for the target video conference to the recording terminal with built-in MCU
  • the access request, to actively request to participate in the target video conference, etc. can be implemented according to but not limited to SIP (Session initialization Protocol, session initiation protocol) in the implementation process.
  • Step S12 detecting that the number of participating terminals reaches the participation threshold, and sending target video encoding parameters to the participating terminals;
  • the recording and broadcasting terminal with a built-in MCU used as an MCU device can support a limited number of participating terminals. For example, a maximum of 4 participating terminals can be supported to participate in a video conference. As the number of terminals increases, the code rate will gradually increase, and the occupied bandwidth will also increase, thereby increasing the risk of packet loss, and will occupy more CPU resources of the recording and broadcasting terminal, affecting its working performance. When the number of participating terminals reaches a certain number, that is, the participation threshold, the participating terminals can be notified to adjust their original video encoding parameters to unified target video encoding parameters to reduce the bit rate, simplify video processing steps, and reduce Occupy CPU resources and improve processing efficiency.
  • the above conference participation threshold can be used to determine whether to trigger the recording and broadcasting terminal to send the target video encoding parameters to each participating terminal, so that the participating terminals perform video encoding processing according to the target video encoding parameters, and the participating terminals that require access to the recording and broadcasting terminal minimum amount. It can be understood that the conference participation threshold is less than the maximum number of conference participating terminals supported by the recording and broadcasting terminal, and this application does not limit its specific value.
  • the recording and broadcasting terminal when the recording and broadcasting terminal recognizes that the participating terminals participating in the target video conference reach the participation threshold (such as 2 sets, etc.), it can trigger the processing mechanism for adjusting the video encoding parameters of the participating terminals proposed in this application.
  • the participation threshold such as 2 sets, etc.
  • a small bit rate requires reducing the video coding parameters of the participating terminals themselves. This application does not limit the reduction of the video coding parameters, and it depends on the situation.
  • the recording and broadcasting terminal send the target video encoding parameters required by the built-in MCU to each participating terminal. That is, each participating terminal needs to adjust its own video encoding parameters to a target value, and this application does not limit the type and value of the target video encoding parameters.
  • the target video coding parameter is smaller than the video coding parameter of the participating terminal itself (such as the video stream coding parameter used by default in the video collection process).
  • the video coding parameter in this application can be Including but not limited to video resolution.
  • the transmission mode of the target video encoding parameters can be implemented according to the communication mode between the recording terminal and each participating terminal, and the embodiment of the present application does not describe it in detail.
  • the participating terminals obtain video according to the original video encoding parameters. After streaming the data, it can be directly sent to the recording and broadcasting terminal. That is to say, the recording and broadcasting terminal does not need to send the target video encoding parameters to each participating terminal. Forwarding to other conference participants for output, meeting the communication requirements between the participants participating in the video conference, the implementation process will not be described in detail in this application.
  • Step S13 receiving the video stream data with target video encoding parameters sent by the participating terminals;
  • each participating terminal in the target video conference after receiving the target video encoding parameters sent by the recording and broadcasting terminal with built-in MCU, it can adjust its own video encoding parameters accordingly, and then follow the adjusted target video encoding parameters Perform video stream data collection, such as video recording, to obtain video stream data with target video encoding parameters, and send it to the recording and broadcasting terminal with built-in MCU.
  • video stream data collection such as video recording
  • This application does not describe the acquisition of video stream data and its transmission time process in detail.
  • step S14 the video stream data of a plurality of participating terminals are screen-combined, and the obtained target video stream data is sent to each participating terminal.
  • each participating terminal reduces its own video encoding parameters and then performs video recording, and the obtained video stream data (that is, the data obtained by encoding the directly collected video data with the adjusted video encoding parameters)
  • the amount is smaller than the data amount of the video stream data obtained according to the video encoding parameters before adjustment, that is, the file size transmitted by the participating terminals is reduced, thereby reducing the bandwidth occupied by the video stream data transmission, thereby reducing the risk of packet loss.
  • the recording and broadcasting terminal with built-in MCU After the recording and broadcasting terminal with built-in MCU receives the video stream data sent by each participating terminal, in order to display the conference window of each conference participant in the same video conference interface, it is necessary to combine the multiple video stream data. Multiple videos are combined and output on the same screen, and the final target video stream conference is fed back to each participating terminal for output, so that each conference participant can output multiple participants through the video conference interface output by the participating terminal screen
  • the conference sub-windows corresponding to the terminals respectively present the video collected by the corresponding participating terminals in the conference sub-window. This application does not limit how the MCU built in the recording and broadcasting terminal implements step S14.
  • the recording and broadcasting terminal with built-in MCU can directly process multiple video stream data together, without encoding adjustment processing before this, which improves the processing efficiency .
  • FIG. 5 it is a schematic flow diagram of another optional example of the video conference processing method proposed by this application.
  • This embodiment can be an optional detailed implementation method for the video conference processing method described above, but it is not limited Based on the detailed implementation method, and the method is still executed by the recording and broadcasting terminal with built-in MCU, as shown in Figure 5, the method may include:
  • Step S21 responding to the multi-point control function trigger request for the target video conference, controlling the recording and broadcasting terminal to enter the multi-point control working mode;
  • this application can choose to use the SIP protocol to realize the video interaction protocol between each participating terminal participating in the target video conference and the recording and broadcasting terminal with a built-in MCU.
  • the recording and broadcasting terminal can be used as a teacher role to open the After the built-in MCU function of the recording and broadcasting terminal, configure the interactive protocol of the recording and broadcasting client (that is, the recording and broadcasting application program) of the recording and broadcasting terminal as the SIP protocol, and input the IP address of the built-in MCU to make a call, and establish a connection with each participating terminal. between media sessions.
  • relevant personnel can trigger the MCU function option by opening the configuration page of the recording and broadcasting terminal with the built-in MCU, that is, triggering the multi-point control function of the recording and broadcasting terminal, or trigger the recording through the shortcut trigger mode of the multi-point control function.
  • the multi-point control function of the broadcasting terminal etc., generates a multi-point control function trigger request for the target video conference, so that after the recording and broadcasting terminal detects the multi-point control function triggering request, it responds to the request and starts the built-in MCU function of the recording and broadcasting terminal, But it is not limited to the implementation method of triggering described in this embodiment.
  • Step S22 receiving a conference access request for a target video conference sent by a participating terminal, and establishing a media session connection between the recording terminal and the participating terminal;
  • the recording terminal with a built-in MCU can actively call each participating terminal to access the target video conference; in some other embodiments, as described in step S22, for any terminal that wants to participate in the target video conference , can actively send a conference access request to the recording and broadcasting terminal with a built-in MCU, and the conference access request can be generated according to the session initiation protocol, such as a SIP-INVITE request, etc.
  • This application does not make any changes to the content and format of the conference access request limit. It can be understood that the conference access request may generally carry the conference identification number of the target video conference and the like.
  • the recording and broadcasting terminal with the built-in MCU After the recording and broadcasting terminal with the built-in MCU receives the conference access request sent by any terminal, it determines that the terminal is allowed to access the target video conference, and can feed back a response message to the conference access request to inform the participating terminal to record
  • the broadcasting terminal receives the conference access request sent by it. After receiving the response message, the participating terminal can further feed back an acknowledgment message to the recording and broadcasting terminal to confirm receipt of the response message, such as an ACK message (Acknowledgment, confirmation message).
  • an ACK message Acknowledgment, confirmation message
  • terminals that want to participate in the target video conference they can access the target video conference according to the method described above, and construct a media session connection with the recording terminal of the target video conference with a built-in MCU as the target video conference.
  • a participant terminal of the video conference this application does not describe in detail one by one.
  • Step S23 obtaining the number of participating terminals of the target video conference
  • Step S24 detecting that the number of participating terminals has reached the threshold of participating in the conference, and obtaining the target video resolution configured for the multi-point control performance of the recording and broadcasting terminal;
  • the bandwidth occupied by the interactive video stream data between the participating terminals and the recording and broadcasting terminals with built-in MCUs will gradually increase, but network resources are limited. It will affect the data transmission performance, and even cause packet loss due to the failure of some video stream data transmission, which will cause the video content output by the video conference interface to be unsmooth.
  • the built-in MCU of the recording and broadcasting terminal recognizes that the number of participating terminals accessing the target video conference reaches the participation threshold, it is hoped that by reducing the video resolution of the video stream data of each participating terminal, To reduce the bit rate (that is, the number of data bits transmitted per unit time during data transmission), thereby reducing the file size of the video file (that is, the file where the video stream data is located) transmitted from each participating terminal to the recording and broadcasting terminal.
  • the recording and broadcasting terminal can determine the target video resolution according to its multi-point control performance (such as network performance, CPU available resources, etc.), that is, the target video resolution required by the built-in MCU, such as 960*540, which is usually smaller than that of each participant
  • the default video resolution of the terminal such as 1920*1080 or 1280*720, etc., but in order to avoid excessive distortion of the obtained video image content due to too small a sampling rate, the target video resolution can be determined according to the default video resolution of each participating terminal , the present application does not limit the numerical value of the target video resolution, it depends on the situation.
  • the recording terminal with built-in MCU can support up to 4 participating terminals to access the target video conference.
  • each participating terminal The corresponding number of conference sub-windows contained in the output video conferencing interface can be laid out according to the left and right interfaces, character-shaped interface, and four-grid interface. Therefore, the target video resolution obtained in this application may be 1/2 of the default video resolution of the participating terminals, but it is not limited thereto.
  • the video interface layout format of the target video conference can be determined according to the number of participating terminals.
  • the target video resolution but not limited to the target video resolution acquisition method proposed in this application.
  • the target video resolution for the built-in MCU can be obtained in response to the video resolution adjustment by operating the corresponding video resolution adjustment button (such as a physical button or a virtual function button, etc.) of the recording and broadcasting terminal;
  • the target video resolution can also be determined by voice or other input methods, and this application does not give examples and details here.
  • Step S25 according to the session initiation protocol, send a coding adjustment request carrying the target video resolution to the participating terminals;
  • a coding adjustment request containing the target video resolution can be generated, such as a SIP-INFO request, and sent to each participating terminal accessing the target video conference, so that each participating terminal receives
  • a corresponding response message can be fed back, such as 2000K, and at the same time, the default video resolution can be adjusted to the target video resolution in response to the encoding adjustment request, so as to obtain video stream data with the target video resolution.
  • This application does not describe in detail how the participating terminal implements its own video resolution coding adjustment method, which can be determined according to the configuration method of the coding parameters and decoding parameters of the codec.
  • the participating terminal modifies its own encoding parameters according to the target video resolution, it will reduce its own video resolution, such as modifying 1920*1080 to 960*540, which will reduce its transmission to the recording and broadcasting terminal with built-in MCU.
  • the bitrate of the video stream is not limited to 1920*1080 to 960*540.
  • the built-in MCU may need to receive a video bit rate of 6 Mbps, resulting in a large amount of packet loss, which eventually leads to blurred screens and freezes in the video output by the participating terminals, reducing user experience.
  • the video bit rate can be reduced by 4 times (that is, when there are 4 participating terminals), so that only the video bit rate of 1 ⁇ 1.5Mbps needs to be received, and the same network quality, reducing bandwidth occupation, and improving video image output quality.
  • Step S26 receiving the video stream data with the target video resolution sent by the participating terminals
  • Step S27 decoding the video stream data with the target video resolution sent by the plurality of participating terminals
  • Step S28 merging the same frame of video stream data corresponding to the decoded multiple participating terminals to obtain the corresponding frame of video stream data with the target video resolution;
  • Step S29 encoding the obtained multi-frame video stream data to obtain target video stream data to be output;
  • Step S210 sending the target video stream data to multiple participating terminals.
  • the same frame of video stream data (that is, the same frame of video image data) from different participating terminals, it can be merged into one with Corresponding frames of video images of the target video resolution are merged frame by frame in this way to obtain merged video stream data.
  • the video stream data includes video stream data content sent by multiple conference participants.
  • the recording and broadcasting terminal with built-in MCU needs to encode it first, and then send the encoded video stream data to each participating terminal, so as to Make the terminal participating in the conference use a corresponding decoding method to decode it and output it.
  • This application does not limit the encoding and decoding implementation method of the video stream data.
  • the recording and broadcasting terminal with built-in MCU recognizes that the number of participating terminals accessing the target video conference reaches the participation threshold, and sends the target video resolution required by the built-in MCU to each participating terminal. Reduce its original video resolution to the target video resolution, so that each participating terminal can collect video accordingly, and transmit the obtained video stream data to the recording and broadcasting terminal with built-in MCU, which can reduce the bit rate and reduce the Occupied bandwidth, so as to achieve the effect of reducing the risk of packet loss.
  • the recording and broadcasting terminal with a built-in MCU obtains the video stream data sent by multiple participating terminals, since the video resolutions are the same, no scaling process is required, and the screen combination process can be directly performed to obtain the desired output target video stream data. Improved processing efficiency.
  • FIG. 6 it is a schematic flow diagram of another optional example of the video conference processing method proposed by the present application.
  • the embodiment of the present application is executed by any participating terminal, and the participating terminal can be mutually configured with a recording and broadcasting terminal with a built-in MCU.
  • the video conference processing method proposed in this application for the method steps performed by the recording and broadcasting terminal, you can refer to the description of the corresponding part of the above embodiment.
  • This embodiment describes the implementation process of the video conference processing method from the participant terminal side, as shown in As shown in Figure 6, the method may include:
  • Step S31 establishing a media session connection with the recording and broadcasting terminal for the target video conference
  • the terminal that wants to participate in the target video conference can initiate a SIP-INVITE request to request the establishment of a media connection with the recording and broadcasting terminal.
  • a SIP-INVITE request to request the establishment of a media connection with the recording and broadcasting terminal.
  • an ACK message can be sent to the recording terminal.
  • the implementation process can refer to the description of the corresponding part above, and will not be described in this embodiment.
  • any participating terminal that establishes a media session with the recording and broadcasting terminal with built-in MCU, it can send video stream data to and from the built-in MCU of the recording and broadcasting terminal.
  • the resolution may be a default video resolution, such as 1080P, etc., but is not limited thereto.
  • Step S32 receiving the target video encoding parameters sent by the recording and broadcasting terminal
  • the built-in MCU of the recording and broadcasting terminal recognizes that the number of participating terminals reaches the participation threshold, and sends the determined target video encoding parameters to each participating terminal.
  • the target video coding parameter is smaller than the corresponding video coding parameter of the participating terminals, which can achieve the technical effect of reducing the bit rate.
  • the target video encoding parameters may include but not limited to target video resolution.
  • Step S33 adjusting the video coding parameters of the participating terminals to the target video coding parameters
  • Step S34 obtaining video stream data with target video encoding parameters, and sending the video stream data to the recording and broadcasting terminal;
  • the participating terminals After the participating terminals adjust their own video encoding parameters, they will perform video capture and encoding according to the target recognition encoding parameters, which reduces the video bit rate transmitted to the recording and broadcasting terminals with built-in MCUs, reduces bandwidth occupation, and reduces the risk of packet loss.
  • the target recognition encoding parameters reduces the video bit rate transmitted to the recording and broadcasting terminals with built-in MCUs, reduces bandwidth occupation, and reduces the risk of packet loss.
  • Step S35 receiving the target video stream data sent by the recording and broadcasting terminal, decoding the target video stream data, and playing the decoded video stream data.
  • the number of participating terminals connected to the built-in MCU of the recording and broadcasting terminal reaches the participation threshold.
  • the participating terminals can follow the built-in
  • the target video coding parameters required by the MCU are used to adjust the video coding parameters of the participating terminal itself, so as to reduce the video bit rate of its video stream data transmission and reduce the bandwidth occupation.
  • each participating terminal sends video stream data with unified video encoding parameters to the built-in MCU of the recording and broadcasting terminal, which saves scaling processing of the built-in MCU and improves processing efficiency.
  • FIG. 7 it is a schematic structural diagram of an optional example of a video conference processing device proposed in this application.
  • the device can be described from the side of the recording and broadcasting terminal with a built-in MCU.
  • the device may include:
  • Participating terminal quantity acquisition module 31 is used to obtain the quantity of the participating terminal of target video conference; Wherein, described participating terminal refers to the terminal that establishes media session connection with recording and broadcasting terminal;
  • the target video encoding parameter sending module 32 is used to detect that the number of the participating terminals reaches the participation threshold, and send the target video encoding parameter to the participating terminal; the target video encoding parameter is less than the corresponding Video encoding parameters;
  • a video stream data receiving module 33 configured to receive the video stream data with the target video coding parameters sent by the participating terminals;
  • the video stream combined screen processing module 34 is configured to perform combined screen processing on the video stream data of a plurality of participating terminals, and send the obtained target video stream data to the participating terminals.
  • the above-mentioned target video coding parameter sending module 32 may include:
  • the target video resolution acquisition unit 321 is used to acquire the target video resolution configured for the multi-point control performance of the recording and broadcasting terminal; the target video resolution is less than the video resolution configured by the participating terminals;
  • the encoding adjustment request sending unit 322 is configured to send an encoding adjustment request carrying the target video resolution to the participating terminal according to the session initiation protocol, so that the participating terminal responds to the encoding adjustment request with default
  • the video resolution of the video is adjusted to the target video resolution, and the video stream data having the target video resolution is obtained.
  • the above-mentioned target video resolution obtaining unit 321 may include:
  • a video interface layout format determining unit configured to determine the video interface layout format of the target video conference according to the number of participating terminals
  • the target video resolution determination unit is configured to determine the target video resolution configured for the multi-point control performance of the recording and broadcasting terminal according to the layout format of the video interface and the video resolution configured by the participating terminals.
  • the video stream combining screen processing module 34 may include:
  • a decoding unit 341, configured to decode the video stream data with the target video coding parameters sent by each of the plurality of participating terminals;
  • Merge processing unit 342 configured to merge the same frame of video stream data corresponding to multiple decoded participating terminals to obtain corresponding frame video stream data having the target video encoding parameters
  • An encoding unit 343, configured to encode the obtained multi-frame video stream data to obtain target video stream data to be output;
  • a target video stream data sending unit 344 configured to send the target video stream data to a plurality of participating terminals.
  • the above-mentioned device may also include:
  • the media session establishment module is used to establish the media session connection between the participating terminal and the recording and broadcasting terminal;
  • the media session establishment module may include:
  • a multi-point control working mode start unit used to respond to a multi-point control function trigger request for the target video conference, and control the recording and broadcasting terminal to enter the multi-point control working mode;
  • a meeting access unit configured to receive a meeting access request for the target video conference sent by a participating terminal, and establish a media session connection between the recording terminal and the participating terminal; the meeting access request Generated according to the Session Initiation Protocol.
  • FIG. 9 it is a schematic structural diagram of another optional example of a video conference processing device proposed in this application.
  • the device can be described from the participant terminal side.
  • the device can include:
  • a media session building module 41 is used to set up a media session connection with the recording and broadcasting terminal for the target video conference;
  • the target video encoding parameter receiving module 42 is used to receive the target video encoding parameter sent by the recording and broadcasting terminal; the target video encoding parameter is less than the corresponding video encoding parameter of the participating terminal;
  • a video encoding parameter adjustment module 43 configured to adjust the video encoding parameters of the participating terminals to the target video encoding parameters
  • the video stream data sending module 44 is used to obtain the video stream data with the target video coding parameters, and send the video stream data to the recording and broadcasting terminal, so that the recording and broadcasting terminal can control the target video conference
  • the video stream data sent by a plurality of participating terminals are combined for screen processing to obtain target video stream data to be output;
  • the video stream data playing module 45 is configured to receive the target video stream data sent by the recording and broadcasting terminal, decode the target video stream data, and play the decoded video stream data.
  • modules and units in the above-mentioned device embodiments can all be stored as program modules in the memory of the corresponding side terminal, and the processor of the corresponding side terminal executes the above-mentioned program stored in the memory.
  • Modules to realize corresponding functions Regarding the functions realized by each program module and its combination, and the technical effects achieved, reference may be made to the descriptions of the corresponding parts of the above-mentioned method embodiments, and details will not be repeated in this embodiment.
  • the present application also provides a computer-readable storage medium, on which a computer program can be stored, and the computer program can be invoked and loaded by a processor, so as to implement each step of the video conference processing method described in the above-mentioned embodiments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本申请提供了一种视频会议处理方法、处理设备、会议***及存储介质,在当前环境未配置实现视频会议的专用多点控制设备的情况下,本申请将由录播终端替代该多点控制设备支持实现视频会议,满足多点视频会议应用需求,且录播终端检测到目标视频会议的参会终端的数量达到参会阈值,将向各参会终端发送目标视频编码参数,以使参会终端减小自身的视频编码参数至目标视频编码参数,以降低参会终端向录播终端传输视频流数据的视频码率,减少对带宽占用,降低丢包风险,且使得录播终端接收到各参会终端的视频流数据后,无需进行缩放处理进行合屏处理,降低了处理成本,提高了处理效率。

Description

视频会议处理方法、处理设备、会议***以及存储介质 技术领域
本申请主要涉及视频会议应用领域,更具体地说是涉及一种视频会议处理方法、处理设备、会议***及存储介质。
背景技术
随着计算机网络技术和宽带建设的不断发展,多点视频会议***已广泛应用于工作、生活、学习等领域。在多点会议***应用中,通常需要多点控制单元(Multipoint Control Unit,MCU)作为多媒体信息交换机,实现参与会议的多个终端的呼叫和连接,并对各终端发送的音视频码流进行处理,向各终端发送给对应的音视频码流,实现各终端之间的互相观看和交流。
技术问题
然而,在某些应用场景下,可能没有配置MCU设置,导致无法进行视频会议。
技术解决方案
有鉴于此,本申请提出了一种视频会议处理方法,所述方法包括:
获取目标视频会议的参会终端的数量;其中,所述参会终端是指与录播终端建立媒体会话连接的终端;
检测到所述参会终端的数量达到参会阈值,向所述参会终端发送目标视频编码参数;所述目标视频编码参数小于所述参会终端相应的视频编码参数;
接收所述参会终端发送的具有所述目标视频编码参数的视频流数据;
对多个所述参会终端的所述视频流数据进行合屏处理,将得到的目标视频流数据发送至各所述参会终端。
在一些实施例中,所述向所述参会终端发送目标视频编码参数,包括:
获取针对所述录播终端的多点控制性能配置的目标视频分辨率;所述目标视频分辨率小于所述参会终端配置的视频分辨率;
按照会话初始协议,向所述参会终端发送携带有所述目标视频分辨率的编码调整请求,以使所述参会终端响应所述编码调整请求,将默认的视频分辨率调整为所述目标视频分辨率,获得具有所述目标视频分辨率的视频流数据。
在一些实施例中,所述获取针对所述录播终端的多点控制性能配置的目标视频分辨率,包括:
依据所述参会终端的数量,确定所述目标视频会议的视频界面布局格式;
依据所述视频界面布局格式以及所述参会终端配置的视频分辨率,确定针对所述录播终端的多点控制性能配置的目标视频分辨率。
在一些实施例中,所述对所述视频流数据进行合屏处理,将得到的目标视频流数据发送至所述参会终端进行播放,包括:
对多个所述参会终端各自发送的具有所述目标视频编码参数的视频流数据进行解码处理;
对解码后的多个所述参会终端对应的同一帧视频流数据进行合并处理,得到具有所述目标视频编码参数的相应帧视频流数据;
对得到的多帧视频流数据进行编码处理,得到待输出的目标视频流数据;
将所述目标视频流数据发送至多个所述参会终端。
在一些实施例中,所述参会终端与录播终端建立媒体会话的实现方法,包括:
响应针对目标视频会议的多点控制功能触发请求,控制录播终端进入多点控制工作模式;
接收参会终端发送的针对所述目标视频会议的会议接入请求,建立所述录播终端与所述参会终端之间的媒体会话连接;所述会议接入请求是按照会话初始协议生成的。
又一方面,本申请还提出了一种视频会议处理方法,所述方法包括:
建立与针对目标视频会议的录播终端的媒体会话连接;
接收所述录播终端发送的目标视频编码参数;所述目标视频编码参数小于参会终端相应的视频编码参数;
将所述参会终端的视频编码参数调整为所述目标视频编码参数;
获得具有所述目标视频编码参数的视频流数据,将所述视频流数据发送至所述录播终端,以使所述录播终端对所述目标视频会议的多个参会终端发送的所述视频流数据进行合屏处理,得到待输出的目标视频流数据;
接收所述录播终端发送的所述目标视频流数据,对所述目标视频流数据进行解码,播放解码后的视频流数据。
又一方面,本申请还提出了一种视频会议处理装置,所述装置包括:
参会终端数量获取模块,用于获取目标视频会议的参会终端的数量;其中,所述参会终端是指与录播终端建立媒体会话连接的终端;
目标视频编码参数发送模块,用于检测到所述参会终端的数量达到参会阈值,向所述参会终端发送目标视频编码参数;所述目标视频编码参数小于所述参会终端相应的视频编码参数;
视频流数据接收模块,用于接收所述参会终端发送的具有所述目标视频编码参数的视频流数据;
视频流合屏处理模块,用于对多个所述参会终端的所述视频流数据进行合屏处理,将得到的目标视频流数据发送至所述参会终端。
又一方面,本申请还提出了一种视频会议处理装置,所述装置包括:
媒体会话构建模块,用于建立与针对目标视频会议的录播终端的媒体会话连接;
目标视频编码参数接收模块,用于接收所述录播终端发送的目标视频编码参数;所述目标视频编码参数小于参会终端相应的视频编码参数;
视频编码参数调整模块,用于将所述参会终端的视频编码参数调整为所述目标视频编码参数;
视频流数据发送模块,用于获得具有所述目标视频编码参数的视频流数据,将所述视频流数据发送至所述录播终端,以使所述录播终端对所述目标视频会议的多个参会终端发送的所述视频流数据进行合屏处理,得到待输出的目标视频流数据;
视频流数据播放模块,用于接收所述录播终端发送的所述目标视频流数据,对所述目标视频流数据进行解码,播放解码后的视频流数据。
又一方面,本申请还提出了一种视频会议***,所述***包括录播终端以及多个参会终端,其中:
所述录播终端包括第一通信接口、第一存储器和第一处理器,其中:
所述第一存储器用于存储实现录播终端侧执行的上述视频会议处理方法的第一程序;
所述第一处理器,用于加载执行所述第一存储器存储的所述第一程序,实现录播终端侧执行的视频会议处理方法;
所述参会终端包括显示器、音频播放器、音频采集器、图像采集器、第二通信接口、第二存储器和第二处理器,其中:
所述第二存储器用于存储实现参会终端侧执行的上述视频会议处理方法的第二程序;
所述第二处理器,用于加载执行所述第二存储器存储的所述第二程序,实现参会终端侧执行的视频会议处理方法。
   又一方面,本申请还提出了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器加载执行,实现上述的视频会议处理方法。
有益效果
由此可见,本申请提供了一种视频会议处理方法、处理设备、会议***及存储介质,在当前环境未配置实现视频会议的专用多点控制设备的情况下,本申请将由录播终端替代该多点控制设备支持实现视频会议,满足多点视频会议应用需求且录播终端检测到目标视频会议的参会终端的数量达到参会阈值,将向各参会终端发送目标视频编码参数,以使参会终端减小自身的视频编码参数至目标视频编码参数,以降低参会终端向录播终端传输视频流数据的视频码率,减少对带宽占用,降低丢包风险,且使得录播终端接收到各参会终端的视频流数据后,无需进行缩放处理进行合屏处理,提高了处理效率。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据提供的附图获得其他的附图。
图1为多点视频会议***的结构示意图;
图2为适用于本申请提出的视频会议处理方法的视频会议***的一可选示例的结构示意图
图3为适用于本申请提出的视频会议处理方法的视频会议***的又一可选示例的硬件结构示意图;
图4为本申请提出的录播终端侧实现的视频会议处理方法的一可选示例的流程示意图;
图5为本申请提出的录播终端侧实现的视频会议处理方法的又一可选示例的流程示意图;
图6为本申请提出的参会终端侧实现的视频会议处理方法的一可选示例的流程示意图;
图7为本申请提出的视频会议处理装置的一可选示例的结构示意图;
图8为本申请提出的视频会议处理装置的又一可选示例的结构示意图;
图9为本申请提出的视频会议处理装置的又一可选示例的结构示意图。
本发明的最佳实施方式
结合背景技术部分描述的技术方案,为了满足用户对视频会议的高稳定性和自动恢复能力的应用需求,提出将多点会议***与录播***融合构成多点视频会议***,如图1所示,可以由录播终端作为多点视频会议***的参会终端,各参会终端可以接入本次会议的多点控制单元(Multipoint Control Unit,MCU)设备,满足多台参会终端之间的视频通信需求。
然而,在某一些业务场景中,多点视频会议***可能未配置单独的MCU设备(即多点控制设备),为了保证***正常运行,提出使用内置MCU的录播终端作为临时MCU设备,实现对参与会议的各参会终端的呼叫接入,对各参会终端的音视频码流的处理和传输等,使得整个***在没有专用的MCU设备的情况下,也能够通过这种一拖二、一拖三组会的方式实现视频会议。
基于此,在组会过程中,确定本次会议作为临时MCU设备使用的录播终端,如启动该录播终端的MCU功能,切换到MCU工作模式(即多点控制工作模式)后,可以接收参与会议的各参会终端发送的默认视频分辨率(如1920*1080或者1280*720等)的视频流数据,使得该录播终端处于MCU工作模式下,接收2路或3路甚至更多路的该视频流数据,若业务场景对实时性要求较高,码率可能会达到6~8Mbps(megabits per second,一种传输速率单位,指每秒传输的位(比特)数量),会占用大量带宽,增加了视频流数据传输过程中丢包的风险概率,降低多点视频会议的数据传输可靠性。
而且,作为临时MCU设备的录播终端在接收到的各参会终端的发送的视频流数据后,还需要对视频流数据进行解码、缩放、合成同一画面等处理,该处理过程会占用录播终端较多的CPU资源,这会影响录播终端的工作性能,降低数据处理效率。
为了进一步改善上述问题,本申请提出在组会成功后,确定录播终端(即能够切换到MCU工作模式,作为临时MCU设备使用的终端)所需的目标视频分辨率(即一种视频编码参数),其可以依据录播终端的网络性能参数、工作性能参数、参会终端默认的视频分辨率等确定,临时MCU设备可以将目标视频分辨率发送至参与会议的各参会终端,以使各参会终端可以依据该目标视频分辨率,调整待发送视频流数据的视频流分辨率,以降低会议过程中视频流数据传输的码率,减少对带宽的占用,降低视频流数据传输过程中的丢包风险。
在又一些实施例中,经过这种视频分辨率的统一调整后,临时MCU设备对接收到的视频流数据进行处理时,可以直接对解码后的视频流数据进行合屏处理,节省了缩放处理过程所占用的资源和时间,提高了数据处理效率。
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合,也就是说,基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
在本申请文件中,除非上下文明确提示例外情形,“一”、“一个”、“一种”和/或“该”等词并非特指单数,也可包括复数。一般说来,术语“包括”与“包含”仅提示包括已明确标识的步骤和元素,而这些步骤和元素不构成一个排它性的罗列,方法或者设备也可能包含其它的步骤或元素。由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、商品或者设备中还存在另外的相同要素。
其中,在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“多个”是指两个或多于两个。以下术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。
另外,本申请中使用了流程图用来说明根据本申请的实施例的***所执行的操作。应当理解的是,前面或后面操作不一定按照顺序来精确地执行。相反,可以按照倒序或同时处理各个步骤。同时,也可以将其他操作添加到这些过程中,或从这些过程移除某一步或数步操作。
参照图2,为适用于本申请提出的视频会议处理方法的视频会议***的一可选应用场景的结构示意图,如图2所示,该***可以包括录播终端10以及多个参会终端20,其中:
结合图3所示的***硬件结构示意图,录播终端10能够进入多点控制工作模式下运行,所作为临时多点控制设备支持实现多点视频会议,解决当前环境未配置多点视频会议专用多点控制设备的技术问题,保证视频会议正常执行。
因此,录播终端10可以内置有多点控制单元MCU,在其启动运行时,可以使得该录播终端10进入多点控制工作模式下运行,实现过程本申请不做详述。基于此,该录播终端10为了能够作为临时多点控制设备,支持实现视频会议,该录播设备10可以包括但并不局限于第一通信接口11、第一存储器12和第一处理器13。
本申请实施例中,该第一存储器12可以用于存储本申请提出的录播终端侧实现的视频会议处理方法的第一程序;第一处理器13可以用于加载执行第一存储器12存储的第一程序,实现如下实施例中录播终端侧描述的视频会议处理方法,实现过程本申请实施例在此不做详述。
在一些实施例中,第一通信接口11、第一存储器12和第一处理器13可以部署在录播终端10内置的MCU中,部署方式本申请不做详述。可选的,第一处理器13可以是上述MCU,这种情况下,第一通信接口11、第一存储器12和第一处理器13可以直接部署在录播终端10的壳体内,实现方式不做限制。
结合上述分析,在多点视频会议过程中,录播终端10可以作为临时MCU设备使用,启动其内置的MCU功能,进入多点控制工作模式,建立与参与本次视频会议的各参会终端20的媒体会话,实现各参会终端20与该录播终端10之间的视频流数据互动,以及通过该录播终端10,实现各参会终端20之间的视频流数据互动,实现过程可以结合视频会议***中多点控制设备的工作原理确定,本申请对视频会议***中录播终端10内置MCU,实现各参会终端20的音频、视频、数据、信令等信号的汇接、分配、交互的处理过程不做详述。
第一通信接口11可以包括但并不局限于WIFI模块、4G/5G/6G(***移动通信网络/第五代移动通信网络/第六代移动通信网络)模块、GPRS模块、GSM模块等通信模块的数据接口,以实现与其他终端之间的数据交互;根据需要,还可以包括如USB接口、串/并口、各种类型的多媒体接口等,实现与其他终端相应接口的有线连接,以及录播终端内部各组成器件之间的数据交互等,本申请对录播终端10包含的第一通信接口11的接口类型及其数量不做限制,可视情况而定。
在本申请实施例中,上述第一存储器12可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件或其他易失性固态存储器件。第一处理器13,可以为中央处理器(Central Processing Unit,CPU)、特定应用集成电路(application-specific integrated circuit,ASIC)、数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)或者其他可编程逻辑器件等。
从数据处理功能来看,上述第一处理器13可以包括但并不局限于音频处理器、视频处理器、数据处理器、控制处理器、多路复用器等,可以依据本申请提出的视频会议处理方法的处理需求确定,本申请对上述第一处理器13包含的处理器类型及其数量不做限制,可视情况而定。
可以理解,在多点视频会议中,录播终端内置的MCU作为***核心,可以提供多点视频会议的管理和控制功能,通常包含有多点控制器、多点处理器,在实际应用中,可以采用合适的会议控制方式,如***控制方式、演讲人控制方式和声音控制方式进行会议,实现过程本申请不做详述。需要说明,本申请对上述内置MCU的录播终端的产品类型不做限制。
参会终端20可以是用户使用参与视频会议的电子设备,其可以包括但并不局限于智能手机、平板电脑、可穿戴设备、智能手表、增强现实技术(Augmented Reality,AR)设备、虚拟现实(Virtual Reality,VR)设备、车载设备、机器人、台式计算机等,用户可以依据场景需求选择合适的电子设备,请求接入目标视频参会终端,建立与针对目标视频参会终端的上述录播终端10(即临时MCU设备)之间的多媒体会话,实现两者之间的视频流数据交互。
结合上述分析可知,如图3所示,对于上述参会终端20可以包括但并不局限于显示器21、音频播放器22、音频采集器23、图像采集器24、第二通信接口25、第二存储器26和第二处理器27等,可以依据参会终端20的功能需求确定其硬件结构,本申请不做一一列举。
显示器21可以包括显示面板,如触摸显示面板、非触摸显示面板等,本申请对显示器21的内容显示原理及其结构不做限制,本申请实施例中,可以显示MCU合屏处理得到的目标视频流数据,即呈现视频会议的会议界面,本申请对该会议界面的布局方式及其内容不做限制,可视情况而定。
音频播放器22可以包括扬声器等,用于输出目标视频流数据中的音频信号;音频采集器23可以包括拾音器等,用于采集参会终端20的使用者(即会议参与者)发言音频信号等;图像采集器24可以包括摄像头等,用于采集会议参与者的图像信息,本申请参会终端20的这类输入设备/输出设备的类别及其工作原理不做详述,可以依据视频会议的应用需求确定。
对于参会终端20的第二通信接口25的类别可以结合上文对第一通信接口11的描述,可以理解,该第一通信接口11与第二通信接口25之间可以包括至少一对相匹配的通信接口,建立两者之间的多媒体会话连接,实现两者之间的多媒体数据交互。
第二存储器26可以用于存储本申请提出的参会终端侧实现的视频会议处理方法的第二程序;第二处理器27可以用于加载执行第二存储器26存储的第二程序,实现如下实施例中参会终端侧描述的视频会议处理方法,实现过程本申请实施例在此不做详述。且关于第二存储器26和第二处理器27的器件类型,可以参照但并不局限于上文对第一存储器12和第一处理器13的器件类型描述,本实施例不做赘述。
在又一些实施例中,根据应用需求,参会终端还可以包括各种传感器构成的传感器模组、天线、电源管理模组等其他器件,本申请在此不做一一列举。
    应该理解的是,图2和图3所示的视频会议***并不构成对本申请实施例提出的视频会议***的限定,在实际应用中,视频会议***可以包括比图2或图3所示的更多或更少的部件,或者组合某些部件,可以依据场景需求确定,本申请在此不
参照图4,为本申请提出的视频会议处理方法的一可选示例的流程示意图,该方法可以适用于上述录播终端,本申请实施例对该录播终端的产品类型不做限制,可视情况而定,需要说明,为了保证该录播终端可以替代视频会议的专用多点控制设备,该录播终端可以内置MCU,在视频会议过程中,其可以作为临时多点控制设备使用。基于此,如图4所示,本实施例提出的视频会议处理方法可以包括:
步骤S11,获取目标视频会议的参会终端的数量;
本申请实施例中,目标视频会议可以是针对任一业务场景构建的多点会议,且在该目标视频会议中,将由上述内置MCU的录播终端作为临时MCU设备,也可以说是将其作为会话服务器,满足本次视频会议的管理和控制需求。因此,该目标视频会议的参会终端可以是指与内置MCU的录播终端建立媒体会话连接的终端,其可以包括但并不局限于上文列举的电子设备。
需要说明,本申请对内置MCU的录播终端如何构建参与目标视频会议的各参会终端之间的媒体会话连接的实现方法不做详述。可以由该内置MCU的录播终端启动MCU,进入多点控制工作模式后,主动呼叫若干终端参与本次目标视频会议;也可以由终端主动向内置MCU的录播终端发送针对目标视频会议的会议接入请求,来主动请求参与目标视频会议等,在实现过程可以依据但并不局限于SIP(Session initialization Protocol,会话初始协议)实现。
在视频会议应用中,结合上文对本申请技术方案的相关分析,为了降低会议过程中的码率,减少对带宽的占用,从而降低视频流数据传输过程中丢包的风险概率,提出在参会终端达到一定数量后,通过调整参会终端的视频编码参数,来降低码率,所以,本申请可以检测接入目标视频会议的参会终端的数量。
步骤S12,检测到参会终端的数量达到参会阈值,向参会终端发送目标视频编码参数;
在任一视频会议中,作为MCU设备使用的内置MCU的录播终端所能够支持的参会终端的数量有限,如最多支持4台参会终端参与视频会议,且随着录播终端接入的参会终端数量增多,码率会逐渐增加,占用带宽也会越来越大,从而增加丢包风险,且会占用录播终端较多CPU资源,影响其工作性能,对此,本申请提出在接入的参会终端达到一定数量即参会阈值的情况下,可以通知参会终端将其原有的视频编码参数调整为统一的目标视频编码参数,来降低码率,简化视频处理步骤,减小对CPU资源的占用,提高处理效率。
可见,上述参会阈值可以是决定是否触发录播终端向各参会终端发送目标视频编码参数,使参会终端按照该目标视频编码参数进行视频编码处理,要求接入录播终端的参会终端最小数量。可以理解,该参会阈值小于该录播终端所能够支持的参会终端最大数量,本申请对其具体数值不做限制。
基于此,录播终端识别到参与目标视频会议的参会终端达到参会阈值(如2台等)的情况下,可以触发本申请提出的调整参会终端的视频编码参数的处理机制,为了减小码率,需要减小参会终端自身的视频编码参数,本申请对该视频编码参数的减小数值不做限制,可视情况而定。
继上述分析,为了减小录播设备接收到各参会终端的视频流数据后的处理步骤,提高处理效率,提出由录播终端向各参会终端发送内置MCU所需的目标视频编码参数,即需要各参会终端将自身的视频编码参数调整到的目标值,本申请对该目标视频编码参数的类型及其数值大小不做限制。
可以理解,为了降低码率,该目标视频编码参数小于参会终端自身的视频编码参数(如视频采集过程中默认使用的视频流编码参数),在一些实施例中,本申请中视频编码参数可以包括但并不局限于视频分辨率。对于目标视频编码参数的传输方式,可以依据录播终端与各参会终端之间的通信方式实现,本申请实施例不做详述。
按照上述检测方式,确定接入录播终端的参会终端数量未达到参会阈值的情况下,可以不用调整参会终端的视频编码参数,这样,参会终端按照原有的视频编码参数获得视频流数据后,可以直接发送至录播终端,也就是说,录播终端可以不用向各参会终端发送目标视频编码参数,待接收到任一参会终端发送的视频流数据后,可以将其转发至其他参会终端输出,满足参与视频会议的各参会终端相互之间的通信需求,实现过程本申请不做详述。
步骤S13,接收参会终端发送的具有目标视频编码参数的视频流数据;
对于目标视频会议中的各参会终端,接收到内置MCU的录播终端发送的目标视频编码参数后,可以据此对自身的视频编码参数进行调整,后续可以按照调整后得到的目标视频编码参数进行视频流数据采集,如视频录制,得到具有目标视频编码参数的视频流数据,将其发送至内置MCU的录播终端,本申请对视频流数据的获取及其传输时间过程不做详述。
步骤S14,对多个参会终端的视频流数据进行合屏处理,将得到的目标视频流数据发送至各参会终端。
如上述描述,各参会终端减小各自的视频编码参数后进行视频录制,所得到的视频流数据(即利用调整后的视频编码参数对直接采集到的视频数据进行编码后的数据)的数据量,小于依据调整前的视频编码参数得到的视频流数据的数据量,即减小了参会终端传输文件大小,从而减少了传输视频流数据占用的带宽,进而降低了丢包的风险。
内置MCU的录播终端接收到各参会终端发送的视频流数据后,为了能够在同一视频会议界面中展示各会议参与者的会议窗口,需要对这多个视频流数据进行合屏处理,即将多个视频合并在同一屏幕上输出,将最终得到的目标视频流会议反馈至各参会终端输出,以使各会议参与者通过参会终端的屏幕输出的视频会议界面,可以输出多个参会终端各自对应的会议子窗口,在该会议子窗口中呈现相应参会终端采集到的视频。本申请对录播终端内置的MCU如何实现步骤S14的方法不做限制。
其中,由于各参会终端发送的视频流数据的视频编码参数相同,内置MCU的录播终端可以直接对多个视频流数据进行合屏处理,在此之前无需进行编码调整处理,提高了处理效率。
参照图5,为本申请提出的视频会议处理方法的又一可选示例的流程示意图,本实施例可以是对上文描述的视频会议处理方法的一可选细化实现方法,但并不局限于该细化实现方法,且该方法仍由内置MCU的录播终端执行,如图5所示,该方法可以包括:
步骤S21,响应针对目标视频会议的多点控制功能触发请求,控制录播终端进入多点控制工作模式;
在多点会议应用中,本申请可以选择采用SIP协议实现参与目标视频会议的各参会终端,与内置MCU的录播终端之间的视频互动协议,该录播终端可以作为教师角色,开启该录播终端的内置MCU功能后,将该录播终端的录播客户端(即录播应用程序)的互动协议配置为SIP协议,并输入内置MCU的IP地址进行呼叫,建立与各参会终端之间的媒体会话。
基于此,相关人员可以通过打开内置MCU的录播终端的配置页面,触发MCU功能选项,即触发启动该录播终端的多点控制功能,或者通过该多点控制功能的快捷触发方式,触发录播终端的多点控制功能等,生成针对目标视频会议的多点控制功能触发请求,以使录播终端检测到多点控制功能触发请求后,响应该请求启动该录播终端的内置MCU功能,但并不局限于本实施例描述的触发启动实现方法。
步骤S22,接收参会终端发送的针对目标视频会议的会议接入请求,建立录播终端与该参会终端之间的媒体会话连接;
在一些实施例中,内置MCU的录播终端可以主动呼叫各参会终端,接入目标视频会议;在又一些实施例中,如步骤S22描述方式,对于任一想要参与目标视频会议的终端,可以主动向内置MCU的录播终端发送会议接入请求,且该会议接入请求可以按照会话初始协议生成,如SIP-INVITE请求等,本申请对会议接入请求的内容及其格式不做限制。可以理解,该会议接入请求通常可以携带目标视频会议的会议标识号等。
在内置MCU的录播终端接收到任一终端发送的会议接入请求后,确定允许该终端接入目标视频会议,可以反馈针对该会议接入请求的响应消息,以告之该参会终端录播终端收到其发送的会议接入请求,参会终端收到该响应消息后,还可以进一步向录播终端反馈确认收到该响应消息的确认消息,如ACK消息(Acknowledgement,确认消息),从而建立该终端(此时可以称为参会终端)与内置MCU的录播终端之间的媒体会话,实现方法不做限制。
可以理解,对于想要参与目标视频会议的终端,均可以按照上文描述的方法接入目标视频会议,构建与该目标视频会议的内置MCU的录播终端之间的媒体会话连接,作为该目标视频会议的一参会终端,本申请不做一一详述。
步骤S23,获取目标视频会议的参会终端的数量;
步骤S24,检测到该参会终端的数量达到参会阈值,获取针对录播终端的多点控制性能配置的目标视频分辨率;
按照上文描述方法,随着接入目标视频会议的参会终端的数量增多,参会终端与内置MCU的录播终端之间互动的视频流数据占用带宽会逐渐增多,但网络资源有限,这会影响数据传输性能,甚至会因部分视频流数据传输失败导致丢包,进而导致视频会议界面输出的视频内容不流畅。
对此,本申请实施例中,录播终端的内置MCU识别接入目标视频会议的参会终端数量达到参会阈值的情况下,希望通过降低各参会终端的视频流数据的视频分辨率的方式,来降低码率(即数据传输时单位时间传送的数据位数),从而减少各参会终端向该录播终端传输的视频文件(即视频流数据所在的文件)的文件大小。
因此,录播终端可以依据其多点控制性能(如网络性能、CPU可用资源等)确定目标视频分辨率,即内置MCU所需的目标视频分辨率,如960*540,其通常小于各参会终端默认的视频分辨率,如1920*1080或者1280*720等,但为了避免取样率过小导致所得视频图像内容过度失真,可以依据各参会终端默认的视频分辨率,来确定目标视频分辨率,本申请对该目标视频分辨率的数值大小不做限制,可视情况而定。
在实际应用中,对于内置MCU的录播终端所能够支持的如上述的一拖二、一拖三业务场景,其最多可支持4台参会终端接入目标视频会议,这样,各参会终端输出的视频会议界面包含的相应数量的会议子窗口,可以按照左右界面、品字形界面、四格界面等方式进行布局,无论哪种布局方式,会议子窗口的宽高均为整个视频会议界面的1/2,因此,本申请获得的目标视频分辨率可以是参会终端默认的视频分辨率的1/2,但并不局限于此。
由此可见,为了获得目标视频分辨率,可以依据参会终端的数量,确定目标视频会议的视频界面布局格式,之后,依据视频界面布局格式以及参会终端配置的视频分辨率,确定针对内置MCU的目标视频分辨率,但并不局限于本申请提出的这种目标视频分辨率获取方法。
在又一些实施例中,可以通过对录播终端相应的视频分辨率调整按钮(如物理按钮或虚拟功能按钮等)进行操作,响应该视频分辨率调整,得到针对内置MCU的目标视频分辨率;也可以通过语音或其他输入方式,来确定该目标视频分辨率,本申请在此不做一一举例详述。
步骤S25,按照会话初始协议,向参会终端发送携带有目标视频分辨率的编码调整请求;
在确定目标视频分辨率后,可以生成包含该目标视频分辨率的编码调整请求,如SIP-INFO请求,将其发送至接入目标视频会议的各参会终端,以使各参会终端接收到该编码调整请求后,可以反馈相应的响应消息,如回复2000K,同时可以响应编码调整请求,将默认的视频分辨率调整为目标视频分辨率,获得具有目标视频分辨率的视频流数据。关于参会终端如何实现自身的视频分辨率的编码调整方法本申请不做详述,可以依据编解码器的编码参数、解码参数的配置方法确定。
可以理解,参会终端按照目标视频分辨率对自身的编码参数进行修改后,即降低自身的视频分辨率,如将1920*1080修改为960*540,将降低其向内置MCU的录播终端传输视频流的码率。
在实际应用中,经过试验得知,按照原视频流数据传输方式,内置MCU可能需要接收到6Mbps的视频码率,造成大量丢包,最终导致参会终端输出的视频花屏卡顿,降低了用户体验。按照本申请提出的降低视频分辨率的处理方式,视频码率可以降低4倍(即存在4台参会终端的情况下),这样,只需要接收1~1.5Mbps的视频码率,同样的网络质量,减少了带宽占用,提高了视频图像输出质量。
步骤S26,接收参会终端发送的具有目标视频分辨率的视频流数据;
步骤S27,对多个参会终端各自发送的具有目标视频分辨率的视频流数据进行解码处理;
步骤S28,对解码后的多个参会终端对应的同一帧视频流数据进行合并处理,得到具有目标视频分辨率的相应帧视频流数据;
步骤S29,对得到的多帧视频流数据进行编码处理,得到待输出的目标视频流数据;
步骤S210,将目标视频流数据发送至多个参会终端。
在对多个参会终端发送的视频流数据进行合屏处理过程中,需要先对来自各参会终端的视频流数据进行解码,再进行YUV(即一种颜色编码方法)合屏处理,即将多个YUV图像合并为一个YUV图像,可以结合YUV图像合并技术实现,但并不局限于这种YUV合屏处理方法。
其中,在合屏处理过程中,如上文步骤描述方法,对于来自不同参会终端的同一帧视频流数据(即同一帧视频图像数据),可以按照预设视频会议界面的布局方式,合并为具有目标视频分辨率的相应帧视频图像,如此逐帧合并处理,可以得到合并后的视频流数据,此时该视频流数据包含多个参会终端发送的视频流数据内容。
依据不同设备之间的视频流传输协议要求,对于合并得到的视频流数据,内置MCU的录播终端需要先对其进行编码处理,再将编码后的视频流数据发送至各参会终端,以使参会终端采用相应的解码方式对其进行解码后输出,本申请对视频流数据的编解码实现方法不做限制。
综上,在本申请实施例中,内置MCU的录播终端识别接入目标视频会议的参会终端数量达到参会阈值,将该内置MCU所需的目标视频分辨率发送至各参会终端,将其原有的视频分辨率降低为目标视频分辨率,这样,各参会终端据此进行视频采集,将得到的视频流数据传输至内置MCU的录播终端过程中,可以降低码率,减少占用的带宽,从而达到降低丢包风险的效果。
而且,内置MCU的录播终端获得多个参会终端发送的视频流数据后,由于其视频分辨率相同,无需进行缩放处理,可以直接进行合屏处理,得到所需输出的目标视频流数据,提高了处理效率。
参照图6,为本申请提出的视频会议处理方法的又一可选示例的流程示意图,本申请实施例由任一参会终端执行,该参会终端可以与内置MCU的录播终端相互配置,实现本申请提出的视频会议处理方法,关于该录播终端所执行的方法步骤,可以参照上文实施例相应部分的描述,本实施例从参会终端侧描述视频会议处理方法的实现过程,如图6所示,该方法可以包括:
步骤S31,建立与针对目标视频会议的录播终端的媒体会话连接;
结合上文实施例相应部分的描述,确定目标视频会议的录播终端的内置MCU的IP地址后,想要参与目标视频会议的终端可以发起SIP-INVITE请求,请求建立与该录播终端的媒体会话,在接收到该录播终端反馈的2000K这一内容的响应消息后,可以向该录播终端发送ACK消息,实现过程可以参照上文相应部分的描述,本实施例不做赘述。
对于与内置MCU的录播终端建立媒体会话的任一参会终端,可以与该录播终端的内置MCU互相发送视频流数据,在接入一个参会终端的情况下,改视频流数据的视频分辨率可以是默认的视频分辨率,如1080P等,但并不局限于此。
在其他终端想要加入目标视频会议的情况下,可以按照上文描述方式,建立与内置MCU的录播终端之间的媒体会话,实现过程本申请不做赘述。
步骤S32,接收录播终端发送的目标视频编码参数;
如上文从内置MCU的录播终端侧描述的视频会议处理方法可知,录播终端的内置MCU识别到参会终端的数量达到参会阈值,将确定的目标视频编码参数发送至各参会终端,该目标视频编码参数小于参会终端相应的视频编码参数,能够达到降低码率的技术效果。可选的,该目标视频编码参数可以包括但并不局限于目标视频分辨率。
步骤S33,将参会终端的视频编码参数调整为目标视频编码参数;
步骤S34,获得具有目标视频编码参数的视频流数据,将该视频流数据发送至录播终端;
参会终端调整自身的视频编码参数后,将按照目标识别编码参数进行视频采集和编码,降低了传输至内置MCU的录播终端的视频码率,减少对带宽占用,从而降低了丢包风险。关于录播终端接收到目标视频会议的多个参会终端发送的视频流数据后的合屏处理过程,可以参照上文实施例相应部分的描述,本实施例不做赘述。
步骤S35,接收录播终端发送的目标视频流数据,对目标视频流数据进行解码,播放解码后的视频流数据。
综上,在一个视频会议中,接入录播终端的内置MCU的参会终端数量达到参会阈值,如2台参会终端的情况下,参会终端可以按照该录播终端发送的其内置MCU所需的目标视频编码参数,来调整该参会终端自身的视频编码参数,以降低其视频流数据传输的视频码率,减少对带宽占用。且各参会终端向录播终端的内置MCU发送统一视频编码参数的视频流数据,节省了内置MCU的缩放处理,提高了处理效率。
参照图7,为本申请提出的视频会议处理装置的一可选示例的结构示意图,该装置可以从内置MCU的录播终端侧进行描述,如图7所示,该装置可以包括:
参会终端数量获取模块31,用于获取目标视频会议的参会终端的数量;其中,所述参会终端是指与录播终端建立媒体会话连接的终端;
目标视频编码参数发送模块32,用于检测到所述参会终端的数量达到参会阈值,向所述参会终端发送目标视频编码参数;所述目标视频编码参数小于所述参会终端相应的视频编码参数;
视频流数据接收模块33,用于接收所述参会终端发送的具有所述目标视频编码参数的视频流数据;
视频流合屏处理模块34,用于对多个所述参会终端的所述视频流数据进行合屏处理,将得到的目标视频流数据发送至所述参会终端。
可选的,如图8所示,上述目标视频编码参数发送模块32可以包括:
目标视频分辨率获取单元321,用于获取针对录播终端的多点控制性能配置的目标视频分辨率;所述目标视频分辨率小于所述参会终端配置的视频分辨率;
编码调整请求发送单元322,用于按照会话初始协议,向所述参会终端发送携带有所述目标视频分辨率的编码调整请求,以使所述参会终端响应所述编码调整请求,将默认的视频分辨率调整为所述目标视频分辨率,获得具有所述目标视频分辨率的视频流数据。
在一种可能的实现方式中,上述目标视频分辨率获取单元321可以包括:
视频界面布局格式确定单元,用于依据所述参会终端的数量,确定所述目标视频会议的视频界面布局格式;
目标视频分辨率确定单元,用于依据所述视频界面布局格式以及所述参会终端配置的视频分辨率,确定针对录播终端的多点控制性能配置的目标视频分辨率。
在又一些实施例中,如图8所示,上述视频流合屏处理模块34可以包括:
解码单元341,用于对多个所述参会终端各自发送的具有所述目标视频编码参数的视频流数据进行解码处理;
合并处理单元342,用于对解码后的多个所述参会终端对应的同一帧视频流数据进行合并处理,得到具有所述目标视频编码参数的相应帧视频流数据;
编码单元343,用于对得到的多帧视频流数据进行编码处理,得到待输出的目标视频流数据;
目标视频流数据发送单元344,用于将所述目标视频流数据发送至多个所述参会终端。
基于上文各实施例的描述,上述装置还可以包括:
媒体会话建立模块,用于建立参会终端与录播终端之间的媒体会话连接;
可选的,该媒体会话建立模块可以包括:
多点控制工作模式启动单元,用于响应针对目标视频会议的多点控制功能触发请求,控制录播终端进入多点控制工作模式;
会议接入单元,用于接收参会终端发送的针对所述目标视频会议的会议接入请求,建立所述录播终端与所述参会终端之间的媒体会话连接;所述会议接入请求是按照会话初始协议生成的。
参照图9,为本申请提出的视频会议处理装置的又一可选示例的结构示意图,该装置可以从参会终端侧进行描述,如图9所示,该装置可以包括:
媒体会话构建模块41,用于建立与针对目标视频会议的录播终端的媒体会话连接;
目标视频编码参数接收模块42,用于接收所述录播终端发送的目标视频编码参数;所述目标视频编码参数小于参会终端相应的视频编码参数;
视频编码参数调整模块43,用于将所述参会终端的视频编码参数调整为所述目标视频编码参数;
视频流数据发送模块44,用于获得具有所述目标视频编码参数的视频流数据,将所述视频流数据发送至所述录播终端,以使所述录播终端对所述目标视频会议的多个参会终端发送的所述视频流数据进行合屏处理,得到待输出的目标视频流数据;
视频流数据播放模块45,用于接收所述录播终端发送的所述目标视频流数据,对所述目标视频流数据进行解码,播放解码后的视频流数据。
需要说明的是,关于上述各装置实施例中的各种模块、单元等,均可以作为程序模块存储在相应侧终端的存储器中,由相应侧终端的处理器执行存储在该存储器中的上述程序模块,以实现相应的功能,关于各程序模块及其组合所实现的功能,以及达到的技术效果,可以参照上述方法实施例相应部分的描述,本实施例不再赘述。
本申请还提供了一种计算机可读存储介质,其上可以存储计算机程序,该计算机程序可以被处理器调用并加载,以实现上述实施例描述的视频会议处理方法的各个步骤。
最后,需要说明的是,本说明书中各个实施例采用递进或并列的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似部分互相参见即可。对于实施例公开的装置、***、终端而言,由于其与实施例公开的方法对应,所以描述的比较简单,相关之处参见方法部分说明即可。
对所公开的实施例的上述说明,使本领域专业技术人员能够实现或使用本发明。对这些实施例的多种修改对本领域的专业技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本发明的精神或范围的情况下,在其它实施例中实现。因此,本发明将不会被限制于本文所示的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (10)

  1. 一种视频会议处理方法,其特征在于,所述方法包括:
    获取目标视频会议的参会终端的数量;其中,所述参会终端是指与录播终端建立媒体会话连接的终端;
    检测到所述参会终端的数量达到参会阈值,向所述参会终端发送目标视频编码参数,以使参会终端将其原有的视频编码参数调整为统一的目标视频编码参数;所述目标视频编码参数小于所述参会终端相应的视频编码参数;接收所述参会终端发送的具有所述目标视频编码参数的视频流数据;
    对多个所述参会终端的所述视频流数据进行合屏处理,将得到的目标视频流数据发送至各所述参会终端。
  2. 根据权利要求1所述的方法,其特征在于,所述向所述参会终端发送目标视频编码参数,包括:
    获取针对所述录播终端的多点控制性能配置的目标视频分辨率;所述目标视频分辨率小于所述参会终端配置的视频分辨率;
    按照会话初始协议,向所述参会终端发送携带有所述目标视频分辨率的编码调整请求,以使所述参会终端响应所述编码调整请求,将默认的视频分辨率调整为所述目标视频分辨率,获得具有所述目标视频分辨率的视频流数据。
  3. 根据权利要求2所述的方法,其特征在于,所述获取针对所述录播终端的多点控制性能配置的目标视频分辨率,包括:
    依据所述参会终端的数量,确定所述目标视频会议的视频界面布局格式;
    依据所述视频界面布局格式以及所述参会终端配置的视频分辨率,确定针对所述录播终端的多点控制性能配置的目标视频分辨率。
  4. 根据权利要求1~3任一项所述的方法,其特征在于,所述对所述视频流数据进行合屏处理,将得到的目标视频流数据发送至所述参会终端进行播放,包括:
    对多个所述参会终端各自发送的具有所述目标视频编码参数的视频流数据进行解码处理;
    对解码后的多个所述参会终端对应的同一帧视频流数据进行合并处理,得到具有所述目标视频编码参数的相应帧视频流数据;
    对得到的多帧视频流数据进行编码处理,得到待输出的目标视频流数据;
    将所述目标视频流数据发送至多个所述参会终端。
  5. 根据权利要求4所述的方法,其特征在于,所述参会终端与录播终端建立媒体会话的实现方法,包括:
    响应针对目标视频会议的多点控制功能触发请求,控制录播终端进入多点控制工作模式;
    接收参会终端发送的针对所述目标视频会议的会议接入请求,建立所述录播终端与所述参会终端之间的媒体会话连接;所述会议接入请求是按照会话初始协议生成的。
  6. 一种视频会议处理方法,其特征在于,所述方法包括:
    建立与针对目标视频会议的录播终端的媒体会话连接;
    接收所述录播终端发送的目标视频编码参数;所述目标视频编码参数小于参会终端相应的视频编码参数;
    将所述参会终端的视频编码参数调整为统一的所述目标视频编码参数;
    获得具有所述目标视频编码参数的视频流数据,将所述视频流数据发送至所述录播终端,以使所述录播终端对所述目标视频会议的多个参会终端发送的所述视频流数据进行合屏处理,得到待输出的目标视频流数据;
    接收所述录播终端发送的所述目标视频流数据,对所述目标视频流数据进行解码,播放解码后的视频流数据。
  7. 一种视频会议处理装置,其特征在于,所述装置包括:
    参会终端数量获取模块,用于获取目标视频会议的参会终端的数量;其中,所述参会终端是指与录播终端建立媒体会话连接的终端;
    目标视频编码参数发送模块,用于检测到所述参会终端的数量达到参会阈值,向所述参会终端发送目标视频编码参数,以使参会终端将其原有的视频编码参数调整为统一的目标视频编码参数;所述目标视频编码参数小于所述参会终端相应的视频编码参数;
    视频流数据接收模块,用于接收所述参会终端发送的具有所述目标视频编码参数的视频流数据;
    视频流合屏处理模块,用于对多个所述参会终端的所述视频流数据进行合屏处理,将得到的目标视频流数据发送至所述参会终端。
  8. 一种视频会议处理装置,其特征在于,所述装置包括:
    媒体会话构建模块,用于建立与针对目标视频会议的录播终端的媒体会话连接;
    目标视频编码参数接收模块,用于接收所述录播终端发送的目标视频编码参数;所述目标视频编码参数小于参会终端相应的视频编码参数;
    视频编码参数调整模块,用于将所述参会终端的视频编码参数调整为统一的所述目标视频编码参数;
    视频流数据发送模块,用于获得具有所述目标视频编码参数的视频流数据,将所述视频流数据发送至所述录播终端,以使所述录播终端对所述目标视频会议的多个参会终端发送的所述视频流数据进行合屏处理,得到待输出的目标视频流数据;
    视频流数据播放模块,用于接收所述录播终端发送的所述目标视频流数据,对所述目标视频流数据进行解码,播放解码后的视频流数据。
  9. 一种视频会议***,其特征在于,所述***包括录播终端以及多个参会终端,其中:
    所述录播终端包括第一通信接口、第一存储器和第一处理器,其中:
    所述第一存储器用于存储实现如权利要求1所述的视频会议处理方法的第一程序;
    所述第一处理器,用于加载执行所述第一存储器存储的所述第一程序,实现如权利要求1所述的视频会议处理方法;
    所述参会终端包括显示器、音频播放器、音频采集器、图像采集器、第二通信接口、第二存储器和第二处理器,其中:
    所述第二存储器用于存储实现如权利要求6所述的视频会议处理方法的第二程序;
    所述第二处理器,用于加载执行所述第二存储器存储的所述第二程序,实现如权利要求6所述的视频会议处理方法。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器加载执行,实现如权利要求1或6所述的视频会议处理方法。
PCT/CN2022/109317 2021-10-29 2022-07-31 视频会议处理方法、处理设备、会议***以及存储介质 WO2023071356A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111276971.5A CN113992883B (zh) 2021-10-29 2021-10-29 视频会议处理方法、处理设备、会议***以及存储介质
CN202111276971.5 2021-10-29

Publications (1)

Publication Number Publication Date
WO2023071356A1 true WO2023071356A1 (zh) 2023-05-04

Family

ID=79744859

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/109317 WO2023071356A1 (zh) 2021-10-29 2022-07-31 视频会议处理方法、处理设备、会议***以及存储介质

Country Status (2)

Country Link
CN (1) CN113992883B (zh)
WO (1) WO2023071356A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113992883B (zh) * 2021-10-29 2022-07-29 安徽文香科技有限公司 视频会议处理方法、处理设备、会议***以及存储介质
CN117669783B (zh) * 2024-02-02 2024-04-05 深圳市汇丰智能***有限公司 一种基于物联网的会议调度预约***及方法
CN117896552B (zh) * 2024-03-14 2024-07-12 浙江华创视讯科技有限公司 视频会议的处理方法、视频会议***以及相关装置

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090213206A1 (en) * 2008-02-21 2009-08-27 Microsoft Corporation Aggregation of Video Receiving Capabilities
US7627629B1 (en) * 2002-10-30 2009-12-01 Cisco Technology, Inc. Method and apparatus for multipoint conferencing
CN101594512A (zh) * 2009-06-30 2009-12-02 中兴通讯股份有限公司 实现高清多画面的终端、多点控制单元、***及方法
JP2011029868A (ja) * 2009-07-24 2011-02-10 Ricoh Co Ltd 端末装置、遠隔会議システム、端末装置の制御方法、端末装置の制御プログラム、及び端末装置の制御プログラムを記録したコンピュータ読み取り可能な記録媒体
CN105635636A (zh) * 2015-12-30 2016-06-01 随锐科技股份有限公司 一种视频会议***及其实现视频图像传输控制的方法
JP2016192610A (ja) * 2015-03-31 2016-11-10 ブラザー工業株式会社 遠隔会議プログラム、制御装置及び遠隔会議方法
CN112511782A (zh) * 2019-09-16 2021-03-16 中兴通讯股份有限公司 视频会议方法、第一终端、mcu、***及存储介质
CN113194276A (zh) * 2021-03-12 2021-07-30 广州朗国电子科技有限公司 在视频会议***中生成动态布局的方法、***、存储装置
CN113992883A (zh) * 2021-10-29 2022-01-28 安徽文香科技有限公司 视频会议处理方法、处理设备、会议***以及存储介质

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI451746B (zh) * 2011-11-04 2014-09-01 Quanta Comp Inc 視訊會議系統及視訊會議方法
US9179100B2 (en) * 2012-10-30 2015-11-03 Polycom, Inc. Video conferencing method and device thereof
WO2015184415A1 (en) * 2014-05-30 2015-12-03 Highfive Technologies, Inc. Method and system for multiparty video conferencing
CN110602431A (zh) * 2019-08-15 2019-12-20 视联动力信息技术股份有限公司 配置参数修改方法、装置

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7627629B1 (en) * 2002-10-30 2009-12-01 Cisco Technology, Inc. Method and apparatus for multipoint conferencing
US20090213206A1 (en) * 2008-02-21 2009-08-27 Microsoft Corporation Aggregation of Video Receiving Capabilities
CN101594512A (zh) * 2009-06-30 2009-12-02 中兴通讯股份有限公司 实现高清多画面的终端、多点控制单元、***及方法
JP2011029868A (ja) * 2009-07-24 2011-02-10 Ricoh Co Ltd 端末装置、遠隔会議システム、端末装置の制御方法、端末装置の制御プログラム、及び端末装置の制御プログラムを記録したコンピュータ読み取り可能な記録媒体
JP2016192610A (ja) * 2015-03-31 2016-11-10 ブラザー工業株式会社 遠隔会議プログラム、制御装置及び遠隔会議方法
CN105635636A (zh) * 2015-12-30 2016-06-01 随锐科技股份有限公司 一种视频会议***及其实现视频图像传输控制的方法
CN112511782A (zh) * 2019-09-16 2021-03-16 中兴通讯股份有限公司 视频会议方法、第一终端、mcu、***及存储介质
CN113194276A (zh) * 2021-03-12 2021-07-30 广州朗国电子科技有限公司 在视频会议***中生成动态布局的方法、***、存储装置
CN113992883A (zh) * 2021-10-29 2022-01-28 安徽文香科技有限公司 视频会议处理方法、处理设备、会议***以及存储介质

Also Published As

Publication number Publication date
CN113992883A (zh) 2022-01-28
CN113992883B (zh) 2022-07-29

Similar Documents

Publication Publication Date Title
US9621854B2 (en) Recording a videoconference using separate video
US9485466B2 (en) Video processing in a multi-participant video conference
WO2023071356A1 (zh) 视频会议处理方法、处理设备、会议***以及存储介质
US9407867B2 (en) Distributed recording or streaming of a videoconference in multiple formats
US8780166B2 (en) Collaborative recording of a videoconference using a recording server
US9270941B1 (en) Smart video conferencing system
US9462227B2 (en) Automatic video layouts for multi-stream multi-site presence conferencing system
US9179100B2 (en) Video conferencing method and device thereof
US9035991B2 (en) Collaboration system and method
US9118808B2 (en) Dynamic allocation of encoders
EP2965508B1 (en) Video conference virtual endpoints
US9438857B2 (en) Video conferencing system and multi-way video conference switching method
US20230283888A1 (en) Processing method and electronic device
TWI636691B (zh) 視訊畫面智能切換方法及其系統
CN116320268A (zh) 一种远程会议控制方法、装置及设备
US20140327729A1 (en) Systems and methods for using split endpoints in video communication systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22885256

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE