CN113473068A - Conference access method, device, server and storage medium - Google Patents

Conference access method, device, server and storage medium Download PDF

Info

Publication number
CN113473068A
CN113473068A CN202110795149.3A CN202110795149A CN113473068A CN 113473068 A CN113473068 A CN 113473068A CN 202110795149 A CN202110795149 A CN 202110795149A CN 113473068 A CN113473068 A CN 113473068A
Authority
CN
China
Prior art keywords
audio
video
conference
terminal
video data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110795149.3A
Other languages
Chinese (zh)
Inventor
汪秀兵
闫振利
赵君
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiaovo Technology Co ltd
China United Network Communications Group Co Ltd
China Unicom Online Information Technology Co Ltd
Original Assignee
Xiaovo Technology Co ltd
China United Network Communications Group Co Ltd
China Unicom Online Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiaovo Technology Co ltd, China United Network Communications Group Co Ltd, China Unicom Online Information Technology Co Ltd filed Critical Xiaovo Technology Co ltd
Priority to CN202110795149.3A priority Critical patent/CN113473068A/en
Publication of CN113473068A publication Critical patent/CN113473068A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/155Conference systems involving storage of or access to video conference sessions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application provides a conference access method, a device, a server and a storage medium, wherein the method receives conference information uploaded by a terminal through the server, the conference information comprises a target scene identification and audio and video data of users in the video conference, furthermore, the server determines audio and video processing parameters of a target audio and video sample according to the target scene identification, and according to the audio and video processing parameters, processing the audio and video data, returning the processed audio and video data to the terminal so that the terminal can carry out a video conference according to the processed audio and video data, namely, the embodiment of the application processes the audio and video data of the user, returns the processed audio and video data to the terminal, the terminal carries out the video conference based on the processed audio and video data, and the safety problem that the existing user is exposed when the audio and video data are directly accessed into the video conference is solved.

Description

Conference access method, device, server and storage medium
Technical Field
The present application relates to the field of communications technologies, and in particular, to a conference access method, an apparatus, a server, and a storage medium.
Background
The video conference system is a communication system capable of remotely transmitting sound, images and file data along with the development of communication technology, and can achieve the purpose of real-time and interactive communication between two or more participants. Compared with a telephone conference system, the video conference system has the characteristics of strong intuition, large information amount and the like.
Currently, video conferencing is widely used. But in some scenarios, such as home office, users are exposed to security issues, such as privacy exposure, when accessing video conferencing.
However, as long as the user accesses the video conference, the data of the microphone and the camera on the user communication device can be directly transmitted through the network to access the video conference, and the above security problem cannot be solved.
Disclosure of Invention
In order to solve the problems in the prior art, the application provides a conference access method, a conference access device, a server and a storage medium.
In a first aspect, an embodiment of the present application provides a conference access method, where the method is applied to a server, and the method includes the following steps:
receiving conference information uploaded by a terminal, wherein the conference information comprises a target scene identifier and audio and video data of users in a video conference;
determining audio and video processing parameters of a target audio and video sample according to the target scene identification;
processing the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample;
and returning the processed audio and video data to the terminal so that the terminal carries out the video conference according to the processed audio and video data.
In one possible implementation, the audio/video processing parameters include a noise reduction ratio and face data;
the processing of the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample comprises the following steps:
according to the noise reduction ratio, carrying out noise reduction processing on audio and video data of users in the video conference;
and according to the face data, carrying out image filtering on the audio and video data subjected to noise reduction processing.
In a possible implementation manner, the determining, according to the target scene identifier, an audio/video processing parameter of a target audio/video sample includes:
acquiring a corresponding relation between a pre-stored scene identification and audio and video processing parameters of an audio and video sample;
and determining audio and video processing parameters of the target audio and video sample corresponding to the target scene identification according to the corresponding relation.
In a possible implementation manner, before determining the audio/video processing parameter of the target audio/video sample according to the target scene identifier, the method further includes:
receiving preprocessing information uploaded by a terminal, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier;
decomposing the audio and video samples and determining audio and video processing parameters of the audio and video samples;
and storing the corresponding relation between the scene identification and the audio and video processing parameters of the audio and video samples.
In a second aspect, an embodiment of the present application provides another conference access method, where the method is applied to a terminal, and the method includes the following steps:
determining a target scene identifier;
uploading meeting information to a server, wherein the meeting information comprises the target scene identification and audio and video data of users in the video meeting, the meeting information is used for indicating the server to determine audio and video processing parameters of a target audio and video sample according to the target scene identification, processing the audio and video data of the users in the video meeting according to the audio and video processing parameters of the target audio and video sample, and returning the processed audio and video data to the terminal;
and receiving the processed audio and video data sent by the server, and carrying out the video conference according to the processed audio and video data.
In one possible implementation, the method further includes:
uploading preprocessing information to the server, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier, and the preprocessing information is used for indicating the server to decompose the audio and video sample, determining audio and video processing parameters of the audio and video sample and storing the corresponding relation between the scene identifier and the audio and video processing parameters of the audio and video sample.
In a third aspect, an embodiment of the present application provides a conference access apparatus, where the apparatus is applied to a server, and the apparatus includes:
the first receiving module is used for receiving conference information uploaded by a terminal, wherein the conference information comprises a target scene identifier and audio and video data of users in a video conference;
the first determining module is used for determining audio and video processing parameters of a target audio and video sample according to the target scene identification;
the processing module is used for processing the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample;
and the return module is used for returning the processed audio and video data to the terminal so that the terminal carries out the video conference according to the processed audio and video data.
In one possible implementation, the audio-video processing parameters include a noise reduction ratio and face data.
The processing module is specifically configured to:
according to the noise reduction ratio, carrying out noise reduction processing on audio and video data of users in the video conference;
and according to the face data, carrying out image filtering on the audio and video data subjected to noise reduction processing.
In a possible implementation manner, the first determining module is specifically configured to:
acquiring a corresponding relation between a pre-stored scene identification and audio and video processing parameters of an audio and video sample;
and determining audio and video processing parameters of the target audio and video sample corresponding to the target scene identification according to the corresponding relation.
In a possible implementation manner, the first determining module is further configured to:
receiving preprocessing information uploaded by a terminal, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier;
decomposing the audio and video samples and determining audio and video processing parameters of the audio and video samples;
and storing the corresponding relation between the scene identification and the audio and video processing parameters of the audio and video samples.
In a fourth aspect, an embodiment of the present application provides another conference access apparatus, where the apparatus is applied to a terminal, and the apparatus includes:
the second determining module is used for determining the target scene identifier;
the conference information comprises a target scene identifier and audio and video data of users in a video conference, and is used for indicating the server to determine audio and video processing parameters of a target audio and video sample according to the target scene identifier, process the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample, and return the processed audio and video data to the terminal;
and the second receiving module is used for receiving the processed audio and video data sent by the server and carrying out the video conference according to the processed audio and video data.
In a possible implementation manner, the uploading module is further configured to:
uploading preprocessing information to the server, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier, and the preprocessing information is used for indicating the server to decompose the audio and video sample, determining audio and video processing parameters of the audio and video sample and storing the corresponding relation between the scene identifier and the audio and video processing parameters of the audio and video sample.
In a fifth aspect, an embodiment of the present application provides a server, including:
a processor;
a memory; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of the first aspect.
In a sixth aspect, an embodiment of the present application provides a terminal, including:
a processor;
a memory; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of the second aspect.
In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program causes a server to execute the method in the first aspect.
In an eighth aspect, the present application provides a computer-readable storage medium, which stores a computer program, where the computer program causes a server to execute the method of the second aspect.
In a ninth aspect, the present application provides a computer program product, which includes computer instructions for executing the method of the first aspect by a processor.
In a tenth aspect, an embodiment of the present application provides a computer program product, which includes computer instructions for executing the method of the second aspect by a processor.
The conference access method, the device, the server and the storage medium provided by the embodiment of the application, the method receives the conference information uploaded by the terminal through the server, the conference information comprises a target scene identification and audio and video data of users in the video conference, furthermore, the server determines audio and video processing parameters of a target audio and video sample according to the target scene identification, and according to the audio and video processing parameters, processing the audio and video data, returning the processed audio and video data to the terminal so that the terminal can carry out a video conference according to the processed audio and video data, namely, the embodiment of the application processes the audio and video data of the user, returns the processed audio and video data to the terminal, the terminal carries out the video conference based on the processed audio and video data, and the safety problem that the existing user is exposed when the audio and video data are directly accessed into the video conference is solved. In addition, according to the method and the device, the corresponding audio and video processing parameters are determined according to the target scene identification, so that the audio and video data of the user in the video conference under the target scene are processed according to the parameters, the accuracy of the subsequent processing result is high, and the method and the device are suitable for application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic diagram of a conference access system architecture provided in an embodiment of the present application;
fig. 2 is a schematic flowchart of a conference access method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of data processing before conference access according to an embodiment of the present application;
fig. 4 is a schematic flowchart of another conference access method according to an embodiment of the present application;
fig. 5 is a schematic flowchart of another conference access method according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a conference access apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of another conference access apparatus according to an embodiment of the present application;
FIG. 8A is a diagram illustrating a basic hardware architecture of a server according to the present application;
fig. 8B is a schematic diagram of a basic hardware architecture of a terminal provided in the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," and "fourth," if any, in the description and claims of this application and the above-described figures are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
In the related art, in some scenarios, such as office at home, some security issues, such as privacy exposure, are exposed when a user accesses a video conference. Illustratively, because the user is in a remote office at home due to epidemic situations, when the user accesses a video conference, data of a microphone and a camera on the user communication equipment can be directly transmitted through a network without any processing. Thus, when a meeting is opened, the conversation of family members, the family environment and the like are transmitted to the video conference, and the privacy of the user is exposed.
In order to solve the above problems, an embodiment of the present application provides a conference access method, which may process audio and video data of a user through a server, and return the processed audio and video data to a terminal, so that the terminal performs a video conference based on the processed audio and video data, and a security problem, such as privacy exposure, exposed when the existing user directly accesses the audio and video data into the video conference is solved. And the server determines corresponding audio and video processing parameters according to the target scene identification, so that the audio and video data of the user in the video conference under the target scene are processed according to the parameters, the accuracy of subsequent processing results is high, and the method is suitable for application.
Optionally, a conference access method provided by the present application may be applied to the architecture diagram of the conference access system shown in fig. 1, and as shown in fig. 1, the system may include a server 11 and a plurality of terminals, where the plurality of terminals take a first terminal 12, a second terminal 13, and a third terminal 14 as an example.
It is to be understood that the illustrated structure of the embodiment of the present application does not form a specific limitation to the architecture of the conference access system. In other possible embodiments of the present application, the foregoing architecture may include more or less components than those shown in the drawings, or combine some components, or split some components, or arrange different components, which may be determined according to practical application scenarios, and is not limited herein. The components shown in fig. 1 may be implemented in hardware, software, or a combination of software and hardware.
In a specific implementation process, in this embodiment of the application, the first terminal 12, the second terminal 13, and the third terminal 14 may be terminal devices of different users, for example, the first terminal 12 is a terminal device of the user 1, the second terminal 13 is a terminal device of the user 2, and the third terminal 14 is a terminal device of the user 3, where the first terminal 12 sends audio and video data of the user 1 in a video conference to the server 11, the second terminal 13 sends audio and video data of the user 2 in the video conference to the server 11, and the third terminal 14 sends audio and video data of the user 3 in the video conference to the server 11. In the above application scenario, users (e.g., user 1, user 2, and user 3) may log in the terminal APP, then acquire audio and video data accessed to the video conference through the microphone and the camera, and send corresponding audio and video data to the server 11. After receiving the audio and video data, the server 11 processes the audio and video data, and returns the processed audio and video data to the terminals (for example, the first terminal 12, the second terminal 13, and the third terminal 14), so that the terminals perform a video conference based on the processed audio and video data, and the security problem that the existing user is exposed when directly accessing the audio and video data to the video conference is solved.
In this embodiment of the application, each of the terminals may be a handheld device, a vehicle-mounted device, a wearable device, a computing device, and various forms of User Equipment (UE) and other devices that are equipped with a conference access function APP.
In addition, the network architecture and the service scenario described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not constitute a limitation to the technical solution provided in the embodiment of the present application, and it can be known by a person skilled in the art that along with the evolution of the network architecture and the appearance of a new service scenario, the technical solution provided in the embodiment of the present application is also applicable to similar technical problems.
The technical solutions of the present application are described below with several embodiments as examples, and the same or similar concepts or processes may not be described in detail in some embodiments.
Fig. 2 is a schematic flow diagram of a conference access method provided in an embodiment of the present application, where an execution subject of the embodiment may be a server in the embodiment shown in fig. 1, and as shown in fig. 2, the method may include:
s201: and receiving conference information uploaded by the terminal, wherein the conference information comprises a target scene identifier and audio and video data of users in the video conference.
The target scene identifier may be an identifier of a scene set on the terminal by the user according to the environment where the user is located, for example, in an office, surrounding colleagues are discussing other problems, the user is reporting work, and the user may set the office scene identifier at the terminal. Therefore, the terminal uploads the target scene identifier to the server, so that the server performs subsequent processing according to the target scene identifier. Here, the target scene identifier may be information for identifying the identity of the target scene, such as a name or a label of the target scene.
In the application scene, a user can log in the terminal APP, then the audio and video data accessed to the video conference are collected through the microphone and the camera, and the collected audio and video data are sent to the server. And the audio and video data correspond to the target scene identification, namely, the user accesses the audio and video data of the video conference under the target scene corresponding to the target scene identification.
S202: and determining audio and video processing parameters of the target audio and video sample according to the target scene identification.
Here, after receiving the conference information uploaded by the terminal, the server may obtain a correspondence between a pre-stored scene identifier and an audio/video processing parameter of an audio/video sample, and then determine an audio/video processing parameter of a target audio/video sample corresponding to the target scene identifier according to the correspondence.
The audio/video processing parameters may include a noise reduction ratio and face data.
For example, as shown in fig. 3, before determining the audio/video processing parameter of the target audio/video sample according to the target scene identifier, the server may receive the preprocessing information uploaded by the terminal, where the preprocessing information includes the scene identifier and the audio/video sample corresponding to the scene identifier. Then the server can decompose the audio/video sample, determine the audio/video processing parameters of the audio/video sample, and store the corresponding relationship between the scene identification and the audio/video processing parameters of the audio/video sample, that is, the server stores the corresponding relationship between the noise reduction ratio of the scene identification and the audio/video sample and the face data. Therefore, the server can determine the audio/video processing parameters of the target audio/video sample corresponding to the target scene identifier according to the corresponding relationship.
Further, as shown in fig. 3, the terminal may provide common scenes such as office/home office/high-speed rail/drive in/mall/hotel/restaurant/default options for the user to select. After a user selects a scene, the terminal determines a corresponding scene identifier and acquires audio and video samples in the scene through the microphone and the camera, namely, the terminal acquires different audio and video samples according to different scenes. And then the terminal uploads the scene identification and the audio/video sample corresponding to the scene identification to the server. The server can decompose the audio and video sample, extract an audio file and an image file, and further determine the noise reduction ratio and the face data of the audio and video sample. For example, taking the determination of the noise reduction ratio as an example, the audio/video sample acquired by the terminal is a section of pure noise sample video file, and then the pure noise sample video file is uploaded to the server, and the server may extract the audio file in the video, perform the audio noise reduction function, select the noise sample, transmit the noise sample to the noise reduction processor, and then perform noise reduction sampling to obtain the noise reduction ratio. The server can store the corresponding relation between the scene identification and the noise reduction ratio and the face data of the audio/video sample, and can also return the corresponding relation to the terminal. The terminal may generate a corresponding prompt, such as a process success or a process failure. If the terminal receives the corresponding relation returned by the server, the terminal can generate processing success prompt information, otherwise, processing failure prompt information is generated, so that a user can know the processing condition in time according to the prompt information, and the method is suitable for application.
S203: and processing the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample.
As can be seen from the above, the audio/video processing parameters include a noise reduction ratio and face data. Therefore, when the server processes the audio and video data of the user in the video conference according to the audio and video processing parameters of the target audio and video sample, the noise reduction processing can be performed on the audio and video data of the user in the video conference according to the noise reduction ratio, so that the audio is ensured to be pure and less in noise, and the audio and video data after the noise reduction processing is subjected to image filtering according to the face data, so that only images conforming to the face are displayed in the images, and other irrelevant images are filtered, thereby solving the safety problem that the existing user is exposed when the user directly accesses the audio and video data into the video conference.
S204: and returning the processed audio and video data to the terminal so that the terminal carries out the video conference according to the processed audio and video data.
The server returns the processed audio and video data to the terminal, so that the terminal can transmit the audio and video data to the video conference, interference is reduced, input of noise is reduced, pictures irrelevant to work in the video are filtered, and the safety problem caused when a user directly accesses the audio and video data to the video conference is avoided.
After receiving the processed audio and video data returned by the server, the terminal may also generate corresponding prompt information, such as success or failure. If the terminal receives the processed audio and video data returned by the server, the terminal can generate success prompt information, otherwise, failure prompt information can be generated.
According to the embodiment of the application, the server receives conference information uploaded by the terminal, the conference information comprises a target scene identifier and audio and video data of a user in a video conference, the server determines audio and video processing parameters of a target audio and video sample according to the target scene identifier, processes the audio and video data according to the audio and video processing parameters, and returns the processed audio and video data to the terminal, so that the terminal carries out the video conference according to the processed audio and video data. In addition, according to the method and the device, the corresponding audio and video processing parameters are determined according to the target scene identification, so that the audio and video data of the user in the video conference under the target scene are processed according to the parameters, the accuracy of the subsequent processing result is high, and the method and the device are suitable for application.
The method for conference access according to the embodiment of the present application is described in detail in the above embodiment from the server side, and the method for conference access provided in the embodiment of the present application will be described in detail in the following from the terminal side in conjunction with the following embodiment. It should be understood that certain concepts, characteristics, and the like of the server-side description correspond to those of the terminal-side description, and the duplicated description is appropriately omitted for the sake of brevity.
Fig. 4 is a schematic flowchart of another conference access method provided in an embodiment of the present application, where an execution subject of the embodiment may be the terminal in fig. 1, and as shown in fig. 4, the method may include the following steps:
s401: and determining the target scene identification.
S402: and uploading conference information to a server, wherein the conference information comprises the target scene identification and audio and video data of the users in the video conference, the conference information is used for indicating the server to determine audio and video processing parameters of a target audio and video sample according to the target scene identification, processing the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample, and returning the processed audio and video data to the terminal.
S403: and receiving the processed audio and video data sent by the server, and carrying out the video conference according to the processed audio and video data.
Here, after receiving the conference information uploaded by the terminal, the server may obtain a correspondence between a pre-stored scene identifier and an audio/video processing parameter of an audio/video sample, and then determine an audio/video processing parameter of a target audio/video sample corresponding to the target scene identifier according to the correspondence.
The terminal can also upload preprocessing information to the server, wherein the preprocessing information comprises a scene identifier and an audio/video sample corresponding to the scene identifier, and the preprocessing information is used for instructing the server to decompose the audio/video sample, determining audio/video processing parameters of the audio/video sample and storing the corresponding relation between the scene identifier and the audio/video processing parameters of the audio/video sample. Therefore, the server can determine the audio/video processing parameters of the target audio/video sample corresponding to the target scene identifier according to the corresponding relationship.
According to the embodiment of the application, conference information is uploaded to a server through a terminal, the conference information comprises a target scene identification and audio and video data of a user in a video conference, further, the server determines audio and video processing parameters of a target audio and video sample according to the target scene identification, the audio and video data are processed according to the audio and video processing parameters, the processed audio and video data are returned to the terminal, and therefore the terminal carries out the video conference according to the processed audio and video data. In addition, according to the method and the device, the corresponding audio and video processing parameters are determined according to the target scene identification, so that the audio and video data of the user in the video conference under the target scene are processed according to the parameters, the accuracy of the subsequent processing result is high, and the method and the device are suitable for application.
In addition, another conference access method is further provided in this embodiment of the present application, which is described in an interactive manner from a user, a server, and a terminal, and as shown in fig. 5, the method may include:
s501: and the terminal determines the target scene identifier.
The target scene identifier may be an identifier of a scene set on the terminal by the user according to the environment where the user is located. The user may first initiate a conference on the terminal and then set the target scene identifier on the terminal.
S502: and the terminal uploads meeting information to the server, wherein the meeting information comprises the target scene identification and the audio and video data of the users in the video meeting.
S503: and the server determines audio and video processing parameters of a target audio and video sample according to the target scene identification, and processes the audio and video data of the user in the video conference according to the audio and video processing parameters of the target audio and video sample.
S504: and the server returns the processed audio and video data to the terminal.
S505: and the terminal receives the processed audio and video data sent by the server and carries out the video conference according to the processed audio and video data.
Here, after receiving the processed audio and video data returned by the server, the terminal may also generate corresponding prompt information, such as success or failure. If the terminal receives the processed audio and video data returned by the server, the terminal can generate success prompt information, otherwise, failure prompt information can be generated.
According to the embodiment of the application, the audio and video data of the user can be processed through the server, and the processed audio and video data are returned to the terminal, so that the terminal can conduct a video conference based on the processed audio and video data, and the problem of safety, such as privacy exposure, exposed when the existing user directly accesses the audio and video data into the video conference is solved. And the server determines corresponding audio and video processing parameters according to the target scene identification, so that the audio and video data of the user in the video conference under the target scene are processed according to the parameters, the accuracy of subsequent processing results is high, and the method is suitable for application.
Fig. 6 is a schematic structural diagram of a conference access apparatus according to the embodiment of the present application, corresponding to the conference access method according to the embodiment of the present application. For convenience of explanation, only portions related to the embodiments of the present application are shown. Fig. 6 is a schematic structural diagram of a conference access apparatus provided in an embodiment of the present application, where the conference access apparatus 60 includes: a first receiving module 601, a first determining module 602, a processing module 603, and a returning module 604. The conference access device may be the server itself, or a chip or an integrated circuit that implements the functions of the server. It should be noted here that the division of the first receiving module, the first determining module, the processing module, and the returning module is only a division of logical functions, and the two may be integrated or independent physically.
The first receiving module 601 is configured to receive conference information uploaded by a terminal, where the conference information includes a target scene identifier and audio/video data of a user in a video conference.
A first determining module 602, configured to determine an audio/video processing parameter of a target audio/video sample according to the target scene identifier.
And the processing module 603 is configured to process the audio and video data of the user in the video conference according to the audio and video processing parameter of the target audio and video sample.
And a returning module 604, configured to return the processed audio and video data to the terminal, so that the terminal performs the video conference according to the processed audio and video data.
In one possible implementation, the audio-video processing parameters include a noise reduction ratio and face data.
The processing module 603 is specifically configured to:
according to the noise reduction ratio, carrying out noise reduction processing on audio and video data of users in the video conference;
and according to the face data, carrying out image filtering on the audio and video data subjected to noise reduction processing.
In a possible implementation manner, the first determining module 602 is specifically configured to:
acquiring a corresponding relation between a pre-stored scene identification and audio and video processing parameters of an audio and video sample;
and determining audio and video processing parameters of the target audio and video sample corresponding to the target scene identification according to the corresponding relation.
In a possible implementation manner, the first determining module 602 is further configured to:
receiving preprocessing information uploaded by a terminal, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier;
decomposing the audio and video samples and determining audio and video processing parameters of the audio and video samples;
and storing the corresponding relation between the scene identification and the audio and video processing parameters of the audio and video samples.
The apparatus provided in the embodiment of the present application may be used to implement the technical solution of the method embodiment in fig. 2, which has similar implementation principles and technical effects, and is not described herein again in the embodiment of the present application.
Fig. 7 is a schematic structural diagram of another conference access apparatus according to an embodiment of the present application. The conference access apparatus 70 includes: a second determining module 701, an uploading module 702, and a second receiving module 703. The conference access device may be the terminal itself, or a chip or an integrated circuit that implements the functions of the terminal. It should be noted that the division of the second determining module, the uploading module and the second receiving module is only a division of logical functions, and the two may be integrated or independent physically.
The second determining module 701 is configured to determine the target scene identifier.
The uploading module 702 is configured to upload conference information to a server, where the conference information includes the target scene identifier and audio/video data of a user in a video conference, and the conference information is used to instruct the server to determine audio/video processing parameters of a target audio/video sample according to the target scene identifier, process the audio/video data of the user in the video conference according to the audio/video processing parameters of the target audio/video sample, and return the processed audio/video data to the terminal.
The second receiving module 703 is configured to receive the processed audio and video data sent by the server, and perform the video conference according to the processed audio and video data.
In one possible design, the upload module 702 is further configured to:
uploading preprocessing information to the server, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier, and the preprocessing information is used for indicating the server to decompose the audio and video sample, determining audio and video processing parameters of the audio and video sample and storing the corresponding relation between the scene identifier and the audio and video processing parameters of the audio and video sample.
The apparatus provided in the embodiment of the present application may be used to implement the technical solution of the method embodiment in fig. 4, which has similar implementation principles and technical effects, and is not described herein again in the embodiment of the present application.
Alternatively, fig. 8A and 8B schematically provide one possible basic hardware architecture of the server and the terminal, respectively, described herein.
Referring to fig. 8A and 8B, the server and the terminal include at least one processor 801 and a communication interface 803. Further optionally, a memory 802 and a bus 804 may also be included.
Among them, the number of the processors 801 in the server and the terminal may be one or more, and fig. 8A and 8B only illustrate one of the processors 801. Alternatively, the processor 801 may be a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or a Digital Signal Processor (DSP). If the server and the terminal have multiple processors 801, the types of the multiple processors 801 may be different, or may be the same. Alternatively, the plurality of processors 801 of the server and the terminal may also be integrated into a multi-core processor.
Memory 802 stores computer instructions and data; the memory 802 may store computer instructions and data necessary to implement the conference access methods provided herein, e.g., the memory 802 stores instructions for implementing the steps of the conference access methods described above. The memory 802 may be any one or any combination of the following storage media: nonvolatile memory (e.g., Read Only Memory (ROM), Solid State Disk (SSD), hard disk (HDD), optical disk), volatile memory.
The communication interface 803 may provide information input/output for the at least one processor. Any one or any combination of the following devices may also be included: a network interface (e.g., an ethernet interface), a wireless network card, etc. having a network access function.
Optionally, the communication interface 803 may also be used for data communication between servers and terminals and other computing devices or terminals.
Further alternatively, fig. 8A and 8B show the bus 804 by a thick line. A bus 804 may connect the processor 801 with the memory 802 and the communication interface 803. Thus, via bus 804, processor 801 may access memory 802 and may also interact with other computing devices or terminals using communication interface 803.
In the present application, the server and the terminal execute the computer instructions in the memory 802, so that the server and the terminal implement the conference access method provided by the present application, or the server and the terminal deploy the conference access apparatus.
From the perspective of logical functional division, illustratively, as shown in fig. 8A, a first receiving module 601, a first determining module 602, a processing module 603, and a returning module 604 may be included in the memory 802. The inclusion herein merely refers to that the instructions stored in the memory may, when executed, implement the functions of the first receiving module, the first determining module, the processing module, and the returning module, respectively, and is not limited to a physical structure.
In one possible design, as shown in fig. 8B, the memory 802 includes a second determining module 701, an uploading module 702, and a second receiving module 703, which are included only to refer to that the instructions stored in the memory can implement the functions of the second determining module, the uploading module, and the second receiving module, respectively, when executed, and are not limited to a physical structure.
In addition, the conference access apparatus may be implemented by software as shown in fig. 8A and 8B, or may be implemented by hardware as a hardware module or as a circuit unit.
The present application provides a computer-readable storage medium, the computer program product comprising computer instructions that instruct a computing device to perform the above-mentioned conference access method provided herein.
The present application provides a chip comprising at least one processor and a communication interface providing information input and/or output for the at least one processor. Further, the chip may also include at least one memory for storing computer instructions. The at least one processor is used for calling and executing the computer instructions to execute the conference access method provided by the application.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Claims (10)

1. A conference access method is applied to a server and comprises the following steps:
receiving conference information uploaded by a terminal, wherein the conference information comprises a target scene identifier and audio and video data of users in a video conference;
determining audio and video processing parameters of a target audio and video sample according to the target scene identification;
processing the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample;
and returning the processed audio and video data to the terminal so that the terminal carries out the video conference according to the processed audio and video data.
2. The method of claim 1, wherein the audio-video processing parameters include a noise reduction ratio and face data;
the processing of the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample comprises the following steps:
according to the noise reduction ratio, carrying out noise reduction processing on audio and video data of users in the video conference;
and according to the face data, carrying out image filtering on the audio and video data subjected to noise reduction processing.
3. The method according to claim 1 or 2, wherein the determining audio/video processing parameters of a target audio/video sample according to the target scene identifier comprises:
acquiring a corresponding relation between a pre-stored scene identification and audio and video processing parameters of an audio and video sample;
and determining audio and video processing parameters of the target audio and video sample corresponding to the target scene identification according to the corresponding relation.
4. The method according to claim 3, before determining the audio-video processing parameters of the target audio-video sample according to the target scene identification, further comprising:
receiving preprocessing information uploaded by a terminal, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier;
decomposing the audio and video samples and determining audio and video processing parameters of the audio and video samples;
and storing the corresponding relation between the scene identification and the audio and video processing parameters of the audio and video samples.
5. A conference access method is applied to a terminal, and the method comprises the following steps:
determining a target scene identifier;
uploading meeting information to a server, wherein the meeting information comprises the target scene identification and audio and video data of users in the video meeting, the meeting information is used for indicating the server to determine audio and video processing parameters of a target audio and video sample according to the target scene identification, processing the audio and video data of the users in the video meeting according to the audio and video processing parameters of the target audio and video sample, and returning the processed audio and video data to the terminal;
and receiving the processed audio and video data sent by the server, and carrying out the video conference according to the processed audio and video data.
6. The method of claim 5, further comprising:
uploading preprocessing information to the server, wherein the preprocessing information comprises a scene identifier and an audio and video sample corresponding to the scene identifier, and the preprocessing information is used for indicating the server to decompose the audio and video sample, determining audio and video processing parameters of the audio and video sample and storing the corresponding relation between the scene identifier and the audio and video processing parameters of the audio and video sample.
7. A conference access apparatus, wherein the apparatus is applied to a server, the apparatus comprising:
the first receiving module is used for receiving conference information uploaded by a terminal, wherein the conference information comprises a target scene identifier and audio and video data of users in a video conference;
the first determining module is used for determining audio and video processing parameters of a target audio and video sample according to the target scene identification;
the processing module is used for processing the audio and video data of the users in the video conference according to the audio and video processing parameters of the target audio and video sample;
and the return module is used for returning the processed audio and video data to the terminal so that the terminal carries out the video conference according to the processed audio and video data.
8. A server, comprising:
a processor;
a memory; and
a computer program;
wherein the computer program is stored in the memory and configured to be executed by the processor, the computer program comprising instructions for performing the method of any of claims 1-4.
9. A computer-readable storage medium, characterized in that it stores a computer program that causes a server to execute the method of any one of claims 1-4.
10. A computer program product comprising computer instructions for executing the method of any one of claims 1 to 4 by a processor.
CN202110795149.3A 2021-07-14 2021-07-14 Conference access method, device, server and storage medium Pending CN113473068A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110795149.3A CN113473068A (en) 2021-07-14 2021-07-14 Conference access method, device, server and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110795149.3A CN113473068A (en) 2021-07-14 2021-07-14 Conference access method, device, server and storage medium

Publications (1)

Publication Number Publication Date
CN113473068A true CN113473068A (en) 2021-10-01

Family

ID=77880191

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110795149.3A Pending CN113473068A (en) 2021-07-14 2021-07-14 Conference access method, device, server and storage medium

Country Status (1)

Country Link
CN (1) CN113473068A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110502974A (en) * 2019-07-05 2019-11-26 深圳壹账通智能科技有限公司 A kind of methods of exhibiting of video image, device, equipment and readable storage medium storing program for executing
US20200186727A1 (en) * 2018-12-08 2020-06-11 Fuji Xerox Co., Ltd. Systems and methods for implementing personal camera that adapts to its surroundings, both co-located and remote
CN111405234A (en) * 2020-04-17 2020-07-10 杭州大轶科技有限公司 Video conference information system and method with integration of cloud computing and edge computing
CN112672095A (en) * 2020-12-25 2021-04-16 联通在线信息科技有限公司 Teleconferencing system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200186727A1 (en) * 2018-12-08 2020-06-11 Fuji Xerox Co., Ltd. Systems and methods for implementing personal camera that adapts to its surroundings, both co-located and remote
CN111294550A (en) * 2018-12-08 2020-06-16 富士施乐株式会社 System and method for implementing a personal camera adapted to its surroundings
CN110502974A (en) * 2019-07-05 2019-11-26 深圳壹账通智能科技有限公司 A kind of methods of exhibiting of video image, device, equipment and readable storage medium storing program for executing
CN111405234A (en) * 2020-04-17 2020-07-10 杭州大轶科技有限公司 Video conference information system and method with integration of cloud computing and edge computing
CN112672095A (en) * 2020-12-25 2021-04-16 联通在线信息科技有限公司 Teleconferencing system

Similar Documents

Publication Publication Date Title
CN110881144B (en) Data processing method based on live broadcast platform and related equipment
RU2642513C2 (en) Communication system
CN104639777A (en) Conference control method, conference control device and conference system
US20170171638A1 (en) Method and Terminal for Video Play
CN110536075B (en) Video generation method and device
CN105791962A (en) Method, apparatus and system for sharing screen-shot
CN103841466A (en) Screen projection method, computer end and mobile terminal
US20230017859A1 (en) Meeting control method and apparatus, device, and medium
CN108574878B (en) Data interaction method and device
CN113438442A (en) Conference data sharing method and device
CN112817671A (en) Image processing method, device, equipment and computer readable storage medium
CN110113298B (en) Data transmission method, device, signaling server and computer readable medium
CN113973103A (en) Audio processing method and device, electronic equipment and storage medium
CN111866440B (en) Method, device and equipment for pushing video data and storage medium
CN112751681B (en) Image processing method, device, equipment and computer readable storage medium
CN111541905B (en) Live broadcast method and device, computer equipment and storage medium
CN113179208A (en) Interaction method, interaction device and storage medium
CN112532913A (en) Video mixing method, video system and server
CN113473068A (en) Conference access method, device, server and storage medium
US9485458B2 (en) Data processing method and device
CN114566173A (en) Audio mixing method, device, equipment and storage medium
EP4289143A1 (en) System and method for sharing media resources for network based communication
US10904301B2 (en) Conference system and method for handling conference connection thereof
CN112559111B (en) Screen capturing method and device for sharing desktop
CN109561081B (en) Mobile terminal video conference method and device, storage medium and server

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211001