CN112688965A

CN112688965A - Conference audio sharing method and device, electronic equipment and storage medium

Info

Publication number: CN112688965A
Application number: CN202110263824.8A
Authority: CN
Inventors: 廖焕柱; 杨国全; 王克彦; 曹亚曦; 俞鸣园
Original assignee: Zhejiang Huachuang Video Signal Technology Co Ltd
Current assignee: Zhejiang Huachuang Video Signal Technology Co Ltd
Priority date: 2021-03-11
Filing date: 2021-03-11
Publication date: 2021-04-20
Anticipated expiration: 2041-03-11
Also published as: CN112688965B

Abstract

The application discloses a conference audio sharing method and device, electronic equipment and a computer readable storage medium. The method comprises the steps that audio data from a conference are eliminated from audio data of a same source of a conference terminal audio playing device through first audio processing to obtain demonstration stream audio data; audio data of the same source of the conference terminal audio playing device is eliminated from the audio data from the conference terminal audio collecting device through second audio processing to obtain audio data of the speaking sound of conference participants; and then, merging the audio data of the demonstration stream audio data and the audio data of the speech sound of the participants, and transmitting the merged audio data back to the online to share the merged audio data to other participants. Therefore, bad experiences such as echo and the like generated when the presentation streaming audio is shared in a call scene can be eliminated. In addition, when the streaming audio is not shared for demonstration, other sounds which are locally played can be eliminated through second audio processing, so that the speaking audio data of the participants are cleaner, and the conference conversation quality is better.

Description

Conference audio sharing method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of information processing, and in particular, to a conference audio sharing method and apparatus, an electronic device, and a computer-readable storage medium.

Background

With the development and progress of communication and internet technologies, web conferencing becomes an important way for people to communicate remotely among multiple parties.

However, most of the current network conference application products only support the sharing of demonstration pictures and do not support the sharing of demonstration audio data; even if some netmeeting application products can share audio data, only non-call scenarios are supported.

In a conversation scene, besides the demonstration audio data to be shared, audio data of other unrelated software and the speaking audio data of conference place and conference participants exist in the audio data, and the audio data are transmitted back to the online, so that bad experiences such as echoes and the like can be generated, and the conference effect is seriously influenced.

Therefore, in a call scenario, how to implement the sharing of the presentation audio data becomes a technical problem yet to be solved.

Disclosure of Invention

The applicant creatively provides a conference audio sharing method, a conference audio sharing device, electronic equipment and a computer readable storage medium.

According to a first aspect of embodiments of the present application, there is provided a conference audio sharing method, including: acquiring audio data, wherein the audio data comprises first audio data from a conference, second audio data from a conference terminal audio playing device and third audio data from a conference terminal audio collecting device; performing first audio processing on the second audio data to eliminate the first audio data to obtain fourth audio data; performing second audio processing on the third audio data to eliminate the second audio data to obtain fifth audio data; performing third audio processing on the fourth audio data and the fifth audio data to combine the fourth audio data and the fifth audio data to obtain sixth audio data; the sixth audio data is shared in the conference.

According to an embodiment of the present application, acquiring audio data includes: acquiring audio data in a conference playing thread as first audio data; acquiring audio data to be sent to audio playing equipment after being processed by a system audio driving module of the conference terminal as second audio data; and acquiring audio data in the conference acquisition thread as third audio data.

According to an embodiment of the present application, performing a first audio process on second audio data to eliminate the first audio data to obtain fourth audio data includes: and taking the first audio data as reference audio data, and performing first noise elimination processing on the second audio data to obtain fourth audio data.

According to an embodiment of the present application, the first sound-deadening process includes a linear sound-deadening process.

According to an embodiment of the present application, performing a second audio processing on the third audio data to eliminate the second audio data to obtain a fifth audio data includes: and taking the second audio data as reference audio data, and performing second noise elimination processing on the third audio data to obtain fifth audio data.

According to an embodiment of the present application, the second sound-deadening process includes a linear sound-deadening process and a nonlinear sound-deadening process.

According to an embodiment of the present application, after performing the second audio processing on the third audio data to eliminate the second audio data to obtain the fifth audio data, the method further includes: performing noise removal processing on the fifth audio data to obtain seventh audio data; correspondingly, the third audio processing is performed on the fourth audio data and the fifth audio data to obtain sixth audio data, and the method includes: and performing third audio processing on the fourth audio data and the seventh audio data to obtain sixth audio data.

According to an embodiment of the present application, after performing noise removal processing on the fifth audio data to obtain seventh audio data, the method further includes: performing gain compensation processing on the seventh audio data to obtain eighth audio data; correspondingly, the third audio processing is performed on the fourth audio data and the fifth audio data to obtain sixth audio data, and the method includes: and performing third audio processing on the fourth audio data and the eighth audio data to obtain sixth audio data.

According to a second aspect of embodiments of the present application, a conference audio sharing apparatus includes: the audio data acquisition module is used for acquiring audio data, wherein the audio data comprises first audio data from a conference, second audio data from a conference terminal audio playing device and third audio data from a conference terminal audio acquisition device; the first audio processing module is used for carrying out first audio processing on the second audio data so as to eliminate the first audio data and obtain fourth audio data; the second audio processing module is used for carrying out second audio processing on the third audio data so as to eliminate the second audio data to obtain fifth audio data; the third audio processing module is used for performing third audio processing on the fourth audio data and the fifth audio data to combine the fourth audio data and the fifth audio data to obtain sixth audio data; and the conference audio sharing module shares the sixth audio data in the conference.

According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; a memory for storing a computer program; and a processor for implementing the method steps of any one of the above conference audio sharing methods when executing the program stored in the memory.

According to a fourth aspect of embodiments of the present application, there is provided a computer-readable storage medium having stored therein a computer program, which when executed by a processor, implements the method steps of any one of the above-mentioned conference audio sharing methods.

The embodiment of the application provides a conference audio sharing method and device, electronic equipment and a computer readable storage medium. The method comprises the steps that audio data from a conference are eliminated from audio data of a same source of a conference terminal audio playing device through first audio processing to obtain demonstration stream audio data; audio data of the same source of the conference terminal audio playing device is eliminated from the audio data from the conference terminal audio collecting device through second audio processing to obtain audio data of the speaking sound of conference participants; and then, merging the audio data of the demonstration stream audio data and the audio data of the speech sound of the participants, and transmitting the merged audio data back to the online to share the merged audio data to other participants. Therefore, bad experiences such as echo and the like generated when the presentation streaming audio is shared in a call scene can be eliminated.

In addition, even when the presentation streaming audio is not shared, the conference audio data and the sound of other applications externally put out of the local system can be eliminated when the second audio processing is carried out, so that the speaking audio data of the conference participants can be more pure, and the quality of conference conversation can be further improved.

It is to be understood that the implementation of the present application does not require all of the above-described advantages to be achieved, but rather that certain technical solutions may achieve certain technical effects, and that other embodiments of the present application may also achieve other advantages not mentioned above.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present application will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present application are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

in the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Fig. 1 is a schematic view illustrating an implementation flow of an embodiment of a conference audio sharing method according to the present application;

fig. 2 is a schematic view illustrating a specific implementation flow of another embodiment of the conference audio sharing method according to the present application;

fig. 3 is a schematic structural diagram of an embodiment of a conference audio sharing device according to the present application.

Detailed Description

In order to make the objects, features and advantages of the present application more obvious and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are only a part of the embodiments of the present application, and not all the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "a plurality" means two or more unless specifically limited otherwise.

Fig. 1 shows a schematic implementation flow diagram of an embodiment of a conference audio sharing method according to the present application. Referring to fig. 1, an embodiment of the present application provides a conference audio sharing method, where the method includes: operation 110, acquiring audio data, where the audio data includes first audio data from a conference, second audio data from a same source of a conference terminal audio playing device, and third audio data from a conference terminal audio collecting device; operation 120, performing a first audio processing on the second audio data to eliminate the first audio data to obtain fourth audio data; operation 130, performing a second audio processing on the third audio data to eliminate the second audio data to obtain a fifth audio data; operation 140, performing third audio processing on the fourth audio data and the fifth audio data to combine the fourth audio data and the fifth audio data to obtain sixth audio data; in operation 150, sixth audio data is shared in the conference.

In operation 110, the audio data refers to digitized sound data. The audio data related in the conference audio sharing method mainly refers to audio data which can be received and acquired by a conference terminal, and includes: audio data from the conference, i.e. first audio data; second audio data of the same source of the conference terminal audio playing device; third audio data from the conference terminal audio acquisition device, and the like; audio data from a system audio playback program, and the like.

The audio data from the conference refers to audio data which is issued by the conference to each conference terminal and shared by the conference, and includes speech sounds of other conference places and the like; the conference terminal refers to various terminal devices accessed to a conference, including a computer, a tablet, a smart phone, a wearable device, a smart television and the like. The audio playing device mainly refers to a loudspeaker, a sound box, a loudspeaker and the like. The audio acquisition device mainly refers to a microphone, a microphone array, an audio acquisition device and the like. The system audio playing program mainly refers to an application program with an audio playing function, a multimedia player or playing software of an operating system, and for a conference terminal, mainly refers to presentation stream audio data to be shared and audio data played by other application programs in the system (i.e., sounds made by other application programs). The operating system installed on the conference terminal device may be any operating system supported by the conference system, including Windows, Linux, IOS, Android, and the like.

Generally, the second audio data of the same source of the conference terminal audio playing device mainly refers to audio data which is sent to the audio playing device to be played or is played through the audio playing device, and includes the first audio data from the conference and the audio data from the system audio playing program; in a call scenario, the third audio data from the audio capturing device of the conference terminal mainly refers to audio data captured in a meeting place, and includes first audio data from a conference played by the conference terminal, second audio data played by a system audio playing program, and audio data (i.e., speech sound of a participant) input by the participant through the audio capturing device.

Accordingly, by performing the first audio processing in operation 120, the audio data from the system audio playing program, i.e., the fourth audio data, can be obtained by eliminating the first audio data from the second audio data of the same source of the audio playing device at the conference terminal. When the conference presenter performs the presentation, the fourth audio data is specifically audio data played by the local system audio playing program, that is, the audio data of the presentation stream to be shared. And, by the second audio processing of operation 130, the fifth audio data including the speech of the conference participant can be obtained by eliminating the second audio data from the third audio data (the audio data collected in the conference hall) of the audio collecting apparatus.

The fourth audio data and the fifth audio data are just in a call scene, and the audio data of other participants of the conference are expected to be shared. The audio data that may cause echo (i.e., the audio data from the conference and the presentation stream audio data played by the audio playing device) is removed through the first audio processing and the second audio processing of operation 120 and operation 130.

It should be noted that the first audio processing and the second audio processing are two audio processing processes in parallel, and the input and the output of the two audio processing processes are independent of each other and can be processed in parallel through two independent processing channels. Therefore, the first audio processing and the second audio processing do not interfere with each other, and different processing methods can be adopted as needed to obtain output audio with desired quality. In particular, an implementer may perform first audio processing and second audio processing separately through two audio processing modules (e.g., ACE modules).

When audio data elimination is performed in the first audio processing and the second audio processing, any applicable silencing method may be adopted, for example, audio data including only audio to be eliminated is acquired as reference audio data, and audio data corresponding to the reference audio data is removed from the source audio data; or remove audio data conforming to the audio features from the source audio data using the audio features of the audio data itself from which the audio is to be eliminated, for example, timbre features or audio features.

After the fourth audio data (the presentation stream audio to be shared) and the fifth audio data (the speech of the conference participants) are obtained, the two audio data need to be combined into audio data (i.e., sixth audio data) that can be shared with other conference participants. The third audio processing in operation 140 is for combining the fourth audio and the fifth audio data. The third audio processing mainly includes mixing and encoding processes.

After the sixth audio data is obtained, the sixth audio data can be sent to other participants in the conference through the network, so that the sharing of the presentation streaming audio and the speaking voice of the participants in the conference is realized.

As can be seen from the above embodiments, in the conference audio sharing method of the present application, audio data of a same source of the conference terminal audio playing device, audio data from a conference, and audio data from the conference terminal audio collecting device are obtained through operation 110; eliminating audio data from the conference from audio data of the same source of the audio playing device of the conference terminal through a first audio process in operation 120 to obtain presentation stream audio data; audio data of the conference terminal audio playing device source is eliminated from the audio data from the conference terminal audio collecting device through the second audio processing in operation 130 to obtain audio data of the speaking sound of the conference participants; then, the audio data of the presentation stream and the audio data of the speech sound of the conference participants are merged by the third audio processing in operation 140, and then returned to the online sharing for other conference participants by operation 150. Therefore, bad experiences such as echo and the like generated when the presentation streaming audio is shared in a call scene can be eliminated.

In addition, even when the presentation streaming audio is not shared, since the audio data (i.e. system playing sound) of the same source of the audio playing device of the conference terminal is eliminated during the second audio processing, the speaking audio data of the participants can be more pure, and the quality of the conference call can be further improved.

It should be noted that the embodiment shown in fig. 1 is only one of the most basic embodiments of the conference audio sharing method of the present application, and further refinements and extensions can be made on the basis of the embodiment. Exemplarily, the following steps are carried out:

The data acquired by the conference playing thread is audio data processed by jitter elimination, packet loss supplement, compression decoding and the like, and the audio data in the conference playing thread is acquired as first audio data, so that the processing process can be omitted, and the audio data with higher quality can be directly acquired.

The Audio data to be sent to the Audio playing device after being processed by the system Audio Driver module (e.g., the windows system Audio Driver module) is obtained is data which is not subjected to digital-to-analog (DA) conversion, and almost has no distortion, so that purer presentation stream Audio data can be obtained.

And the audio data in the conference acquisition thread is used as the third audio data, so that the process of analog-to-digital (AD) conversion can be omitted, and the audio data with higher quality can be obtained.

Because the audio data only including the audio data from the conference, namely the first audio data, can be obtained, the audio data can be directly used as the reference audio data, and the second audio data is subjected to noise elimination treatment, so that the fourth audio data played by the system, namely the presentation stream audio data, can be obtained. Thus, the extracted presentation stream audio data is more accurate.

When the first audio data from the conference and the second audio data of the same source of the conference terminal audio playing device have high tone quality, little noise and little distortion, when the first silencing treatment is carried out, only linear silencing treatment can be carried out, and purer demonstration stream audio can be obtained without carrying out nonlinear silencing treatment. Thus, the calculation power can be greatly saved, and the processing time is shorter.

Because the audio data only containing the second audio data of the same source of the conference terminal audio playing device, namely the second audio data, can be obtained, the audio data can be directly used as the reference audio data to perform noise elimination processing on the third audio data. After the processing, the fifth audio data except the audio data played by the terminal audio playing device in the meeting place, namely the speech sound audio data of the meeting participants can be obtained. Therefore, the extracted speech sound audio data of the conference participants is more accurate.

The audio data collected by the conference terminal audio collection device, i.e. the third audio data, is likely to undergo a number of AD conversion or DA conversion processes more than the data of the same source as the conference terminal audio playing device. In this case, when the second noise cancellation process is performed, if only the linear echo cancellation is performed, there is a high possibility that some echoes or noises due to distortion may occur. It is therefore desirable to perform some additional non-linear echo cancellation to obtain better quality audio data.

As described above, since the audio data collected by the audio collection device of the conference terminal apparatus is likely to include some background noise collected from the conference hall, which is not related to the conference. Therefore, an additional Noise removal process, for example, an Automatic Noise Suppression (ANS) process, may be required after the silencing process. Thus, a purer speech sound of the conference participants can be obtained.

Generally, the conference participants may speak suddenly or suddenly due to the distance between the participants and the audio capturing devices of the conference terminals. In this case, after the sound attenuation and noise reduction process, a Gain process may be further performed, for example, an Automatic Gain Control (AGC) process is used to automatically adjust the sound volume of the microphone, so that the conference participants receive a certain sound volume level without changing the distance between the speaker and the microphone.

The above embodiments are exemplary illustrations of how to further refine and expand on the basis of the basic embodiment shown in fig. 1, and an implementer may combine various implementations in the above embodiments to form a new embodiment according to specific implementation conditions and needs, so as to achieve a more ideal implementation effect.

Fig. 2 shows another embodiment of the conference audio sharing method of the present application, which combines various implementations of the above embodiments to finally form another embodiment with better implementation effect.

As shown in fig. 2, the process from receiving an audio signal from a conference from a network to generating audio data to be shared, which includes presentation stream audio data and audio data of voices spoken by participants, and sharing the audio data to other participants through the network mainly includes:

step 2010, receiving an audio signal from the conference sent via the network, and performing processing such as jitter elimination, packet loss supplement, compression decoding and the like on the audio signal through a signal processing module (e.g., a Neteq module) to obtain audio data (i.e., first audio data) for playing in a conference playing thread;

step 2020, an Audio Driver module of the system (e.g., a windows system Audio Driver module) collects Audio data and presentation stream Audio data played by the conference playing thread, and processes the Audio data and the presentation stream Audio data by the Audio Driver module to obtain Audio data a (i.e., second Audio data) for the Audio playing device of the conference terminal equipment to play;

step 2030, the Automatic Echo Cancellation (AEC) module 1 obtains audio data a as source audio data from the audio driver module output interface, obtains audio data from the conference as reference audio data from the conference playing thread, and performs silencing on the source audio data to obtain presentation stream audio data (i.e. fourth audio data);

the quality is high and distortion is hardly caused because the audio data is acquired from the conference playing thread and the audio data is acquired from the audio driving module output interface. Therefore, only linear noise reduction processing is performed in the linear AEC module 1 to obtain relatively pure audio data of the presentation stream, and then ANS and AGC processing are not performed.

Step 2040, the ACE module 2 acquires the processed meeting place sound (i.e. third audio data) acquired by the microphone from the meeting acquisition thread as source audio data; acquiring audio data A as reference audio data from an output interface of an audio driving module, and performing silencing treatment on source audio data to obtain speech sound audio data (namely fifth audio data) of participants;

and linear and nonlinear silencing treatment is adopted when the ACE module 2 performs silencing treatment.

Since the audio data a often contains audio data unrelated to the conference played by other applications of the system in step 2040, the use of the audio data a as reference audio data can eliminate the unrelated data, so as to make the audio data of the speech sound of the conference participants more pure.

Step 2050, performing noise removal processing on the speech sound audio data output from the ACE module 2 through an ANS module to remove noise of links such as conference room noise and the like to obtain seventh audio data;

step 2060, compensating and gain processing are carried out on the speech sound audio data of the participants through an AGC module, so that the speech sound quality of the participants is better and more stable, and eighth audio data are obtained;

step 2070, mixing and encoding the processed presentation stream audio data (fourth audio data) and the processed audio data (eighth audio data) of the speaking voice of the conference participants to obtain sixth audio data to be shared, and then sending the sixth audio data to other conference participants of the conference through the network, thereby realizing sharing of the presentation stream audio data in the call mode.

Therefore, when the user performs demonstration in the call mode, the demonstration stream pictures and the demonstration stream audios can be shared.

In order to make the participants have good sharing experience, the invention uses two echo cancellation modules: the linear AEC module 1 only performs linear echo cancellation without causing distortion; the AEC module 2 comprises two parts, linear and nonlinear echo cancellation, which can ensure that the processed sound does not contain echo data.

It should be noted that the application shown in fig. 2 is only an exemplary illustration of the conference audio sharing method of the present application and is not a limitation to the embodiment and application scenario of the conference audio sharing method of the present application. The implementer can adopt any applicable implementation mode and be applied to any applicable application scene according to specific implementation conditions.

Further, according to the embodiment of the present application, there is also provided a conference audio sharing apparatus, as shown in fig. 3, the apparatus 30 includes: the audio data acquisition module 301 is configured to acquire audio data, where the audio data includes first audio data from a conference, second audio data from a same source of a conference terminal audio playing device, and third audio data from a conference terminal audio collecting device; the first audio processing module 302 is configured to perform first audio processing on the second audio data to eliminate the first audio data to obtain fourth audio data; a second audio processing module 303, configured to perform second audio processing on the third audio data to eliminate the second audio data to obtain fifth audio data; a third audio processing module 304, configured to perform third audio processing on the fourth audio data and the fifth audio data to combine the fourth audio data and the fifth audio data to obtain sixth audio data; and a conference audio sharing module 305 for sharing the sixth audio data in the conference.

According to an embodiment of the present application, the audio data obtaining module 301 includes: the first audio data acquisition submodule is used for acquiring audio data in a conference playing thread as first audio data; the second audio data acquisition submodule is used for acquiring audio data which are processed by a system audio driving module of the conference terminal and are to be sent to the audio playing device as second audio data; and the third audio data acquisition submodule is used for acquiring the audio data in the conference acquisition thread as third audio data.

According to an embodiment of the present application, the first audio processing module 302 is specifically configured to perform a first noise reduction process on the second audio data to obtain fourth audio data, with the first audio data as reference audio data.

According to an embodiment of the present application, the first audio processing module 302 is specifically configured to perform linear noise reduction processing.

According to an embodiment of the present application, the second audio processing module 303 is specifically configured to perform a second noise elimination process on the third audio data to obtain fifth audio data, with the second audio data as reference audio data.

According to an embodiment of the present application, the second audio processing module 303 is specifically used for linear noise reduction processing and nonlinear noise reduction processing.

According to an embodiment of the present application, the apparatus 30 further includes: the noise removal processing module is used for performing noise removal processing on the fifth audio data to obtain seventh audio data; correspondingly, the third audio processing module 304 is specifically configured to perform third audio processing on the fourth audio data and the seventh audio data to obtain sixth audio data.

According to an embodiment of the present application, the apparatus 30 further includes: the gain compensation processing module is used for carrying out gain compensation processing on the seventh audio data to obtain eighth audio data; correspondingly, the third audio processing module 304 is specifically configured to perform third audio processing on the fourth audio data and the eighth audio data to obtain sixth audio data.

According to a third aspect of the embodiments of the present application, there is provided an electronic device, including a processor, a communication interface, a memory, and a communication bus, where the processor, the communication interface, and the memory complete communication with each other through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any of the above conference audio sharing methods when executing the program stored in the memory.

Here, it should be noted that: the above description on the embodiment of the conference audio sharing apparatus, the above description on the embodiment of the electronic device, and the above description on the embodiment of the computer readable storage medium are similar to the descriptions on the foregoing method embodiments, and have similar beneficial effects to the foregoing method embodiments, and therefore, the descriptions are not repeated. For technical details that have not been disclosed in the description of the embodiment of the conference audio sharing device, the description of the embodiment of the electronic device, and the description of the embodiment of the computer-readable storage medium, please refer to the description of the foregoing method embodiments of the present application for understanding, and therefore, for brevity, will not be described again.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of a unit is only one logical function division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another device, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units; can be located in one place or distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may be separately regarded as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

Those of ordinary skill in the art will understand that: all or part of the steps for realizing the method embodiments can be completed by hardware related to program instructions, the program can be stored in a computer readable storage medium, and the program executes the steps comprising the method embodiments when executed; and the aforementioned storage medium includes: various media capable of storing program codes, such as a removable storage medium, a Read Only Memory (ROM), a magnetic disk, and an optical disk.

Alternatively, the integrated units described above in the present application may be stored in a computer-readable storage medium if they are implemented in the form of software functional modules and sold or used as independent products. Based on such understanding, the technical solutions of the embodiments of the present application may be essentially implemented or portions thereof that contribute to the prior art may be embodied in the form of a software product stored in a storage medium, and including several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a removable storage medium, a ROM, a magnetic disk, an optical disk, or the like, which can store the program code.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A conference audio sharing method, the method comprising:

acquiring audio data, wherein the audio data comprises first audio data from a conference, second audio data from a conference terminal audio playing device and third audio data from a conference terminal audio collecting device;

performing first audio processing on the second audio data to eliminate the first audio data to obtain fourth audio data;

performing second audio processing on the third audio data to eliminate the second audio data to obtain fifth audio data;

performing third audio processing on the fourth audio data and the fifth audio data to combine the fourth audio data and the fifth audio data to obtain sixth audio data;

sharing the sixth audio data in the conference.

2. The method of claim 1, wherein the obtaining audio data comprises:

acquiring audio data in a conference playing thread as the first audio data;

acquiring audio data to be sent to audio playing equipment after being processed by a system audio driving module of the conference terminal as second audio data;

and acquiring audio data in the conference acquisition thread as the third audio data.

3. The method of claim 1, wherein the first audio processing the second audio data to remove the first audio data to obtain fourth audio data comprises:

and taking the first audio data as reference audio data, and performing first noise elimination processing on the second audio data to obtain fourth audio data.

4. The method of claim 3, wherein the first muffling process comprises a linear muffling process.

5. The method of claim 1, wherein the second audio processing the third audio data to remove the second audio data to obtain fifth audio data comprises:

and taking the second audio data as reference audio data, and performing second noise elimination processing on the third audio data to obtain fifth audio data.

6. The method of claim 5, wherein the second muffling process comprises a linear muffling process and a non-linear muffling process.

7. The method of claim 1, wherein after the second audio processing of the third audio data to eliminate the second audio data to obtain fifth audio data, the method further comprises:

performing noise removal processing on the fifth audio data to obtain seventh audio data;

correspondingly, performing third audio processing on the fourth audio data and the fifth audio data to obtain sixth audio data, including:

and performing third audio processing on the fourth audio data and the seventh audio data to obtain sixth audio data.

8. The method of claim 7, wherein after said performing noise-removal processing on the fifth audio data to obtain seventh audio data, the method further comprises:

performing gain compensation processing on the seventh audio data to obtain eighth audio data;

and performing third audio processing on the fourth audio data and the eighth audio data to obtain sixth audio data.

9. An apparatus for conference audio sharing, the apparatus comprising:

the audio data acquisition module is used for acquiring audio data, wherein the audio data comprises first audio data from a conference, second audio data from a conference terminal audio playing device and third audio data from a conference terminal audio acquisition device;

the first audio processing module is used for carrying out first audio processing on the second audio data so as to eliminate the first audio data and obtain fourth audio data;

the second audio processing module is used for performing second audio processing on the third audio data so as to eliminate the second audio data to obtain fifth audio data;

a third audio processing module, configured to perform third audio processing on the fourth audio data and the fifth audio data to combine the fourth audio data and the fifth audio data to obtain sixth audio data;

a conference audio sharing module that shares the sixth audio data in the conference.

10. An electronic device is characterized by comprising a processor, a communication interface, a memory and a communication bus, wherein the processor and the communication interface are used for realizing mutual communication by the memory through the communication bus; a memory for storing a computer program; a processor for implementing the method steps of any one of claims 1 to 8 when executing a program stored in the memory.

11. A computer-readable storage medium, characterized in that a computer program is stored in the computer-readable storage medium, which computer program, when being executed by a processor, carries out the method steps of any one of claims 1 to 8.