CN116633909B - Conference management method and system based on artificial intelligence - Google Patents
Conference management method and system based on artificial intelligence Download PDFInfo
- Publication number
- CN116633909B CN116633909B CN202310875230.1A CN202310875230A CN116633909B CN 116633909 B CN116633909 B CN 116633909B CN 202310875230 A CN202310875230 A CN 202310875230A CN 116633909 B CN116633909 B CN 116633909B
- Authority
- CN
- China
- Prior art keywords
- speaking
- period
- voice
- users
- time
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013473 artificial intelligence Methods 0.000 title claims abstract description 25
- 238000007726 management method Methods 0.000 title claims abstract description 24
- 230000003993 interaction Effects 0.000 claims abstract description 143
- 238000012545 processing Methods 0.000 claims abstract description 11
- 238000003062 neural network model Methods 0.000 claims description 18
- 238000000034 method Methods 0.000 claims description 12
- 238000004590 computer program Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 4
- 230000007774 longterm Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/10—Office automation; Time management
- G06Q10/109—Time management, e.g. calendars, reminders, meetings or time accounting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/042—Knowledge-based neural networks; Logical representations of neural networks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/40—Support for services or applications
- H04L65/403—Arrangements for multi-party communication, e.g. for conferences
- H04L65/4038—Arrangements for multi-party communication, e.g. for conferences with floor control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/80—Responding to QoS
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N7/00—Television systems
- H04N7/14—Systems for two-way working
- H04N7/15—Conference systems
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Entrepreneurship & Innovation (AREA)
- Strategic Management (AREA)
- Computer Networks & Wireless Communication (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Tourism & Hospitality (AREA)
- Health & Medical Sciences (AREA)
- Quality & Reliability (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- General Business, Economics & Management (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Operations Research (AREA)
- Marketing (AREA)
- Economics (AREA)
- Telephonic Communication Services (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
The conference management method and system based on artificial intelligence provided by the invention comprise the steps of determining the voice interaction degree of the spoken voice and the plurality of non-spoken users in a period of time of a speaking user by using a voice interaction degree model, determining the video interaction degree of the shared video and the plurality of non-spoken users in a period of time of the speaking user by using a video interaction degree model based on the shared video and conference information in a period of time of the speaking user, determining the association degree of the speaking user and the plurality of non-spoken users based on the voice interaction degree of the speech and the shared video and the plurality of non-spoken users in a period of time of the speaking user, determining the plurality of to-be-muted users based on the association degree of the speaking user and the plurality of non-spoken users, and carrying out mute processing on the plurality of to-be-muted users.
Description
Technical Field
The invention relates to the technical field of conference management, in particular to a conference management method and system based on artificial intelligence.
Background
With the development of technology, more and more users choose to conduct teleconferencing through mobile applications. In the teleconference process, as the number of participants is numerous, if the microphones of all participants are opened, larger noise interference can exist to influence the progress of the conference, so that a manager can close the microphones of the participants who do not speak in the conference, thereby preventing noise in the environment from being transmitted to the conference to further interfere the conference.
Therefore, how to quickly manage the speaking right of the participants and improve the user experience are the problems to be solved currently.
Disclosure of Invention
The invention mainly solves the technical problem of how to rapidly manage the speaking right of the participants and improve the user experience.
According to a first aspect, the present invention provides an artificial intelligence based conference management method, including: acquiring conference information; acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in the period of time and the conference information; determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
Still further, the method further comprises: performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters; judging whether the voice recognition text comprises the name of the user who does not speak; and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
Further, the voice interaction degree model is a long-short period neural network model, the input of the voice interaction degree model is the speaking voice of the speaking user in a period of time and the conference information, the output of the voice interaction degree model is the voice interaction degree of the speaking voice of the speaking user in a period of time and the voice interaction degree of a plurality of non-speaking users, the video interaction degree model is a long-short period neural network model, the input of the video interaction degree model is the shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users.
Still further, the determining the association degree of the speaking user with the plurality of non-speaking users based on the voice interaction degree of the speaking voice with the plurality of non-speaking users in the period of time of the speaking user and the video interaction degree of the shared video with the plurality of non-speaking users in the period of time of the speaking user includes: and respectively giving different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and then carrying out weighted summation to obtain the association degree of the speaking user and the plurality of non-speaking users.
According to a second aspect, the present invention provides an artificial intelligence based conference management system comprising: the first acquisition module is used for acquiring conference information; the second acquisition module is used for acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; the voice determining module is used for determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time based on the spoken voices of the speaking user in the period of time and the conference information by using a voice interaction degree model; the video determining module is used for determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; the relevancy determination module is used for determining relevancy between the speaking user and the plurality of non-speaking users based on voice interaction degrees of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and video interaction degrees of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and the mute module is used for determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users and carrying out mute processing on the plurality of users to be muted.
Still further, the system is further configured to: performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters; judging whether the voice recognition text comprises the name of the user who does not speak; and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
Further, the voice interaction degree model is a long-short period neural network model, the input of the voice interaction degree model is the speaking voice of the speaking user in a period of time and the conference information, the output of the voice interaction degree model is the voice interaction degree of the speaking voice of the speaking user in a period of time and the voice interaction degree of a plurality of non-speaking users, the video interaction degree model is a long-short period neural network model, the input of the video interaction degree model is the shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users.
Furthermore, the relevance determining module is further configured to assign different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and perform weighted summation to obtain the relevance between the speaking user and the plurality of non-speaking users.
According to a third aspect, the present invention provides an electronic device comprising: a memory; a processor; a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method described above.
According to a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method as in any of the above aspects.
The conference management method and system based on artificial intelligence provided by the invention comprise the steps of determining the voice interaction degree of the spoken voice and the plurality of non-spoken users in a period of time of a speaking user by using a voice interaction degree model, determining the video interaction degree of the shared video and the plurality of non-spoken users in a period of time of the speaking user by using a video interaction degree model based on the shared video and conference information in a period of time of the speaking user, determining the association degree of the speaking user and the plurality of non-spoken users based on the voice interaction degree of the speech and the shared video and the plurality of non-spoken users in a period of time of the speaking user, determining the plurality of to-be-muted users based on the association degree of the speaking user and the plurality of non-spoken users, and carrying out mute processing on the plurality of to-be-muted users.
Drawings
Fig. 1 is a schematic flow chart of a conference management method based on artificial intelligence according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of managing user mute authorities according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an artificial intelligence-based conference management system according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In an embodiment of the present invention, there is provided an artificial intelligence based conference management method as shown in fig. 1, where the artificial intelligence based conference management method includes steps S1 to S6:
step S1, meeting information is obtained.
Meeting information includes the start time of the meeting, the topic of the meeting, information of participating users, the profile of the meeting, the agenda of the meeting, the material of the introduction to the meeting, etc. The information of the participating users includes user identity, user age, etc.
Step S2, the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time are obtained.
The spoken voice of the speaking user in a period of time represents that voice data in a period of time is obtained by recording the voice of the speaking user when speaking. The spoken voice over a period of time may be 5 seconds, 10 seconds, 30 seconds, etc.
The shared video of the speaking user in a period of time represents that the video picture of the speaking user when the video is shared is recorded, so that the shared video in a period of time is obtained. The shared video over a period of time may be 5 seconds, 10 seconds, 30 seconds, etc. The time of the spoken voice of the speaking user over a period of time and the shared video of the speaking user over a period of time may be the same or different, e.g., the speaking user may typically speak voice while sharing the video.
For speaking users in each conference, the voices and videos of the speaking users need to be recorded and stored continuously during the conference, so that the speaking voices of the speaking users and the shared videos of the speaking users in a period of time can be obtained.
And step S3, determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in a period of time and the conference information.
The voice interaction degree model is a long-term and short-term neural network model. The long-term neural network model is one implementation of artificial intelligence. The Long-Short Term neural network model includes a Long-Short Term neural network (LSTM). The long-term and short-term neural network model can process sequence data with any length, capture sequence information and output results based on the association relationship of front data and rear data in the sequence. The long-short-term neural network model is used for processing the speaking voice in the continuous time period, so that the characteristics of the association relationship among the speaking voice in each time point can be comprehensively considered, and the output characteristics are more accurate and comprehensive. The voice interaction degree model can be obtained by training a training sample through a gradient descent method.
The input of the voice interaction degree model comprises the spoken voices of the speaking users in a period of time and the conference information, and the output of the voice interaction degree model is the voice interaction degree of the spoken voices of the speaking users in a period of time and a plurality of non-speaking users.
The voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time represents the interaction degree of the spoken voice of the speaking user and the other plurality of non-speaking users. The greater the degree of voice interaction, the greater the likelihood that there is interaction between the speaking voice of the speaking user and the non-speaking user, and the greater the likelihood that the non-speaking user needs to turn on the microphone to communicate with the speaking user. For example, if the spoken voice of the speaking user is "i want to invite xiao Liu to explain PPT material" in the following, the degree of interaction between the spoken voice of the speaking user and xiao Liu in the period of time is greater, and the microphone of xiao Liu needs to be turned on to perform voice communication. For another example, if the spoken voice of the speaking user in a period of time is "each student in the learning group a talks about his own learning experience", the degree of interaction between the spoken voice of the speaking user in a period of time and each student in the learning group a is greater, and then the microphone of each student in the learning group a needs to be turned on to conduct voice communication. The voice interaction degree between the spoken voice of the speaking user and the plurality of non-speaking users in a period of time can be a value between 0 and 1, and the larger the value is, the greater the possibility that interaction exists between the speaking voice of the speaking user and the non-speaking users is indicated.
And step S4, determining the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information.
The video interaction degree model is a long-term and short-term neural network model, the input of the video interaction degree model is shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users.
The interaction degree of the shared video of the speaking user and the videos of the plurality of non-speaking users in a period of time represents the interaction degree of the shared video of the speaking user and the other plurality of non-speaking users. The greater the degree of video interaction, the greater the likelihood that there is interaction between the shared video of the speaking user and the non-speaking user, and the greater the likelihood that the non-speaking user needs to turn on the microphone to communicate with the speaking user. For example, if the shared video of the speaking user in a period of time is a small piece of shared video data, the interaction degree between the shared video of the speaking user in a period of time and the small piece of video is large, and a small piece of microphone needs to be turned on to perform voice communication. For another example, if the shared video of the speaking user in a period of time is a video clip of three students in the learning group a, the interaction degree between the shared video of the speaking user in a period of time and the video of three students in the learning group a is large, and then the microphones of the three students in the learning group a need to be turned on to perform voice communication. The degree of interaction between the shared video of the speaking user and the videos of the plurality of non-speaking users within a period of time can be a value between 0 and 1, and the larger the value is, the greater the possibility that interaction exists between the shared video of the speaking user and the non-speaking users is indicated.
And S5, determining the association degree of the speaking user and the plurality of non-speaking users based on the voice interaction degree of the speaking voice of the speaking user and the plurality of non-speaking users and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users.
In some embodiments, different weights may be respectively given to the voice interaction degree of the spoken voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user, and then the association degree of the speaking user and the plurality of non-speaking users is obtained after weighted summation. For example, the voice interaction degree and the video interaction degree may be respectively given a weight of 0.5, and then weighted and summed to obtain the association degree between the speaking user and the plurality of non-speaking users.
In some embodiments, the association degree between the speaking user and the plurality of non-speaking users may also be determined by a preset comparison relationship between the voice interaction degree between the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree between the shared video and the plurality of non-speaking users in a period of time of the speaking user.
And S6, determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of users not speaking, and carrying out mute processing on the plurality of users to be muted.
In some embodiments, a relevance threshold may be set, and if the relevance of the non-speaking user is lower than the relevance threshold, the non-speaking user is irrelevant to the discussion subject or the right to speak is lower, and the non-speaking user below the relevance threshold may be muted. The mute processing mode comprises the steps of turning off the microphone and setting the user to be in a mute state.
In some embodiments, the conference remaining time may also be determined by a graph neural network model.
The graph neural network model includes a graph neural network (Graph Neural Network, GNN) and a full connectivity layer. A graph neural network is a neural network that acts directly on a graph, which is a data structure made up of two parts, nodes and edges. The graph neural network is one implementation of artificial intelligence.
The input of the graph neural network model comprises a plurality of nodes and a plurality of edges, the plurality of nodes comprise speaking user nodes and non-speaking user nodes, the plurality of edges are position relations among the plurality of nodes, each node in the plurality of nodes comprises a plurality of node characteristics, the node characteristics of the speaking user nodes comprise speaking voice of a speaking user in a period of time, shared video of the speaking user in a period of time, conference information and positions of the speaking user, the node characteristics of the non-speaking user nodes comprise voice interaction degree of the non-speaking user and the speaking voice of the speaking user in a period of time, video interaction degree of shared video of the non-speaking user and the speaking user in a period of time and positions of the non-speaking user, and the output of the graph neural network model is conference residual time. The job relationships may include upper and lower levels, the same team, different teams, etc. As an example, the conference remaining time may be 20 minutes. Through estimating the residual time of the conference, the user can plan the follow-up work in advance, and the user experience is improved.
In some embodiments, the user mute rights can also be managed by the method shown in fig. 2.
Fig. 2 is a schematic flow chart of managing user mute authorities according to an embodiment of the present invention, and fig. 2 includes steps S21 to S23:
and S21, performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters.
The speech recognition mode can comprise a speech recognition method based on template matching, a speech recognition method based on a hidden Markov model and a speech recognition method based on deep learning.
The speech recognition text is text obtained by performing speech recognition on the spoken speech.
Step S22, judging whether the voice recognition text comprises the name of the user who does not speak.
For example, if the speech recognition text includes the name of the non-speaking user, it indicates that the speaking user is presented to the non-speaking user, and it indicates that the speaking user may have interaction with the non-speaking user. For example, the speech recognition word is "I am out of the third party below to explain the PPT. "there may be an interaction between the speaking user and the third party.
In step S23, if the speech recognition text includes the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
And if the voice recognition text is judged to comprise the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
Based on the same inventive concept, fig. 3 is a schematic diagram of an artificial intelligence-based conference management system according to an embodiment of the present invention, where the artificial intelligence-based conference management system includes:
a first obtaining module 31, configured to obtain conference information;
a second obtaining module 32, configured to obtain the spoken voice of the speaking user during a period of time and the shared video of the speaking user during a period of time;
a voice determining module 33, configured to determine, based on the spoken voices of the speaking user and the conference information, a voice interaction degree model, where the spoken voices of the speaking user and the voices of a plurality of non-speaking users in a period of time;
video determination module 34, configured to determine, based on the shared video of the speaking user over a period of time and the conference information, a video interaction degree model, based on the shared video of the speaking user over a period of time, and based on the video interaction degree model;
a correlation determining module 35, configured to determine a correlation between the speaking user and a plurality of non-speaking users based on a voice interaction degree between the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and a video interaction degree between the shared video and the plurality of non-speaking users in a period of time of the speaking user;
and the muting module 36 is configured to determine a plurality of users to be muted based on the association degree between the speaking user and the plurality of non-speaking users, and mute the plurality of users to be muted.
Based on the same inventive concept, an embodiment of the present invention provides an electronic device, as shown in fig. 4, including:
comprising the following steps: a processor 41; a memory 42; a computer program; wherein the computer program is stored in the memory 42 and configured to be executed by the processor 41 to implement an artificial intelligence based conference management method as provided above, the method comprising: acquiring conference information; acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in the period of time and the conference information; determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
Based on the same inventive concept, the present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by the processor 41, implements the aforementioned provided artificial intelligence based conference management method, the method comprising obtaining conference information; acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in the period of time and the conference information; determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
Claims (8)
1. An artificial intelligence based conference management method, comprising:
acquiring conference information;
acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time;
determining the voice interaction degree of the spoken voice of the speaking user and the voice of a plurality of non-speaking users in a period of time based on the spoken voice of the speaking user and the conference information by using a voice interaction degree model, wherein the voice interaction degree model is a long-short-period neural network model, the input of the voice interaction degree model is the spoken voice of the speaking user in a period of time and the conference information, and the output of the voice interaction degree model is the voice interaction degree of the spoken voice of the speaking user and the voice of the plurality of non-speaking users in a period of time;
determining the video interaction degree of the shared video of the speaking user and the video interaction degree of the plurality of non-speaking users in a period of time by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information, wherein the video interaction degree model is a long-short-period neural network model, the input of the video interaction degree model is the shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users;
determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user;
and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
2. The artificial intelligence based conference management method of claim 1, wherein the method further comprises:
performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters;
judging whether the voice recognition text comprises the name of the user who does not speak;
and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
3. The artificial intelligence based conference management method of claim 1, wherein the determining the degree of association of the speaking user with the plurality of non-speaking users based on the degree of voice interaction of the speaking voice with the plurality of non-speaking users over a period of time of the speaking user and the degree of video interaction of the shared video with the plurality of non-speaking users over a period of time of the speaking user comprises: and respectively giving different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and then carrying out weighted summation to obtain the association degree of the speaking user and the plurality of non-speaking users.
4. An artificial intelligence based conference management system, comprising:
the first acquisition module is used for acquiring conference information;
the second acquisition module is used for acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time;
the voice determining module is used for determining voice interaction degrees of the spoken voices of the speaking user and the plurality of non-speaking users in a period of time based on the spoken voices of the speaking user and the conference information by using a voice interaction degree model, wherein the voice interaction degree model is a long-short-period neural network model, the input of the voice interaction degree model is the spoken voices of the speaking user in the period of time and the conference information, and the output of the voice interaction degree model is the voice interaction degrees of the spoken voices of the speaking user and the plurality of non-speaking users in the period of time;
the video determining module is used for determining video interaction degrees of the shared videos of the speaking user and the plurality of non-speaking users in a period of time based on the shared videos of the speaking user in a period of time and the conference information by using a video interaction degree model, wherein the video interaction degree model is a long-short period neural network model, the input of the video interaction degree model is the shared videos of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degrees of the shared videos of the speaking user in a period of time and the plurality of non-speaking users;
the relevancy determination module is used for determining relevancy between the speaking user and the plurality of non-speaking users based on voice interaction degrees of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and video interaction degrees of the shared video and the plurality of non-speaking users in a period of time of the speaking user;
and the mute module is used for determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users and carrying out mute processing on the plurality of users to be muted.
5. The artificial intelligence based meeting management system of claim 4, wherein the system is further configured to:
performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters;
judging whether the voice recognition text comprises the name of the user who does not speak;
and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
6. The artificial intelligence based meeting management system of claim 4 wherein the relevancy determination module is further configured to: and respectively giving different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and then carrying out weighted summation to obtain the association degree of the speaking user and the plurality of non-speaking users.
7. An electronic device, comprising: a memory; a processor; a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the artificial intelligence based conference management method of any one of claims 1 to 3.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the artificial intelligence based conference management method as claimed in any one of claims 1 to 3.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310875230.1A CN116633909B (en) | 2023-07-17 | 2023-07-17 | Conference management method and system based on artificial intelligence |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310875230.1A CN116633909B (en) | 2023-07-17 | 2023-07-17 | Conference management method and system based on artificial intelligence |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116633909A CN116633909A (en) | 2023-08-22 |
CN116633909B true CN116633909B (en) | 2023-12-19 |
Family
ID=87602804
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310875230.1A Active CN116633909B (en) | 2023-07-17 | 2023-07-17 | Conference management method and system based on artificial intelligence |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116633909B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108346034A (en) * | 2018-02-02 | 2018-07-31 | 深圳市鹰硕技术有限公司 | A kind of meeting intelligent management and system |
CN108399923A (en) * | 2018-02-01 | 2018-08-14 | 深圳市鹰硕技术有限公司 | More human hairs call the turn spokesman's recognition methods and device |
CN111694479A (en) * | 2020-06-11 | 2020-09-22 | 北京百度网讯科技有限公司 | Mute processing method and device in teleconference, electronic device and storage medium |
CN113920986A (en) * | 2021-09-29 | 2022-01-11 | 中国平安人寿保险股份有限公司 | Conference record generation method, device, equipment and storage medium |
CN114727047A (en) * | 2021-01-07 | 2022-07-08 | 元平台公司 | System and method for resolving overlapping speech in a communication session |
CN115831155A (en) * | 2021-09-16 | 2023-03-21 | 腾讯科技(深圳)有限公司 | Audio signal processing method and device, electronic equipment and storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10776073B2 (en) * | 2018-10-08 | 2020-09-15 | Nuance Communications, Inc. | System and method for managing a mute button setting for a conference call |
US11838340B2 (en) * | 2021-09-20 | 2023-12-05 | International Business Machines Corporation | Dynamic mute control for web conferencing |
-
2023
- 2023-07-17 CN CN202310875230.1A patent/CN116633909B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108399923A (en) * | 2018-02-01 | 2018-08-14 | 深圳市鹰硕技术有限公司 | More human hairs call the turn spokesman's recognition methods and device |
CN108346034A (en) * | 2018-02-02 | 2018-07-31 | 深圳市鹰硕技术有限公司 | A kind of meeting intelligent management and system |
CN111694479A (en) * | 2020-06-11 | 2020-09-22 | 北京百度网讯科技有限公司 | Mute processing method and device in teleconference, electronic device and storage medium |
CN114727047A (en) * | 2021-01-07 | 2022-07-08 | 元平台公司 | System and method for resolving overlapping speech in a communication session |
CN115831155A (en) * | 2021-09-16 | 2023-03-21 | 腾讯科技(深圳)有限公司 | Audio signal processing method and device, electronic equipment and storage medium |
CN113920986A (en) * | 2021-09-29 | 2022-01-11 | 中国平安人寿保险股份有限公司 | Conference record generation method, device, equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
3GPP.SP-030434.3GPP tsg_sa\TSG_SA.2003,(TSGS_21),全文. * |
会议电话中的实时回声消除算法研究与实现;陈林;中国优秀硕士学位论文全文数据库;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116633909A (en) | 2023-08-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10915570B2 (en) | Personalized meeting summaries | |
US10431205B2 (en) | Dialog device with dialog support generated using a mixture of language models combined using a recurrent neural network | |
US12002464B2 (en) | Systems and methods for recognizing a speech of a speaker | |
US9495350B2 (en) | System and method for determining expertise through speech analytics | |
CN107818798A (en) | Customer service quality evaluating method, device, equipment and storage medium | |
US9210269B2 (en) | Active speaker indicator for conference participants | |
US10652655B1 (en) | Cognitive volume and speech frequency levels adjustment | |
US20150154960A1 (en) | System and associated methodology for selecting meeting users based on speech | |
US10699709B2 (en) | Conference call analysis and automated information exchange | |
CN113228074A (en) | Urgency and emotional state matching for automatic scheduling by artificial intelligence | |
CN111556279A (en) | Monitoring method and communication method of instant session | |
CN114727047A (en) | System and method for resolving overlapping speech in a communication session | |
CN116569197A (en) | User promotion in collaboration sessions | |
CN111696538A (en) | Voice processing method, apparatus and medium | |
Soofastaei | Introductory chapter: Virtual assistants | |
CN109961152B (en) | Personalized interaction method and system of virtual idol, terminal equipment and storage medium | |
CN116633909B (en) | Conference management method and system based on artificial intelligence | |
CN113539261A (en) | Man-machine voice interaction method and device, computer equipment and storage medium | |
CN111756939A (en) | Online voice control method and device and computer equipment | |
CN110865789A (en) | Method and system for intelligently starting microphone based on voice recognition | |
WO2023040456A1 (en) | Dynamic mute control for web conferencing | |
Palinko et al. | How should a robot interrupt a conversation between multiple humans | |
KR102408455B1 (en) | Voice data synthesis method for speech recognition learning, and computer program recorded on record-medium for executing method therefor | |
Bumbalek et al. | Cloud-based assistive speech-transcription services | |
CN115294987A (en) | Conference record generation method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20231127 Address after: 363000 Factory Building 3 Nanpu Road, Xiangcheng District, Zhangzhou City, Fujian Province (2nd Floor, Building C) Applicant after: Fujian Yizhaoguang Intelligent Equipment Co.,Ltd. Address before: No. 3001A, Floor 4, Building 1-2, No. 69, Junlong Street, Jinjiang District, Chengdu, Sichuan 610000 (self number) Applicant before: Chengdu Haojie Technology Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant |