CN116633909B - Conference management method and system based on artificial intelligence - Google Patents

Conference management method and system based on artificial intelligence Download PDF

Info

Publication number
CN116633909B
CN116633909B CN202310875230.1A CN202310875230A CN116633909B CN 116633909 B CN116633909 B CN 116633909B CN 202310875230 A CN202310875230 A CN 202310875230A CN 116633909 B CN116633909 B CN 116633909B
Authority
CN
China
Prior art keywords
speaking
period
voice
users
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310875230.1A
Other languages
Chinese (zh)
Other versions
CN116633909A (en
Inventor
李源
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fujian Yizhaoguang Intelligent Equipment Co ltd
Original Assignee
Fujian Yizhaoguang Intelligent Equipment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fujian Yizhaoguang Intelligent Equipment Co ltd filed Critical Fujian Yizhaoguang Intelligent Equipment Co ltd
Priority to CN202310875230.1A priority Critical patent/CN116633909B/en
Publication of CN116633909A publication Critical patent/CN116633909A/en
Application granted granted Critical
Publication of CN116633909B publication Critical patent/CN116633909B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/10Office automation; Time management
    • G06Q10/109Time management, e.g. calendars, reminders, meetings or time accounting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • H04L65/4038Arrangements for multi-party communication, e.g. for conferences with floor control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/80Responding to QoS
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Strategic Management (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Tourism & Hospitality (AREA)
  • Health & Medical Sciences (AREA)
  • Quality & Reliability (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • General Business, Economics & Management (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Operations Research (AREA)
  • Marketing (AREA)
  • Economics (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The conference management method and system based on artificial intelligence provided by the invention comprise the steps of determining the voice interaction degree of the spoken voice and the plurality of non-spoken users in a period of time of a speaking user by using a voice interaction degree model, determining the video interaction degree of the shared video and the plurality of non-spoken users in a period of time of the speaking user by using a video interaction degree model based on the shared video and conference information in a period of time of the speaking user, determining the association degree of the speaking user and the plurality of non-spoken users based on the voice interaction degree of the speech and the shared video and the plurality of non-spoken users in a period of time of the speaking user, determining the plurality of to-be-muted users based on the association degree of the speaking user and the plurality of non-spoken users, and carrying out mute processing on the plurality of to-be-muted users.

Description

Conference management method and system based on artificial intelligence
Technical Field
The invention relates to the technical field of conference management, in particular to a conference management method and system based on artificial intelligence.
Background
With the development of technology, more and more users choose to conduct teleconferencing through mobile applications. In the teleconference process, as the number of participants is numerous, if the microphones of all participants are opened, larger noise interference can exist to influence the progress of the conference, so that a manager can close the microphones of the participants who do not speak in the conference, thereby preventing noise in the environment from being transmitted to the conference to further interfere the conference.
Therefore, how to quickly manage the speaking right of the participants and improve the user experience are the problems to be solved currently.
Disclosure of Invention
The invention mainly solves the technical problem of how to rapidly manage the speaking right of the participants and improve the user experience.
According to a first aspect, the present invention provides an artificial intelligence based conference management method, including: acquiring conference information; acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in the period of time and the conference information; determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
Still further, the method further comprises: performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters; judging whether the voice recognition text comprises the name of the user who does not speak; and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
Further, the voice interaction degree model is a long-short period neural network model, the input of the voice interaction degree model is the speaking voice of the speaking user in a period of time and the conference information, the output of the voice interaction degree model is the voice interaction degree of the speaking voice of the speaking user in a period of time and the voice interaction degree of a plurality of non-speaking users, the video interaction degree model is a long-short period neural network model, the input of the video interaction degree model is the shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users.
Still further, the determining the association degree of the speaking user with the plurality of non-speaking users based on the voice interaction degree of the speaking voice with the plurality of non-speaking users in the period of time of the speaking user and the video interaction degree of the shared video with the plurality of non-speaking users in the period of time of the speaking user includes: and respectively giving different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and then carrying out weighted summation to obtain the association degree of the speaking user and the plurality of non-speaking users.
According to a second aspect, the present invention provides an artificial intelligence based conference management system comprising: the first acquisition module is used for acquiring conference information; the second acquisition module is used for acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; the voice determining module is used for determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time based on the spoken voices of the speaking user in the period of time and the conference information by using a voice interaction degree model; the video determining module is used for determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; the relevancy determination module is used for determining relevancy between the speaking user and the plurality of non-speaking users based on voice interaction degrees of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and video interaction degrees of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and the mute module is used for determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users and carrying out mute processing on the plurality of users to be muted.
Still further, the system is further configured to: performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters; judging whether the voice recognition text comprises the name of the user who does not speak; and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
Further, the voice interaction degree model is a long-short period neural network model, the input of the voice interaction degree model is the speaking voice of the speaking user in a period of time and the conference information, the output of the voice interaction degree model is the voice interaction degree of the speaking voice of the speaking user in a period of time and the voice interaction degree of a plurality of non-speaking users, the video interaction degree model is a long-short period neural network model, the input of the video interaction degree model is the shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users.
Furthermore, the relevance determining module is further configured to assign different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and perform weighted summation to obtain the relevance between the speaking user and the plurality of non-speaking users.
According to a third aspect, the present invention provides an electronic device comprising: a memory; a processor; a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the method described above.
According to a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a method as in any of the above aspects.
The conference management method and system based on artificial intelligence provided by the invention comprise the steps of determining the voice interaction degree of the spoken voice and the plurality of non-spoken users in a period of time of a speaking user by using a voice interaction degree model, determining the video interaction degree of the shared video and the plurality of non-spoken users in a period of time of the speaking user by using a video interaction degree model based on the shared video and conference information in a period of time of the speaking user, determining the association degree of the speaking user and the plurality of non-spoken users based on the voice interaction degree of the speech and the shared video and the plurality of non-spoken users in a period of time of the speaking user, determining the plurality of to-be-muted users based on the association degree of the speaking user and the plurality of non-spoken users, and carrying out mute processing on the plurality of to-be-muted users.
Drawings
Fig. 1 is a schematic flow chart of a conference management method based on artificial intelligence according to an embodiment of the present invention;
fig. 2 is a schematic flow chart of managing user mute authorities according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an artificial intelligence-based conference management system according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In an embodiment of the present invention, there is provided an artificial intelligence based conference management method as shown in fig. 1, where the artificial intelligence based conference management method includes steps S1 to S6:
step S1, meeting information is obtained.
Meeting information includes the start time of the meeting, the topic of the meeting, information of participating users, the profile of the meeting, the agenda of the meeting, the material of the introduction to the meeting, etc. The information of the participating users includes user identity, user age, etc.
Step S2, the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time are obtained.
The spoken voice of the speaking user in a period of time represents that voice data in a period of time is obtained by recording the voice of the speaking user when speaking. The spoken voice over a period of time may be 5 seconds, 10 seconds, 30 seconds, etc.
The shared video of the speaking user in a period of time represents that the video picture of the speaking user when the video is shared is recorded, so that the shared video in a period of time is obtained. The shared video over a period of time may be 5 seconds, 10 seconds, 30 seconds, etc. The time of the spoken voice of the speaking user over a period of time and the shared video of the speaking user over a period of time may be the same or different, e.g., the speaking user may typically speak voice while sharing the video.
For speaking users in each conference, the voices and videos of the speaking users need to be recorded and stored continuously during the conference, so that the speaking voices of the speaking users and the shared videos of the speaking users in a period of time can be obtained.
And step S3, determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in a period of time and the conference information.
The voice interaction degree model is a long-term and short-term neural network model. The long-term neural network model is one implementation of artificial intelligence. The Long-Short Term neural network model includes a Long-Short Term neural network (LSTM). The long-term and short-term neural network model can process sequence data with any length, capture sequence information and output results based on the association relationship of front data and rear data in the sequence. The long-short-term neural network model is used for processing the speaking voice in the continuous time period, so that the characteristics of the association relationship among the speaking voice in each time point can be comprehensively considered, and the output characteristics are more accurate and comprehensive. The voice interaction degree model can be obtained by training a training sample through a gradient descent method.
The input of the voice interaction degree model comprises the spoken voices of the speaking users in a period of time and the conference information, and the output of the voice interaction degree model is the voice interaction degree of the spoken voices of the speaking users in a period of time and a plurality of non-speaking users.
The voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time represents the interaction degree of the spoken voice of the speaking user and the other plurality of non-speaking users. The greater the degree of voice interaction, the greater the likelihood that there is interaction between the speaking voice of the speaking user and the non-speaking user, and the greater the likelihood that the non-speaking user needs to turn on the microphone to communicate with the speaking user. For example, if the spoken voice of the speaking user is "i want to invite xiao Liu to explain PPT material" in the following, the degree of interaction between the spoken voice of the speaking user and xiao Liu in the period of time is greater, and the microphone of xiao Liu needs to be turned on to perform voice communication. For another example, if the spoken voice of the speaking user in a period of time is "each student in the learning group a talks about his own learning experience", the degree of interaction between the spoken voice of the speaking user in a period of time and each student in the learning group a is greater, and then the microphone of each student in the learning group a needs to be turned on to conduct voice communication. The voice interaction degree between the spoken voice of the speaking user and the plurality of non-speaking users in a period of time can be a value between 0 and 1, and the larger the value is, the greater the possibility that interaction exists between the speaking voice of the speaking user and the non-speaking users is indicated.
And step S4, determining the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information.
The video interaction degree model is a long-term and short-term neural network model, the input of the video interaction degree model is shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the video interaction degree of the plurality of non-speaking users.
The interaction degree of the shared video of the speaking user and the videos of the plurality of non-speaking users in a period of time represents the interaction degree of the shared video of the speaking user and the other plurality of non-speaking users. The greater the degree of video interaction, the greater the likelihood that there is interaction between the shared video of the speaking user and the non-speaking user, and the greater the likelihood that the non-speaking user needs to turn on the microphone to communicate with the speaking user. For example, if the shared video of the speaking user in a period of time is a small piece of shared video data, the interaction degree between the shared video of the speaking user in a period of time and the small piece of video is large, and a small piece of microphone needs to be turned on to perform voice communication. For another example, if the shared video of the speaking user in a period of time is a video clip of three students in the learning group a, the interaction degree between the shared video of the speaking user in a period of time and the video of three students in the learning group a is large, and then the microphones of the three students in the learning group a need to be turned on to perform voice communication. The degree of interaction between the shared video of the speaking user and the videos of the plurality of non-speaking users within a period of time can be a value between 0 and 1, and the larger the value is, the greater the possibility that interaction exists between the shared video of the speaking user and the non-speaking users is indicated.
And S5, determining the association degree of the speaking user and the plurality of non-speaking users based on the voice interaction degree of the speaking voice of the speaking user and the plurality of non-speaking users and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users.
In some embodiments, different weights may be respectively given to the voice interaction degree of the spoken voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user, and then the association degree of the speaking user and the plurality of non-speaking users is obtained after weighted summation. For example, the voice interaction degree and the video interaction degree may be respectively given a weight of 0.5, and then weighted and summed to obtain the association degree between the speaking user and the plurality of non-speaking users.
In some embodiments, the association degree between the speaking user and the plurality of non-speaking users may also be determined by a preset comparison relationship between the voice interaction degree between the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree between the shared video and the plurality of non-speaking users in a period of time of the speaking user.
And S6, determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of users not speaking, and carrying out mute processing on the plurality of users to be muted.
In some embodiments, a relevance threshold may be set, and if the relevance of the non-speaking user is lower than the relevance threshold, the non-speaking user is irrelevant to the discussion subject or the right to speak is lower, and the non-speaking user below the relevance threshold may be muted. The mute processing mode comprises the steps of turning off the microphone and setting the user to be in a mute state.
In some embodiments, the conference remaining time may also be determined by a graph neural network model.
The graph neural network model includes a graph neural network (Graph Neural Network, GNN) and a full connectivity layer. A graph neural network is a neural network that acts directly on a graph, which is a data structure made up of two parts, nodes and edges. The graph neural network is one implementation of artificial intelligence.
The input of the graph neural network model comprises a plurality of nodes and a plurality of edges, the plurality of nodes comprise speaking user nodes and non-speaking user nodes, the plurality of edges are position relations among the plurality of nodes, each node in the plurality of nodes comprises a plurality of node characteristics, the node characteristics of the speaking user nodes comprise speaking voice of a speaking user in a period of time, shared video of the speaking user in a period of time, conference information and positions of the speaking user, the node characteristics of the non-speaking user nodes comprise voice interaction degree of the non-speaking user and the speaking voice of the speaking user in a period of time, video interaction degree of shared video of the non-speaking user and the speaking user in a period of time and positions of the non-speaking user, and the output of the graph neural network model is conference residual time. The job relationships may include upper and lower levels, the same team, different teams, etc. As an example, the conference remaining time may be 20 minutes. Through estimating the residual time of the conference, the user can plan the follow-up work in advance, and the user experience is improved.
In some embodiments, the user mute rights can also be managed by the method shown in fig. 2.
Fig. 2 is a schematic flow chart of managing user mute authorities according to an embodiment of the present invention, and fig. 2 includes steps S21 to S23:
and S21, performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters.
The speech recognition mode can comprise a speech recognition method based on template matching, a speech recognition method based on a hidden Markov model and a speech recognition method based on deep learning.
The speech recognition text is text obtained by performing speech recognition on the spoken speech.
Step S22, judging whether the voice recognition text comprises the name of the user who does not speak.
For example, if the speech recognition text includes the name of the non-speaking user, it indicates that the speaking user is presented to the non-speaking user, and it indicates that the speaking user may have interaction with the non-speaking user. For example, the speech recognition word is "I am out of the third party below to explain the PPT. "there may be an interaction between the speaking user and the third party.
In step S23, if the speech recognition text includes the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
And if the voice recognition text is judged to comprise the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
Based on the same inventive concept, fig. 3 is a schematic diagram of an artificial intelligence-based conference management system according to an embodiment of the present invention, where the artificial intelligence-based conference management system includes:
a first obtaining module 31, configured to obtain conference information;
a second obtaining module 32, configured to obtain the spoken voice of the speaking user during a period of time and the shared video of the speaking user during a period of time;
a voice determining module 33, configured to determine, based on the spoken voices of the speaking user and the conference information, a voice interaction degree model, where the spoken voices of the speaking user and the voices of a plurality of non-speaking users in a period of time;
video determination module 34, configured to determine, based on the shared video of the speaking user over a period of time and the conference information, a video interaction degree model, based on the shared video of the speaking user over a period of time, and based on the video interaction degree model;
a correlation determining module 35, configured to determine a correlation between the speaking user and a plurality of non-speaking users based on a voice interaction degree between the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and a video interaction degree between the shared video and the plurality of non-speaking users in a period of time of the speaking user;
and the muting module 36 is configured to determine a plurality of users to be muted based on the association degree between the speaking user and the plurality of non-speaking users, and mute the plurality of users to be muted.
Based on the same inventive concept, an embodiment of the present invention provides an electronic device, as shown in fig. 4, including:
comprising the following steps: a processor 41; a memory 42; a computer program; wherein the computer program is stored in the memory 42 and configured to be executed by the processor 41 to implement an artificial intelligence based conference management method as provided above, the method comprising: acquiring conference information; acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in the period of time and the conference information; determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
Based on the same inventive concept, the present embodiment provides a computer-readable storage medium having stored thereon a computer program which, when executed by the processor 41, implements the aforementioned provided artificial intelligence based conference management method, the method comprising obtaining conference information; acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time; determining the voice interaction degree of the spoken voices of the speaking user and a plurality of non-speaking users in a period of time by using a voice interaction degree model based on the spoken voices of the speaking user in the period of time and the conference information; determining the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information; determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user; and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.

Claims (8)

1. An artificial intelligence based conference management method, comprising:
acquiring conference information;
acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time;
determining the voice interaction degree of the spoken voice of the speaking user and the voice of a plurality of non-speaking users in a period of time based on the spoken voice of the speaking user and the conference information by using a voice interaction degree model, wherein the voice interaction degree model is a long-short-period neural network model, the input of the voice interaction degree model is the spoken voice of the speaking user in a period of time and the conference information, and the output of the voice interaction degree model is the voice interaction degree of the spoken voice of the speaking user and the voice of the plurality of non-speaking users in a period of time;
determining the video interaction degree of the shared video of the speaking user and the video interaction degree of the plurality of non-speaking users in a period of time by using a video interaction degree model based on the shared video of the speaking user in a period of time and the conference information, wherein the video interaction degree model is a long-short-period neural network model, the input of the video interaction degree model is the shared video of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degree of the shared video of the speaking user in a period of time and the plurality of non-speaking users;
determining the association degree of the speaking user and a plurality of non-speaking users based on the voice interaction degree of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and the video interaction degree of the shared video and the plurality of non-speaking users in a period of time of the speaking user;
and determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users, and carrying out mute processing on the plurality of users to be muted.
2. The artificial intelligence based conference management method of claim 1, wherein the method further comprises:
performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters;
judging whether the voice recognition text comprises the name of the user who does not speak;
and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
3. The artificial intelligence based conference management method of claim 1, wherein the determining the degree of association of the speaking user with the plurality of non-speaking users based on the degree of voice interaction of the speaking voice with the plurality of non-speaking users over a period of time of the speaking user and the degree of video interaction of the shared video with the plurality of non-speaking users over a period of time of the speaking user comprises: and respectively giving different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and then carrying out weighted summation to obtain the association degree of the speaking user and the plurality of non-speaking users.
4. An artificial intelligence based conference management system, comprising:
the first acquisition module is used for acquiring conference information;
the second acquisition module is used for acquiring the spoken voice of the speaking user in a period of time and the shared video of the speaking user in a period of time;
the voice determining module is used for determining voice interaction degrees of the spoken voices of the speaking user and the plurality of non-speaking users in a period of time based on the spoken voices of the speaking user and the conference information by using a voice interaction degree model, wherein the voice interaction degree model is a long-short-period neural network model, the input of the voice interaction degree model is the spoken voices of the speaking user in the period of time and the conference information, and the output of the voice interaction degree model is the voice interaction degrees of the spoken voices of the speaking user and the plurality of non-speaking users in the period of time;
the video determining module is used for determining video interaction degrees of the shared videos of the speaking user and the plurality of non-speaking users in a period of time based on the shared videos of the speaking user in a period of time and the conference information by using a video interaction degree model, wherein the video interaction degree model is a long-short period neural network model, the input of the video interaction degree model is the shared videos of the speaking user in a period of time and the conference information, and the output of the video interaction degree model is the video interaction degrees of the shared videos of the speaking user in a period of time and the plurality of non-speaking users;
the relevancy determination module is used for determining relevancy between the speaking user and the plurality of non-speaking users based on voice interaction degrees of the speaking voice and the plurality of non-speaking users in a period of time of the speaking user and video interaction degrees of the shared video and the plurality of non-speaking users in a period of time of the speaking user;
and the mute module is used for determining a plurality of users to be muted based on the association degree of the speaking user and the plurality of non-speaking users and carrying out mute processing on the plurality of users to be muted.
5. The artificial intelligence based meeting management system of claim 4, wherein the system is further configured to:
performing voice recognition on the spoken voice of the speaking user to obtain voice recognition characters;
judging whether the voice recognition text comprises the name of the user who does not speak;
and if the voice recognition text comprises the name of the non-speaking user, the non-speaking user corresponding to the name of the non-speaking user is unmuted.
6. The artificial intelligence based meeting management system of claim 4 wherein the relevancy determination module is further configured to: and respectively giving different weights to the voice interaction degree of the spoken voice of the speaking user and the plurality of non-speaking users in a period of time and the video interaction degree of the shared video of the speaking user and the plurality of non-speaking users in a period of time, and then carrying out weighted summation to obtain the association degree of the speaking user and the plurality of non-speaking users.
7. An electronic device, comprising: a memory; a processor; a computer program; wherein the computer program is stored in the memory and configured to be executed by the processor to implement the artificial intelligence based conference management method of any one of claims 1 to 3.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the artificial intelligence based conference management method as claimed in any one of claims 1 to 3.
CN202310875230.1A 2023-07-17 2023-07-17 Conference management method and system based on artificial intelligence Active CN116633909B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310875230.1A CN116633909B (en) 2023-07-17 2023-07-17 Conference management method and system based on artificial intelligence

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310875230.1A CN116633909B (en) 2023-07-17 2023-07-17 Conference management method and system based on artificial intelligence

Publications (2)

Publication Number Publication Date
CN116633909A CN116633909A (en) 2023-08-22
CN116633909B true CN116633909B (en) 2023-12-19

Family

ID=87602804

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310875230.1A Active CN116633909B (en) 2023-07-17 2023-07-17 Conference management method and system based on artificial intelligence

Country Status (1)

Country Link
CN (1) CN116633909B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108346034A (en) * 2018-02-02 2018-07-31 深圳市鹰硕技术有限公司 A kind of meeting intelligent management and system
CN108399923A (en) * 2018-02-01 2018-08-14 深圳市鹰硕技术有限公司 More human hairs call the turn spokesman's recognition methods and device
CN111694479A (en) * 2020-06-11 2020-09-22 北京百度网讯科技有限公司 Mute processing method and device in teleconference, electronic device and storage medium
CN113920986A (en) * 2021-09-29 2022-01-11 中国平安人寿保险股份有限公司 Conference record generation method, device, equipment and storage medium
CN114727047A (en) * 2021-01-07 2022-07-08 元平台公司 System and method for resolving overlapping speech in a communication session
CN115831155A (en) * 2021-09-16 2023-03-21 腾讯科技(深圳)有限公司 Audio signal processing method and device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10776073B2 (en) * 2018-10-08 2020-09-15 Nuance Communications, Inc. System and method for managing a mute button setting for a conference call
US11838340B2 (en) * 2021-09-20 2023-12-05 International Business Machines Corporation Dynamic mute control for web conferencing

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108399923A (en) * 2018-02-01 2018-08-14 深圳市鹰硕技术有限公司 More human hairs call the turn spokesman's recognition methods and device
CN108346034A (en) * 2018-02-02 2018-07-31 深圳市鹰硕技术有限公司 A kind of meeting intelligent management and system
CN111694479A (en) * 2020-06-11 2020-09-22 北京百度网讯科技有限公司 Mute processing method and device in teleconference, electronic device and storage medium
CN114727047A (en) * 2021-01-07 2022-07-08 元平台公司 System and method for resolving overlapping speech in a communication session
CN115831155A (en) * 2021-09-16 2023-03-21 腾讯科技(深圳)有限公司 Audio signal processing method and device, electronic equipment and storage medium
CN113920986A (en) * 2021-09-29 2022-01-11 中国平安人寿保险股份有限公司 Conference record generation method, device, equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
3GPP.SP-030434.3GPP tsg_sa\TSG_SA.2003,(TSGS_21),全文. *
会议电话中的实时回声消除算法研究与实现;陈林;中国优秀硕士学位论文全文数据库;全文 *

Also Published As

Publication number Publication date
CN116633909A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
US10915570B2 (en) Personalized meeting summaries
US10431205B2 (en) Dialog device with dialog support generated using a mixture of language models combined using a recurrent neural network
US12002464B2 (en) Systems and methods for recognizing a speech of a speaker
US9495350B2 (en) System and method for determining expertise through speech analytics
CN107818798A (en) Customer service quality evaluating method, device, equipment and storage medium
US9210269B2 (en) Active speaker indicator for conference participants
US10652655B1 (en) Cognitive volume and speech frequency levels adjustment
US20150154960A1 (en) System and associated methodology for selecting meeting users based on speech
US10699709B2 (en) Conference call analysis and automated information exchange
CN113228074A (en) Urgency and emotional state matching for automatic scheduling by artificial intelligence
CN111556279A (en) Monitoring method and communication method of instant session
CN114727047A (en) System and method for resolving overlapping speech in a communication session
CN116569197A (en) User promotion in collaboration sessions
CN111696538A (en) Voice processing method, apparatus and medium
Soofastaei Introductory chapter: Virtual assistants
CN109961152B (en) Personalized interaction method and system of virtual idol, terminal equipment and storage medium
CN116633909B (en) Conference management method and system based on artificial intelligence
CN113539261A (en) Man-machine voice interaction method and device, computer equipment and storage medium
CN111756939A (en) Online voice control method and device and computer equipment
CN110865789A (en) Method and system for intelligently starting microphone based on voice recognition
WO2023040456A1 (en) Dynamic mute control for web conferencing
Palinko et al. How should a robot interrupt a conversation between multiple humans
KR102408455B1 (en) Voice data synthesis method for speech recognition learning, and computer program recorded on record-medium for executing method therefor
Bumbalek et al. Cloud-based assistive speech-transcription services
CN115294987A (en) Conference record generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20231127

Address after: 363000 Factory Building 3 Nanpu Road, Xiangcheng District, Zhangzhou City, Fujian Province (2nd Floor, Building C)

Applicant after: Fujian Yizhaoguang Intelligent Equipment Co.,Ltd.

Address before: No. 3001A, Floor 4, Building 1-2, No. 69, Junlong Street, Jinjiang District, Chengdu, Sichuan 610000 (self number)

Applicant before: Chengdu Haojie Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant