CN111405232B - Video conference speaker picture switching processing method and device, equipment and medium - Google Patents

Video conference speaker picture switching processing method and device, equipment and medium Download PDF

Info

Publication number
CN111405232B
CN111405232B CN202010147449.6A CN202010147449A CN111405232B CN 111405232 B CN111405232 B CN 111405232B CN 202010147449 A CN202010147449 A CN 202010147449A CN 111405232 B CN111405232 B CN 111405232B
Authority
CN
China
Prior art keywords
switching
picture
speaker
voice
video conference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010147449.6A
Other languages
Chinese (zh)
Other versions
CN111405232A (en
Inventor
汪德召
吴闽华
姜坤
卫宣安
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Genew Technologies Co Ltd
Original Assignee
Shenzhen Genew Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Genew Technologies Co Ltd filed Critical Shenzhen Genew Technologies Co Ltd
Priority to CN202010147449.6A priority Critical patent/CN111405232B/en
Publication of CN111405232A publication Critical patent/CN111405232A/en
Application granted granted Critical
Publication of CN111405232B publication Critical patent/CN111405232B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/155Conference systems involving storage of or access to video conference sessions
    • GPHYSICS
    • G08SIGNALLING
    • G08CTRANSMISSION SYSTEMS FOR MEASURED VALUES, CONTROL OR SIMILAR SIGNALS
    • G08C23/00Non-electrical signal transmission systems, e.g. optical systems
    • G08C23/02Non-electrical signal transmission systems, e.g. optical systems using infrasonic, sonic or ultrasonic waves
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/268Signal distribution or switching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The application relates to a method, a device, equipment and a medium for switching and processing a speaker picture of a video conference, wherein the method comprises the following steps: establishing a one-to-one correspondence relationship between users participating in a video conference and corresponding conference participating terminals in advance; establishing a corresponding relation between a voice switching signal for switching the images of the speaker and a corresponding image switching operation instruction of the conference terminal; when the video conference starts, all users participating in the video conference access the video conference through the corresponding conference participating terminals; acquiring a detection voice signal, and judging whether the detection voice signal contains a voice switching signal for switching the images of the speaker or not; and when the voice switching signal for switching the picture of the main speaker is judged, controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying. The invention provides a method for rapidly switching and displaying the picture of a speaker by voice in a video conference, which can automatically switch the picture of the speaker by voice, has high switching efficiency, simple and convenient operation, easy realization and high reliability.

Description

Video conference speaker picture switching processing method and device, equipment and medium
Technical Field
The present application relates to the field of video conference technologies, and in particular, to a method and an apparatus for processing video conference speaker picture switching, a computer device, and a readable storage medium.
Background
Video conferencing refers to a conference in which people at two or more locations have a face-to-face conversation via a communication device and a network. Video conferences can be divided into point-to-point conferences and multipoint conferences according to different numbers of participating places.
Currently, video conferences generally have a main speaker picture in a relatively prominent position. However, the operation of switching the picture of the speaker in the prior art is too complex, some people set up who and who are the speaker and put in the picture central position now manually, some put the current speaker in the picture central position automatically through the mode of "voice excitation", the operation is also inconvenient, wastes time and energy, and the switching efficiency of the picture of the speaker is not high.
Therefore, the prior art is in need of improvement.
Disclosure of Invention
The invention provides a method and a device for switching and processing the picture of a speaker in a video conference, computer equipment and a readable storage medium, aiming at the technical problems in the prior art.
The technical scheme of the invention is as follows:
a video conference speaker picture switching processing method comprises the following steps:
establishing a one-to-one correspondence relationship between users participating in a video conference and corresponding conference participating terminals in advance;
establishing a corresponding relation between the voice switching signal for switching the images of the speaker and the corresponding image switching operation instruction of the conference terminal;
when the video conference starts, all users participating in the video conference access the video conference through the corresponding conference participating terminals;
acquiring a detection voice signal, and judging whether the detection voice signal contains a voice switching signal for switching the images of the speaker or not;
and when the voice switching signal for switching the picture of the main speaker is judged, controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying.
The video conference speaker picture switching processing method comprises the following steps of establishing a one-to-one correspondence relationship between users participating in a video conference and corresponding conference participating terminals in advance:
suggesting a corresponding relation between the user N and the conference terminal N to represent that the user N uses the conference terminal N to participate in the video conference; wherein N is a positive integer.
The video conference speaker picture switching processing method is characterized in that the step of establishing a corresponding relation between a voice switching signal for switching the speaker picture and a corresponding conference terminal picture switching operation instruction comprises the following steps:
and establishing a corresponding relation between the voice switching signal containing the picture switching of the main speaker of the user N and the picture switching operation instruction of the corresponding conference terminal N.
The video conference speaker picture switching processing method comprises the following steps of obtaining detection voice signals and judging whether voice switching signals for switching the speaker pictures exist in the detection voice signals or not:
acquiring a voice signal of a user in real time through a voice assistant;
judging whether the detected voice signal contains a keyword containing voice switching;
if the keyword containing the voice switching exists, the voice switching signal for switching the picture of the speaker is judged to exist.
The video conference speaker picture switching processing method comprises the following steps of, when a voice switching signal for switching the speaker picture is judged to exist, controlling the conference terminal picture corresponding to a user in the voice switching signal to be switched to the speaker picture for display:
when judging that a voice switching signal for switching the picture of the speaker exists;
analyzing keywords of the voice switching signal, and translating the keywords into corresponding conference terminal picture switching operation instructions formulated by a background;
and controlling the corresponding conference terminal picture to be switched to the speaker picture for display according to the conference terminal picture switching operation instruction.
The video conference speaker picture switching processing method comprises the following steps of, when a voice switching signal for switching the speaker picture is judged to exist, controlling the conference terminal picture corresponding to a user in the voice switching signal to be switched to the speaker picture for display:
when the voice switching signal for switching the images of the speaker is judged to have voice switching operation instructions of a plurality of users; controlling queuing according to the sequence specified in the voice switching signal;
and controlling the corresponding conference terminal pictures to be switched to the speaker picture for display according to the sequence of queuing.
The video conference speaker picture switching processing method comprises the following steps:
and when the voice signal of finishing the video conference is acquired in the video conference process, controlling to finish the current video conference.
A video conference speaker picture switching processing apparatus, wherein the apparatus comprises:
the system comprises a presetting module, a video conference processing module and a video conference processing module, wherein the presetting module is used for establishing a one-to-one correspondence relationship between users participating in the video conference and corresponding conference participating terminals in advance;
the switching instruction corresponding module is used for establishing a corresponding relation between the voice switching signal for switching the images of the speaker and the corresponding image switching operation instruction of the conference terminal;
the video conference access module is used for controlling each user participating in the video conference to access the video conference through the corresponding conference participating terminal when the video conference starts;
the voice judging module is used for acquiring the detection voice signal and judging whether the voice switching signal for switching the images of the speaker exists in the detection voice signal;
and the picture switching control module is used for controlling the conference terminal picture corresponding to the user in the voice switching signal to be switched to the picture of the speaker for display when the voice switching signal for switching the picture of the speaker is judged.
A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of any one of the video conference speaker picture switching processing methods when executing the computer program.
A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when executed by a processor, implements the steps of any one of the video conference speaker picture switching processing methods.
Compared with the prior art, the embodiment of the invention has the following advantages:
the invention provides a method and a device for switching and processing pictures of a speaker in a video conference, computer equipment and a readable storage medium. Therefore, the current speaker can be accurately set, and the picture of the speaker can be conveniently placed at the position of the speaker in advance. The invention provides a method for monitoring the speaking of a large family by a voice assistant, quickly translating a section of speech before and after a keyword into a well-formulated instruction of a background by analyzing the keyword after the keyword is heard, then executing operation, automatically switching the picture of a speaker by voice, and having fast switching efficiency, simple and convenient operation, simple realization and high reliability; providing convenience for the user.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a video conference speaker picture switching processing method according to an embodiment of the present invention.
Fig. 2 is a schematic view of a video conference speaker picture switching processing method according to an embodiment of the present invention.
Fig. 3 is a schematic structural diagram of a video conference speaker picture switching processing apparatus according to an embodiment of the present invention;
fig. 4 is an internal structural diagram of a computer device in an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The inventor has found through research that at present, a speaker picture is generally in a relatively prominent position in a video conference. However, the operation of switching the pictures of the main speaker in the prior art is too complex, some people manually set up who and who are the main speaker and put in the picture central position, some people automatically put the current main speaker in the picture central position through a voice excitation mode, the operation is inconvenient, time and labor are wasted, and the picture switching efficiency of the main speaker is not high.
In order to solve the above problems, in the embodiment of the present invention, a one-to-one correspondence relationship is established in advance between users participating in a video conference and corresponding conference participating terminals; establishing a corresponding relation between a voice switching signal for switching the images of the speaker and a corresponding image switching operation instruction of the conference terminal; when the video conference starts, all users participating in the video conference access the video conference through the corresponding conference participating terminals; acquiring a detection voice signal, and judging whether the detection voice signal contains a voice switching signal for switching the images of the speaker or not; and when the voice switching signal for switching the picture of the main speaker is judged, controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying. The invention provides a method for rapidly switching and displaying the picture of a speaker by voice in a video conference, which can automatically switch the picture of the speaker by voice, has high switching efficiency, simple and convenient operation, easy realization and high reliability.
Various non-limiting embodiments of the present invention are described in detail below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 shows a method for switching between images of a video conference speaker according to an embodiment of the present invention, where the method includes the following steps:
step S1, establishing a one-to-one correspondence relationship between users participating in the video conference and corresponding conference participating terminals in advance;
in the invention, when the method is implemented specifically, the users participating in the video conference and the corresponding conference participating terminals can be established in advance in a one-to-one correspondence relationship. For example: suggesting a corresponding relation between the user A and the conference terminal A, and indicating that the user A uses the conference terminal A to participate in the video conference; analogizing in sequence, proposing a corresponding relation between the user N and the conference terminal N, and indicating that the user N is using the conference terminal N to participate in the video conference; wherein N is a positive integer; the preset mode is convenient for switching the corresponding user through voice contact when the speaker is switched.
Step S2, establishing a corresponding relation between the voice switching signal for switching the picture of the speaker and the corresponding picture switching operation instruction of the conference terminal;
in the invention, the voice switching signal for switching the images of the speaker and the corresponding conference terminal image switching operation instruction are established to be corresponding to each other. For example, the voice switching signal including the picture switching of the main speaker of the user a is corresponding to the picture switching operation instruction of the corresponding conference terminal a, and by analogy, the voice switching signal including the picture switching of the main speaker of the user N is corresponding to the picture switching operation instruction of the corresponding conference terminal N. For example, in a video conference, the conference members are a, B, C, D, E, F, etc. The voice switching signal such as "please speak next to the user a" or "the next speaker a" may be associated with the screen switching operation command of the corresponding conference terminal a.
And step S3, the video conference is started, and each user participating in the video conference accesses the video conference through the corresponding conference participating terminal.
In the embodiment of the invention, when the video conference is started, each user participating in the video conference accesses the video conference through the corresponding conference participating terminal. For example, in a video conference, conference members a, B, C, D, E, F, etc. exist, and each user participating in the video conference accesses the video conference through a corresponding conference participating terminal.
And step S4, acquiring the detection voice signal, and judging whether the voice signal is a voice switching signal for switching the picture of the speaker.
In the embodiment of the invention, a video conference is started, the system detects and acquires the voice signals in real time, and judges whether the voice signals are detected to have the voice switching signals for switching the images of the speaker.
For example: acquiring a voice signal of a user in real time through a voice assistant; judging whether the detected voice signal contains a keyword containing voice switching; if the keyword containing the voice switching exists, the voice switching signal for switching the picture of the speaker is judged to exist. For example, in a video conference, the conference members are a, B, C, D, E, F, etc. Currently, A is a speaker, if A mentions 'please talk a.b. below', then the voice assistant grasps the keyword and understands that B talks next, then the voice is translated into an instruction, and then a voice switching signal for switching the screen of the speaker is determined.
And step S5, when the voice switching signal for switching the picture of the main speaker is judged, controlling the conference terminal picture corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for display.
In the embodiment of the invention, when the voice switching signal for switching the picture of the speaker is judged, the picture of the conference terminal corresponding to the user in the voice switching signal is controlled to be switched to the picture of the speaker for display.
Such as: when judging that a voice switching signal for switching the picture of the speaker exists; analyzing keywords of the voice switching signal, and translating the keywords into corresponding conference terminal picture switching operation instructions formulated by a background; and controlling the corresponding conference terminal picture to be switched to the speaker picture for display according to the conference terminal picture switching operation instruction.
In particular implementations, for example, as shown in fig. 2, in a XXX video conference, the conference members are a, B, C, D, E, F, etc. Currently, A is the main speaker, if A mentions 'please talk a.b. below', then the voice assistant grasps the keyword and understands that B talks next, then the voice is translated into an instruction, the background automatically sets the picture of B to the picture position of the main speaker, and the picture of A is returned to the position of B or arranged with other participants.
The speech keywords in the embodiment of the invention can be analyzed by big data to obtain the mainstream saying. Such as "who says you next, who says you say you, and who says you say you" etc. Can be identified very accurately.
Namely, the voice implementation mode of the invention: the voice assistant monitors the speaking of the major at all times, and after hearing the keywords, the speech is quickly translated into a command formulated by the background by analyzing a section of speech before and after the keywords, and then the operation is executed.
The invention relates to a video conference speaker picture switching processing method, which further realizes the mode that the method comprises the following steps:
when the voice switching signal for switching the images of the speaker is judged to have voice switching operation instructions of a plurality of users; controlling queuing according to the sequence specified in the voice switching signal;
and controlling the corresponding conference terminal pictures to be switched to the speaker picture for display according to the sequence of queuing.
For example, as shown in fig. 2, when a is currently the main speaker, if a mentions "please speak next to B.. and ask C to prepare for the next main speaker", then the voice assistant grasps the keyword and understands that the next main speaker speaks from B, and places C in the arrangement list as the preparation for the next main speaker, then the voice is translated into an instruction, the background automatically sets the picture of B to the picture position of the main speaker, and the picture of a is returned to the position of B or arranged with other participants; and displaying in the main talking picture of B, and asking C to queue up for main talking preparation.
And when the B main speaker is finished, automatically switching the picture of the C to the picture position of the main speaker, and returning the picture of the B or arranging the picture of the B with other participants.
Further, the video conference speaker picture switching processing method further comprises the following steps:
and when the voice signal of finishing the video conference is acquired in the video conference process, controlling to finish the current video conference.
Therefore, the invention provides a method for switching and processing the picture of a speaker in a video conference, which integrates the roles of voice assistants in the video conference through a voice control mode, directly speaks who speaks to the next by voice, identifies keywords of the type through voice, automatically finds who's picture and sets the picture to the picture position of the speaker. Therefore, the current speaker can be accurately set, and the picture of the speaker can be conveniently placed at the position of the speaker in advance. The invention provides a method for monitoring the speaking of a large family by a voice assistant, quickly translating a section of speech before and after a keyword into a well-formulated instruction of a background by analyzing the keyword after the keyword is heard, then executing operation, automatically switching the picture of a speaker by voice, and having fast switching efficiency, simple and convenient operation, simple realization and high reliability; providing convenience for the user.
In one embodiment, the present invention provides a video conference speaker screen switching processing apparatus, as shown in fig. 3, the apparatus including:
a presetting module 41, configured to establish a one-to-one correspondence between users participating in a video conference and corresponding conference participating terminals in advance;
a switching instruction corresponding module 42, configured to establish a corresponding relationship between the voice switching signal for switching the image of the speaker and the corresponding image switching operation instruction of the conference terminal;
the video conference access module 43 is configured to, when a video conference starts, control each user participating in the video conference to access the video conference through a corresponding conference participating terminal;
the voice judging module 44 is configured to acquire the detection voice signal, and judge whether a voice switching signal for switching the picture of the speaker exists in the detection voice signal;
the picture switching control module 45 is used for controlling the conference terminal picture corresponding to the user in the voice switching signal to be switched to the picture of the speaker for display when the voice switching signal for switching the picture of the speaker is judged to exist; as described above.
In one embodiment, the present invention provides a computer device, which may be a terminal, having an internal structure as shown in fig. 4. The computer device includes a processor, a memory, a network interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of generating a natural language model. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.
Those skilled in the art will appreciate that the illustration in fig. 4 is merely a block diagram of a portion of the structure associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
The embodiment of the invention provides computer equipment, which comprises a memory and a processor, wherein the memory stores a computer program, and the processor executes the computer program to realize the following steps:
establishing a one-to-one correspondence relationship between users participating in a video conference and corresponding conference participating terminals in advance;
establishing a corresponding relation between the voice switching signal for switching the images of the speaker and the corresponding image switching operation instruction of the conference terminal;
when the video conference starts, all users participating in the video conference access the video conference through the corresponding conference participating terminals;
acquiring a detection voice signal, and judging whether the detection voice signal contains a voice switching signal for switching the images of the speaker or not;
and when the voice switching signal for switching the picture of the main speaker is judged, controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying.
The step of establishing a one-to-one correspondence relationship between users participating in the video conference and corresponding conference participating terminals in advance comprises the following steps:
suggesting a corresponding relation between the user N and the conference terminal N to represent that the user N uses the conference terminal N to participate in the video conference; wherein N is a positive integer.
The step of establishing the corresponding relation between the voice switching signal for switching the images of the speaker and the corresponding conference terminal image switching operation instruction comprises the following steps:
and establishing a corresponding relation between the voice switching signal containing the picture switching of the main speaker of the user N and the picture switching operation instruction of the corresponding conference terminal N.
The step of acquiring the detection voice signal and judging whether the voice switching signal for switching the images of the speaker exists in the detection voice signal comprises the following steps:
acquiring a voice signal of a user in real time through a voice assistant;
judging whether the detected voice signal contains a keyword containing voice switching;
if the keyword containing the voice switching exists, the voice switching signal for switching the picture of the speaker is judged to exist.
When the voice switching signal for switching the picture of the main speaker is judged to exist, the step of controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying comprises the following steps:
when judging that a voice switching signal for switching the picture of the speaker exists;
analyzing keywords of the voice switching signal, and translating the keywords into corresponding conference terminal picture switching operation instructions formulated by a background;
and controlling the corresponding conference terminal picture to be switched to the speaker picture for display according to the conference terminal picture switching operation instruction.
When the voice switching signal for switching the picture of the main speaker is judged to exist, the step of controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying comprises the following steps:
when the voice switching signal for switching the images of the speaker is judged to have voice switching operation instructions of a plurality of users; controlling queuing according to the sequence specified in the voice switching signal;
and controlling the corresponding conference terminal pictures to be switched to the speaker picture for display according to the sequence of queuing.
The video conference speaker picture switching processing method comprises the following steps:
when a voice signal for finishing the video conference is acquired in the video conference process, the current video conference is controlled to be finished; as described above.
In summary, compared with the prior art, the embodiment of the invention has the following advantages:
a method and a device for switching the pictures of a video conference speaker, computer equipment and a readable storage medium are provided, and the method comprises the following steps: establishing a one-to-one correspondence relationship between users participating in a video conference and corresponding conference participating terminals in advance; establishing a corresponding relation between a voice switching signal for switching the images of the speaker and a corresponding image switching operation instruction of the conference terminal; when the video conference starts, all users participating in the video conference access the video conference through the corresponding conference participating terminals; acquiring a detection voice signal, and judging whether the detection voice signal contains a voice switching signal for switching the images of the speaker or not; and when the voice switching signal for switching the picture of the main speaker is judged, controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying. The invention provides a method for rapidly switching and displaying the picture of a speaker by voice in a video conference, which can automatically switch the picture of the speaker by voice, has high switching efficiency, simple and convenient operation, easy realization and high reliability.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (9)

1. A video conference speaker picture switching processing method is characterized by comprising the following steps:
establishing a one-to-one correspondence relationship between users participating in a video conference and corresponding conference participating terminals in advance;
establishing a corresponding relation between the voice switching signal for switching the images of the speaker and the corresponding image switching operation instruction of the conference terminal;
when the video conference starts, all users participating in the video conference access the video conference through the corresponding conference participating terminals;
acquiring a detection voice signal, and judging whether the detection voice signal contains a voice switching signal for switching the images of the speaker or not;
when the voice switching signal for switching the picture of the main speaker is judged, controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying;
when the voice switching signal for switching the picture of the main speaker is judged to exist, the step of controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying comprises the following steps:
when the voice switching signal for switching the images of the speaker is judged to have voice switching operation instructions of a plurality of users, controlling the queue to wait according to the sequence specified in the voice switching signal;
and controlling the corresponding conference terminal pictures to be switched to the speaker picture for display according to the sequence of queuing.
2. The method for switching between the video conference speaker pictures according to claim 1, wherein the step of establishing a one-to-one correspondence relationship between the users participating in the video conference and the corresponding conference terminals in advance comprises:
establishing a corresponding relation between the user N and the conference terminal N to show that the user N uses the conference terminal N to participate in the video conference; wherein N is a positive integer.
3. The video conference speaker picture switching processing method according to claim 1, wherein the step of establishing a correspondence between the voice switching signal for switching the speaker picture and the corresponding conference terminal picture switching operation instruction includes:
and establishing a corresponding relation between the voice switching signal containing the picture switching of the main speaker of the user N and the picture switching operation instruction of the corresponding conference terminal N.
4. The method for switching between the images of the speaker in the video conference according to claim 1, wherein the step of acquiring the detection voice signal and determining whether the voice switching signal for switching between the images of the speaker exists in the detection voice signal comprises:
acquiring a voice signal of a user in real time through a voice assistant;
judging whether the detected voice signal contains a keyword containing voice switching;
if the keyword containing the voice switching exists, the voice switching signal for switching the picture of the speaker is judged to exist.
5. The video conference speaker picture switching processing method according to claim 1, wherein the step of controlling the conference terminal picture corresponding to the user in the voice switching signal to switch to the speaker picture for display when it is judged that there is the voice switching signal for switching the speaker picture comprises:
when judging that a voice switching signal for switching the picture of the speaker exists;
analyzing keywords of the voice switching signal, and translating the keywords into corresponding conference terminal picture switching operation instructions formulated by a background;
and controlling the corresponding conference terminal picture to be switched to the speaker picture for display according to the conference terminal picture switching operation instruction.
6. The video conference speaker picture switching processing method according to claim 1, further comprising the steps of:
and when the voice signal of finishing the video conference is acquired in the video conference process, controlling to finish the current video conference.
7. A video conference speaker screen switching processing apparatus, characterized in that the apparatus comprises:
the system comprises a presetting module, a video conference processing module and a video conference processing module, wherein the presetting module is used for establishing a one-to-one correspondence relationship between users participating in the video conference and corresponding conference participating terminals in advance;
the switching instruction corresponding module is used for establishing a corresponding relation between the voice switching signal for switching the images of the speaker and the corresponding image switching operation instruction of the conference terminal;
the video conference access module is used for controlling each user participating in the video conference to access the video conference through the corresponding conference participating terminal when the video conference starts;
the voice judging module is used for acquiring the detection voice signal and judging whether the voice switching signal for switching the images of the speaker exists in the detection voice signal;
the picture switching control module is used for controlling the conference terminal picture corresponding to the user in the voice switching signal to be switched to the picture of the speaker for display when the voice switching signal for switching the picture of the speaker is judged;
when the voice switching signal for switching the picture of the main speaker is judged to exist, the step of controlling the picture of the conference terminal corresponding to the user in the voice switching signal to be switched to the picture of the main speaker for displaying comprises the following steps:
when the voice switching signal for switching the images of the speaker is judged to have voice switching operation instructions of a plurality of users, controlling the queue to wait according to the sequence specified in the voice switching signal;
and controlling the corresponding conference terminal pictures to be switched to the speaker picture for display according to the sequence of queuing.
8. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor when executing the computer program implements the steps of the video conference speaker picture switching processing method according to any one of claims 1 to 6.
9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the video conference presenter screen switching processing method according to any one of claims 1 to 6.
CN202010147449.6A 2020-03-05 2020-03-05 Video conference speaker picture switching processing method and device, equipment and medium Active CN111405232B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010147449.6A CN111405232B (en) 2020-03-05 2020-03-05 Video conference speaker picture switching processing method and device, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010147449.6A CN111405232B (en) 2020-03-05 2020-03-05 Video conference speaker picture switching processing method and device, equipment and medium

Publications (2)

Publication Number Publication Date
CN111405232A CN111405232A (en) 2020-07-10
CN111405232B true CN111405232B (en) 2021-08-06

Family

ID=71428663

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010147449.6A Active CN111405232B (en) 2020-03-05 2020-03-05 Video conference speaker picture switching processing method and device, equipment and medium

Country Status (1)

Country Link
CN (1) CN111405232B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112235529A (en) * 2020-10-13 2021-01-15 武汉吉迅信息技术有限公司 Implementation mode of cloud video multi-point control rapid conference
CN113206974B (en) * 2021-04-21 2022-10-21 随锐科技集团股份有限公司 Video picture switching method and system
CN113225521B (en) * 2021-05-08 2022-11-04 维沃移动通信(杭州)有限公司 Video conference control method and device and electronic equipment
CN113596349B (en) * 2021-07-26 2024-06-04 世邦通信股份有限公司 Conference method, system, device and storage medium for automatic linkage video of speaking position
CN113810653A (en) * 2021-09-17 2021-12-17 广州科天视畅信息科技有限公司 Audio and video based method and system for talkback tracking of multi-party network conference
CN114531563A (en) * 2022-02-16 2022-05-24 广州市哲闻信息科技有限公司 Video conference control method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107396036A (en) * 2017-09-07 2017-11-24 北京小米移动软件有限公司 Method for processing video frequency and terminal in video conference
CN110556112A (en) * 2018-05-30 2019-12-10 夏普株式会社 operation support device, operation support system, and operation support method
CN110600036A (en) * 2019-09-24 2019-12-20 随锐科技集团股份有限公司 Conference picture switching device and method based on voice recognition
CN110730324A (en) * 2019-09-12 2020-01-24 视联动力信息技术股份有限公司 Video picture display control method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100588274C (en) * 2006-12-19 2010-02-03 展讯通信(上海)有限公司 Method for switching booking of multipartite call right to speak in numeral intercommunication system
CN104580995B (en) * 2015-01-28 2018-01-12 苏州科达科技股份有限公司 A kind of means of communication and device for video conference
CN109547735B (en) * 2019-01-18 2024-04-16 海南科先电子科技有限公司 Conference integration system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107396036A (en) * 2017-09-07 2017-11-24 北京小米移动软件有限公司 Method for processing video frequency and terminal in video conference
CN110556112A (en) * 2018-05-30 2019-12-10 夏普株式会社 operation support device, operation support system, and operation support method
CN110730324A (en) * 2019-09-12 2020-01-24 视联动力信息技术股份有限公司 Video picture display control method and device
CN110600036A (en) * 2019-09-24 2019-12-20 随锐科技集团股份有限公司 Conference picture switching device and method based on voice recognition

Also Published As

Publication number Publication date
CN111405232A (en) 2020-07-10

Similar Documents

Publication Publication Date Title
CN111405232B (en) Video conference speaker picture switching processing method and device, equipment and medium
US10863137B2 (en) Method and apparatus for controlling audio-video conference, device and storage medium
CN108933965B (en) Screen content sharing method and device and storage medium
US8416715B2 (en) Interest determination for auditory enhancement
EP2526651B1 (en) Communication sessions among devices and interfaces with mixed capabilities
JP2544581B2 (en) Conference system control method, conference device, and conference system
US20190174095A1 (en) System and methods for automatic call initiation based on biometric data
US10423237B2 (en) Gesture-based control and usage of video relay service communications
CN110769189B (en) Video conference switching method and device and readable storage medium
US20160294892A1 (en) Storage Medium Storing Program, Server Apparatus, and Method of Controlling Server Apparatus
CN108809902B (en) Terminal, server and audio and video conference implementation method, device and system thereof
CN111246157A (en) Audio and video conference abnormal exit re-entry processing method and device, equipment and medium
KR102460105B1 (en) Method for providing conference service and apparatus thereof
JP2019117998A (en) Web conference system, control method of web conference system, and program
US8976223B1 (en) Speaker switching in multiway conversation
CN111835617B (en) User head portrait adjusting method and device and electronic equipment
JP2019117997A (en) Web conference system, control method of web conference system, and program
CN113225521B (en) Video conference control method and device and electronic equipment
CN111327773B (en) Conference mode-based method and device for processing contact numbers in call session
US9842108B2 (en) Automated escalation agent system for language interpretation
CN111405122B (en) Audio call testing method, device and storage medium
CN115622824A (en) Conference terminal screen layout method, control device, conference system and medium
KR101048848B1 (en) Voice conference call method and system
CN112312063A (en) Multimedia communication method and device
JP2023084986A (en) Display control system, display control method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant