CN112839195A - Method and device for consulting meeting record, computer equipment and storage medium - Google Patents

Method and device for consulting meeting record, computer equipment and storage medium Download PDF

Info

Publication number
CN112839195A
CN112839195A CN202011608242.0A CN202011608242A CN112839195A CN 112839195 A CN112839195 A CN 112839195A CN 202011608242 A CN202011608242 A CN 202011608242A CN 112839195 A CN112839195 A CN 112839195A
Authority
CN
China
Prior art keywords
text
participant
paragraph
video
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011608242.0A
Other languages
Chinese (zh)
Other versions
CN112839195B (en
Inventor
凌斌
廖明章
李可圣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Horion Intelligent Technology Co ltd
Original Assignee
Shenzhen Horion Intelligent Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Horion Intelligent Technology Co ltd filed Critical Shenzhen Horion Intelligent Technology Co ltd
Priority to CN202011608242.0A priority Critical patent/CN112839195B/en
Publication of CN112839195A publication Critical patent/CN112839195A/en
Application granted granted Critical
Publication of CN112839195B publication Critical patent/CN112839195B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems
    • H04N7/155Conference systems involving storage of or access to video conference sessions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L12/00Data switching networks
    • H04L12/02Details
    • H04L12/16Arrangements for providing special services to substations
    • H04L12/18Arrangements for providing special services to substations for broadcast or conference, e.g. multicast
    • H04L12/1813Arrangements for providing special services to substations for broadcast or conference, e.g. multicast for computer conferences, e.g. chat rooms
    • H04L12/1831Tracking arrangements for later retrieval, e.g. recording contents, participants activities or behavior, network status
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • H04N21/44213Monitoring of end-user related data
    • H04N21/44222Analytics of user selections, e.g. selection of programs or purchase activity
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • H04N21/47217End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/278Subtitling

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention discloses a method and a device for consulting a conference record, computer equipment and a storage medium. The method comprises the steps of obtaining voice data of each participant in a conference video, analyzing the voice data to obtain a text paragraph and a time stamp corresponding to each participant, and storing the text paragraph and the time stamp into a first text area; when a user inputs text information to be inquired, inquiring and confirming a text paragraph in which the text information is located in a first text region, and executing jumping to a video playing node corresponding to the text paragraph; and when the user selects the target participant, acquiring all text paragraphs of the target participant, storing the text paragraphs into the second text region, and jumping to the video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region. The invention has the advantages of quickly positioning the corresponding node of the conference record and looking up the video and audio files.

Description

Method and device for consulting meeting record, computer equipment and storage medium
Technical Field
The present invention relates to the field of multimedia technologies, and in particular, to a method and an apparatus for consulting a conference record, a computer device, and a storage medium.
Background
With the rapid development of computer technology and network technology, the mode of the conference changes with the change of covering the ground, and at present, people do not need to gather participants in a unified conference room for meeting as usual, but can realize cross-region meeting through brand new modes such as conference video and the like, thereby greatly enriching the meeting modes of people and bringing convenience to people.
In the related technology, in order to facilitate the participants to look up the conference record again after the conference is finished, the video and the audio are stored, then the video and the audio are analyzed and converted into character information with a time stamp, and the character information is displayed on the conference video; however, when the video of the conference is long and the generated text information is too much, the participant is inconvenient to look up, and cannot quickly locate the corresponding node of the conference and look up the video and audio files.
Disclosure of Invention
The invention aims to provide a method and a device for consulting a conference record, computer equipment and a storage medium, and aims to solve the problem that in the prior art, when a stored conference video is consulted, a corresponding node of a conference cannot be quickly positioned and video and audio files cannot be consulted.
In a first aspect, an embodiment of the present invention provides a method for consulting a meeting record, where the method includes:
confirming the identities of all participants in the conference video, and binding identity tags for each participant;
acquiring and analyzing voice data of each participant in the conference video to obtain all text paragraphs corresponding to each participant and a timestamp corresponding to each text paragraph, storing the text paragraphs and the timestamps into a first text area, and displaying the first text area in the conference video;
scrolling and playing a current text paragraph in a first text area according to the current playing progress of the conference video, and marking a corresponding identity tag and a timestamp on the current text paragraph;
according to text information input by a user, scrolling and inquiring in the first text area, confirming a text paragraph where the text information is located, and executing jumping to a video playing node corresponding to the text paragraph;
and acquiring all text paragraphs corresponding to the target participant according to the target participant selected by the user, storing the text paragraphs into a second text region, displaying the second text region in the conference video, and executing jumping to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region.
In a second aspect, an embodiment of the present invention provides a device for consulting a meeting record, including:
the system comprises a confirming unit, a processing unit and a processing unit, wherein the confirming unit is used for confirming the identities of all participants in a conference video and binding an identity tag for each participant;
the analysis unit is used for acquiring and analyzing voice data of each participant in the conference video to obtain all text paragraphs corresponding to each participant and a timestamp corresponding to each text paragraph, storing the text paragraphs and the timestamps into a first text area, and displaying the first text area in the conference video;
the playing unit is used for rolling and playing a current text paragraph in a first text area according to the current playing progress of the conference video and marking a corresponding identity tag and a corresponding time stamp on the current text paragraph;
the first jumping unit is used for rolling and inquiring in the first text area according to the text information input by the user, confirming a text paragraph where the text information is located, and executing jumping to a video playing node corresponding to the text paragraph;
and the second jumping unit is used for acquiring all text paragraphs corresponding to the target participant according to the target participant selected by the user, storing the text paragraphs into a second text region, displaying the second text region in the conference video, and jumping to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region.
In a third aspect, an embodiment of the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the method for consulting a meeting record according to the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present invention further provides a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program, when executed by a processor, causes the processor to execute the method for consulting a meeting record according to the first aspect.
The embodiment of the invention discloses a method and a device for consulting a conference record, computer equipment and a storage medium. The method comprises the steps of obtaining voice data of each participant in a conference video, analyzing the voice data to obtain a text paragraph and a time stamp corresponding to each participant, and storing the text paragraph and the time stamp into a first text area; when a user inputs text information to be inquired, inquiring and confirming a text paragraph in which the text information is located in a first text region, and executing jumping to a video playing node corresponding to the text paragraph; and when the user selects the target participant, acquiring all text paragraphs of the target participant, storing the text paragraphs into the second text region, and jumping to the video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region. The embodiment of the invention has the advantages of quickly positioning the corresponding node of the conference record and looking up the video and audio files.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for consulting a meeting record according to an embodiment of the present invention;
fig. 2 is a schematic sub-flow diagram of a method for consulting a meeting record according to an embodiment of the present invention;
fig. 3 is a schematic sub-flow diagram of a method for consulting a meeting record according to an embodiment of the present invention;
fig. 4 is a schematic sub-flow chart of a method for referring to a meeting record according to an embodiment of the present invention;
FIG. 5 is a schematic block diagram of a device for reviewing meeting records provided by an embodiment of the present invention;
fig. 6 is a schematic diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the specification of the present invention and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Referring to fig. 1, fig. 1 is a schematic flow chart of a method for consulting a meeting record according to an embodiment of the present invention;
as shown in fig. 1, the method includes steps S101 to S105.
S101, confirming the identities of all participants in the conference video, and binding identity tags for each participant.
In this embodiment, the conference video refers to a multi-user video chat initiated by an initiator on conference software, and a video file recorded in the whole process can replace a conference in an offline office scene, so that participants can conveniently cross regions at different places to meet at any time, after the conference video is initiated, the participants entering the conference can be registered and the identities of the participants can be determined, and identity tags are bound to each participant, so that the conference record of each participant in the conference video can be conveniently recorded.
In one embodiment, the step S101 includes:
and identifying each participant through a face recognition system, confirming the identity of each participant, and binding an identity tag to each participant, wherein the identity tag is an identity name or a preset number.
In this embodiment, when a participant enters a conference, the conference software automatically acquires face pictures of the participant, and identifies the face pictures by the face identification system, confirms the identity of the participant and binds an identity tag to the participant, wherein the identity tag is an identity name or a preset number, the identity name can be an ID nickname in an application platform, the preset number can be automatically numbered by the conference software, and the numbering sequence can be sorted according to the order of entering the conference; it should be noted that, when the identity tag is an identity name, the purpose of face recognition is to determine whether the participant is a matching corresponding identity name; when the serial number is preset in the identity tag, the purpose of face recognition is to bind the participant with the serial number.
S102, acquiring and analyzing voice data of each participant in the conference video to obtain all text paragraphs corresponding to each participant and timestamps corresponding to each text paragraph, storing the text paragraphs and the timestamps into a first text area, and displaying the first text area in the conference video.
In this embodiment, the voice of each participant is distinguished by the voiceprint recognition module to obtain voice data corresponding to each participant, and the voice data of each participant is analyzed to obtain all text paragraphs of each participant and timestamps corresponding to each text paragraph, where the text paragraphs are analyzed texts of voice segments of the participants in a certain period, the text paragraphs and the timestamps corresponding to all participants obtained by analysis are stored in a first text region, the first text region is displayed in the conference video, and a user can directly look up conference records in the first text region.
In another embodiment, the voice data of each participant can be accurately acquired by confirming the current speaking participant, and whether the participant is in the speaking state can be quickly determined by monitoring and judging the mouth area of each participant, so that the voice data of each participant can be accurately acquired.
Specifically, the judgment can be performed through the camera function of the mobile terminal equipment of the participant or through the peripheral camera used by the participant, and the specific judgment mode can be that the mouth area of the participant is detected, and the mouth area when the participant does not speak is detected firstly; when the participant speaks, the mouth opening of the participant can be detected, the proportion of the mouth area of the participant after the mouth opening to the mouth area of the participant when the participant does not speak is detected, and if the proportion is larger than a preset proportion threshold value, the participant is judged to speak, so that the current speaker can be accurately judged, and the voice data of each participant can be accurately acquired.
In one embodiment, as shown in fig. 2, the step S102 includes:
s201, recording voice data of each participant in the whole conference video, and dividing the voice data of each participant into a plurality of voice packets according to the audio fluctuation frequency of the voice data;
s202, analyzing each voice packet through a voice recognition module to obtain a corresponding text paragraph and a start time and an end time corresponding to the text paragraph, and taking the start time as a time stamp of the text paragraph;
s203, after the text paragraphs of all the participants are analyzed and obtained, sequencing the texts according to the timestamp of each text paragraph and storing the sequenced texts into a first text area;
and S204, displaying the first text area in the conference video.
In this embodiment, in the voice data of each participant, there may be multiple speeches in multiple time periods, each speech may be recorded as one voice packet, and a plurality of voice packets constitute one complete voice data; specifically, in the voice data of one participant, the size of the audio fluctuation frequency of the participant can be detected in the speaking time period, and in the non-speaking time period, the audio fluctuation frequency is almost zero, so that the voice data of each participant can be divided through the size of the audio fluctuation frequency to obtain a plurality of voice packets; then, analyzing each voice packet through a voice recognition module and obtaining a corresponding text paragraph and a start time and an end time corresponding to the text paragraph, so that time periods (the time periods comprise the start time and the end time) of all text paragraphs and speech of each participant in the conference video can be obtained, regarding any text paragraph, the start time of the text paragraph is taken as a timestamp of the text paragraph, sequencing all text paragraphs in the conference video according to the timestamp, and storing the sequencing into a first text region; and finally, displaying the first text area in the conference video, wherein a user can directly inquire the corresponding conference record in the first text area.
In one embodiment, the step S201 includes:
analyzing the voice data of each participant to obtain the audio fluctuation frequency of the voice data, intercepting the section of the audio fluctuation frequency which is greater than the preset frequency, and taking each section as a voice packet.
In this embodiment, the voice data of each participant is detected by the audio detection module to obtain the audio fluctuation frequency of the voice data in the whole time period, and the audio fluctuation frequency of the whole time period is compared with a preset frequency, where the preset frequency is 80 to 300HZ, for example, when the preset frequency is 120HZ, a time period exceeding 120HZ in the audio fluctuation frequency is used as a speaking time, that is, a speaking time period is used as a voice packet, so that the voice data is divided and a plurality of voice packets are obtained.
S103, according to the current playing progress of the conference video, a current text paragraph is played in a rolling mode in the first text area, and a corresponding identity tag and a corresponding time stamp are marked on the current text paragraph.
In this embodiment, when the user normally plays the conference video, according to the time node of the current playing progress, the closest timestamp is determined in the first text region, the text paragraph corresponding to the timestamp is played in the first text region in a real-time rolling manner, and the corresponding identity tag and the timestamp are marked at the beginning of the played text paragraph, so that the played text paragraph can correspond to the currently played conference video, and the user can more intuitively obtain the conference record.
Furthermore, when the conference video is played, besides the text paragraphs are played in the first text area in a rolling manner, the characters currently spoken by the participant can be displayed on the head portrait of the currently spoken participant.
S104, according to the text information input by the user, scrolling and inquiring in the first text area, confirming the text paragraph where the text information is located, and executing jumping to the video playing node corresponding to the text paragraph.
In this embodiment, a user may directly scroll and query corresponding text content in the first text region, and also perform fast query by using the search module, specifically, receive text information input by the user, query a text paragraph associated with the text information in the first text region according to the text information, and skip to the text paragraph and a video playing node corresponding to the text paragraph, thereby achieving an advantage of fast positioning.
In one embodiment, as shown in fig. 3, the step S104 includes:
s301, according to text information input by a user in a query box of the first text area, querying and confirming all text paragraphs containing the text information in the first text area, wherein the text information is a keyword;
s302, if a plurality of text paragraphs containing the text information exist in the first text region, continuously screening out accurate text paragraphs according to newly added keywords in the query box;
s303, if only one text paragraph containing the text information in the first text region is available, directly determining the text paragraph;
s304, receiving a jump instruction, and executing jump to a video playing node corresponding to the text paragraph.
In this embodiment, the query box in the first text region is used to quickly locate the corresponding node, the user can quickly query a text paragraph corresponding to the text information in the first text region by inputting the text information to be queried in the query box through the search module in the first text region, and jump to the text paragraph, and the user can directly look up the information of the text paragraph.
If the first text area contains a plurality of text paragraphs containing the text information, selecting all the text paragraphs, popping up a selection area at one side of the first text area, and synchronously jumping to the same text paragraph in the first text area according to the text paragraph selected by the user in the selection area, thereby realizing quick and accurate query; or screening out accurate text paragraphs according to newly-added keywords in the query box by the user; and if only one text paragraph containing the text information in the first text area exists, directly jumping to the text paragraph.
After jumping to the corresponding text paragraph, whether the user executes the jump instruction is determined, if yes, the jump is executed to the video playing node corresponding to the text paragraph, and therefore the advantage of synchronously looking up the text paragraph and the video content is achieved.
And S105, according to the target participant selected by the user, acquiring all text paragraphs corresponding to the target participant, storing the text paragraphs into a second text region, displaying the second text region in the conference video, and executing jumping to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region.
In this embodiment, when the user does not input accurate text information into the query box, the query may be performed through a target participant selected by the user, all text paragraphs of any participant may be separately obtained and stored in the second text region, the second text region is displayed in the conference video, then the query is directly performed in the second text region and corresponding information is obtained, and the selected text paragraph may be confirmed in the second text region and a jump to a video playing node corresponding to the text paragraph may be performed; the advantage of multiple query modes is realized.
In one embodiment, as shown in fig. 4, the step S105 includes:
s401, acquiring a head portrait of a participant selected by a user according to a long press, and determining the head portrait as a target participant to be inquired;
s402, selecting all text paragraphs of the target participant from the first text area, storing the text paragraphs in a second text area, and displaying the second text area in the conference video;
s403, confirming text paragraphs which are inquired and selected by the user in a scrolling manner in the second text area;
s404, receiving a jump instruction, and executing jump to a video playing node corresponding to the text paragraph.
In the embodiment, in the conference video, a user can directly press the head portraits of the participants on a display interface by using a selection device such as a mouse and the like, determine that the selected participants are the target participants to be inquired, and inquire the target participants in a targeted manner; specifically, after the target participant is determined, all text paragraphs belonging to the target participant are selected from the first text region according to the identity tag of the target participant and stored in the second text region, so that the query range can be reduced, conference records can be quickly obtained from the second text region in a rolling query mode, and then the video playing node corresponding to the text paragraph can be directly jumped to according to the text paragraph selected from the second text region by the user and the execution of the jump instruction after the jump instruction is determined.
In an embodiment, the first text region and the second text region are displayed in a display interface of the conference video in a semi-transparent state, and the characters in the first text region and the second text region are displayed in the first text region and the second text region in a semi-transparent state respectively.
In this embodiment, the first text region and the second text region are set to be semi-transparent and displayed on the display interface of the conference video, so as to prevent the first text region and the second text region from blocking the video picture in the display interface; it can be understood that displaying the characters in the first text region and the second text region in a semi-transparent state in the first text region and the second text region respectively is also a video picture for preventing the characters from being blocked in the display interface; obviously, in order to avoid blocking the video picture, the display interface may also be divided, the video picture, the first text region and the second text region are divided, or the first text region and the second text region may also be set as a concealable region, so that the first text region and the second text region may be opened or hidden at any time.
Furthermore, it should be noted that the fast positioning query method provided by the embodiment of the present invention is not only used for querying the conference record after the conference video is finished, but also used for querying in the real-time conference process, and the same query method can be used for fast querying the conference record stored in real time in the conference process.
The embodiment of the invention also provides a device for consulting the conference record, which is used for executing any embodiment of the method for consulting the conference record. Specifically, referring to fig. 5, fig. 5 is a schematic block diagram of a device for referring to a meeting record according to an embodiment of the present invention.
As shown in fig. 5, the device 500 for referring to a meeting record includes: a confirmation unit 501, a parsing unit 502, a playback unit 503, a first jumping unit 504, and a second jumping unit 505.
A confirming unit 501, configured to confirm identities of all participants in the conference video, and bind an identity tag to each participant;
an analyzing unit 502, configured to obtain and analyze voice data of each participant in the conference video, obtain all text paragraphs corresponding to each participant and a timestamp corresponding to each text paragraph, store the text paragraphs in a first text region, and display the first text region in the conference video;
a playing unit 503, configured to scroll and play a current text paragraph in a first text region according to a current playing progress of the conference video, and mark a corresponding identity tag and a timestamp on the current text paragraph;
a first skipping unit 504, configured to, according to text information input by a user, perform a rolling query in the first text region and determine a text paragraph where the text information is located, and perform a skipping to a video playing node corresponding to the text paragraph;
a second jumping unit 505, configured to obtain all text paragraphs corresponding to a target participant according to the target participant selected by the user, store the text paragraphs in a second text region, display the second text region in the conference video, and execute jumping to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region;
the device respectively and rapidly inquires the required text paragraph in the first text region or the second text region according to the text information provided by the user or the selected target participant, and jumps to the video playing node corresponding to the text paragraph according to the text paragraph, thereby having the advantages of rapidly positioning the corresponding node of the conference record and looking up the video and audio files.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
The above-mentioned means for referring to the meeting record may be implemented in the form of a computer program which can be run on a computer device as shown in fig. 6.
Referring to fig. 6, fig. 6 is a schematic block diagram of a computer device according to an embodiment of the present invention. The computer device 600 is a server, and the server may be an independent server or a server cluster composed of a plurality of servers.
Referring to fig. 6, the computer device 600 includes a processor 602, memory, and a network interface 605 connected by a system bus 601, where the memory may include a non-volatile storage medium 603 and an internal memory 604.
The non-volatile storage medium 603 may store an operating system 6031 and computer programs 6032. The computer program 6032, when executed, may cause the processor 602 to perform a review method for meeting records.
The processor 602 is used to provide computing and control capabilities that support the operation of the overall computer device 600.
The internal memory 604 provides an environment for the execution of a computer program 6032 in the non-volatile storage medium 603, which computer program 6032, when executed by the processor 602, causes the processor 602 to perform a method of reviewing a meeting record.
The network interface 605 is used for network communication, such as providing transmission of data information. Those skilled in the art will appreciate that the configuration shown in fig. 6 is a block diagram of only a portion of the configuration associated with aspects of the present invention and is not intended to limit the computing device 600 to which aspects of the present invention may be applied, and that a particular computing device 600 may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
Those skilled in the art will appreciate that the embodiment of a computer device illustrated in fig. 6 does not constitute a limitation on the specific construction of the computer device, and that in other embodiments a computer device may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. For example, in some embodiments, the computer device may only include a memory and a processor, and in such embodiments, the structures and functions of the memory and the processor are consistent with those of the embodiment shown in fig. 6, and are not described herein again.
It should be understood that, in the embodiment of the present invention, the Processor 602 may be a Central Processing Unit (CPU), and the Processor 602 may also be other general-purpose processors, Digital Signal Processors (DSPs), Application Specific Integrated Circuits (ASICs), Field Programmable Gate Arrays (FPGAs) or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components, and the like. Wherein a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
In another embodiment of the invention, a computer-readable storage medium is provided. The computer readable storage medium may be a non-volatile computer readable storage medium. The computer-readable storage medium stores a computer program, wherein the computer program, when executed by a processor, implements a method for referring to a meeting record according to an embodiment of the present invention.
The storage medium is an entity and non-transitory storage medium, and may be various entity storage media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only memory (ROM), a magnetic disk, or an optical disk.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, devices and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
While the invention has been described with reference to specific embodiments, the invention is not limited thereto, and various equivalent modifications and substitutions can be easily made by those skilled in the art within the technical scope of the invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for consulting a meeting record, comprising:
confirming the identities of all participants in the conference video, and binding identity tags for each participant;
acquiring and analyzing voice data of each participant in the conference video to obtain all text paragraphs corresponding to each participant and a timestamp corresponding to each text paragraph, storing the text paragraphs and the timestamps into a first text area, and displaying the first text area in the conference video;
scrolling and playing a current text paragraph in a first text area according to the current playing progress of the conference video, and marking a corresponding identity tag and a timestamp on the current text paragraph;
according to text information input by a user, scrolling and inquiring in the first text area, confirming a text paragraph where the text information is located, and executing jumping to a video playing node corresponding to the text paragraph;
and acquiring all text paragraphs corresponding to the target participant according to the target participant selected by the user, storing the text paragraphs into a second text region, displaying the second text region in the conference video, and executing jumping to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region.
2. The method for consulting a conference record according to claim 1, wherein the confirming identities of all participants in the conference video and binding an identity tag to each participant includes:
and identifying each participant through a face recognition system, confirming the identity of each participant, and binding an identity tag to each participant, wherein the identity tag is an identity name or a preset number.
3. The method for referring to a meeting record of claim 1, wherein the obtaining and analyzing the voice data of each participant in the meeting video to obtain all text paragraphs and timestamps corresponding to each text paragraph corresponding to each participant, storing the text paragraphs and the timestamps into a first text area, and displaying the first text area in the meeting video comprises:
recording the voice data of each participant in the whole conference video, and dividing the voice data of each participant into a plurality of voice packets according to the audio fluctuation frequency of the voice data;
analyzing each voice packet through a voice recognition module to obtain a corresponding text paragraph and a start time and an end time corresponding to the text paragraph, and taking the start time as a timestamp of the text paragraph;
after the text paragraphs of all participants are analyzed and obtained, text sequencing is carried out according to the timestamp of each text paragraph and the text is stored in a first text area;
and displaying the first text region in the conference video.
4. The method for reviewing a conference record, as claimed in claim 3, wherein said recording voice data of each participant in the whole conference video, and dividing the voice data of each participant into a plurality of voice packets according to the audio fluctuation frequency of the voice data comprises:
analyzing the voice data of each participant to obtain the audio fluctuation frequency of the voice data, intercepting the section of the audio fluctuation frequency which is greater than the preset frequency, and taking each section as a voice packet.
5. The method for referring to a meeting record of claim 1, wherein the scrolling, querying and confirming a text paragraph in which the text information is located in the first text region according to the text information input by the user, and performing a jump to a video playing node corresponding to the text paragraph comprises:
according to text information input by a user in the query box of the first text area, querying and confirming all text paragraphs containing the text information in the first text area, wherein the text information is a keyword;
if a plurality of text paragraphs containing the text information are in the first text area, continuously screening out accurate text paragraphs according to newly added keywords in the query box;
if only one text paragraph containing the text information in the first text region is available, directly determining the text paragraph;
and receiving a jump instruction, and executing jump to a video playing node corresponding to the text paragraph.
6. The method for referring to meeting records of claim 1, wherein the obtaining, according to a target participant selected by a user, all text paragraphs corresponding to the target participant and storing the text paragraphs to a second text region, displaying the second text region in the meeting video, and performing a jump to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region comprises:
acquiring the head portrait of the participant selected by the user according to the length of the user, and determining the head portrait as a target participant to be inquired;
selecting all text paragraphs of the target participant from the first text region, storing the text paragraphs in a second text region, and displaying the second text region in the conference video;
confirming a text paragraph that the user scrolls the query and selects in the second text region;
and receiving a jump instruction, and executing jump to a video playing node corresponding to the text paragraph.
7. The method for viewing a conference record according to claim 1, wherein the first text region and the second text region are displayed in a display interface of the conference video in a semi-transparent manner, and characters in the first text region and the second text region are displayed in the first text region and the second text region in a semi-transparent manner, respectively.
8. A device for referring to a conference record, comprising:
the system comprises a confirming unit, a processing unit and a processing unit, wherein the confirming unit is used for confirming the identities of all participants in a conference video and binding an identity tag for each participant;
the analysis unit is used for acquiring and analyzing voice data of each participant in the conference video to obtain all text paragraphs corresponding to each participant and a timestamp corresponding to each text paragraph, storing the text paragraphs and the timestamps into a first text area, and displaying the first text area in the conference video;
the playing unit is used for rolling and playing a current text paragraph in a first text area according to the current playing progress of the conference video and marking a corresponding identity tag and a corresponding time stamp on the current text paragraph;
the first jumping unit is used for rolling and inquiring in the first text area according to the text information input by the user, confirming a text paragraph where the text information is located, and executing jumping to a video playing node corresponding to the text paragraph;
and the second jumping unit is used for acquiring all text paragraphs corresponding to the target participant according to the target participant selected by the user, storing the text paragraphs into a second text region, displaying the second text region in the conference video, and jumping to a video playing node corresponding to the text paragraph according to the text paragraph selected by the user in the second text region.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of consulting a meeting record according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a processor, causes the processor to execute the method of referring to a meeting record of any one of claims 1 to 7.
CN202011608242.0A 2020-12-30 2020-12-30 Conference record consulting method and device, computer equipment and storage medium Active CN112839195B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011608242.0A CN112839195B (en) 2020-12-30 2020-12-30 Conference record consulting method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011608242.0A CN112839195B (en) 2020-12-30 2020-12-30 Conference record consulting method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112839195A true CN112839195A (en) 2021-05-25
CN112839195B CN112839195B (en) 2023-10-10

Family

ID=75925422

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011608242.0A Active CN112839195B (en) 2020-12-30 2020-12-30 Conference record consulting method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112839195B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326387A (en) * 2021-05-31 2021-08-31 引智科技(深圳)有限公司 Intelligent conference information retrieval method
CN113411532A (en) * 2021-06-24 2021-09-17 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for recording content
CN114282621A (en) * 2021-12-29 2022-04-05 湖北微模式科技发展有限公司 Multi-mode fused speaker role distinguishing method and system
CN114661942A (en) * 2022-03-31 2022-06-24 医渡云(北京)技术有限公司 Method and device for processing streaming tone data, electronic equipment and computer readable medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170126667A (en) * 2016-05-10 2017-11-20 삼성에스디에스 주식회사 Method for generating conference record automatically and apparatus thereof
JP2019105740A (en) * 2017-12-12 2019-06-27 キヤノン株式会社 Conference system, summary device, control method of conference system, control method of summary device, and program
CN110335612A (en) * 2019-07-11 2019-10-15 招商局金融科技有限公司 Minutes generation method, device and storage medium based on speech recognition
CN111193890A (en) * 2018-11-14 2020-05-22 株式会社理光 Conference record analyzing device and method and conference record playing system
CN111563182A (en) * 2020-04-28 2020-08-21 深圳震有科技股份有限公司 Voice conference record storage processing method and device
CN111708912A (en) * 2020-05-06 2020-09-25 深圳震有科技股份有限公司 Video conference record query processing method and device
WO2020218664A1 (en) * 2019-04-25 2020-10-29 이봉규 Smart conference system based on 5g communication and conference support method using robotic processing automation
CN111953852A (en) * 2020-07-30 2020-11-17 北京声智科技有限公司 Call record generation method, device, terminal and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170126667A (en) * 2016-05-10 2017-11-20 삼성에스디에스 주식회사 Method for generating conference record automatically and apparatus thereof
JP2019105740A (en) * 2017-12-12 2019-06-27 キヤノン株式会社 Conference system, summary device, control method of conference system, control method of summary device, and program
CN111193890A (en) * 2018-11-14 2020-05-22 株式会社理光 Conference record analyzing device and method and conference record playing system
WO2020218664A1 (en) * 2019-04-25 2020-10-29 이봉규 Smart conference system based on 5g communication and conference support method using robotic processing automation
CN110335612A (en) * 2019-07-11 2019-10-15 招商局金融科技有限公司 Minutes generation method, device and storage medium based on speech recognition
CN111563182A (en) * 2020-04-28 2020-08-21 深圳震有科技股份有限公司 Voice conference record storage processing method and device
CN111708912A (en) * 2020-05-06 2020-09-25 深圳震有科技股份有限公司 Video conference record query processing method and device
CN111953852A (en) * 2020-07-30 2020-11-17 北京声智科技有限公司 Call record generation method, device, terminal and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113326387A (en) * 2021-05-31 2021-08-31 引智科技(深圳)有限公司 Intelligent conference information retrieval method
CN113326387B (en) * 2021-05-31 2022-12-13 引智科技(深圳)有限公司 Intelligent conference information retrieval method
CN113411532A (en) * 2021-06-24 2021-09-17 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for recording content
CN113411532B (en) * 2021-06-24 2023-08-08 Oppo广东移动通信有限公司 Method, device, terminal and storage medium for recording content
CN114282621A (en) * 2021-12-29 2022-04-05 湖北微模式科技发展有限公司 Multi-mode fused speaker role distinguishing method and system
CN114661942A (en) * 2022-03-31 2022-06-24 医渡云(北京)技术有限公司 Method and device for processing streaming tone data, electronic equipment and computer readable medium

Also Published As

Publication number Publication date
CN112839195B (en) 2023-10-10

Similar Documents

Publication Publication Date Title
CN112839195A (en) Method and device for consulting meeting record, computer equipment and storage medium
US20150049162A1 (en) Panoramic Meeting Room Video Conferencing With Automatic Directionless Heuristic Point Of Interest Activity Detection And Management
JP5949843B2 (en) Information processing apparatus, information processing apparatus control method, and program
CN112653902B (en) Speaker recognition method and device and electronic equipment
US9055193B2 (en) System and method of a remote conference
CN110335625A (en) The prompt and recognition methods of background music, device, equipment and medium
US20180090116A1 (en) Audio Processing Method, Apparatus and System
TW201142823A (en) Voice print identification
CN110475140B (en) Bullet screen data processing method and device, computer readable storage medium and computer equipment
JP2007148904A (en) Method, apparatus and program for presenting information
WO2012174388A2 (en) System and method for synchronously generating an index to a media stream
CN108712665A (en) A kind of generation method, device, server and the storage medium of live streaming list
WO2006089355A1 (en) A system for recording and analysing meetings
JP2008139969A (en) Conference minutes generation device, conference information management system, and program
JP2011102862A (en) Speech recognition result control apparatus and speech recognition result display method
JP2008077495A (en) Conference support apparatus, conference support method and conference support program
US20050198123A1 (en) Network conference system
JP6485935B1 (en) Online communication review system, method, and computer program
CN108227950A (en) A kind of input method and device
WO2020108045A1 (en) Video playback method and apparatus and multimedia data playback method
US20170294213A1 (en) Method for video investigation
US20220093103A1 (en) Method, system, and computer-readable recording medium for managing text transcript and memo for audio file
JP4353083B2 (en) Inter-viewer communication method, apparatus and program
CN105868400A (en) Recorded sound information processing method and recorded sound information processing device
WO2015150867A1 (en) Assigning voice characteristics to a contact information record of a person

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant