CN113918114A

CN113918114A - Document control method and device, computer equipment and storage medium

Info

Publication number: CN113918114A
Application number: CN202111203894.0A
Authority: CN
Inventors: 邢起源
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-10-15
Filing date: 2021-10-15
Publication date: 2022-01-11
Anticipated expiration: 2041-10-15
Also published as: CN113918114B

Abstract

The application relates to a document control method, a document control device, computer equipment and a storage medium. The method comprises the following steps: when a document is demonstrated, a triggering control piece following voice page turning is demonstrated in a demonstration interface of the document; responding to the triggering operation of the following voice page turning triggering control, and entering a following voice page turning mode; and in the following voice page turning mode, following the voice content of the presenter to turn a page to a target page of the document, wherein the text content in the target page is matched with the semantics of the voice content of the presenter. The method can be applied to online collaboration documents, can realize that the documents automatically turn pages along with the voice content of a presenter, the presenter does not need to turn pages through an extra demonstration pen to control the documents, and does not need to send a control instruction except the demonstration content to control the turning of the documents, the turning of the pages is very convenient, the demonstration idea of the presenter in the whole demonstration process is more coherent, and the user experience of document demonstration is improved.

Description

Document control method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for controlling a document, a computer device, and a storage medium.

Background

The document demonstration is widely applied to work reports, enterprise propaganda, product introduction, education and training and various types of speeches, and is beneficial to directly and vividly conveying document contents and themes for a demonstrator. However, in the present document presentation process, when a page turning operation is required to be performed on a document, a manner that a presenter manually clicks "previous page" or "next page" is usually adopted, or the presenter controls the page turning of the document through an additional presentation pen. Obviously, the method is not only complicated to operate, but also can frequently interrupt the presentation thought of a presenter for the document in the presentation process, so that the whole presentation process is inconsistent.

Disclosure of Invention

In view of the above, it is necessary to provide a document control method, apparatus, computer device and storage medium for solving the above technical problems.

A document control method, the method comprising:

when a document is demonstrated, a triggering control following voice page turning is shown in a demonstration interface of the document;

responding to the triggering operation of the following voice page turning triggering control, and entering a following voice page turning mode;

and in the following voice page turning mode, following the voice content of a presenter to turn a page to a target page of the document, wherein the text content in the target page is matched with the semantics of the voice content of the presenter.

A document control apparatus, the apparatus comprising:

the demonstration interface display module is used for displaying the trigger control following the voice page turning in the demonstration interface of the document when the document is demonstrated;

the response module is used for responding to the triggering operation of the following voice page turning triggering control and entering a following voice page turning mode;

and the following page turning module is used for following the voice content of the presenter to turn pages to a target page of the document in the following voice page turning mode, and the text content in the target page is matched with the semantics of the voice content of the presenter.

In one embodiment, the document control apparatus further comprises:

the editing interface display module is used for displaying the initialization control following the voice page turning in the editing interface of the document when the document is edited; responding to the triggering operation of the following voice page turning initialization control, and displaying a following page turning prompt area in the editing interface; and displaying the text content corresponding to each page in the document in the following page turning prompt area.

In one embodiment, the editing interface presentation module is further configured to enter a document editing mode with respect to the document in response to a triggering operation for editing the document; after entering the document editing mode, displaying an editing interface of the document; and displaying the following voice page turning initialization control in the editing interface.

In one embodiment, the document control apparatus further comprises:

the text content editing module is used for responding to the text editing operation in the following page turning prompt region and displaying the edited text content in the following page turning prompt region; and updating the text content corresponding to each page in the document according to the edited text content.

In an embodiment, the presentation interface display module is further configured to cancel, in presenting a document, display of the following voice page turning initialization control and the following page turning prompt region in a presentation interface of the document.

In one embodiment, the presentation interface presentation module is further configured to enter a document presentation mode with respect to the document in response to a trigger operation to present the document; after entering the document demonstration mode, displaying a demonstration interface of the document; and displaying the following voice page turning trigger control in the demonstration interface.

In an embodiment, the response module is further configured to, in the following voice page-turning mode, respond to a triggering operation on the following voice page-turning trigger control, and exit the following voice page-turning mode.

In one embodiment, the apparatus further comprises:

and before entering the following voice page turning mode, when the document is in the document editing mode and the triggering operation of the following voice page turning initialization control displayed in the editing interface is determined not to occur, extracting text content corresponding to each page in the document and correspondingly storing the page number of each page and the corresponding text content.

In one embodiment, the document control apparatus further comprises:

the following voice page turning initialization module is used for extracting text contents corresponding to all pages in the document, wherein the text contents comprise at least one of page remark contents and page title contents; and correspondingly storing the page number of each page and the corresponding text content.

In an embodiment, the following voice page turning initialization module is further configured to, for each page in the document, respectively extract original text content corresponding to each page; when the number of words of the original text content is less than the preset number, the original text content is reserved and used as the text content extracted from the corresponding page; and when the number of words of the original text content is more than the preset number, reading the characters of the preset number from the first character of the original text content and then continuing to read the characters until the separator is read for the first time, and keeping the read content as the text content extracted from the corresponding page.

In one embodiment, the document control apparatus further comprises:

the acquisition module is used for acquiring voice data of a presenter in the following voice page turning mode;

the sentence conversion module is used for converting the voice data of the presenter into a sentence of the presenter;

and the matching module is used for carrying out semantic matching on the latest sentence and the text content corresponding to each page in the document one by one and determining a target page according to the successfully matched text content.

In one embodiment, the matching module is further configured to perform semantic matching on the latest statements and the page remark contents corresponding to each page in the document one by one, and use a page corresponding to a page remark content that is successfully matched first as a target page; when the latest sentence is not successfully matched with the page remark content corresponding to each page in the document, performing semantic matching on the latest sentence and the page title content corresponding to each page in the document one by one, and taking the page corresponding to the first successfully matched page title content as a target page; and when the latest sentence is not successfully matched with the page title contents corresponding to each page in the document, indicating the acquisition module to continue to acquire the voice data of the presenter in the following voice page turning mode.

In one embodiment, the matching module is further configured to compare lengths of the latest sentence and the page remark content corresponding to the page being matched in a successive matching process in which the latest sentence is successively semantically matched with the page remark content corresponding to each page in the document; when the length of the latest sentence is greater than the length of the page remark content, intercepting the latest sentence from the last character of the latest sentence to the direction of the first character according to the length of the page remark content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page remark content; and when the length of the latest sentence is smaller than that of the page remark content, intercepting from the last character of the page remark content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

In one embodiment, the matching module is further configured to compare lengths of the latest sentence and the page title content corresponding to the page being matched in a successive matching process in which the latest sentence is successively semantically matched with the page title content corresponding to each page in the document; when the length of the latest sentence is greater than that of the page title content, intercepting from the last character of the latest sentence to the direction of the first character according to the length of the page title content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page title content; and when the length of the latest sentence is smaller than that of the page title content, intercepting from the tail character of the page title content to the direction of the head character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

In one embodiment, the matching module is further configured to, in a successive matching process in which a latest sentence is successively semantically matched with text content corresponding to each page in the document, convert the latest sentence into a corresponding word sequence, and convert text content corresponding to a page being matched into a corresponding word sequence; generating a sentence vector corresponding to the latest sentence based on a word vector corresponding to each participle in the word sequence corresponding to the latest sentence, and generating a text vector corresponding to the text content based on a word vector corresponding to each participle in the word sequence corresponding to the text content; and when the similarity between the sentence vector and the text vector is greater than a preset threshold, judging that the latest sentence is successfully matched with the text content.

In one embodiment, the document control apparatus further comprises:

the page turning module is used for generating a page turning instruction carrying the page number according to the page number of the target page; and based on the page turning instruction, after the document is turned from the current page to the target page, continuously indicating the acquisition module to acquire the voice data of the presenter in the following voice page turning mode.

In one embodiment, the document is an online collaboration document, and the document control apparatus further includes:

the invitation module is used for initiating a document cooperation invitation according to the access address of the online cooperation document;

and the synchronization module is used for synchronizing the document control information generated in the demonstration process of the online collaboration document to the terminal responding to the document collaboration invitation so that the terminal synchronizes the control state of the online collaboration document according to the document control information.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

A computer program comprising computer instructions stored in a computer readable storage medium, the computer instructions being read from the computer readable storage medium by a processor of a computer device, the processor executing the computer instructions to cause the computer device to perform the steps of the document control method described above.

A computer program product comprising a computer program which, when executed by a processor, carries out the steps of the above-mentioned document control method.

According to the document control method, the document control device, the computer equipment and the storage medium, when a document is demonstrated, the following voice page turning trigger control is displayed in a demonstration interface of the document, and when the following voice page turning trigger control is triggered, a following voice page turning mode is started for the document. In the mode, the document can be turned over along with the voice content of the presenter, the text content in the target page to which the document is turned is matched with the voice content of the presenter in the aspect of semantics, the document can be automatically turned over along with the voice content of the presenter, the presenter does not need to control the document to turn over through an extra demonstration pen, control instructions except the presentation content do not need to be sent to control the document to turn over, the turning over is very convenient, the presentation thought of the presenter in the whole presentation process is more consistent, and the user experience of document presentation is improved.

Drawings

FIG. 1 is a diagram showing an application environment of a document control method in one embodiment;

FIG. 2 is a flowchart illustrating a document control method according to an embodiment;

FIG. 3 is an interface diagram of a presentation interface in one embodiment;

FIG. 4 is a diagram of an embodiment of an interface for automatically paging pages of a document with the speech content of a presenter;

FIG. 5 is a flowchart illustrating steps for performing a follow-up voice paging initialization on a document in one embodiment;

FIG. 6 is an interface diagram of an editing interface in one embodiment;

FIG. 7 is an interface diagram showing a follow-up page flip prompt area in an editing interface, under an embodiment;

FIG. 8 is a simplified flowchart illustration of a document control method in one embodiment;

FIG. 9 is a flowchart illustrating a document control method in accordance with an exemplary embodiment;

FIG. 10 is a block diagram showing the construction of a document control apparatus according to an embodiment;

FIG. 11 is a block diagram showing the construction of a document control apparatus in another embodiment;

FIG. 12 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The document control method provided by the application can be applied to the application environment shown in FIG. 1. Wherein the terminal 102 communicates with the server 104 via a network. Wherein, the terminal 102 and the server 104 are connected by wire or wireless. The document display method can be applied to various scenes, such as conferences, work reports, enterprise propaganda, product introduction, education training, various forms of lectures and the like.

The terminal 102 may display the following voice page turning trigger control in a presentation interface of a document when presenting the document, and enter a following voice page turning mode in response to a trigger operation on the following voice page turning trigger control, in the following voice page turning mode, the terminal 102 turns a page to a target page of the document along with voice content of a presenter, and text content in the target page is matched with semantics of the voice content of the presenter.

The terminal 102 may also display the following voice page turning initialization control in an editing interface of the document when the document is edited, extract text content corresponding to each page in the document in response to a trigger operation on the following voice page turning initialization control, and the terminal 102 may store the page number of each page and the corresponding text content to the server 104 correspondingly.

Correspondingly, in the following voice page turning mode, the terminal 102 may send the voice content of the presenter to the server 104, perform semantic matching on the voice content of the presenter and the text content corresponding to each page in the stored document through the server 104, generate a page turning instruction according to the page number of the target page successfully matched, return the page turning instruction to the terminal 102, and instruct the terminal 102 to turn the page to the target page according to the page number in the page turning instruction.

The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, portable wearable devices, smart voice interaction devices, smart appliances, and vehicle-mounted terminals, and in one embodiment, the terminal 102 may run an instant messaging application, and the instant messaging application may support browsing, presentation, editing, and the like of a displayed document. In another embodiment, the terminal 102 may have an online collaborative document application running thereon, which may be used for online browsing, presentation, and editing of a document, and may also support multiple users to simultaneously browse and edit the document online. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers. The server 104 may be a separate application server for providing respective computing services and storage services for an instant messaging application or an online collaborative document application running on the terminal 102. As shown in FIG. 1, the server 104 may also be a server cluster composed of a plurality of servers that can communicate with each other, such as a document server 1041, a speech recognition server 1042, a semantic matching server 1043, and the like.

In the related art, when a document is presented, the following two page turning modes are adopted for the document page: one is to control the document to turn over the page through demonstration pen or keyboard, and this kind of mode not only needs extra controlgear, has increased certain cost, still needs the manual operation of presenter, and so in the demonstration process, presenter's thinking will be interrupted by own manual operation, influences demonstration experience. The other is voice-controlled page turning, when the page needs to be turned in the demonstration process, the demonstrator transmits a page turning control instruction to the voice assistant, such as 'turning to page 2', 'turning to chapter 3', and the like, after receiving the page turning control instruction, the voice assistant analyzes the page number and instructs the demonstration equipment to turn the page of the document according to the analyzed page number, that is, the demonstrator needs to send a control instruction except the speech content in the demonstration process, obviously, the extra control instruction is hard and redundant, and the demonstration idea of the demonstrator can also be interrupted.

According to the document control method provided by the embodiment of the application, when a document is demonstrated, the following voice page turning trigger control is displayed in a demonstration interface of the document, and when the following voice page turning trigger control is triggered, a following voice page turning mode is started for the document. In the mode, the document can be turned over along with the voice content of the presenter, the text content in the target page to which the document is turned is matched with the voice content of the presenter in the aspect of semantics, the document can be automatically turned over along with the voice content of the presenter, the presenter does not need to control the document to turn over through an extra demonstration pen, control instructions except the presentation content do not need to be sent to control the document to turn over, the turning over is very convenient, the presentation thought of the presenter in the whole presentation process is more consistent, and the user experience of document presentation is improved.

In one embodiment, as shown in fig. 2, a document control method is provided, which is described by taking the method as an example applied to the terminal 102 in fig. 1, and includes the following steps:

step 202, when the document is demonstrated, the following voice page turning triggering control is shown in a demonstration interface of the document.

The document may be a local document, such as a PPT document, a Word document, or an Excel document, among others. The document may also be an online collaboration document that supports multi-person collaboration, the online collaboration document may support multi-person browsing and editing simultaneously, the online collaboration document may be an online slide, an online document, an online table, or the like. The online collaboration document can be stored in the cloud, and the terminal opens and accesses the content of the online collaboration document stored in the cloud server through the access address. The terminal can display the document page of the online collaboration document through the browser, and can also display the document page of the online collaboration document in the instant communication application.

And the presentation interface is an interface presented when the document is in a document presentation mode. Accordingly, the editing interface is an interface presented when the document is in the document editing mode. The terminal may switch the document between the document presentation mode and the document editing mode in response to a switching instruction between the document presentation mode and the document editing mode. Specifically, when the document is in the document editing mode, the terminal can enter the document demonstration mode from the document editing mode in response to a trigger operation for demonstrating the document, and a demonstration interface of the document is shown. When the document is in the document demonstration mode, the terminal can respond to the triggering operation of exiting the document demonstration mode, close the demonstration of the document, switch from the document demonstration mode to the document editing mode, and display the editing interface of the document.

And the following voice page turning trigger control is an icon or a control for triggering to enter a following voice page turning mode.

In one embodiment, step 202 comprises: in response to a triggering operation of a presentation document, entering a document presentation mode with respect to the document; after entering a document demonstration mode, displaying a demonstration interface of the document; and displaying the trigger control along with the voice page turning in the demonstration interface.

Specifically, when a document needs to be demonstrated, the terminal can respond to the triggering operation of the demonstration document, enter a document demonstration mode, and display the following voice page turning triggering control in a demonstration interface of the document. That is to say, the following voice page turning trigger control is displayed in the presentation interface of the document, and when the document is in the document presentation mode, the terminal can further respond to the trigger operation of the following voice page turning trigger control displayed in the presentation interface and enter the following voice page turning mode.

In one embodiment, the document presentation mode may further include a follow voice paging mode and a general presentation mode. In the following voice page turning mode, the terminal can automatically turn pages of the document according to the voice content of the presenter. In one embodiment, in the general presentation mode, the terminal may page the document according to a page turning instruction manually triggered by the presenter through a presentation pen or a keyboard. It is to be understood that the page following voice mode is a specific document presentation mode.

In one embodiment, when the document is in the document editing mode, the terminal may display the following voice page turning trigger control in an editing interface of the document, and when the document needs to be demonstrated and the following voice page turning demonstration is required, the terminal may directly enter the following voice page turning mode from the document editing mode in response to a trigger operation on the following voice page turning trigger control displayed in the editing interface. That is to say, the following voice page turning trigger control is displayed in the editing interface of the document, and the terminal can respond to the trigger operation of the following voice page turning trigger control to directly enter the following voice page turning mode when the document is in the document editing mode.

Optionally, in this case, after the following voice page turning mode is entered, the document is still in the document presentation mode, and the terminal displays the presentation interface of the document, so that the terminal may display a control for exiting the following voice page turning mode in the presentation interface, for example, the terminal may continue to display the following voice page turning trigger control in the presentation interface, so that the following voice page turning mode may be exited in response to the trigger operation on the following voice page turning trigger control in the following voice page turning mode.

In the embodiment of the application, the trigger operation is a preset operation acting on the control or the icon, and the preset operation may be, for example, a touch operation, a click operation, a key operation, or the like. The touch operation may be, for example, a touch operation on a control or an icon displayed by the terminal through a screen, the click operation may be, for example, a cursor click operation on the control or the icon displayed by the terminal through a mouse or a presentation pen, and the key operation may be, for example, an operation controlled by a designated key in the presentation pen or a keyboard.

And step 204, responding to the triggering operation of the following voice page turning triggering control, and entering a following voice page turning mode.

The following voice page turning mode may be used to trigger the terminal to start executing logic related to following voice page turning, including step 206. The page turning is carried out along with the voice, as the name suggests, the terminal can follow the voice content of a presenter to control a document to turn pages, so that the document can automatically turn pages along with the voice content of the presenter, the presenter does not need to control the document to turn pages through an additional presentation pen, and does not need to send out control instructions except the presentation content, such as turning to page 2 and opening chapter 3 to control the document to turn pages, when the presenter speaks that 'the appearance … of the product is described in chapter 3', the terminal directly controls the document to turn pages to the page where chapter 3 is located along with the speaking content of the presenter, the page turning is very convenient, the presentation thought of the presenter in the whole presentation process is more coherent, and the user experience of document presentation is improved.

The triggering operation of the following voice page turning triggering control is a preset operation which acts on the following voice page turning triggering control in a demonstration interface, and the preset operation can be a touch operation, a click operation or a key operation and the like.

In one embodiment, the method further comprises: and under the following voice page turning mode, responding to the triggering operation of the following voice page turning triggering control, and exiting the following voice page turning mode.

Specifically, in the document presentation process, when the document is in the following voice page turning mode, the terminal may exit the current following voice page turning mode in response to the triggering operation of the following voice page turning triggering control displayed in the presentation interface, and then the terminal will stop executing the relevant logic of automatically turning the page following the speaking content of the presenter.

Furthermore, when the terminal controls the document to exit from the following voice page turning mode, the document can be in a general demonstration mode, and the document can also be directly in a document editing mode. It can be understood that after the terminal controls the document to exit from the following voice page turning mode, the document is in the general demonstration mode, which indicates that the document is still in the document demonstration mode, and the presenter can turn pages of the document through a page turning instruction manually triggered by a demonstration pen or a keyboard. After the terminal controls the document to exit the following voice page turning mode, the document is directly in a document editing mode, the document is no longer in a document demonstration mode, and the document is in an editable state.

FIG. 3 is a schematic interface diagram of a presentation interface in one embodiment. Referring to fig. 3, when document presentation is performed, in a presentation interface 300, a following voice page turning trigger control 302 is provided, that is, "following voice page turning" in fig. 3, a presenter can open or close a following voice page turning function through the control, and after the following voice page turning function is opened, a speaking voice automatically following the presenter is switched to a corresponding page, and the switched page may be a next page, a previous page, or a page turning across pages, so that manual interaction of a user is reduced, and presentation experience of the user is improved. Optionally, after the following voice page turning function is turned on, the terminal may close the following voice page turning function in response to a trigger operation of the presenter on the "following voice page turning" control, so as to exit the following voice page turning mode.

Step 206, in the following voice page turning mode, the voice content of the presenter is followed to turn a page to a target page of the document, and the text content in the target page is matched with the semantics of the voice content of the presenter.

Wherein, the presenter is an object in which voice content is included when the document is presented, the object is usually the presenter of the document, and the specific content of the document is presented during the presentation. The speech content of the presenter is the content of the presentation of the presenter during the presentation, for example, "good family, thanks to everybody to attend my graduate speech", "below my directory structure", "next chapter 3 describes the appearance … of the product", and so on. The speech content of the presenter is typically used to describe the specific content of the document without interrupting the presentation concept during the presentation by the presenter.

The target page is a page in the document that matches the presenter's voice content. It can be understood that as the demonstration process is carried out, the voice content of the presenter changes continuously, the target page is a page matched with the latest voice content of the presenter, the text content in the target page is semantically associated with the latest voice content of the presenter, therefore, as the voice content of the presenter changes continuously, the latest voice content of the presenter also changes continuously, and the target page is updated correspondingly, so that the effect of automatically turning pages along with the voice content of the presenter is achieved.

It can be understood that in the follow voice page turning mode, the target page is changed along with the change of the voice content of the presenter, so that the target page can be a page that is previous, next or page-crossed to the current page.

The text content in the target page is semantically matched with the voice content of the presenter, the voice content of the presenter is shown to describe or introduce the specific content in the target page, and the text content in the page is used as a page turning basis, so that the page turning accuracy is higher during document presentation.

The text content in the page may be at least one of page remark content and page title content in the page, and may also be at least one of a part of the page remark content and a part of the page title content. Page remark content is the presenter's remark information for the document that can be used to guide the presenter's presentation process. During the demonstration, the terminal can display the document in a split screen mode, namely in the document demonstration mode, the page remark content can be set on the terminal used by the presenter to be displayed, but not displayed on the terminal watched by the audience. The page title content is a title of each page, and may include a main title, a subtitle, and the like. It will be appreciated that some pages in the document may not have page remark content or page title content.

In some cases, a document page only shows a title and a picture, a motion picture or a video for describing a presentation content, is not specifically described with a text, therefore, the voice content of the presenter is matched with the page remark content or the page title content by adopting the page title content or the page remark content except the document page as the basis for page turning, after the document page is taken as an image, in terms of the manner in which the voice content of the presenter is matched with each image, the page remark content and the page title content are used for guiding the demonstration process of the presenter and guiding the content spoken by the presenter, so that the voice content of the presenter can be greatly matched with the text content of the page so as to be matched with the upper document page, namely, the voice content is associated with the text content of the document page, and the page turning accuracy is improved. It can be understood that after the document page is taken as an image, the voice content of the presenter is matched with each image, and in many cases, the voice content cannot be matched with the corresponding page, so that the page turning cannot be performed.

FIG. 4 is a diagram illustrating an interface for automatically paging pages of a document with the voice content of a presenter, according to one embodiment. Referring to fig. 4, page 1 is a page presented when the presenter says "good family, thanks to everybody to attend my graduate answer speech". When the presenter says "fortunately enough to spend the full learning time … … with the teacher's instructions," the document remains on page 1. When the presenter says "below is my directory structure", the document is paged from Page 1 to Page 2. When the presenter says "the relevant experimental data … … is given in chapter 4 next", the document is paged over to page 5. When the presenter says "the front directory structure also introduces … …", page 5 is paged across to page 2.

In the document control method, the following voice page turning trigger control is displayed in a demonstration interface of a document when the document is demonstrated, and when the following voice page turning trigger control is triggered, a following voice page turning mode is started for the document. In the mode, the document can be turned over along with the voice content of the presenter, the text content in the target page to which the document is turned is matched with the voice content of the presenter in the aspect of semantics, the document can be automatically turned over along with the voice content of the presenter, the presenter does not need to control the document to turn over through an extra demonstration pen, control instructions except the presentation content do not need to be sent to control the document to turn over, the turning over is very convenient, the presentation thought of the presenter in the whole presentation process is more consistent, and the user experience of document presentation is improved.

In one embodiment, as shown in FIG. 5, prior to presenting the document, the method further comprises the step of initiating a follow-up voice paging of the document, comprising:

step 502, when editing the document, displaying the initialization control following the voice page turning in the editing interface of the document.

The initialization control following the voice page turning is an icon or a control used for triggering initialization following the voice page turning on the document. The initialization following the voice page turning refers to a process of extracting text content corresponding to each page in the document.

In one embodiment, when editing a document, displaying a follow voice page initialization control in an editing interface of the document, the method comprises the following steps: entering a document editing mode with respect to the document in response to a triggering operation for editing the document; after entering a document editing mode, displaying an editing interface of the document; and displaying the initialization control following the voice page turning in the editing interface.

Specifically, when a document needs to be edited, the terminal may enter a document editing mode in response to a trigger operation for editing the document, and display the following voice page turning initialization control in an editing interface of the document. That is, the page voice initialization control is displayed in the editing interface of the document following the page voice initialization control.

And 504, responding to the triggering operation of the following voice page turning initialization control, and displaying a following page turning prompt area in an editing interface.

And the following page turning prompt area is an area for displaying the text content corresponding to each extracted page in the editing interface. The following page turning prompting area can be displayed in the editing interface in a form of a pop-up window, a floating window and the like, can be displayed at a certain fixed position in the editing interface, and can also move in the editing interface in response to a moving operation, a dragging operation or a sliding operation.

The triggering operation of the following voice page turning initialization control is a preset operation which acts on the following voice page turning initialization control in the editing interface, and the preset operation may be a touch operation, a click operation, a key operation or the like.

Specifically, for the following voice page turning initialization control displayed in the editing interface, when the control is triggered, a following page turning prompt area can be popped up in the editing interface, and text content corresponding to each page is displayed in the following page turning prompt area.

Step 506, displaying the text content corresponding to each page in the document in the following page turning prompt area.

In this embodiment, when a document is edited, a control initialized with a voice page turning is provided in an editing interface of the document, the control serves as an entry for extracting text content corresponding to each page required for following the voice page turning, and can trigger extraction of the text content corresponding to each page, and the text content serves as key description information for automatically turning the page following the voice and is displayed to a user in a small window form. The text content can support the browsing or self-editing of the presenter, and is helpful for the presenter to confirm or change the text content according to the presentation requirement, for example, the text content is changed into the content to be spoken in the presentation process, so that the page matching accuracy and the page turning accuracy are improved.

In one embodiment, the text content corresponding to each page of the document displayed in the page-following hint area is limited, for example, the terminal may limit the text content corresponding to each page to a sentence, or limit the number of words of the text content corresponding to each page to within 10 words, and so on.

FIG. 6 is a diagram illustrating an interface of an editing interface in one embodiment. Referring to fig. 6, when a document is edited, in the editing interface 600, a following voice page turning initialization control 602 is provided, and a presenter can trigger initialization of a following voice page turning function by one key to trigger extraction of text content corresponding to each page in the document. In addition, in the editing interface 600, a menu bar area 601, a document page area 603, a page navigation area 604 and a document remark area 605 are included, the following voice page-turning initialization control 602 is located in the menu bar area 601, the title, the subtitle and the text are located in the document page area 603, and the page remark content is located in the document remark area 605.

FIG. 7 is a schematic diagram illustrating an interface following a page flip prompt area in an editing interface according to an embodiment. Referring to part (a) of fig. 7, for the follow voice page-turning initialization control 702 provided in the editing interface 700, the presenter may generate a trigger operation 704 thereon, so as to pop up a follow page-turning prompt region 706 in the editing interface 700, as shown in part (b) of fig. 7, where page remark contents and page title contents corresponding to respective pages are presented in the follow page-turning prompt region 706. For example, 1-1 represents notes on page 1: "great family thank you to attend my graduate answer lecture", 1-2 represents the title on page 1: "graduation answer", similarly, 2-1 represents a remark on page 2, 2-2 represents a title on page 2, 3-1 represents a remark on page 3, 3-2 represents a title on page 3, and so on.

In one embodiment, the method further comprises:

displaying the edited text content in the following page turning prompt region in response to the text editing operation in the following page turning prompt region; and updating the text content corresponding to each page in the document according to the edited text content.

In this embodiment, the page turning following prompt region is a region supporting the presenter to perform text editing, the content automatically extracted and displayed in the page turning following prompt region by the terminal can be confirmed by the user, and the user can edit and modify the content in the page turning following prompt region as required, so that the probability of successfully matching the voice content of the presenter with the document page during subsequent document presentation is improved.

In one embodiment, the method further comprises: when the document is in a document editing mode and the triggering operation of the following voice page turning initialization control displayed in the editing interface does not occur, extracting the text content corresponding to each page in the document, and correspondingly storing the page number of each page and the corresponding text content.

Specifically, when the document is edited and the document is not initialized by the following voice page turning initialization control to obtain the text content corresponding to each page in the document, the text content corresponding to each page in the document is extracted in response to the triggering operation of the following voice page turning triggering control during document demonstration, and the page number of each page is stored in correspondence with the corresponding text content.

That is, when a document is presented, after entering the document presentation mode, the terminal may determine whether the document has completed initialization of the voice page turning function, for example, the terminal may determine whether text content corresponding to each page in the document exists, so as to determine whether a triggering operation for following the voice page turning initialization control displayed in the editing interface occurs when the document is in the document editing mode. If so, according to the method, the following voice page turning mode can be entered, and the page can be turned along with the voice content of the presenter.

If not, the terminal can respond to the triggering operation of the voice following triggering control displayed in the demonstration interface at the moment, trigger the initialization of the voice following page turning function once, extract the text content corresponding to each page in the document, then execute the relevant logic of entering the voice following page turning mode and following the voice content page turning of the demonstrator. It can be understood that, at this time, the document is in the document presentation mode, and after the text content corresponding to each page in the document is extracted, the terminal only caches the extracted text content, and the extracted text content does not need to be presented in the presentation interface by following the page turning prompt region.

In one embodiment, in the document editing mode, even though the terminal has extracted the text content corresponding to each page in the document based on the triggering operation of the following voice page turning initialization control displayed in the editing interface, the user may edit the document again, such as modifying the title of the document or the page remark content of the document. In order to ensure the page turning accuracy, the terminal can automatically update the extracted text content corresponding to each page again based on the editing operation of the user. In addition, the terminal can update the text content displayed in the following page turning prompt area according to the automatically updated text content corresponding to each page. Of course, the text content corresponding to each page may not be automatically updated as the user edits the document again, but rather the content displayed in the following page turning prompt region.

In one embodiment, the method further comprises:

when the document is demonstrated, the display of the following voice page turning initialization control and the following page turning prompt area is cancelled in a demonstration interface of the document.

It should be noted that after entering the document presentation mode, the presentation interface does not need to display the following voice page turning initialization control and the following page turning prompt region. As described above, if the document is not initialized, the initialization may be performed once according to the following voice page turning trigger control displayed in the presentation interface, and if the document is initialized, it is not necessary to display the following voice page turning initialization control and the following page turning prompt region in the presentation interface.

In one embodiment, the method further comprises: extracting text contents corresponding to all pages in the document, wherein the text contents comprise at least one of page remark contents and page title contents; and correspondingly storing the page number of each page and the corresponding text content.

The extracted text content may be original text content extracted from each page of the document, that is, original text content. When the original text content is long and the content is large, because the voice content of each speaking of the presenter is not necessarily long, if the original text content is directly used for matching with the voice content of the presenter, the accuracy of the matching result is affected when the length difference between the original text content and the voice content of the presenter is large, and for this purpose, the extracted text content can also be a part reserved from the original text content, for example, a first sentence is reserved, the first 10 words are reserved, and the like.

Specifically, the terminal can extract and store text contents corresponding to all pages of the document according to page numbers, such as page remark contents and page title contents. For example, the terminal records the content of the page remark extracted from each page as Rn, and the length as l (Rn), where n is 1, 2, 3, …, and n represents the page number, and records the content of the page title extracted from each page as Tn, and the length as l (Tn). The terminal may store the page number and the content in a key-value manner in the cache. For example, the page remark content of page 1 is R1, and the page remark content of page 2 is R2. The page title content of page 1 is T1, and the page title content of page 2 is T2.

In one embodiment, extracting the text content corresponding to each page in the document includes: for each page in the document, respectively extracting the original text content corresponding to each page; when the number of words of the original text content is less than the preset number, the original text content is reserved and used as the text content extracted from the corresponding page; and when the number of words of the original text content is more than the preset number, reading the characters of the preset number from the first character of the original text content and then continuing to read until the separator is read for the first time, and keeping the read content as the text content extracted from the corresponding page.

The original text content can be original remark content of the page and original title content of the page. Specifically, for each page of the document, the terminal determines whether the original text content of the page includes more than a preset number of characters, for example, more than 10 characters, and if not, directly retains the original text content of the page as the text content corresponding to the page. If yes, further determine whether the first predetermined number of characters is followed by a separator, such as comma, pause, period, exclamation mark, etc. If yes, keeping the front preset number of characters as the text content corresponding to the page, if not, continuing to read backwards until the first separator is met, and taking the read characters as the text content corresponding to the page.

Taking the page remark content as an example, the original page remark content of page 1 is "good family, thanks to everyone to participate in my graduation answer and debate speech, and is very happy … …", the 10 th character "ginseng" of the sentence has no separator, the reading is continued until the separator is read at the end of the sentence, and the extracted text content corresponding to page 1 includes "good family thanks everyone to participate in my graduation answer and speech". The original page remark content of the page 2 is 'the following is a directory structure', the total number of characters does not exceed 10, and all the page remark content is reserved. Page 3 original page remark content is "first introduce graduation design summary, this part is mainly … …", after 10 th character "say" is comma, then no reading is continued, and the first sentence is directly used as the corresponding text content of page 3.

It should be noted that, when the extracted text content is a part reserved from the original text content, the text content displayed in the page turning following prompt region may be the original text content, which is used as key description information for following the voice page turning, and the text content displayed in the page turning following prompt region may also be consistent with the extracted text content, which is a part of the original text content.

In this embodiment, by reserving a part of the original text content as the text content corresponding to each page, the amount of cached data can be reduced, and the matching accuracy can be improved when the reserved part of the content is subsequently used to match the voice content of the presenter.

In one embodiment, the method further comprises: collecting voice data of a presenter in a following voice page turning mode; converting voice data of a presenter into sentences of the presenter; and carrying out semantic matching on the latest sentence and the text content corresponding to each page in the document one by one, and determining a target page according to the successfully matched text content.

After entering the following voice page turning mode, the terminal can acquire the voice content of the presenter in real time. Of course, the voice content of the presenter may be collected in real time by the designated recording device, and then the voice content may be transmitted to the terminal and then matched by the terminal. The terminal continuously acquires the captured voice contents which are used as a basis for turning pages along with the voice.

The terminal can use the speech recognition network based on the neural network to convert the input speech content into corresponding sentences, namely words, and can also perform sentence break on continuous speech content to translate the speech content of a presenter in real time. The speech recognition network may be constructed based on a speech acoustic model, a language model, and a dictionary.

In addition, the terminal may sequentially generate and buffer the translated latest sentences based on the input continuous voice contents. For example, the voice content of the presenter is "respectful assessment teacher, good family", and the sentence after sentence break is "respectful assessment teacher/good family", then the latest sentence translated at the 1 st time is "respectful assessment teacher", and the latest sentence translated at the 2 nd time is "good family". With the continuous input of the voice content of the presenter, the translated sentences are the latest sentences in turn each time.

When the latest sentence is subjected to semantic matching with the text content corresponding to each page in the document, the terminal can respectively vectorize the text content corresponding to the latest sentence and the page by using a word vector model by using an unsupervised Euclidean distance method, and then measure the distance between the two vectors, namely the content similarity, as the matching degree of the two vectors in the aspect of semantics.

Alternatively, when the latest sentence is semantically matched with the text content corresponding to each page in the document successively, the terminal may semantically match the latest sentence with the text content corresponding to the page starting from page 1 successively. That is, regardless of the document page of the few pages displayed on the current presentation interface, the terminal matches the latest sentence with the text content corresponding to the page starting from the page 1 one by one until the first matching is successful, and determines the target page. For example, matching is started from page 1, and matching is successful until page 3, and then page 3 is determined as the target page.

Optionally, the terminal may further match the latest sentence with the text contents corresponding to all pages in the document one by one, and select the page with the highest matching degree as the target page. For example, a document has 5 pages in total, after all the documents are matched, the matching degree of the 2 nd page is the highest, and the 2 nd page with the highest matching degree is selected as a target page.

Optionally, since in most cases, the presentation process is sequentially presented according to the sequence of each page in the document, that is, the presenter sequentially speaks according to the sequence of each page in the document, in order to reduce the number of times of matching the latest sentence with the text content of each page in the document, the terminal may further determine the page number of the document page presented in the current presentation interface, start matching from a later page of the document page where the page number is located, and take the page that is successfully matched for the first time as the target page. For example, if a document has 5 pages, and the page of the document displayed in the current presentation interface is the 2 nd page, matching is started from the 3 rd page, and if matching is successful for the first time when the 4 th page is matched, the 4 th page is taken as a target page. Obviously, the target page is determined by matching only 1 time, and compared with the matching of 4 times from the 1 st page, the terminal response efficiency can be improved, so that the demonstration experience of a presenter is improved.

In one embodiment, semantic matching the latest sentence with the text content corresponding to each page in the document one by one includes: in the successive matching process of carrying out semantic matching on the latest sentence and the text content corresponding to each page in the document successively, converting the latest sentence into a corresponding word sequence, and converting the text content corresponding to the page being matched into a corresponding word sequence; generating a sentence vector corresponding to the latest sentence based on a word vector corresponding to each participle in a word sequence corresponding to the latest sentence, and generating a text vector corresponding to the text content based on a word vector corresponding to each participle in a word sequence corresponding to the text content; and when the similarity between the sentence vector and the text vector is greater than a preset threshold, judging that the latest sentence is successfully matched with the text content.

Specifically, the computer device cuts the latest sentence and the text content corresponding to each page in the document into words, so as to convert the latest sentence into a corresponding word sequence and convert the text content corresponding to the page being matched into a corresponding word sequence.

In one embodiment, a word vector corresponding to each word in the sequence of words may be obtained based on a word vector model. The word vector model is obtained by mapping each word to a high-dimensional vector through training of a large amount of linguistic data. By solving the distance between the vectors, the similarity between two words can be judged. The word vector model may be a neural network model based on CBOW and Skip-Gram algorithms.

After the word vector corresponding to each word in the word sequence is obtained, the sentence vector corresponding to the sentence represented by the word sequence can be obtained. For example, the sentence vector can be calculated by using an average vector, a TF-IDF weighted average vector or a SIF weighted average word vector.

The similarity between the sentence vector and the text vector can be represented by a cosine distance between the two vectors, when the cosine distance is smaller, the similarity is higher, and when the similarity is greater than a preset threshold, the latest sentence and the text content are successfully matched. For example, the preset threshold may be 80%. It should be noted that, when the similarity is equal to the threshold, it may be determined that the matching is successful, or it may be determined that the matching is failed, which is not limited in this application.

In one embodiment, semantic matching the latest sentence with the text content corresponding to each page in the document one by one includes: carrying out semantic matching on the latest sentences and the page remark contents corresponding to all pages in the document one by one, and taking the page corresponding to the page remark contents which are successfully matched as a target page; when the latest sentence is not successfully matched with the page remark content corresponding to each page in the document, performing semantic matching on the latest sentence and the page title content corresponding to each page in the document one by one, and taking the page corresponding to the first successfully matched page title content as a target page; and when the latest sentence is not successfully matched with the page title content corresponding to each page in the document, continuously acquiring the voice data of the presenter in a following voice page turning mode.

In this embodiment, when the target page matched with the voice content of the current presenter is determined, the translated latest sentence may be preferentially and successively matched with the page remark content of each page, and the page successfully matched for the first time is taken as the target page. If the matching is not successful, the page header content of each page is matched successively, and the page which is successfully matched for the first time is taken as a target page. If the matching is not successful, continuing to stay on the currently displayed document page and continuing to collect the voice data of the presenter.

It should be noted that the target page matched with the voice content of the current presenter may also be a document page currently being presented.

FIG. 8 is a simplified flowchart of a document control method in one embodiment. Referring to fig. 8, the method mainly includes the following steps:

step 802, initializing a document with a voice page turning function: and extracting text contents corresponding to each page, including page remark contents and page title contents.

And step 804, starting a page turning function following the voice.

Step 806 captures real-time speech content of the presenter.

Step 808, translate the presenter's sentence.

Step 810, performing semantic matching on the translated latest sentences and page remark contents corresponding to each page in the document one by one;

step 812, determine whether the matching with the page remark content corresponding to a certain page is successful. If yes, go to step 818; if not, go to step 814;

step 814, performing semantic matching on the translated latest sentences and page title contents corresponding to each page in the document one by one;

step 816, determine whether the page title content corresponding to a certain page is successfully matched. If yes, go to step 818; if not, returning to the step 806;

step 818, page turning from the current page to the successfully matched target page; and returns to perform step 806.

Since the lengths of the voice content of the presenter and the text content (the original text content or a part of the original text content) corresponding to the page may be inconsistent, the semantic matching is directly performed, which may affect the accuracy of the matching result. Therefore, when the latest sentence is subjected to semantic matching with the text content corresponding to each page in the document, the terminal can intercept the text content corresponding to the latest sentence or page and then perform semantic matching so as to improve the accuracy of matching the voice content of the presenter with the text content corresponding to the page.

In one embodiment, the semantic matching of the latest sentence with the page remark content corresponding to each page in the document successively includes: in the successive matching process of carrying out semantic matching on the latest sentence and the page remark content corresponding to each page in the document successively, comparing the length of the latest sentence and the page remark content corresponding to the page being matched; when the length of the latest sentence is larger than that of the page remark content, intercepting from the last character of the latest sentence to the direction of the first character according to the length of the page remark content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page remark content; and when the length of the latest sentence is less than that of the page remark content, intercepting the latest sentence from the last character of the page remark content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

Specifically, when the translated latest sentence is matched with the page remark content, the terminal can align the sentence according to the length of the sentence, so that the matching accuracy is improved.

The terminal may record the translated latest sentence as S and record the length as l (S), and the terminal performs semantic matching on the translated latest sentence S and the remark content Rn of the page n successively. The length of the page remark content Rn of the nth page is recorded as l (Rn). In general Chinese language expression, people are used to put key contents in the latter half of a sentence, so that the terminal can perform semantic matching on the last x characters of the translated latest sentence S and the page remark contents Rn of each page in a document, that is, the terminal intercepts the translated latest sentence S and the page remark contents Rn according to the lengths of the translated latest sentence S and the page remark contents Rn, the intercepted sentence is S ', and the intercepted contents are R' n.

Specifically, the terminal compares the length l (S) of the translated latest sentence S with the length l (Rn) of the page remark content Rn, when l (S) ≦ l (Rn), the terminal retains all the contents of the translated latest sentence S, that is, x ═ l (S), and Rn needs to intercept the contents of x length to obtain intercepted contents R 'n, and performs semantic matching on the translated latest sentence S and the intercepted contents R' n. When l (S) > l (Rn), the whole content of the page remark content Rn is retained, l (Rn) words after the translated latest sentence S is intercepted, that is, x ═ l (Rn), are obtained, the intercepted sentence is S ', and the intercepted sentence is S' and the page remark content Rn is subjected to semantic matching. And the terminal takes the page corresponding to the page remark content successfully matched for the first time as a target page. The specific matching manner is the same as the aforementioned manner for matching the translated sentence with the text content corresponding to the page, and the description is not repeated.

For example, the latest sentence S translated is: "see the following is my directory structure", S has a length of 10, calculated from the page remark content R1 corresponding to page 1 of the document, R1 is: "good will of the great family thanks everybody to participate in my graduation answer and speech", the length is 18, so it is necessary to intercept R1 according to the length of S, and the intercepted R' 1 is: "attend my graduate answer speech", length is 10, calculate the similarity between these two texts about 10%, the similarity is smaller than the preset threshold 80%, page 1 fails to match successfully. Thus, page 2, page 3, …, and so on, need to be compared. When page 2 is compared, the page remark content R2 of page 2 is: since the length of "following my directory structure" is 9, S needs to be truncated according to the length of R2, and S' after truncation is: the length of the My directory structure is 9, the similarity between the two texts is calculated to be about 98%, the similarity is greater than a preset threshold value of 80%, and if the page 2 matching is determined to be successful, the voice content of the presenter is continuously captured after the page 2 is directly turned to the page 2.

And if the similarity is less than the preset threshold value after the translated latest sentence S is matched with the page remark contents corresponding to all the pages in the document, continuing to match the translated latest sentence S with the page title contents corresponding to all the pages in the document.

In one embodiment, the semantic matching of the latest sentence with the page title content corresponding to each page in the document successively includes: in the successive matching process of carrying out semantic matching on the latest sentence and the page title content corresponding to each page in the document successively, comparing the lengths of the latest sentence and the page title content corresponding to the page being matched; when the length of the latest sentence is larger than that of the page title content, intercepting from the tail character of the latest sentence to the direction of the first character according to the length of the page title content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page title content; and when the length of the latest sentence is less than that of the page title content, intercepting the latest sentence from the last character of the page title content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

The terminal may record the translated latest sentence as S and record the length as l (S), and the terminal performs semantic matching on the translated latest sentence S and the page title content Tn of the nth page successively. The length of the page header contents Tn of the nth page is denoted as L (Tn). Similarly, the terminal may perform semantic matching on the last y characters of the translated latest sentence S and the page title content Tn of each page in the document, that is, the terminal intercepts the translated latest sentence S and the page title content Tn according to the length of the translated latest sentence S and the page title content Tn, where the intercepted sentence is S ″ and the intercepted content is T' n.

Specifically, the terminal compares the length l (S) of the translated latest sentence S with the length l (Tn) of the page title content Tn, when l (S) is less than or equal to l (Tn), the terminal retains all the contents of the translated latest sentence S, namely y ═ l (S), Tn needs to intercept the content of y length, obtains intercepted content T 'n, and performs semantic matching on the translated latest sentence S and the intercepted content T' n. When L (S) > L (Tn), the whole content of the page title content Tn is reserved, L (Tn) words after the translated latest sentence S is intercepted, namely y ═ L (Tn), the intercepted sentence is S ', and the semantic matching is carried out on the intercepted sentence S' and the page title content Tn. And the terminal takes the page corresponding to the page title content successfully matched for the first time as a target page, turns to the target page and then continues to capture the voice content of the presenter.

For example, the latest sentence S translated is: "next to explain the key technology of the present subject", if the length is 13 and the page remark content Rn corresponding to each page in the document does not match, the comparison with the page title content Tn corresponding to each page in the document is continued. First, the page title content T1 corresponding to page 1 of the document is calculated, T1 is: the "graduation answer" is 4 in length, so that the translated latest sentence S needs to be intercepted according to the length of T1, and the intercepted S ″ is: "Key technology", length 4, calculate the similarity between the two texts to be about 0.5%, less than the preset threshold of 80%, page 1 failed to match successfully. Thus, continuing with page 2's page title content T2, T2 is: the length of the "directory" is 2, so the translated latest statement S needs to be intercepted according to the length of T2, and the intercepted S ″ is: "technique", length 2, calculate the similarity between the two texts 1.5%, < 80%, less than the preset threshold 80%, page 1 failed to match successfully. Continuing to match page title content corresponding to page 3, page 4 …, etc., when proceeding to page 5, T5 is: the length of the "key technology" is 4, so the translated latest statement S needs to be intercepted according to the length of T5, and the intercepted S ″ is: and the key technology calculates the similarity between the two texts by 100 percent, is greater than the preset threshold value by 80 percent, determines that the page 5 is successfully matched, and continuously captures the voice content of the presenter after directly turning the page to the page 5.

And if the similarity is less than the preset threshold after the translated latest sentence S is matched with the page title contents corresponding to all the pages in the document, the terminal continuously captures the voice content of the presenter. For example, when the translated latest sentence S is: "what problem can break me at any time", the sentence is not matched with the page remark content Rn corresponding to each page in the document, and is not matched with the page title content Tn corresponding to each page in the document, then the terminal continues to capture the voice content of the presenter.

It should be noted that, in some cases, the speech speed of the presenter is fast, and if the currently translated latest sentence is not ready to complete the matching with the text content corresponding to each page in the document and a new sentence is already generated, the matching process that is not completed is discarded, and the matching process of the translated latest sentence and the text content corresponding to each page in the document is directly started to be executed.

In one embodiment, the method further comprises: generating a page turning instruction carrying a page number according to the page number of the target page; and based on the page turning instruction, after the document is turned from the current page to the target page, continuously acquiring the voice data of the presenter in a following voice page turning mode.

Specifically, after determining a target page matched with the translated latest statement, the terminal generates a page turning instruction according to the page number of the target page, and directly jumps to the target page of the document, for example, the terminal may directly display the target page. In addition, after the target page is displayed, the terminal continues to collect voice data of the presenter. It can be understood that when the translated latest sentence fails to be successfully matched with all the pages, the terminal stays on the currently displayed page.

In one embodiment, the document is an online collaboration document, the method further comprising: initiating a document cooperation invitation according to the access address of the online cooperation document; synchronizing document control information generated in the demonstration process of the online collaboration document to a terminal responding to the document collaboration invitation, so that the terminal synchronizes the control state of the online collaboration document according to the document control information.

In one embodiment, the document control method is applied to an online collaboration document application, the document is an online collaboration document, a terminal can start the online collaboration document application, a document directory is opened in the online collaboration document application, and after a target document is selected from the document directory, an access address of the selected target document is obtained. The terminal can send the access address to a cooperative terminal which needs to edit and modify the target document at the same time. The collaboration terminal can open the target document according to the access address through the online collaboration document application or the browser, and can also open the target document according to the access address through other instant messaging applications.

In addition, when the target document is opened by the terminal and the cooperation terminal at the same time, in the process of demonstrating the target document by the terminal, document control information, such as document page turning, triggering operation generated on the document and the like, generated on the document can be synchronized to the cooperation terminal, the cooperation terminal can synchronously control the target document according to the document control information, so that different users at the two ends can see the operation of the other side on the document, and multi-user online cooperation on the target document is realized. Of course, when the terminal and the collaboration terminal open the target document at the same time, the collaboration terminal may also synchronize the document control information generated for the target document to the terminal that initiated the invitation to collaborate with the document.

FIG. 9 is a flowchart illustrating a document control method according to an exemplary embodiment. Referring to fig. 9, the following steps are included:

step 902, in response to a trigger operation for editing a document, entering a document editing mode with respect to the document;

step 904, after entering the document editing mode, displaying an editing interface of the document;

step 906, displaying an initialization control following the voice page turning in an editing interface;

step 908, in response to the triggering operation of the initialization control for page turning following voice, for each page in the document, respectively extracting the original text content corresponding to each page, where the text content includes at least one of the page remark content and the page title content;

step 910, when the number of words in the original text content is less than the preset number, the original text content is retained as the text content extracted from the corresponding page;

step 912, when the number of words of the original text content is more than the preset number, reading the preset number of characters from the first character of the original text content and then continuing reading until the separator is read for the first time, and keeping the read content as the text content extracted from the corresponding page;

step 914, storing the page number of each page and the corresponding text content correspondingly;

step 916, displaying a following page turning prompt area in an editing interface;

step 918, in the following page turning prompt area, displaying the text content corresponding to each page in the document.

Step 920, in response to the text editing operation in the following page turning prompting area, displaying the edited text content in the following page turning prompting area;

step 922, updating the text content corresponding to each page in the document according to the edited text content;

step 924, initiating a document collaboration invitation according to the access address of the document;

step 926, synchronizing document control information generated during the document demonstration process to a terminal responding to the document cooperation invitation, so that the terminal synchronizes the control state of the document according to the document control information;

step 928, responding to the triggering operation of the presentation document, and entering a document presentation mode about the document;

step 930, after entering the document demonstration mode, displaying a demonstration interface of the document;

step 932, displaying the trigger control following the voice page turning in a demonstration interface;

step 934, responding to the triggering operation of the following voice page turning triggering control, and entering a following voice page turning mode;

step 936, collecting voice data of the presenter in a following voice page turning mode;

step 940, the voice data of the presenter is converted into a sentence of the presenter;

step 942, comparing the length of the latest sentence and the page remark content corresponding to the page being matched in the process of semantic matching the latest sentence with the page remark content corresponding to each page in the document;

step 944, when the length of the latest sentence is greater than the length of the page remark content, intercepting the latest sentence from the last character to the first character according to the length of the page remark content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page remark content;

946, when the length of the latest sentence is smaller than the length of the page remark content, intercepting the latest sentence from the last character of the page remark content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content;

948, judging whether the page remark content of a certain page is successfully matched, if so, executing step 958; if not, go to step 936:

step 950, comparing the length of the latest sentence with the page title content corresponding to the page being matched in the process of semantic matching the latest sentence with the page title content corresponding to each page in the document;

step 952, when the length of the latest sentence is greater than the length of the page title content, intercepting from the last character of the latest sentence to the direction of the first character according to the length of the page title content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page title content;

step 954, when the length of the latest sentence is smaller than the length of the page title content, intercepting the latest sentence from the last character of the page title content to the direction of the first character according to the length of the latest sentence to obtain intercepted content, and performing semantic matching on the latest sentence and the intercepted content;

step 956, judging whether the page title content of a certain page is successfully matched, if so, executing step 958; if not, go to step 936:

step 958, controlling the document to page to a matched target page;

and step 960, in the following voice page turning mode, responding to the triggering operation of the following voice page turning triggering control, and exiting the following voice page turning mode.

According to the document control method, when a document is demonstrated, the following voice page turning trigger control is displayed in a demonstration interface of the document, and when the following voice page turning trigger control is triggered, a following voice page turning mode is started for the document. In the mode, the document can be turned over along with the voice content of the presenter, the text content in the target page to which the document is turned is matched with the voice content of the presenter in the aspect of semantics, the document can be automatically turned over along with the voice content of the presenter, the presenter does not need to control the document to turn over through an extra demonstration pen, control instructions except the presentation content do not need to be sent to control the document to turn over, the turning over is very convenient, the presentation thought of the presenter in the whole presentation process is more consistent, and the user experience of document presentation is improved.

In addition, the page title content or the page remark content except the document page is used as a page turning basis, and the voice content of the presenter is matched with the page remark content or the page title content.

Moreover, the following page turning prompting area is an area supporting a presenter to edit texts, contents which are automatically extracted and displayed in the following page turning prompting area by the terminal can be confirmed by a user, and the contents can be edited and modified in the following page turning prompting area if the user needs to edit the contents according to presentation requirements, so that the probability of successfully matching the voice contents of the presenter with document pages is improved during subsequent document presentation.

When the latest sentence is subjected to semantic matching with the text content corresponding to each page in the document, the terminal can intercept the latest sentence or the text content corresponding to the page and then perform semantic matching so as to improve the accuracy of matching the voice content of the presenter with the text content corresponding to the page.

The application also provides an application scene, and the application scene applies the document control method. Specifically, the document control method is applied to the application scenario as follows:

the user a starts an online collaborative document application through the user terminal 1, and creates and edits an online collaborative document in the online collaborative document application, and the document is in a document editing mode at this time. After the document is edited, the user a clicks a following voice page turning initialization control displayed in an editing interface of the document, and triggers the user terminal 1 to pop up a following page turning prompt area in the editing interface, and in the following page turning prompt area, text contents corresponding to each page in the document are displayed, including at least a part of reserved page remark contents and page title contents.

When the terminal extracts the text content, the following two cases are distinguished: for the page remark content and the page title content of each page, if the number of words is less than N (for example, N is 10), all the content is reserved; if the number of characters exceeds N, judging whether the characters adjacent to the Nth character are delimiters such as commas and periods, and if so, reserving the first N characters; if not, the reading is done back until the first delimiter is encountered.

For the text content displayed in the following page turning prompt area, after confirming that the content displayed in the following page turning prompt area is correct, the user A stores the document, generates an access address of the document through the online cooperation document application, and then sends the access address to the user terminal 2 needing to perform online cooperation on the document through the user terminal 1.

When the user a opens the document through the user terminal 1 and the user B opens the document through the user terminal 2, the user a can start presenting the document, and the document is in the document presentation mode, and the control state of the document will be synchronized to the user terminal 2 by the user terminal 1 during the whole presentation process.

And the user A clicks a following voice page turning trigger control displayed in a demonstration interface of the document to start a following voice page turning function. In the presentation process, the user terminal 1 continuously captures the speaking content in the document presentation process of the user a, translates the speaking content into sentences successively, performs semantic matching on the latest sentences and the text content corresponding to each page in the document successively, and determines the target page to which the page is turned according to the text content successfully matched. The user terminal 1 controls the document to page to the target page, generates a page turning instruction carrying the page number of the target page, and transmits the page turning instruction to the user terminal 2, so that the user terminal 2 synchronously turns the page of the document.

It should be understood that, although the steps in the above-described flowcharts are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the above-mentioned flowcharts may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of performing the steps or the stages is not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a part of the steps or the stages in other steps.

In one embodiment, as shown in fig. 10, there is provided a document control apparatus 1000, which may be a part of a computer device using a software module or a hardware module, or a combination of the two, and specifically includes: a presentation interface display module 1002, a response module 1004, and a follow flip module 1006, wherein:

the demonstration interface display module 1002 is configured to display a trigger control following the voice page turning in a demonstration interface of a document when the document is demonstrated;

the response module 1004 is configured to enter a following voice page turning mode in response to a triggering operation on the following voice page turning triggering control;

and the following page turning module 1006 is configured to, in a following voice page turning mode, follow the voice content of the presenter to turn a page to a target page of the document, where the text content in the target page matches the semantics of the voice content of the presenter.

In one embodiment, as shown in FIG. 11, the document control apparatus 1000 further includes:

an editing interface display module 1001, configured to display an initialization control following voice page turning in an editing interface of a document when the document is edited; responding to the triggering operation of the following voice page turning initialization control, and displaying a following page turning prompt area in an editing interface; and displaying the text content corresponding to each page in the document in the following page turning prompt area.

In one embodiment, the editing interface presentation module 1001 is further configured to enter a document editing mode with respect to a document in response to a trigger operation for editing the document; after entering a document editing mode, displaying an editing interface of the document; and displaying the initialization control following the voice page turning in the editing interface.

In one embodiment, referring to fig. 11, the document control apparatus 1000 further includes:

a text content editing module 1008, configured to display the edited text content in the following page turning prompt region in response to a text editing operation in the following page turning prompt region; and updating the text content corresponding to each page in the document according to the edited text content.

In one embodiment, the presentation interface presentation module 1002 is further configured to cancel presentation of the following voice page turning initialization control and the following page turning prompt region in the presentation interface of the document when the document is presented.

In one embodiment, the presentation interface presentation module 1002 is further configured to enter a document presentation mode with respect to a document in response to a trigger operation of a presentation document; after entering a document demonstration mode, displaying a demonstration interface of the document; and displaying the trigger control along with the voice page turning in the demonstration interface.

In one embodiment, the response module 1004 is further configured to exit the following voice page turning mode in response to a triggering operation of the following voice page turning trigger control in the following voice page turning mode.

In one embodiment, the document control apparatus 1000 further comprises:

before entering the following voice page turning mode, when the document is in the document editing mode and it is determined that the triggering operation for the following voice page turning initialization control displayed in the editing interface has not occurred, the following voice page turning initialization module 1010 is configured to extract text content corresponding to each page in the document and store the page number of each page corresponding to the corresponding text content.

the following voice page turning initialization module 1010 is configured to extract text content corresponding to each page in the document, where the text content includes at least one of page remark content and page title content; and correspondingly storing the page number of each page and the corresponding text content.

In one embodiment, the following voice page turning initialization module 1010 is further configured to, for each page in the document, respectively extract original text content corresponding to each page; when the number of words of the original text content is less than the preset number, the original text content is reserved and used as the text content extracted from the corresponding page; and when the number of words of the original text content is more than the preset number, reading the characters of the preset number from the first character of the original text content and then continuing to read until the separator is read for the first time, and keeping the read content as the text content extracted from the corresponding page.

In one embodiment, the document control apparatus 1000 further comprises:

the acquisition module is used for acquiring voice data of a demonstrator in a following voice page turning mode;

and the matching module is used for carrying out semantic matching on the latest sentences and the text contents corresponding to the pages in the document one by one and determining the target page according to the successfully matched text contents.

In one embodiment, the matching module is further configured to perform semantic matching on the latest statements and the page remark contents corresponding to each page in the document one by one, and use a page corresponding to the page remark contents successfully matched first as a target page; when the latest sentence is not successfully matched with the page remark content corresponding to each page in the document, performing semantic matching on the latest sentence and the page title content corresponding to each page in the document one by one, and taking the page corresponding to the first successfully matched page title content as a target page; and when the latest sentence is not successfully matched with the page title content corresponding to each page in the document, indicating the acquisition module to continue to acquire the voice data of the presenter in a following voice page turning mode.

In one embodiment, the matching module is further configured to compare lengths of the latest sentence and the page remark content corresponding to the page being matched in a successive matching process in which the latest sentence is successively semantically matched with the page remark content corresponding to each page in the document; when the length of the latest sentence is larger than that of the page remark content, intercepting from the last character of the latest sentence to the direction of the first character according to the length of the page remark content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page remark content; and when the length of the latest sentence is less than that of the page remark content, intercepting the latest sentence from the last character of the page remark content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

In one embodiment, the matching module is further configured to compare lengths of the latest sentence and the page title content corresponding to the page being matched in a successive matching process in which the latest sentence is successively semantically matched with the page title content corresponding to each page in the document; when the length of the latest sentence is larger than that of the page title content, intercepting from the tail character of the latest sentence to the direction of the first character according to the length of the page title content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page title content; and when the length of the latest sentence is less than that of the page title content, intercepting the latest sentence from the last character of the page title content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

In one embodiment, the matching module is further configured to, in a successive matching process in which semantic matching is performed between the latest sentence and text content corresponding to each page in the document successively, convert the latest sentence into a corresponding word sequence, and convert text content corresponding to the page being matched into a corresponding word sequence; generating a sentence vector corresponding to the latest sentence based on a word vector corresponding to each participle in a word sequence corresponding to the latest sentence, and generating a text vector corresponding to the text content based on a word vector corresponding to each participle in a word sequence corresponding to the text content; and when the similarity between the sentence vector and the text vector is greater than a preset threshold, judging that the latest sentence is successfully matched with the text content.

In one embodiment, the document control apparatus 1000 further comprises:

the page turning module is used for generating a page turning instruction carrying a page number according to the page number of the target page; and based on the page turning instruction, after the document is turned from the current page to the target page, continuously indicating the acquisition module to acquire the voice data of the presenter in a following voice page turning mode.

In one embodiment, the document is an online collaboration document, and the document control apparatus 1000 further includes:

The document control apparatus 1000 displays the following voice page turning trigger control in a presentation interface of a document when the document is presented, and starts the following voice page turning mode for the document when the triggering operation for the following voice page turning trigger control occurs. In the mode, the document can be turned over along with the voice content of the presenter, the text content in the target page to which the document is turned is matched with the voice content of the presenter in the aspect of semantics, the document can be automatically turned over along with the voice content of the presenter, the presenter does not need to control the document to turn over through an extra demonstration pen, control instructions except the presentation content do not need to be sent to control the document to turn over, the turning over is very convenient, the presentation thought of the presenter in the whole presentation process is more consistent, and the user experience of document presentation is improved.

For specific limitations of the document control apparatus 1000, reference may be made to the above limitations of the document control method, which are not described herein again. The respective modules in the document control apparatus 1000 described above may be wholly or partially implemented by software, hardware, and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be the terminal 102 shown in fig. 1, and its internal structure diagram may be as shown in fig. 12. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a document control method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 12 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is further provided, which includes a memory and a processor, the memory stores a computer program, and the processor implements the steps of the above method embodiments when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.

In one embodiment, a computer program product or computer program is provided that includes computer instructions stored in a computer-readable storage medium. The computer instructions are read by a processor of a computer device from a computer-readable storage medium, and the computer instructions are executed by the processor to cause the computer device to perform the steps in the above-mentioned method embodiments.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A document control method, characterized in that the method comprises:

2. The method of claim 1, wherein prior to presenting the document, the method further comprises:

when a document is edited, displaying an initialization control following voice page turning in an editing interface of the document;

responding to the triggering operation of the following voice page turning initialization control, and displaying a following page turning prompt area in the editing interface;

and displaying the text content corresponding to each page in the document in the following page turning prompt area.

3. The method according to claim 2, wherein when a document is edited, a follow voice page-turning initialization control is displayed in an editing interface of the document, and the method comprises the following steps:

entering a document editing mode with respect to the document in response to a triggering operation for editing the document;

after entering the document editing mode, displaying an editing interface of the document;

and displaying the following voice page turning initialization control in the editing interface.

4. The method of claim 2, further comprising:

displaying the edited text content in the following page turning prompt region in response to the text editing operation in the following page turning prompt region;

and updating the text content corresponding to each page in the document according to the edited text content.

5. The method of claim 1, wherein presenting a follow-up voice paging trigger control in a presentation interface of a document while presenting the document comprises:

in response to a triggering operation for presenting the document, entering a document presentation mode with respect to the document;

after entering the document demonstration mode, displaying a demonstration interface of the document;

and displaying the following voice page turning trigger control in the demonstration interface.

6. The method of claim 1, further comprising:

and in the following voice page turning mode, responding to the triggering operation of the following voice page turning triggering control, and exiting the following voice page turning mode.

7. The method of claim 1, wherein prior to entering the follow page voice mode, the method further comprises:

when the document is in a document editing mode and triggering operation of a following voice page turning initialization control displayed in an editing interface does not occur, extracting text content corresponding to each page in the document, and correspondingly storing the page number of each page and the corresponding text content.

8. The method of claim 1, further comprising:

extracting text content corresponding to each page in the document, wherein the text content comprises at least one of page remark content and page title content;

and correspondingly storing the page number of each page and the corresponding text content.

9. The method according to claim 8, wherein the extracting the text content corresponding to each page in the document comprises:

for each page in the document, respectively extracting the original text content corresponding to each page;

when the number of words of the original text content is less than the preset number, the original text content is reserved and used as the text content extracted from the corresponding page;

and when the number of words of the original text content is more than the preset number, reading the characters of the preset number from the first character of the original text content and then continuing to read the characters until the separator is read for the first time, and keeping the read content as the text content extracted from the corresponding page.

10. The method of claim 1, further comprising:

collecting voice data of a presenter in the following voice page turning mode;

converting the voice data of the presenter into a sentence of the presenter;

and carrying out semantic matching on the latest sentence and the text content corresponding to each page in the document one by one, and determining the target page according to the successfully matched text content.

11. The method according to claim 10, wherein the semantic matching of the latest sentence with the text content corresponding to each page in the document one by one comprises:

carrying out semantic matching on the latest sentences and the page remark contents corresponding to all the pages in the document one by one, and taking the page corresponding to the page remark contents which are successfully matched as a target page;

when the latest sentence is not successfully matched with the page remark content corresponding to each page in the document, performing semantic matching on the latest sentence and the page title content corresponding to each page in the document one by one, and taking the page corresponding to the first successfully matched page title content as a target page;

and when the latest sentence is not successfully matched with the page title content corresponding to each page in the document, continuously acquiring the voice data of the presenter in the following voice page turning mode.

12. The method according to claim 11, wherein the semantic matching of the latest sentence with the page remark content corresponding to each page in the document one by one comprises:

in the successive matching process of carrying out semantic matching on the latest sentence and the page remark content corresponding to each page in the document successively, comparing the length of the latest sentence and the page remark content corresponding to the page being matched;

when the length of the latest sentence is greater than the length of the page remark content, intercepting the latest sentence from the last character of the latest sentence to the direction of the first character according to the length of the page remark content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page remark content;

and when the length of the latest sentence is smaller than that of the page remark content, intercepting from the last character of the page remark content to the direction of the first character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

13. The method of claim 11, wherein the semantic matching of the latest sentence with the page title content corresponding to each page in the document one by one comprises:

in the successive matching process of carrying out semantic matching on the latest sentence and the page title content corresponding to each page in the document successively, comparing the lengths of the latest sentence and the page title content corresponding to the page being matched;

when the length of the latest sentence is greater than that of the page title content, intercepting from the last character of the latest sentence to the direction of the first character according to the length of the page title content to obtain an intercepted sentence, and performing semantic matching on the intercepted sentence and the page title content;

and when the length of the latest sentence is smaller than that of the page title content, intercepting from the tail character of the page title content to the direction of the head character according to the length of the latest sentence to obtain the intercepted content, and performing semantic matching on the latest sentence and the intercepted content.

14. The method according to claim 10, wherein the semantic matching of the latest sentence with the text content corresponding to each page in the document one by one comprises:

in the successive matching process of carrying out semantic matching on the latest sentence and the text content corresponding to each page in the document successively, converting the latest sentence into a corresponding word sequence, and converting the text content corresponding to the page being matched into a corresponding word sequence;

generating a sentence vector corresponding to the latest sentence based on a word vector corresponding to each participle in the word sequence corresponding to the latest sentence, and generating a text vector corresponding to the text content based on a word vector corresponding to each participle in the word sequence corresponding to the text content;

and when the similarity between the sentence vector and the text vector is greater than a preset threshold, judging that the latest sentence is successfully matched with the text content.

15. The method of claim 10, further comprising:

generating a page turning instruction carrying the page number according to the page number of the target page;

and based on the page turning instruction, after the document is turned from the current page to the target page, continuously acquiring voice data of a presenter in the following voice page turning mode.

16. The method of claims 1-15, wherein the document is an online collaboration document, the method further comprising:

initiating a document cooperation invitation according to the access address of the online cooperation document;

synchronizing the document control information generated in the demonstration process of the online collaboration document to a terminal responding to the document collaboration invitation, so that the terminal synchronizes the control state of the online collaboration document according to the document control information.

17. A document control apparatus, characterized in that the apparatus comprises:

18. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 16.

19. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 16.

20. A computer program product comprising a computer program, characterized in that the computer program realizes the steps of the method of any one of claims 1 to 16 when executed by a processor.