CN116260995A

CN116260995A - Method for generating media directory file and video presentation method

Info

Publication number: CN116260995A
Application number: CN202111500492.7A
Authority: CN
Inventors: 鄢彪; 张怡
Original assignee: Shanghai Hode Information Technology Co Ltd
Current assignee: Shanghai Hode Information Technology Co Ltd
Priority date: 2021-12-09
Filing date: 2021-12-09
Publication date: 2023-06-13

Abstract

The application discloses a method for generating a media directory file, which is used in a server and comprises the following steps: acquiring text content of target media, wherein the text content corresponds to audio content in the target media; determining a plurality of skip nodes of the target media according to the text content; setting a catalog index for each of the plurality of jump nodes to generate a media catalog file of the target media; the directory index comprises the directory name of the skip node and the time information of the skip node in the target media. The application also provides a generating system, a video presenting method, a computer device and a computer readable storage medium. According to the technical scheme, a user can conveniently and accurately locate and jump to any content paragraph in the target media, and user experience is optimized.

Description

Method for generating media directory file and video presentation method

Technical Field

Embodiments of the present disclosure relate to the field of video, and in particular, to a method and system for generating a media directory file, a video presentation method, a computer device, and a computer readable storage medium.

Background

With the rapid popularization of internet technology, video has become an important way for people to live, learn, and entertain. For example, science and education video is becoming the dominant way for people to acquire knowledge. However, one science and education video may involve multiple knowledge points or multiple solutions, and one science and education video typically has a duration of several tens of minutes.

Some users may only be interested in a certain knowledge point in the video. When the users want to see the knowledge points of interest at the back, the knowledge points of interest of the users need to be searched by dragging the progress bar, which is tedious, invalid and has poor experience.

The foregoing should not be construed as limiting the application scenario and the patent protection scope of the embodiments of the present application.

Disclosure of Invention

In view of the foregoing, it is an object of embodiments of the present application to provide a method, a system, a video presentation method, a computer device, and a computer-readable storage medium for generating a media directory file, which can be used to solve the above-mentioned problems.

An aspect of an embodiment of the present application provides a method for generating a media directory file, which is used in a server, and includes:

acquiring text content of target media, wherein the text content corresponds to audio content in the target media;

Determining a plurality of skip nodes of the target media according to the text content; a kind of electronic device with high-pressure air-conditioning system

Setting a directory index for each of the plurality of jump nodes to generate a media directory file of the target media;

the directory index comprises the directory name of the skip node and the time information of the skip node in the target media.

Optionally, the determining a plurality of skip nodes of the target media according to the text content includes:

determining an accepting statement in the text content, wherein the accepting statement is used for separating different content paragraphs;

and determining the plurality of jump nodes according to the time information of each accepting sentence in the target media.

Optionally, the determining the accepting sentence in the text content includes:

respectively matching a plurality of language tags with the text content, and presetting the language tags;

and determining the sentence matched with any language tag in the plurality of language tags in the text content as the accepted sentence.

Optionally, each language tag includes a plurality of characters;

the matching the plurality of language tags with the text content respectively includes:

And matching the text content with each node of a tree structure, wherein the tree structure comprises paths of the language tags, and each node in the paths corresponds to each character in the language tag one by one in sequence.

Optionally, the matching the text content with each node of the tree structure includes:

according to the character sequence of each character in the text content, sequentially matching the characters, and when the matching of the mth character is executed:

determining whether an m-1 st character in the text content matches a node in the tree structure;

if the (m-1) th character is matched with the node which does not belong to the leaf node in the tree structure, matching the (m) th character with each child node under the node matched with the (m-1) th character to acquire the node matched with the (m) th character; if the node matched with the mth character is obtained and is a leaf node, determining that the node matched with the mth character is matched with the target language label corresponding to the path where the node matched with the mth character is located;

if the m-1 character is not matched with the node in the tree structure or the leaf node in the tree structure, matching the m-1 character with the first node in each path; the first nodes in each path correspond to the first characters of each language label, and the same first character nodes are shared.

Optionally, the configuration operation of the tree structure is further included:

splitting each language label by taking characters as units;

configuring a root node of the tree structure;

respectively configuring the first characters of each language label as each child node of the root node, wherein the same first character nodes are shared;

configuring an ith path in the tree structure according to the character sequence in the ith language tag; wherein the ith path comprises each node from the ith sub-node of the root node to the ith leaf node under the ith sub-node; the ith child node of the root node corresponds to the first character of the ith language tag, the ith leaf node under the ith child node of the root node corresponds to the last character of the ith language tag, and the adjacent previous character and the next character in the ith language tag correspond to father and child nodes in the tree structure; the ith language tag is any one of the plurality of language tags;

if the first plurality of characters of the ith language tag and the first plurality of characters of the other language tags are in one-to-one correspondence and the same, the plurality of nodes corresponding to the first plurality of characters of the ith language tag and the plurality of nodes corresponding to the first plurality of characters of the other language tags are configured as common nodes in one-to-one correspondence in the tree structure.

Optionally, the setting a directory index for each of the plurality of jumping nodes includes:

acquiring key content associated with each acceptance statement;

and setting the directory name of the jump node corresponding to each accepting statement according to the key content associated with each accepting statement.

Optionally, the acquiring key content associated with each of the socket sentences includes:

extracting a plurality of nouns in the text;

according to the occurrence frequency of each of the nouns, determining one or more nouns with the highest occurrence frequency;

determining a media type of the target media according to the one or more nouns;

determining a target name word stock associated with the target media according to the media type;

respectively matching each noun of the target name word stock with each noun in the whole sentence of the receiving sentence;

acquiring key content associated with the bearing statement according to the matching result;

the key content comprises nouns existing in the whole sentence of the adapting sentence and the target name word stock.

An aspect of an embodiment of the present application further provides a system for generating a media directory file, where the system is used in a server, and includes:

The system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring text content of target media, and the text content corresponds to audio content in the target media;

a determining module, configured to determine a plurality of skip nodes of the target media according to the text content;

the generation module is used for setting a catalog index for each of the plurality of jump nodes so as to generate a media catalog file of the target media;

An aspect of an embodiment of the present application further provides a video presenting method, which is used in a user device, and includes:

presenting at least one of a plurality of directory names while presenting the video; wherein, each catalog name corresponds to one piece of time information of the video and is used for representing the content of the video in the corresponding time information;

detecting a selection of any one of the plurality of directory names; a kind of electronic device with high-pressure air-conditioning system

In response to selection of any one of the plurality of directory names, jumping the playing progress of the video to target time information; wherein the target time information corresponds to the selected directory name.

Optionally, the plurality of directory names are presented in a list form in a display window.

Optionally, the plurality of directory names are identified on a progress bar of the video;

the time information of each directory name corresponds to the identified position of each directory name on the progress bar one by one.

An aspect of the embodiments of the present application further provides a computer device, where the computer device includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method for generating a media directory file or the method for presenting a video described above.

An aspect of the embodiments of the present application further provides a computer readable storage medium having a computer program stored therein, where the computer program is executable by at least one processor to cause the at least one processor to perform the steps of the method for generating a media directory file or the method for presenting video described above.

The method, the system, the video presentation method, the device and the computer readable storage medium for generating the media directory file provided by the embodiment of the application have the following technical advantages:

According to the text content of the target media, a plurality of skip nodes and directory indexes thereof corresponding to different content paragraphs are configured, so that a viewer can conveniently and accurately locate and skip any content paragraph in the target media, and user experience is optimized.

Drawings

FIG. 1 schematically illustrates an application environment diagram of a method of generating a media directory file and a method of video presentation;

FIG. 2 schematically illustrates a flowchart of a method of generating a media directory file according to an embodiment of the present application;

fig. 3 schematically shows a flow chart of step S202 in fig. 2;

fig. 4 schematically shows a flow chart of step S300 in fig. 3;

fig. 5 schematically shows a flow chart of step S400 in fig. 4;

fig. 6 schematically shows a flow chart of step S500 in fig. 5;

fig. 7 schematically shows a tree structure;

FIG. 8 schematically illustrates a new step of a method for generating a media directory file according to an embodiment of the present application;

fig. 9 schematically shows a flow chart of step S204 in fig. 2;

fig. 10 schematically shows a flowchart of step S900 in fig. 9;

fig. 11 schematically illustrates a flowchart of a video presentation method according to a second embodiment of the present application;

FIG. 12 schematically illustrates a display interface in a user device;

FIG. 13 schematically illustrates a block diagram of a media catalog file generation system according to embodiment III of the present application;

fig. 14 schematically illustrates a block diagram of a video presentation system according to a fourth embodiment of the present application; a kind of electronic device with high-pressure air-conditioning system

Fig. 15 schematically shows a hardware architecture diagram of a computer device according to a fifth embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

It should be noted that the descriptions of "first," "second," etc. in the embodiments of the present application are for descriptive purposes only and are not to be construed as indicating or implying a relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include at least one such feature. In addition, the technical solutions of the embodiments may be combined with each other, but it is necessary to base that the technical solutions can be realized by those skilled in the art, and when the technical solutions are contradictory or cannot be realized, the combination of the technical solutions should be regarded as not exist and not within the protection scope of the present application.

The inventors have found that the user may only be interested in a certain knowledge point in the video. When a user wants to watch the knowledge points of interest at the back, the user needs to search the knowledge points of interest by dragging the progress bar, so that the knowledge points of interest are tedious and invalid, and experience is poor.

The embodiment of the application aims at identifying the specific tag in the video through the identification technology and generating the video catalog according to the time point and the tag, so that a viewer can conveniently and accurately position to a certain knowledge point, and the user experience is optimized. That is, the user may select knowledge points to jump to based on the video catalog, thereby jumping directly to optimize the user experience.

In the description of the present application, it should be understood that the numerical references before the steps do not identify the order of performing the steps, but are only used for convenience in describing the present application and distinguishing each step, and thus should not be construed as limiting the present application.

Fig. 1 schematically shows an application environment diagram of a method for generating a media directory file and a method for video presentation in the present application.

The server 2 may push media (e.g. video, audio) and/or media directory files into the user device 4 via a network.

Server 2 may be comprised of a single or multiple computing devices, such as a rack server, a blade server, a tower server, or a rack server (including individual servers, or a server cluster comprised of multiple servers), among others. The one or more computer devices may include virtualized computing instances. The computer device may load the virtual machine based on a virtual image and/or other data defining particular software (e.g., operating system, application specific, server) for emulation. As the demand for different types of processing services changes, different virtual machines may be loaded and/or terminated on the one or more computer devices.

The network may include various network devices such as routers, switches, multiplexers, hubs, modems, bridges, repeaters, firewalls, proxy devices, and/or the like. The network may include physical links such as coaxial cable links, twisted pair cable links, fiber optic links, combinations thereof, and/or the like. The network may include wireless links, such as cellular links, satellite links, wi-Fi links, and/or the like.

The user equipment 4 may be configured to access the server 2. The user device 4 may comprise any type of computer device, such as a smart phone, a tablet, a smart tv, a projector, a personal computer, etc. The user device 4 may have a built-in browser or a dedicated program, through which audio and video are received and content is output to the user. The content may include video, audio, comments, text data, and the like.

The user device 4 may comprise an interface, which may comprise an input element. For example, the input element may be configured to receive user instructions that may cause the user device 4 to perform various types of operations, such as dragging a progress bar, and the like.

The generation scheme and the video presentation scheme of the media directory file provided by the embodiments of the present application will be described below by way of embodiments.

Example 1

Fig. 2 schematically shows a flowchart of a method for generating a media directory file according to an embodiment of the present application. The method embodiment may be performed in a server. The server 2 is used as a single execution body of the present flow.

As shown in fig. 2, the method for generating a media directory file may include steps S200 to S204, wherein:

step S200, obtaining text content of target media, wherein the text content corresponds to audio content in the target media.

The target media can be a video file carrying audio data or a pure audio file.

The text content is composed of constructing a plurality of sentences.

In the above steps, the text content may be acquired in various ways, such as:

(1) And performing voice recognition on the target media through voice recognition (Speech Recognition) technology so as to convert voice content in the target media into corresponding words, thereby obtaining the text content. The Speech recognition technology may also be called automatic Speech recognition (Automatic Speech Recognition, ASR), computer Speech recognition (Computer Speech Recognition) or Speech-To-Text recognition (STT), which uses a computer To automatically convert human Speech content into corresponding Text.

(2) And reading the caption text matched with the target media, such as srt, ass, ssa, idx and other formats of caption text.

Step S202, determining a plurality of skip nodes of the target media according to the text content.

Bearing, turning, highlighting, etc. bearing sentences are usually performed between different paragraphs or before highlighting.

For example, if the target medium is composed of multiple video segments of different content, there will typically be a bearing sentence between video segments. Taking an educational video (video a) as an example, in the same video, a teacher recording the video sequentially carries out a section of open-field and a section of knowledge point explanation. The teacher, in the transition from "open white" to "knowledge point explanation", will generally say "next", "focus on" and so on.

Thus, the textual content may be analyzed based on, for example, a story line, to determine whether the target media includes a plurality of content paragraphs, and a start time of each of the plurality of content paragraphs in the target media.

As an alternative embodiment, to determine the plurality of jumping nodes more precisely, as shown in fig. 3, the step S202 may include steps S300 to S302, wherein: step S300, determining an accepting sentence in the text content, where the accepting sentence is used to separate different content paragraphs. Step S302, determining the plurality of skip nodes according to time information of each adapting sentence in the target media. In the above alternative embodiment, the receiving sentence in the text content is identified, and the identified receiving sentence is used as an effective criterion for judging adjacent different content paragraphs in the same media content (e.g. video). Continuing with the video A as an example, the teacher speaks a section of the open scene and a section of the knowledge point explanation. If the text content corresponding to the video a matches the accepted sentence (for example, "the key comes"), it is determined that the accepted sentence is a demarcation point between the "open white" video paragraph and the "knowledge point explanation" video paragraph. The demarcation point is also called a 'jump node', i.e. the demarcation point can be directly jumped to in the video presentation process, so that a user can skip a 'white-on' video paragraph and directly watch a 'knowledge point explanation' video paragraph.

The determination/identification of the accept sentence: the text content and the preset label (preset sentence) can be compared, and the sentence conforming to the preset label is found out according to the comparison result and is used as the receiving sentence. In other embodiments, the receiving statement may be identified by machine learning, deep neural network, or the like.

Step S204, setting a directory index for each of the plurality of jumping nodes, so as to generate a media directory file of the target media.

Continuing with the video A as an example, the teacher speaks a section of the open scene and a section of the knowledge point explanation. If a receiving sentence (for example, "emphasis is coming") is matched in the text content corresponding to the video a, the playing time of the voice signal corresponding to the receiving sentence in the video a is 12 minutes, 0 seconds, 44 milliseconds, and the content explained by the knowledge point after the receiving sentence is "pythagorean theorem", the following directory index may be configured for the receiving sentence:

the directory name of the directory index is: pythagorean theorem;

the time information of the catalog index is: 12 minutes, 0 seconds, 44 milliseconds.

It should be noted that, the directory name may be extracted from the whole sentence in which the receiving sentence is located, and also extracted from the text content corresponding to the "knowledge point explanation" video paragraph. The directory name may be extracted in a variety of ways, such as by part of speech, by part of speech and frequency of occurrence, or by artificial intelligence.

The video a may be provided to the user device 4, and when the user views the video a through the user device 4, if he wants to directly view the explanation content of the pythagorean theorem, he may perform the following exemplary operations based on the directory index:

(1) The progress bar is dragged to 12 minutes, 0 seconds, 44 milliseconds or thereabout.

(2) The directory index is triggered and automatically jumped to 12 minutes, 0 seconds, 44 milliseconds by the user equipment.

It should be noted that, to further improve the user experience, when the progress bar is dragged to within about 12 minutes, 0 seconds, 44 milliseconds, for example, for a few seconds (e.g., 3 seconds), the drag point of the progress bar may be automatically magnetically attracted to 12 minutes, 0 seconds, 44 milliseconds.

Based on the foregoing, the method for generating a media directory file according to the embodiments of the present application includes the following advantages: according to the text content of the target media, a plurality of skip nodes and directory indexes thereof corresponding to different content paragraphs are configured, so that a viewer can conveniently and accurately locate and skip any content paragraph in the target media, and user experience is optimized.

Taking educational videos as an example, viewers can jump to knowledge points of their own interest conveniently and accurately according to the catalog index.

Taking interview-like video as an example, viewers can jump to their own topic segments of interest conveniently and accurately according to the catalog index.

Of course, embodiments of the present application are not limited to educational videos, interview videos, and may be used in other scenarios as well.

Several alternative embodiments are provided below.

As an alternative embodiment, to improve the accuracy of determining/identifying the accepted sentence, as shown in fig. 4, the step S300 "determining the accepted sentence in the text content" may include: step S400, respectively matching a plurality of language tags with the text content, wherein the language tags are preset; and step S402, determining a sentence matched with any one of the language tags in the text content as the accepted sentence.

The plurality of language tags may be obtained through big data analysis and collection, or user settings.

The plurality of language tags includes, for example, "next", "focus", and the like.

The language tags are commonly used sentences accepted between different content paragraphs (video paragraphs). Therefore, by matching the plurality of language tags with the text content, the accepted sentence can be efficiently and accurately found.

However, due to the flexibility and massive combination of human languages, the number of language tags mentioned above is at least tens of thousands. For a few tens of minutes of media file (video), tens of thousands of language tags need to be matched individually with the text content of the media file, which is very computationally intensive, i.e. very performance consuming for the server 2. Further, if such matching needs to be performed on thousands of such media files, the server 2 may be difficult to withstand, even crashed. In view of this, the present embodiments provide the following alternative embodiments.

As an alternative embodiment, each language tag includes a plurality of characters, such as the language tag "next" includes the characters "next", coming ".

As shown in fig. 5, the step S400 "matching the plurality of language tags with the text content, respectively" may include: and step S500, matching the text content with each node of a tree structure, wherein the tree structure comprises paths of the language tags, and each node in the paths corresponds to each character in the language tag one by one in sequence. And corresponding each language label to one path to form a tree structure. And traversing and matching are carried out on the basis of each node of the tree structure, so that a matching result can be obtained quickly and with low consumption.

As shown in fig. 6, the step S500 may include: step S600, sequentially matching the characters according to the character sequence of the characters in the text content, and when the matching of the mth character is executed: step S600A, determining whether the (m-1) th character in the text content is matched with a node in the tree structure; step S600B, if the (m-1) th character is matched with a node which does not belong to a leaf node in the tree structure, matching the (m) th character with each child node under the node matched with the (m-1) th character to obtain a node matched with the (m) th character; if the node matched with the mth character is obtained and is a leaf node, determining that the node matched with the mth character is matched with the target language label corresponding to the path where the node matched with the mth character is located; step S600C, if the (m-1) th character is not matched with a node in the tree structure or is matched with a leaf node in the tree structure, matching the (m) th character with a head node in each path; the first nodes in each path correspond to the first characters of each language label, and the same first character nodes are shared. m is an integer greater than or equal to 2.

For ease of understanding, an example application is provided below in connection with fig. 7.

Fig. 7 provides a tree structure in which there are 5 paths below the root node, specifically:

knowledge, points and arrival;

knowledge, knowledge and coming;

knowledge, recognition, weight and point;

drawing, heavy and point;

start- & gt beginning- >

each of the above characters corresponds to a node in the tree structure, and the two nodes, "know" and "recognize" are shared by three paths.

The text content includes the following options: "… classmates, now the knowledge point is …".

(1) If at this time, the mth character is "present" in the above-mentioned section.

Correspondingly, the (m-1) th character is "people" (except punctuation marks), and if no matched node exists, the (m) th character is matched with each child node (known, drawn and opened) under the root node respectively, and the matching is failed.

m is added with 1, and then the process flow (2) is carried out.

(2) At this time, the mth character is "at" in the above-mentioned section.

Correspondingly, if the m-1 character is "present" and there is no matched node, each child node ("know", "mark", "open") under the m-1 character and the root node are respectively matched, and as a result, the matching fails.

m is added with 1, and then the process flow (3) is carried out.

(3) At this time, the mth character is "knowledge" in the above-mentioned section.

Correspondingly, if the m-1 character is 'in', and there is no matched node, the m-1 character 'know' and each child node ('know', 'draw', 'open') under the root node are respectively matched, and are successfully matched to the node 'know'.

m is added with 1, and then the process flow (4) is carried out.

(4) At this time, the mth character is "knowledge" in the above-mentioned section.

Correspondingly, the (m-1) th character is 'knowledge', and the matched node 'knowledge', each sub-node 'knowledge' under the (m-th character 'knowledge' and the node 'knowledge' are respectively matched, and the matching is successfully carried out to the node 'knowledge'.

m is added with 1, and then the process flow (5) is carried out.

(5) At this time, the mth character is a "point" in the above-mentioned section.

Correspondingly, the (m-1) th character is 'knowledge', and the (m-1) th character has matched node 'knowledge', and each child node ('point', 'come', 'heavy') under the (m-th character 'point' and the node 'knowledge') are respectively matched, and are successfully matched to the node 'point'.

m is added with 1, and then the process flow (6) is carried out.

(6) At this time, the mth character is "coming" in the above-mentioned section.

Correspondingly, the (m-1) th character is a 'point', and the (m-1) th character has a matched node 'point', and then the (m-1) th character is matched with each child node 'coming' under the node 'point', and the (m-1) th character is successfully matched with the node 'coming'.

Since the node "comes" to a leaf node, the round of matching ends. In this round of matching, the content of the text "… classmates, the knowledge point now …" successfully matches the language tag "knowledge point come".

m is added 1, and then the flow (7) is entered.

(7) At this time, the mth character is "having been" in the above-mentioned section.

Correspondingly, the (m-1) th character is "coming", there is a matched node "coming" and the node "coming" belongs to a leaf node (the leaf node refers to a node without a child node) in the tree structure, and then the (m) th character "coming" and each child node ("know", "mark", "open") under the root node are respectively matched, and a matching result is obtained.

m is added 1, and then the flow (8) is entered.

The matching process described in this embodiment is exemplarily described above through the processes (1) to (7), and will not be described herein. It can be understood that by starting the search of the text content at the root node of the tree structure, the search (matching) range is gradually narrowed in each search round, so that the calculation amount in the matching operation can be greatly reduced, and the matching time can be saved.

It should be noted that, the first character of the text content is directly matched with each child node under the root node. In addition, each character used for matching in the text content may be a kanji character, a word, or the like, but does not include punctuation marks.

As an alternative embodiment, as shown in fig. 8, the method further includes a configuration operation of the tree structure, and specifically includes the following steps:

step S800, splitting each language label by taking characters as units;

step S802, configuring a root node of the tree structure;

step S804, configuring the first characters of the language labels as the child nodes of the root node respectively, wherein the same first character nodes are shared;

step S806, configuring an ith path in the tree structure according to the character sequence in the ith language tag; wherein the ith path comprises each node from the ith sub-node of the root node to the ith leaf node under the ith sub-node; the ith child node of the root node corresponds to the first character of the ith language tag, the ith leaf node under the ith child node of the root node corresponds to the last character of the ith language tag, and the adjacent previous character and the next character in the ith language tag correspond to father and child nodes in the tree structure; the ith language tag is any one of the plurality of language tags;

With continued reference to fig. 7, the present embodiment provides the following 5 language tags:

language tag 1: "knowledge point come";

language tag 2: "knowledge come";

language tag 3: "knowledge emphasis";

language tag 4: "key points";

language tag 5: "Start".

Splitting each language tag as follows:

language tag 1 splits into: "know", "point", "come";

the language tag 2 is split into: "know", "come" and "come";

the language tag 3 is split into: "know", "heavy", "point";

the language tag 4 is split into: "scratch", "heavy", "dot";

the language tag 5 is split into: "start", "have been.

The first character of each language label is respectively used as a child node of the root node, namely 'know', 'scratch', 'open'.

A path with a node 'know' as a starting node is set for the language tag 1, and subsequent nodes of the path are 'know', 'point', 'come'.

A path with a node 'know' as a starting node is set for the language tag 2, and subsequent nodes of the path are 'know' and 'come' in sequence.

A path with a node "know" as a starting node is set for the language tag 3, and subsequent nodes of the path are "know", "heavy", "point" in sequence.

A path with a node "scratch" as a starting node is set for the language tag 4, and subsequent nodes of the path are "heavy" and "dot" in sequence.

A path with a node "on" as a start node is set for the language tag 5, and the subsequent nodes of the path are "start" and "have" in turn.

With continued reference to fig. 7, the first two characters ("knowledge" s ") of the language tags 1, 2, 3 are identical, and thus the nodes and paths are common.

Based on the tree structure configured by the above-described alternative embodiment, matching with text contents can be efficiently achieved.

As an alternative embodiment, as shown in fig. 9, step S204 "setting a directory index for each of the plurality of jumping nodes" may include: step S900, obtaining key content associated with each bearing statement; step S902, setting a directory name of a skip node corresponding to each of the accept statements in the key content associated with each of the accept statements. In the optional embodiment, the key content is used as the catalog name, so that the user can understand the theme or main content of the corresponding content paragraph by seeing the catalog name, and the user experience is improved.

As an alternative embodiment, in order to accurately extract the key content, as shown in fig. 10, the step S900 may be implemented by: step S1000, extracting a plurality of nouns in the text; step S1002, determining one or more nouns with the highest occurrence frequency according to the occurrence frequency of each of the nouns; step S1004, determining the media type of the target media according to the one or more nouns; step S1006, determining a target name word stock associated with the target media according to the media type; step S1008, each noun in the target name word stock and each noun in the whole sentence in which the accepting sentence is located are respectively matched; step S1010, obtaining key content associated with the bearing statement according to a matching result; the key content comprises nouns existing in the whole sentence of the adapting sentence and the target name word stock.

Taking another educational video (next video B) as an example.

The structure of video B is: a section of open white video section, a section of knowledge point of "mean inequality" explain the video section. During the course of the video paragraph, the knowledge point of the "mean inequality" is explained, the following ". The colleagues are well informed, the knowledge point is explained, and we are below the mean inequality …, …" in this mean inequality.

(1) And acquiring the adapting sentence matched with the language label 'knowledge point coming' in the video B.

(2) A plurality of nouns, such as 'mean inequality', 'Pythagorean theorem', 'vector', are extracted from the text content of video B.

(3) If the noun "mean inequality", "Pythagorean theorem" and "vector" are the most frequent, the media type of the video B, namely the education video, is determined according to the noun "mean inequality", "Pythagorean theorem" and "vector".

(4) And determining a mathematical noun library of the education video according to the education video.

The mathematical noun library includes a plurality of educational nouns.

(5) And respectively matching each noun in the mathematical name word stock with each noun in the whole sentence of the receiving sentence.

The whole sentence in which the receiving sentence is located has no matched noun, and the whole sentence can be matched in the next sentence of the whole sentence.

In this example, the term "mean inequality" is taken as the key content if it is successfully matched.

Finally, the name "mean inequality" is taken as the catalog name, and the time information of the above-mentioned accepting sentence is associated.

Example two

Fig. 11 schematically shows a flowchart of a video presentation method according to a second embodiment of the present application. The method embodiment may be performed in a user equipment. The user equipment 4 is used as a single execution body of the present flow.

As shown in fig. 11, the video presentation method may include steps S1100 to S1104, in which:

step S1100, presenting at least one of a plurality of directory names while presenting the video; wherein each directory name corresponds to one piece of time information of the video, and is used for representing the content of the video in the corresponding time information.

Step S1102 detects a selection of any one of the plurality of directory names.

Step S1104, in response to selection of any one of the plurality of directory names, of jumping the playing progress of the video to target time information; wherein the target time information corresponds to the selected directory name.

Continuing with the above video a as an example, during the presentation of the video a by the user device 4, the user may know the subject matter or main content of the different video paragraphs in this video a by directory name. Further, if the user wants to directly watch the explanation of the "Pythagorean theorem", the following exemplary operations may be performed based on the corresponding directory names:

(2) Clicking on the corresponding directory name is automatically jumped to 12 minutes 0 seconds 44 milliseconds by the user device 4.

Based on the above, according to the video presentation method provided by the embodiment of the present application, according to the directory name and the associated time information thereof, the user can conveniently and accurately locate and jump to the start point of the corresponding video paragraph, and the user experience is optimized.

As an alternative embodiment, the plurality of directory names are presented in a list form in one display window.

As shown in fig. 12, the following 3 directory names are listed: mean inequality, pythagorean theorem, vector.

When the user wants to directly start to see the knowledge point explanation video paragraph of the Pythagorean theorem, the user can click on the area where the Pythagorean theorem is located.

If the clicking event of the user on the area of the Pythagorean theorem is detected, the user equipment 4 automatically jumps the progress bar to the play starting time point of the video paragraph in the video A, which is explained by the knowledge point of the Pythagorean theorem.

It should be noted that, the list contents in fig. 12 may be presented in a normal state, or may be presented after the user triggers.

As an alternative embodiment, the plurality of directory names are identified on a progress bar of the video;

For example, the user device 4 reads the video directory file provided by the server 2 and generates an intelligent progress bar. Specifically, the user equipment 4 reads the video directory file at the same time when playing the video, and marks each directory name in the video directory file on the progress bar according to the respective associated time information, so that the user can check and jump conveniently.

The progress bar may be displayed in a variety of forms, such as:

on the progress bar, each directory name may be displayed in a normal state.

And on the progress bar, when the mouse hovers over the progress bar, displaying each catalog name.

The mark on the progress bar can be a red dot and the like, and when the mouse hovers over the red dot, the corresponding directory name is displayed.

Example III

Fig. 13 schematically shows a block diagram of a system for generating a media directory file according to the third embodiment of the present application, which may be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors to complete the embodiments of the present application. Program modules in the embodiments of the present application refer to a series of computer program instruction segments capable of implementing specific functions, and the following description specifically describes the functions of each program module in the embodiment.

As shown in fig. 13, the generation system 1300 of the media directory file may include:

an obtaining module 1310, configured to obtain text content of a target media, where the text content corresponds to audio content in the target media;

a determining module 1320, configured to determine a plurality of skip nodes of the target media according to the text content;

a generating module 1330, configured to set a directory index for each of the plurality of jumping nodes, so as to generate a media directory file of the target media;

As an optional embodiment, the determining module 1320 is further configured to:

As an alternative embodiment, each language tag includes a plurality of characters;

the determining module 1320 is further configured to:

As an alternative embodiment, the device further comprises a configuration module for:

splitting each language label by taking characters as units;

configuring a root node of the tree structure;

As an alternative embodiment, the generating module 1330 is further configured to:

acquiring key content associated with each acceptance statement;

extracting a plurality of nouns in the text;

Example IV

Fig. 14 schematically shows a block diagram of a video presentation system according to a fourth embodiment of the present application, which may be divided into one or more program modules, which are stored in a storage medium and executed by one or more processors to complete the embodiments of the present application. Program modules in the embodiments of the present application refer to a series of computer program instruction segments capable of implementing specific functions, and the following description specifically describes the functions of each program module in the embodiment.

As shown in fig. 14, the video presentation system 1400 may include:

a presentation module for presenting at least one of a plurality of directory names while presenting the video; wherein, each catalog name corresponds to one piece of time information of the video and is used for representing the content of the video in the corresponding time information;

a detection module for detecting a selection of any one of the plurality of directory names;

A response module, configured to skip the playing progress of the video to target time information in response to selection of any one of the plurality of directory names; wherein the target time information corresponds to the selected directory name.

Example five

Fig. 15 schematically shows a hardware architecture diagram of a computer device according to a fifth embodiment of the present application. The computer device 10000 may be provided as a server 2 or a component thereof, or as a user device 4 or a component thereof. In this embodiment, the computer device 10000 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and may be, for example, a rack server, a blade server, a tower server, or a rack server (including an independent server or a server cluster formed by a plurality of servers), or an electronic device with audio and video processing capability such as a smart phone, a tablet computer, or a notebook computer. As shown in fig. 15, the computer device 10000 includes at least, but is not limited to: the memory 10010, processor 10020, network interface 10030 may be communicatively linked to each other via a system bus. Wherein:

Memory 10010 includes at least one type of computer-readable storage medium including flash memory, hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, memory 10010 may be an internal storage module of computer device 10000, such as a hard disk or memory of computer device 10000. In other embodiments, the memory 10010 may also be an external storage device of the computer device 10000, such as a plug-in hard disk provided on the computer device 10000, a Smart Media Card (SMC for short), a Secure Digital (SD for short), a Flash Card (Flash Card) or the like. Of course, the memory 10010 may also include both an internal memory module of the computer device 10000 and an external memory device thereof. In this embodiment, the memory 10010 is typically used for storing an operating system installed on the computer device 10000 and various application software, such as program codes of a generation/presentation method of a media directory file. In addition, the memory 10010 may be used to temporarily store various types of data that have been output or are to be output.

The processor 10020 may be a central processing unit (Central Processing Unit, simply CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 10020 is typically configured to control overall operation of the computer device 10000, such as performing control and processing related to data interaction or communication with the computer device 10000. In this embodiment, the processor 10020 is configured to execute program codes or process data stored in the memory 10010.

The network interface 10030 may comprise a wireless network interface or a wired network interface, which network interface 10030 is typically used to establish a communication link between the computer device 10000 and other computer devices. For example, the network interface 10030 is used to connect the computer device 10000 to an external terminal through a network, establish a data transmission channel and a communication link between the computer device 10000 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, abbreviated as GSM), wideband code division multiple access (Wideband Code Division Multiple Access, abbreviated as WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, etc.

It should be noted that fig. 15 only shows a computer device having components 10010-10030, but it should be understood that not all of the illustrated components are required to be implemented, and more or fewer components may be implemented instead.

In this embodiment, the method for generating/displaying the media directory file stored in the memory 10010 may be further divided into one or more program modules and executed by one or more processors (the processor 10020 in this embodiment) to complete the present application.

Example six

The present embodiment also provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method of generating a media directory file in the embodiment, or the steps of the method of video presentation.

In this embodiment, the computer-readable storage medium includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the computer readable storage medium may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device. In other embodiments, the computer readable storage medium may also be an external storage device of a computer device, such as a plug-in hard disk provided on the computer device, a Smart Media Card (SMC for short), a Secure Digital (SD for short) Card, a Flash Card (Flash Card), etc. Of course, the computer-readable storage medium may also include both internal storage units of a computer device and external storage devices. In this embodiment, the computer readable storage medium is typically used to store an operating system and various types of application software installed on a computer device, such as a program code of a method for generating a media directory file or a method for presenting video in the embodiment. Furthermore, the computer-readable storage medium may also be used to temporarily store various types of data that have been output or are to be output.

It will be apparent to those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.

The foregoing description is only of the preferred embodiments of the present application, and is not intended to limit the scope of the claims, and all equivalent structures or equivalent processes using the descriptions and drawings of the present application, or direct or indirect application in other related technical fields are included in the scope of the claims of the present application.

Claims

1. A method for generating a media directory file, for use in a server, comprising:

2. The method of generating a media directory file as claimed in claim 1, wherein,

the determining a plurality of skip nodes of the target media according to the text content comprises the following steps:

3. The method of generating a media directory file as claimed in claim 2, wherein,

the determining the accepting sentence in the text content comprises the following steps:

4. A method of generating a media directory file as claimed in claim 3, wherein each language tag comprises a plurality of characters;

5. The method of generating a media directory file as claimed in claim 3, wherein,

the matching the text content with each node of the tree structure comprises the following steps:

6. The method of generating a media directory file as claimed in claim 4, further comprising a configuration operation of the tree structure:

splitting each language label by taking characters as units;

configuring a root node of the tree structure;

7. The method for generating a media directory file as claimed in any one of claims 2 to 6, wherein,

the setting a directory index for each of the plurality of jump nodes includes:

acquiring key content associated with each acceptance statement;

8. The method of generating a media directory file as claimed in claim 7, wherein,

the obtaining key content associated with each acceptance statement includes:

extracting a plurality of nouns in the text;

9. A system for generating a media directory file for use in a server, comprising:

10. A video presentation method for use in a user device, comprising:

11. The method of claim 10, wherein the plurality of directory names are presented in a list in a display window.

12. The method of claim 10, wherein the plurality of directory names are identified on a progress bar of the video;

13. A computer device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the computer program when executed by the processor is configured to implement:

The method of generating a media directory file as claimed in any one of claims 1 to 8; or (b)

The video presentation method of claims 10 to 12.

14. A computer-readable storage medium having a computer program stored therein, the computer program being executable by at least one processor to cause the at least one processor to perform:

The video presentation method of claims 10 to 12.