CN110493661B

CN110493661B - Video file processing method and server

Info

Publication number: CN110493661B
Application number: CN201910562051.6A
Authority: CN
Inventors: 高志栋; 陶晓龙; 张志涛; 金博; 杨洋; 郭军
Original assignee: China Telecom Wanwei Information Technology Co Ltd
Current assignee: China Telecom Wanwei Information Technology Co Ltd
Priority date: 2019-06-26
Filing date: 2019-06-26
Publication date: 2021-11-16
Anticipated expiration: 2039-06-26
Also published as: CN110493661A

Abstract

The application provides a video file processing method and a server, which are used for a user to conveniently learn background content related to video content. The processing method of the video file comprises the following steps: the method comprises the steps that a server obtains an original video file, and the original video file is used for providing video playing services for UE; the server carries out decomposition processing on the original video file to obtain a plurality of video clip files corresponding to the video track and a plurality of audio clip files corresponding to the audio track; the server extracts feature information of content items of the video clip file through image identification processing, wherein the feature information comprises character features, animal and plant features and non-living object features; the server acquires first introduction information corresponding to the characteristic information, wherein the first introduction information comprises content items for introducing the characteristic information to the user; the server packages the first introduction information into a first subtitle file, and the first subtitle file is used for loading at the time when the characteristic information appears in the original video file.

Description

Video file processing method and server

Technical Field

The present application relates to the field of videos, and in particular, to a method for processing a video file and a server.

Background

With the implementation of the policy of the network service for increasing speed and reducing cost, the video service can be better pushed to a large number of users, so that higher-quality video content can bring higher user traffic, and further higher traffic benefits can be brought to a video producer and a video platform.

When a User watches a video through a video Application (APP) on User Equipment (UE), if the User is interested in background content of the video, the User searches for related background content through a search engine on a browser APP to meet the information acquisition requirement of the User.

However, in the above practical scenario, due to the working mechanism of the search engine, the searched information is complicated and often contains a lot of advertisements, and before really useful background content is obtained, the user often needs to consume a lot of labor cost and time cost.

Disclosure of Invention

The application provides a video file processing method and a server, which are used for a user to conveniently learn background content related to video content.

In a first aspect, the present application provides a method for processing a video file, where the method includes:

the method comprises the steps that a server obtains an original video file, and the original video file is used for providing video playing services for UE;

the server carries out decomposition processing on the original video file to obtain a plurality of video clip files corresponding to the video track and a plurality of audio clip files corresponding to the audio track;

the server extracts feature information of content items of the video clip file through image identification processing, wherein the feature information comprises character features, animal and plant features and non-living object features;

the server acquires first introduction information corresponding to the characteristic information, wherein the first introduction information comprises content items for introducing the characteristic information to the user;

the server packages the first introduction information into a first subtitle file, and the first subtitle file is used for loading at the time when the characteristic information appears in the original video file.

With reference to the first aspect of the present application, in a first possible implementation manner of the first aspect of the present application, the method further includes:

the server extracts keywords of content items of the audio clip file through voice recognition processing;

the server acquires second introduction information corresponding to the keywords, wherein the second introduction information comprises content items used for introducing the keywords to the user;

and the server packages the second introduction information into a second subtitle file, and the second subtitle file is used for loading at the time when the keywords appear in the original video file.

With reference to the first possible implementation manner of the first aspect of the present application, in a second possible implementation manner of the first aspect of the present application, the obtaining, by the server, the first introduction information corresponding to the feature information includes:

identifying a characteristic Identification (ID) corresponding to the characteristic information by the server;

the method comprises the steps that a server positions a first encyclopedia entry corresponding to a feature ID from a locally preset encyclopedia entry list, wherein the encyclopedia entry list comprises corresponding relations between different encyclopedia entries and different feature IDs;

the server acquires first text content in the first encyclopedia entry from a locally preset encyclopedia information base and determines the first text content as first introduction information; alternatively, the first and second electrodes may be,

the server acquires second introduction information corresponding to the keyword, and the second introduction information comprises:

the server locates a second encyclopedia entry corresponding to the keyword from the encyclopedia entry list;

the server extracts second text content in the second encyclopedic entry from the encyclopedic information base and determines the second text content as second introduction information.

With reference to the first possible implementation manner of the first aspect of the present application, in a third possible implementation manner of the first aspect of the present application, the obtaining, by the server, the first introduction information corresponding to the feature information includes:

the server acquires first introduction information from a public database of a third party through a web crawler; alternatively, the first and second electrodes may be,

the server acquires first introduction information from an encyclopedia information base of a third party through an Application Program Interface (API); alternatively, the first and second electrodes may be,

the server acquires second introduction information from a public database of a third party through the web crawler; alternatively, the first and second electrodes may be,

and the server acquires second introduction information from the encyclopedia information base of the third party through the third party API.

With reference to the first possible implementation manner of the first aspect of the present application, in a fourth possible implementation manner of the first aspect of the present application, the encapsulating, by the server, the first introduction information into the first subtitle file includes:

the server identifies that the characteristic information appears in N first time periods of the original video file, wherein N is more than 2;

when N is an even number, the server packages the first introduction information into a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2) th first time period and the Nth first time period;

when N is an odd number, the server packages the first introduction information into a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2 th upward rounding) first time period and the Nth first time period; alternatively, the first and second electrodes may be,

the server packaging the second introduction information into the second subtitle file includes:

the server identifies that the keywords appear in M second time periods of the original video file, wherein M is larger than 2;

when M is an even number, the server packages the second introduction information into a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2) th second time period and the Mth second time period;

and when M is an odd number, the server packages the second introduction information into a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2 th upwards) th second time period and the Mth second time period.

With reference to the first possible implementation manner of the first aspect of the present application, in a fifth possible implementation manner of the first aspect of the present application, the method further includes:

the server sets a configuration strategy;

when a server receives a video playing request initiated by UE based on a target video file, the server issues an original video file, a configuration strategy and a first subtitle file to the UE; alternatively, the first and second electrodes may be,

when a server receives a video playing request initiated by UE based on a target video file, the server issues an original video file, a configuration strategy, a first subtitle file and a second subtitle file to the UE;

wherein, the configuration strategy comprises:

when the text content of the subtitle file is within 100 characters, the configuration strategy of the subtitle file is displayed in a floating mode in a transparent floating window mode;

when the text content of the subtitle file is more than 100 characters and within 300 characters, displaying the configuration strategy of the subtitle file in a scrolling display mode; and the number of the first and second groups,

when the text content of the subtitle file is above 300 characters, selecting whether to display the configuration strategy of the subtitle file or not by a subtitle control under the selection operation of a user.

With reference to the first possible implementation manner of the first aspect of the present application, in a sixth possible implementation manner of the first aspect of the present application, the configuring policy further includes:

deleting the configuration strategy of the target subtitle file under the selection operation of a user by using a subtitle control;

the method further comprises the following steps:

the server counts the times of deleting the target subtitle files locally by a plurality of UE;

and when the times within the preset statistical time are larger than the preset times, the server deletes the target subtitle file from the first subtitle file and/or the second subtitle file which are locally stored.

With reference to the first possible implementation manner of the first aspect of the present application, in a seventh possible implementation manner of the first aspect of the present application, the method further includes:

and the server embeds the first subtitle file and/or the second subtitle file into the original video file to obtain a target video file, and the target video file is used for providing video playing services for the UE.

With reference to the first aspect of the present application, in an eighth possible implementation manner of the first aspect of the present application, the acquiring, by the server, the original video file includes:

when the server detects that the original video file is uploaded to a local video database, the server acquires the original video file; alternatively, the first and second electrodes may be,

when the server receives a video playing request initiated by the UE based on the original video file, the server acquires the original video file from the video database according to the video file ID carried by the video playing request.

In a second aspect, the present application provides a server, comprising:

the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an original video file which is used for providing a video playing service for UE;

the decomposition unit is used for decomposing the original video file to obtain a plurality of video clip files corresponding to the video track and a plurality of audio clip files corresponding to the audio track;

the extraction unit is used for extracting the characteristic information of the content items of the video clip files through image recognition processing, wherein the characteristic information comprises character characteristics, animal and plant characteristics and non-living object characteristics;

the obtaining unit is further used for obtaining first introduction information corresponding to the characteristic information, and the first introduction information comprises content items used for introducing the characteristic information to the user;

and the packaging unit is used for packaging the first introduction information into a first subtitle file, and the first subtitle file is used for loading at the time when the characteristic information appears in the original video file.

With reference to the second aspect of the present application, in a first possible implementation manner of the second aspect of the present application, the extracting unit is further configured to extract keywords of content items of the audio clip file through a speech recognition process;

the acquisition unit is further used for acquiring second introduction information corresponding to the keywords, and the second introduction information comprises content items used for introducing the keywords to the user;

and the packaging unit is also used for packaging the second introduction information into a second subtitle file, and the second subtitle file is used for loading at the time when the keywords appear in the original video file.

With reference to the second possible implementation manner of the second aspect of the present application, in the second possible implementation manner of the second aspect of the present application, the obtaining unit is specifically configured to:

identifying a feature ID corresponding to the feature information;

positioning a first encyclopedia entry corresponding to the characteristic ID from a locally preset encyclopedia entry list, wherein the encyclopedia entry list comprises corresponding relations between different encyclopedia entries and different characteristic IDs;

acquiring first text content in a first encyclopedia entry from a locally preset encyclopedia information base, and determining the first text content as first introduction information; alternatively, the first and second electrodes may be,

the acquiring of the second introduction information corresponding to the keyword comprises:

positioning a second encyclopedia entry corresponding to the keyword from the encyclopedia entry list;

and extracting second text content in the second encyclopedic entry from the encyclopedic information base and determining the second text content as second introduction information.

With reference to the second possible implementation manner of the second aspect of the present application, in a third possible implementation manner of the second aspect of the present application, the obtaining unit is specifically configured to:

the server acquires first introduction information from an encyclopedia information base of a third party through a third party API; alternatively, the first and second electrodes may be,

an acquisition unit, specifically configured to:

With reference to the second possible implementation manner of the second aspect of the present application, in a fourth possible implementation manner of the second aspect of the present application, the encapsulation unit is specifically configured to:

identifying that the characteristic information appears in N first time periods of an original video file, wherein N is more than 2;

when N is an even number, packaging the first introduction information into a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2) th first time period and the Nth first time period;

when N is an odd number, packaging the first introduction information into a first subtitle file corresponding to the original video file in a 1 st first time period, a (N/2 th upward rounding) first time period and an Nth first time period; alternatively, the first and second electrodes may be,

the encapsulation unit is specifically configured to:

identifying M second time periods when the keywords appear in the original video file, wherein M is larger than 2;

when M is an even number, packaging the second introduction information into a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2) th second time period and the Mth second time period;

and when M is an odd number, packaging the second introduction information into a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2 th upwards) th second time period and the Mth second time period.

With reference to the second possible implementation manner of the second aspect of the present application, in a fifth possible implementation manner of the second aspect of the present application, the server further includes:

a setting unit for setting a configuration policy;

the system comprises an issuing unit, a processing unit and a sending unit, wherein the issuing unit is used for issuing an original video file, a configuration strategy and a first subtitle file to the UE when the server receives a video playing request initiated by the UE based on a target video file; alternatively, the first and second electrodes may be,

when a server receives a video playing request initiated by UE based on a target video file, issuing an original video file, a configuration strategy, a first subtitle file and a second subtitle file to the UE;

wherein, the configuration strategy comprises:

With reference to the second possible implementation manner of the second aspect of the present application, in a sixth possible implementation manner of the second aspect of the present application, the configuring policy further includes:

the server further comprises:

the statistical unit is used for counting the times of deleting the target subtitle files locally by the UE;

and the deleting unit is used for deleting the target subtitle file from the first subtitle file and/or the second subtitle file stored locally when the times in the preset statistical time are larger than the preset times.

With reference to the second possible implementation manner of the second aspect of the present application, in a seventh possible implementation manner of the second aspect of the present application, the server further includes:

and the embedded unit is used for embedding the first subtitle file and/or the second subtitle file into the original video file to obtain a target video file, and the target video file is used for providing a video playing service for the UE.

With reference to the second aspect of the present application, in an eighth possible implementation manner of the second aspect of the present application, the obtaining unit is specifically configured to:

when a server detects that an original video file is uploaded to a local video database, the original video file is obtained; alternatively, the first and second electrodes may be,

when a server receives a video playing request initiated by UE based on an original video file, the original video file is obtained from a video database according to a video file ID carried by the video playing request.

In a third aspect, the present application provides a server comprising a processor for implementing any of the steps of the video file processing method according to the first aspect as described above when executing a computer program stored in a memory.

In a fourth aspect, the present application provides a readable storage medium having stored thereon a computer program which, when executed by a processor, implements any of the steps of the method for processing a video file as described above in the first aspect.

According to the technical scheme, the method has the following advantages:

on the video platform side, a server acquires an original video file, extracts characteristic information of a content item from a video clip file obtained by audio and video decomposition of the original video file, and acquires corresponding introduction information and encapsulates the introduction information to a subtitle file on the basis of the characteristic information, so that when the original video file is played, the subtitle file is loaded at the time when the characteristic information appears, the introduction information corresponding to the characteristic information in the subtitle file can be visually presented to a user, the user can easily obtain background content corresponding to the characteristic information according to the introduction information without manual retrieval, and the video platform can provide video playing service with high-quality video content and rich video content for the user through the arrangement.

Drawings

Fig. 1 is a schematic flow chart illustrating a video file processing method provided in the present application;

fig. 2 is a schematic flow chart illustrating a video file processing method provided by the present application;

FIG. 3 is a schematic flow chart illustrating the process of obtaining feature information through encyclopedia entries provided in the present application;

fig. 4 is a schematic flow chart illustrating the encapsulation of introductory information into a subtitle file provided by the present application;

FIG. 5 is a schematic diagram of a server provided in the present application;

fig. 6 shows a schematic diagram of another structure of the server provided in the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terms "first," "second," and the like in the description and in the claims of the present application and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that the embodiments described herein may be practiced otherwise than as specifically illustrated or described herein. Moreover, the terms "comprises," "comprising," and any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not expressly listed or inherent to such process, method, article, or apparatus. The naming or numbering of the steps appearing in the present application does not mean that the steps in the method flow have to be executed in the chronological/logical order indicated by the naming or numbering, and the named or numbered process steps may be executed in a modified order depending on the technical purpose to be achieved, as long as the same or similar technical effects are achieved.

The division of the modules presented in this application is a logical division, and in practical applications, there may be another division, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed, and in addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some interfaces, and the indirect coupling or communication connection between the modules may be in an electrical or other similar form, which is not limited in this application. The modules or sub-modules described as separate components may or may not be physically separated, may or may not be physical modules, or may be distributed in a plurality of circuit modules, and some or all of the modules may be selected according to actual needs to achieve the purpose of the present disclosure.

First, before the present application is introduced, the video platform, the server, and the UE related to the present application will be described.

The video platform is used for providing video playing service for users in a network online platform mode, and the users can access the video platform through the UE to watch video programs.

The server is a hardware device bearing the video platform, the server is specifically one or more servers, and the video file processing method related to the application can be applied to part of the servers, so that video playing services with better video content and rich video content can be provided for users.

The UE may specifically be a terminal device such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a Personal Digital Assistant (PDA) and the like, which can access a video platform and play a video program, and further may specifically be a television and a set-top box, which can access the video platform, and is not limited herein.

Next, based on the description of the above scenario, a detailed description of a video file processing method provided in the present application is started.

Referring to fig. 1, fig. 1 shows a schematic flow chart of a processing method for providing a video file according to the present application, specifically, the processing method for providing a video file according to the present application may include the following steps:

step S101, a server acquires an original video file;

the original video file is used for providing video playing services for the UE.

It can be understood that the video platform is provided with a video database, and a server applying the video file processing method provided by the application can acquire an original video file from a local server or other servers deploying the video database.

The original video file can be understood as a video file which is not processed by the video file processing method provided by the application, and can also be directly understood as a video file to be processed by the video file processing method provided by the application.

Step S102, a server carries out decomposition processing on an original video file to obtain a plurality of video clip files corresponding to a video track and a plurality of audio clip files corresponding to an audio track;

after determining the original video file to be processed, the server carries out audio and video decomposition processing on the original video file, and obtains a plurality of corresponding video clip files and a plurality of corresponding audio clip files according to the division of the video track and the audio track on a time axis.

Step S103, the server extracts the characteristic information of the content item of the video clip file through image identification processing;

the characteristic information comprises character characteristics, animal and plant characteristics and non-living object characteristics;

each video clip file has a corresponding video picture, and therefore, the server can extract the feature information of the content item from the video picture through image recognition processing.

For example, facial features of the X parietal, leaf features of the background plant Y, toe features of the animal Z, or features of statues, paintings, and apparel on contours, colors, or three-dimensional structures may be extracted from the video frame.

Step S104, the server acquires first introduction information corresponding to the characteristic information;

wherein the first introduction information includes a content item for introducing the characteristic information to the user;

after identifying the feature information, the server may further perform a search process to obtain introduction information corresponding to the feature information.

For example, corresponding to the above, the basic information of the X-coordinate, the information such as the character experience or the performance, etc. are determined by the face feature of the X-coordinate, the information such as the academic name, the growing environment or the acquisition channel, etc. of the plant Y is determined by the leaf feature of the background plant Y, and the information such as the academic name, the growing environment or the visiting place, etc. of the animal Z is determined by the toe feature of the animal Z; or determining the information of the works names, creators, acquisition channels and the like of the statues, the paintings and the clothes according to the characteristic information of the statues, the paintings and the clothes in the outlines, the colors or the three-dimensional structures.

Step S105, the server packages the first introduction information into a first subtitle file;

the first subtitle file is used for loading at the time when the feature information appears in the original video file.

After the server obtains the introduction information corresponding to the feature information, the introduction information can be encapsulated into the subtitle file, so that when the UE plays the original video file, the introduction information is loaded at the time when the feature information appears in the original video file and is presented to the user, and the user can know the background content corresponding to the feature information.

It can be seen from the above that, at the video platform side, the server obtains the original video file, and extracts the feature information of the content item from the video clip file obtained by performing audio/video decomposition on the original video file, and on the basis of the feature information, obtains the corresponding introduction information and encapsulates the introduction information to the subtitle file, so that, when the original video file is played, the subtitle file is loaded at the time when the feature information appears, the introduction information corresponding to the feature information in the subtitle file can be visually presented to the user, the user can easily learn the background content corresponding to the feature information according to the introduction information without manual retrieval, and through the setting, the video platform can provide video playing service with better quality video content and rich video content for the user.

With continued reference to fig. 2, fig. 2 shows another flow chart of the video file processing method provided by the present application, specifically, the video file processing method provided by the present application may include the following steps:

step S201, a server acquires an original video file;

Step S202, the server carries out decomposition processing on the original video file to obtain a plurality of video clip files corresponding to the video track and a plurality of audio clip files corresponding to the audio track;

it is understood that step 201 and step 202 are repeated in step S101 and step S102 in the embodiment corresponding to fig. 1, and detailed description thereof is omitted here.

Step 203, the server extracts keywords of the content items of the audio clip files through voice recognition processing;

it should be understood that, the processing method of the video file provided by the application can extract the keywords from the audio clip file on the basis of extracting the feature information from the video clip file.

The keywords can be preset as names of human beings, animals and plants, and non-living objects, and can be even preset as words such as adjectives, phrases, and sentences.

Step 204, the server acquires second introduction information corresponding to the keyword;

wherein the second introduction information includes a content item for introducing the keyword to the user.

Similar to step S104 in the embodiment corresponding to fig. 1, after the keyword is identified, the server may further perform a search process to obtain introduction information corresponding to the keyword.

In step 205, the server encapsulates the second introduction information into the second subtitle file.

And the second subtitle file is used for loading at the time when the keywords appear in the original video file.

After the server obtains the introduction information corresponding to the keywords, the introduction information can be encapsulated into the subtitle file, so that when the UE plays the original video file, the introduction information is loaded at the time when the keywords appear in the original video file and is displayed to a user, and the user can know the background content corresponding to the feature information.

It should be understood that, for convenience of description, the descriptions of extracting the feature information from the video clip file, obtaining the first introduction information corresponding to the feature information, and packaging the first introduction information into the first subtitle file are omitted in the description of the corresponding embodiment of fig. 2.

The video file processing method provided by the present application is explained based on the feature information corresponding to the video clip file and the keywords of the audio clip file.

In some embodiments, the feature information and the introduction information corresponding to the keyword may be obtained through encyclopedia entries.

Referring to fig. 3, fig. 3 is a schematic flow chart illustrating a process of obtaining feature information through an encyclopedia entry according to the present application, which specifically includes the following steps:

step S301, the server identifies a feature ID corresponding to the feature information;

it can be understood that different feature information and different feature IDs can be preset in the server, so that when the server identifies the feature information of the video clip file through image identification processing, the server can identify the corresponding feature ID, and then the introduction information is acquired through the feature ID, thereby greatly reducing the information processing amount required by the server to acquire the introduction information and reducing the load pressure of the server.

Step S302, the server positions a first encyclopedia entry corresponding to the characteristic ID from a locally preset encyclopedia entry list;

the encyclopedic entry list comprises corresponding relations between different encyclopedic entries and different feature IDs.

After identifying the feature ID corresponding to the feature information, the server can locate the corresponding encyclopedia entry from the local encyclopedia entry list.

Step S303, the server obtains the first text content in the first encyclopedia entry from a locally preset encyclopedia information base, and determines the first text content as the first introduction information.

It should be understood that the encyclopedic information base includes an information set, which is preset by the server, for introducing background contents (text contents of encyclopedic entries) of the feature information to the user, and the encyclopedic entry list is used as an index directory, in practical applications, it is convenient for the video platform to manage the content quality of the local encyclopedic entries, for example, related staff may conveniently perform operations such as backup, restoration, deletion, addition, update, or maintenance on the entry contents.

Similar to the above feature information, after identifying the keywords of the audio clip file, the server may directly locate the second encyclopedic entry corresponding to the keyword from the encyclopedic entry list, extract the second text content in the second encyclopedic entry from the encyclopedic information base, and determine the second text content as the second introduction information.

In some embodiments, the feature information and the introduction information corresponding to the keyword can be obtained through the outside of the video platform.

For example, the server may obtain the feature information and the introduction information corresponding to the keyword from the public database of the third party through a web crawler.

The server can search the feature information and the keywords from the public database of the third party through the web crawler, position the feature information and the text information related to the keywords in the public database of the third party, and extract the text information as introduction information.

Or, the server may also obtain the feature information and the introduction information corresponding to the keyword from an encyclopedia information base of a third party through a third party API.

The server and the encyclopedia information base of the third party can negotiate in advance, and a third party API and the authority limit of the third party API are preset between the server and the encyclopedia information base of the third party, so that the server can obtain the feature information and the introduction information corresponding to the keywords from the encyclopedia information base of the third party.

Similar to the embodiment corresponding to fig. 3, the feature information and the introduction information corresponding to the keyword may be obtained from the encyclopedic information base of the third party through an encyclopedic entry list preset locally in the server, which is not described herein again in detail.

In some embodiments, the process of packaging the introductory information into the subtitle file, referring to fig. 4, fig. 4 shows a schematic flowchart of packaging the introductory information into the subtitle file provided by the present application, which may specifically include the following steps:

step S401, a server identifies that characteristic information appears in N first time periods of an original video file;

where N is greater than 2, if N is even, step S402 is triggered, and if N is odd, step S403 is triggered.

It is understood that the time when the feature information appears in the original video file, or the time when the feature information appears in the video clip file, may be divided into different time periods.

The subtitle file with the correspondingly encapsulated introduction information can be set according to different time periods when the characteristic information appears.

Step S402, the server packages the first introduction information to a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2) th first time period and the Nth first time period;

in step S403, the server packages the first introduction information into the first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2 th rounding up) first time period, and the nth first time period.

When the feature information appears for more than 3 time periods, for the introduction information C packaged with the feature information a, the subtitle file C can select 3 of the time periods to play so as to play the feature information a under different conditions such as different scenes, different viewing angles or different time points, and a user can learn the background content of the feature information a in more detail.

In practical applications, the time slot count N in which the feature information a appears may be obtained by selecting 3 time slots according to the odd number and the even number in the steps S402 and S403, and packaging the subtitle file.

Similar to the embodiment corresponding to fig. 4, 3 time periods are selected for encapsulating the corresponding subtitle file according to the M second time periods when the keywords appear in the original video file, which is not described herein again.

Subtitles can be classified into embedded subtitles and outer subtitles.

In one aspect, in some embodiments, the embedded subtitles may be directly encapsulated in the original video file, such as:

When UE initiates a video playing request corresponding to an original video file or a target video file to a server, the server can issue the target video file to the UE, and the UE can also load a first subtitle file and/or a second subtitle file in the process of playing the target video file, so that a user can obtain the characteristic information and/or the background content corresponding to the keyword.

On the other hand, the subtitle file of the outer hanging type subtitle can adopt file formats such as srt, ass, smi, ssa or sub, and the like, and the UE can synchronously load the subtitle file when loading the original video file and output the background content corresponding to the learned feature information and/or the keywords to the user.

In some embodiments, for the subtitle file of the outer-hanging caption, the server may further set a configuration policy for configuring an output manner of the UE for the subtitle file of the outer-hanging caption.

Therefore, when the server receives a video playing request initiated by the UE based on the target video file, the server issues an original video file, a configuration strategy and a first subtitle file to the UE; alternatively, the first and second electrodes may be,

the configuration policy may include the following policies:

In some embodiments, to facilitate optimizing the content quality of the subtitle file, configuring the policy may further include:

and deleting the configuration strategy of the target subtitle file by using the subtitle control under the selection operation of the user.

Therefore, the video file processing method provided by the present application may further include:

and when the times are greater than the preset times within the preset statistical time, the server deletes the target subtitle file from the first subtitle file and/or the second subtitle file which are locally stored.

It can be understood that by tracking the way that a plurality of UEs delete the same subtitle file, a specific subtitle file with poor content quality can be effectively screened out and deleted on the server side, and the content quality of the subtitle file delivered by the server is improved on the whole layer.

In some embodiments, the uploading of the original video file or the corresponding playing request may be used as a trigger condition for the server to acquire the original video file.

It can be understood that when the server applies the video file processing method provided by the present application, the server may monitor a video database local to the server or local to the video platform, and if it is monitored that a new video file is uploaded to the video database, the new video file may be identified as an original video file, and the video file processing method provided by the present application may be triggered.

Therefore, when the server detects that the original video file is uploaded to a local video database, the server acquires the original video file.

Or, when the server applies the processing method of the video file provided by the present application, it may also wait for the UE to initiate a video playing request, and determine, for the video file corresponding to the request, that the video file corresponding to the video file ID carried in the request is the original video file, and apply the processing method of the video file provided by the present application, so as to reduce the video file processing amount of the processing method of the video file provided by the present application.

It should be understood that there are video files with small click amount and play amount on a video platform, and if all the video files on the video platform are applied to the processing method of the video files provided by the present application in advance, obviously, a large amount of unnecessary resource costs, such as time cost, computing resources, etc., need to be invested in a short time.

Therefore, when the server receives a video playing request initiated by the UE based on the original video file, the server acquires the original video file from the video database according to the video file ID carried by the video playing request.

The above is an introduction of the video file processing method provided by the present application, and the following is a start of an introduction of the server provided by the present application.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a server provided by the present application, and in particular, the server may include the following structure:

an obtaining unit 501, configured to obtain an original video file, where the original video file is used to provide a video playing service to a UE;

a decomposition unit 502, configured to decompose an original video file to obtain multiple video clip files corresponding to a video track and multiple audio clip files corresponding to an audio track;

an extracting unit 503 configured to extract feature information of the content item of the video clip file through image recognition processing, the feature information including a character feature, an animal feature, a plant feature, and a non-living object feature;

the obtaining unit 501 is further configured to obtain first introduction information corresponding to the feature information, where the first introduction information includes a content item for introducing the feature information to a user;

an encapsulating unit 504, configured to encapsulate the first introduction information into a first subtitle file, where the first subtitle file is used to load at a time when the feature information appears in the original video file.

In some embodiments, the extracting unit 503 is further configured to extract keywords of the content item of the audio clip file through a speech recognition process;

the obtaining unit 501 is further configured to obtain second introduction information corresponding to the keyword, where the second introduction information includes a content item for introducing the keyword to the user;

the packaging unit 504 is further configured to package the second introduction information into a second subtitle file, where the second subtitle file is used to be loaded at a time when the keyword appears in the original video file.

In some embodiments, the obtaining unit 501 is specifically configured to:

identifying a feature ID corresponding to the feature information;

In some embodiments, the obtaining unit 501 is specifically configured to:

the obtaining unit 501 is specifically configured to:

In some embodiments, the encapsulation unit 504 is specifically configured to:

the encapsulation unit 504 is specifically configured to:

In some embodiments, the server further comprises:

a setting unit 505 for setting a configuration policy;

the issuing unit 506 is configured to issue the original video file, the configuration policy, and the first subtitle file to the UE when the server receives a video playing request initiated by the UE based on the target video file; alternatively, the first and second electrodes may be,

wherein, the configuration strategy comprises:

In some embodiments, configuring the policy further comprises:

the server further comprises:

a counting unit 507, configured to count the number of times that the target subtitle file is deleted locally by the multiple UEs;

a deleting unit 508, configured to delete the target subtitle file from the locally stored first subtitle file and/or second subtitle file when the number of times within the preset statistical time is greater than the preset number of times.

In some embodiments, the server further comprises:

an embedding unit 509, configured to embed the first subtitle file and/or the second subtitle file into the original video file to obtain a target video file, where the target video file is used to provide a video playing service for the UE.

In some embodiments, the obtaining unit 501 is specifically configured to:

Referring to fig. 6, fig. 6 is a schematic diagram illustrating another structure of a server provided in the present application, specifically, the server provided in the present application includes a processor 601, where the processor 601 is configured to implement the steps of the video file processing method in any embodiment corresponding to fig. 1 to 4 when executing the computer program stored in the memory 602; alternatively, the processor 601 is configured to implement the functions of the units in the corresponding embodiment of fig. 5 when executing the computer program stored in the memory 602.

Illustratively, a computer program may be partitioned into one or more modules/units, which are stored in the memory 602 and executed by the processor 601 to accomplish the present application. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.

The server may include, but is not limited to, a processor 601, a memory 602. Those skilled in the art will appreciate that the illustration is merely an example of a computer apparatus and does not constitute a limitation of a server, and may include more or less components than those illustrated, or combine some components, or different components, for example, a server may further include an input output device, a network access device, a bus, etc., and the processor 601, the memory 602, the input output device, the network access device, etc., are connected via the bus.

The Processor 601 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center of the computer device and the various interfaces and lines connecting the various parts of the overall computer device.

The memory 602 may be used for storing computer programs and/or modules, and the processor 601 may implement various functions of the computer apparatus by executing or executing the computer programs and/or modules stored in the memory 602 and calling data stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the cellular phone, etc. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

The present application further provides a readable storage medium, on which a computer program is stored, and the computer program, when executed by a processor, implements a method for processing a video file as in any of the embodiments corresponding to fig. 1 to 4.

It will be appreciated that the integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be substantially implemented or contributed to by the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the server and the units thereof described above may refer to the descriptions of the video file processing methods in the embodiments corresponding to fig. 1 to fig. 4, and details are not described herein again.

To sum up, according to the processing method of the video file and the server provided by the application, at the video platform side, the server acquires the original video file, extracts the feature information of the content item from the video clip file obtained by performing audio and video decomposition on the original video file, and acquires the corresponding introduction information and encapsulates the introduction information to the subtitle file on the basis of the feature information, so that when the original video file is played, the subtitle file is loaded at the time when the feature information appears, the introduction information corresponding to the feature information in the subtitle file can be visually presented to a user, the user can easily learn the background content corresponding to the feature information according to the introduction information without manual retrieval, and through the setting, the video platform can provide video playing service with better quality video content and rich video content for the user.

In the embodiments provided in the present application, it should be understood that the disclosed server and its units may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units is only one logical division, and other divisions may be realized in practice, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The above embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for processing a video file, the method comprising:

a server acquires an original video file, wherein the original video file is used for providing a video playing service for UE;

the server extracts feature information of content items of the video clip files through image recognition processing, wherein the feature information comprises character features, animal and plant features and non-living object features;

the server acquires first introduction information corresponding to the characteristic information, wherein the first introduction information comprises content items used for introducing the characteristic information to a user;

the server packages the first introduction information into a first subtitle file, wherein the first subtitle file is used for loading when the characteristic information appears in the original video file;

the server extracts keywords of the content items of the audio clip files through voice recognition processing;

the server acquires second introduction information corresponding to the keyword, wherein the second introduction information comprises content items used for introducing the keyword to a user;

the server packages the second introduction information into a second subtitle file, wherein the second subtitle file is used for loading at the time when the keyword appears in the original video file, and the step of acquiring the first introduction information corresponding to the characteristic information by the server comprises the following steps:

the server identifies a characteristic ID corresponding to the characteristic information;

the server locates a first encyclopedia entry corresponding to the feature ID from a locally preset encyclopedia entry list, wherein the encyclopedia entry list comprises corresponding relations between different encyclopedia entries and different feature IDs;

the server acquiring second introduction information corresponding to the keyword comprises the following steps:

the server extracts second text content in the second encyclopedic entry from the encyclopedic information base and determines the second text content as second introduction information, wherein the step of acquiring the first introduction information corresponding to the characteristic information by the server comprises the following steps:

the server acquires the first introduction information from a public database of a third party through a web crawler; alternatively, the first and second electrodes may be,

the server acquires the first introduction information from an encyclopedia information base of a third party through a third party API; alternatively, the first and second electrodes may be,

the server acquires the second introduction information from the public database of the third party through the web crawler; alternatively, the first and second electrodes may be,

the server acquires the second introduction information from an encyclopedia information base of the third party through the third party API, and the step of packaging the first introduction information into the first subtitle file by the server comprises the following steps:

when N is an even number, the server packages the first introduction information to a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2) th first time period and the Nth first time period;

when N is an odd number, the server packages the first introduction information to a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2 th upward rounding) first time period and the Nth first time period; alternatively, the first and second electrodes may be,

the server packaging the second introduction information into a second subtitle file comprises:

when M is an even number, the server packages the second introduction information to a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2) th second time period and the Mth second time period;

when M is an odd number, the server encapsulates the second introduction information into a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2 th upward rounding) second time period, and the mth second time period.

2. The method of claim 1, further comprising:

the server sets a configuration policy;

when the server receives a video playing request initiated by the UE based on a target video file, the server issues the original video file, the configuration strategy and the first subtitle file to the UE; alternatively, the first and second electrodes may be,

when the server receives a video playing request initiated by the UE based on the target video file, the server issues the original video file, the configuration strategy, the first subtitle file and the second subtitle file to the UE;

wherein the configuration policy comprises:

when the text content of the subtitle file is within 100 characters, displaying the configuration strategy of the subtitle file in a floating mode in a transparent floating window mode;

when the text content of the subtitle file is above 300 characters, selecting whether to display the configuration strategy of the subtitle file or not by using a subtitle control under the selection operation of a user.

3. The method of claim 2, wherein the configuration policy further comprises:

deleting the configuration strategy of the target subtitle file under the selection operation of the user by using the subtitle control;

the method further comprises the following steps:

4. The method of claim 1, further comprising:

and the server embeds the first subtitle file and/or the second subtitle file into the original video file to obtain a target video file, and the target video file is used for providing a video playing service for the UE.

5. The method of claim 1, wherein the server obtaining the original video file comprises:

and when the server receives a video playing request initiated by the UE based on the original video file, the server acquires the original video file from the video database according to the video file ID carried by the video playing request.

6. A server, comprising:

an extraction unit configured to extract feature information of a content item of the video clip file by image recognition processing, the feature information including character features, animal and plant features, and non-living object features;

an obtaining unit configured to obtain first introduction information corresponding to the feature information, the first introduction information including a content item for introducing the feature information to a user;

a packaging unit, configured to package the first introduction information into a first subtitle file, where the first subtitle file is used to load the feature information at a time when the feature information appears in the original video file;

the extraction unit extracts keywords of the content items of the audio clip files through voice recognition processing;

the acquisition unit acquires second introduction information corresponding to the keyword, wherein the second introduction information comprises content items for introducing the keyword to a user;

the packaging unit packages the second introduction information into a second subtitle file, and the second subtitle file is used for loading at the time when the keyword appears in the original video file;

the acquiring unit acquires first introduction information corresponding to the feature information, and the acquiring unit includes:

the acquiring unit is used for positioning a first encyclopedia entry corresponding to the feature ID from a locally preset encyclopedia entry list, wherein the encyclopedia entry list comprises corresponding relations between different encyclopedia entries and different feature IDs;

the obtaining unit obtains first text content in the first encyclopedia entry from a locally preset encyclopedia information base and determines the first text content as the first introduction information; alternatively, the first and second electrodes may be,

the acquiring unit acquires second introduction information corresponding to the keyword, and the acquiring unit includes:

the obtaining unit is used for positioning a second encyclopedia entry corresponding to the keyword from the encyclopedia entry list;

the acquisition unit extracts second text content in the second encyclopedic entry from the encyclopedic information base and determines the second text content as second introduction information;

the first introduction information corresponding to the characteristic information of the obtaining unit comprises:

the acquisition unit acquires the first introduction information from a public database of a third party through a web crawler; alternatively, the first and second electrodes may be,

the obtaining unit obtains the first introduction information from an encyclopedia information base of a third party through a third party API; alternatively, the first and second electrodes may be,

the acquisition unit acquires the second introduction information from the public database of the third party through the web crawler; alternatively, the first and second electrodes may be,

the obtaining unit obtains the second introduction information from an encyclopedia information base of the third party through the third party API;

the packaging unit packaging the first introduction information into a first subtitle file comprises:

the packaging unit identifies that the characteristic information appears in N first time periods of the original video file, wherein N is greater than 2;

when N is an even number, the encapsulating unit encapsulates the first introduction information into a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2) th first time period, and the nth first time period;

when N is an odd number, the encapsulating unit encapsulates the first introduction information into a first subtitle file corresponding to the original video file in the 1 st first time period, the (N/2 th upward rounding) first time period, and the nth first time period; alternatively, the first and second electrodes may be,

the packaging unit packaging the second introduction information into a second subtitle file includes:

the packaging unit identifies that the keywords appear in M second time periods of the original video file, wherein M is larger than 2;

when M is an even number, the encapsulating unit encapsulates the second introduction information into a second subtitle file corresponding to the original video file in the 1 st second time period, the (M/2) th second time period, and the mth second time period;

when M is an odd number, the encapsulating unit encapsulates the second introduction information to a second subtitle file corresponding to the original video file in the 1 st second period, the (M/2 th up-rounding) th second period, and the mth second period.