CN109803173B

CN109803173B - Video transcoding method and device and storage device

Info

Publication number: CN109803173B
Application number: CN201711139159.1A
Authority: CN
Inventors: 王俊
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2017-11-16
Filing date: 2017-11-16
Publication date: 2022-08-19
Anticipated expiration: 2037-11-16
Also published as: CN109803173A

Abstract

The embodiment of the invention discloses a video transcoding method, a device and a storage device, wherein the video transcoding method comprises the following steps: analyzing a source video file to obtain video metadata; determining a track matched with pre-configured language priority information according to track description information of each track included in the video metadata, and setting the matched track to be a transcoding track; determining a subtitle matched with pre-configured language priority information according to subtitle description information of each subtitle included in the video metadata, and transcoding the matched subtitle; and transcoding the source video file according to the setting. The embodiment of the invention can automatically realize video transcoding, improve transcoding efficiency and save labor cost.

Description

Video transcoding method and device and storage device

Technical Field

The embodiment of the invention relates to the technical field of video processing, in particular to a video transcoding method, a video transcoding device and storage equipment.

Background

Due to various reasons such as different requirements of users, different used playing terminals, different network bandwidths, and the like, transcoding a source video file is required to obtain a video file meeting the requirements. The existing video transcoding scheme is mainly realized in a manual mode. For example, for a multi-track multi-subtitle source video file, a video player is required to play the source video file in advance to confirm that each track identifies an identified track and each subtitle identifies an identified subtitle, and then manually set a transcoded track and a transcoded subtitle. For a large-batch film source transcoding scene, if manual processing is adopted, transcoding tracks and transcoding subtitles need to be set for a source video file one by one, the workload is huge, errors are easy to occur, and the transcoding efficiency is low, and the cost is high.

Disclosure of Invention

In view of this, embodiments of the present invention provide a video transcoding method, an apparatus, and a storage device, which can automatically implement video transcoding, improve transcoding efficiency, and save labor cost.

The video transcoding method provided by the embodiment of the invention comprises the following steps:

analyzing a source video file to obtain video metadata;

determining an audio track matched with pre-configured language priority information according to audio track description information of each audio track included in the video metadata, and setting the matched audio track as a transcoding audio track;

determining a subtitle matched with pre-configured language priority information according to subtitle description information of each subtitle included in the video metadata, and transcoding the matched subtitle;

and transcoding the source video file according to the setting.

The video transcoding device provided by the embodiment of the invention comprises:

the analysis unit is used for analyzing the source video file to obtain video metadata;

the audio track setting unit is used for determining an audio track matched with the pre-configured language priority information according to the audio track description information of each audio track included in the video metadata and setting a transcoding audio track for the matched audio track;

the caption setting unit is used for determining the caption matched with the pre-configured language priority information according to the caption description information of each caption in the video metadata and setting the matched caption into a transcoding caption;

and the transcoding unit is used for transcoding the source video file according to the setting.

The embodiment of the invention also provides a storage device, wherein the storage device is used for storing a plurality of instructions, and the instructions are suitable for being loaded by a processor and executing the video transcoding method.

In the embodiment of the invention, video metadata are obtained by analyzing a source video file, audio tracks matched with preset language priority information are determined according to audio track description information of each audio track included in the video metadata, and the matched audio tracks are set to be transcoded; determining a subtitle matched with pre-configured language priority information according to subtitle description information of each subtitle included in the video metadata, and transcoding the matched subtitle; and transcoding the source video file according to the setting. In the embodiment of the invention, a matching method can be adopted, and the transcoding track and the transcoding subtitle are automatically set by utilizing the video metadata and the pre-configured language priority information, so that automatic transcoding is realized.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.

Fig. 1 is a schematic view of a scene of a video transcoding method according to an embodiment of the present invention;

fig. 2 is a schematic flowchart of a video transcoding method according to an embodiment of the present invention;

fig. 3a is another schematic flow chart of a video transcoding method according to an embodiment of the present invention;

fig. 3b is a schematic flowchart of a method for setting a transcoded soundtrack according to an embodiment of the present invention;

fig. 3c is a schematic flowchart of a method for setting transcoded subtitles according to an embodiment of the present invention;

fig. 3d is a schematic flowchart of a watermark setting method according to an embodiment of the present invention;

fig. 4 is another schematic flowchart of a video transcoding method according to an embodiment of the present invention;

fig. 5a is a schematic structural diagram of a video transcoding apparatus according to an embodiment of the present invention;

fig. 5b is a schematic structural diagram of a video transcoding apparatus according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of another video transcoding apparatus according to an embodiment of the present invention.

Detailed Description

Referring to the drawings, wherein like reference numbers refer to like elements, the principles of the present application are illustrated as being implemented in a suitable computing environment. The following description is based on illustrated embodiments of the application and should not be taken as limiting the application with respect to other embodiments that are not detailed herein.

In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This action transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in the foregoing text and are not meant to be limiting, those of ordinary skill in the art will appreciate that various of the steps and operations described below may be implemented in hardware.

The term module, as used herein, may be considered a software object executing on the computing system. The various components, modules, engines, and services described herein may be viewed as objects implemented on the computing system. The apparatus and method described herein may be implemented in software, but may also be implemented in hardware, and are within the scope of the present application.

The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or modules is not limited to only those steps or modules recited, but rather, some embodiments include additional steps or modules not recited, or inherent to such process, method, article, or apparatus.

Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein may be combined with other embodiments.

An execution main body of the video transcoding method can be the video transcoding device provided by the embodiment of the application or an electronic device integrated with the video transcoding device, wherein the video transcoding device can be realized in a hardware or software mode; the electronic device may be a server, a computer, or the like.

Referring to fig. 1, fig. 1 is a schematic view of an application scenario of a video transcoding method provided in an embodiment of the present application, where a video transcoding device is taken as a server, and the server can parse a source video file to be transcoded to obtain video metadata; determining a track matched with pre-configured language priority information according to track description information of each track included in the video metadata, and setting the matched track to be a transcoding track; determining a subtitle matched with pre-configured language priority information according to subtitle description information of each subtitle included in the video metadata, and transcoding the matched subtitle; and transcoding the source video file according to the setting. Thereafter, when the client wants to obtain the video file, a video obtaining request may be sent to the server, and after receiving the video obtaining request, the server may send the transcoded video to the client.

The following are detailed descriptions. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.

In this embodiment, the video transcoding device will be described in terms of a specific flow of the video transcoding method, as shown in fig. 2, which may be as follows:

step 201, analyzing a source video file to obtain video metadata.

The source video files refer to video files to be transcoded, and each source video file can comprise audio tracks of multiple languages and subtitles of multiple languages besides video data. Each source video file itself has video metadata, which is mainly data describing the attributes of the source video file, such as: author, name, copyright, version, format, authoring time, language, etc. of a video file, and video metadata is generally stored in a key-value format.

The audio track and the subtitle are stored in the source video file in an abstract format of a stream (stream), the audio track of one language is a stream, the subtitle of one language is a stream, information of each stream can be described in video metadata, detailed description information of each stream is generally stored in a title (title) field in the video metadata, and the audio track and the subtitle as well as the language information of the audio track and the subtitle are generally described in the industry.

In a specific implementation, the source video file may be parsed as follows to obtain video metadata:

opening an analysis handle, executing stream analysis, and acquiring the number of streams contained in a source video file;

reading a stream handle, acquiring a stream type, and further extracting the content of a video metadata title field;

and repeating the steps until all the streams are analyzed.

After all streams are analyzed, the video metadata can be obtained to include audio track description information of each audio track and subtitle description information of each subtitle, each audio track description information includes language information of the audio track, and each subtitle description information includes language information of the subtitle.

Step 202, determining the audio track matched with the pre-configured language priority information according to the audio track description information of each audio track included in the video metadata, and setting the matched audio track as a transcoding audio track.

The language priority information may be configured in advance according to the transcoding requirement, and the language priority information may indicate a priority language, for example, the language priority information may be configured as follows: cantonese or english, etc.

During configuration, language priority information can be configured for each source video file separately, for example: configure "yue language first" for source video file 1; it is also possible to configure a global language priority information for all source video files. For a large-batch film source transcoding scene, the transcoding efficiency can be further improved by the method for globally configuring language priority information.

In a specific implementation, each language priority information is further configured with a corresponding keyword, and the configured keywords may include one or more keywords, for example:

"cantonese preference", the corresponding keywords may include at least one of: cantonese, Guangdong, white, Guangdong, Cantonese;

"english language first," the corresponding keyword may include at least one of: english, English.

After a source video file is analyzed to obtain video metadata, a track matching with pre-configured language priority information can be determined according to track description information of each track included in the video metadata, and the matched track is set as a transcoding track, wherein a specific setting method of the transcoding track can be as follows:

acquiring a keyword corresponding to the language priority information;

detecting audio track description information matched with the keywords;

and setting the audio track corresponding to the audio track description information matched with the key words as the transcoding audio track.

Specifically, when detecting track description information matching the keywords, it may be detected whether each track description information includes the keywords, and if so (i.e., includes at least one keyword), the track description information is considered to match the keywords, otherwise (i.e., none of the keywords is included), the track description information is considered to not match the keywords.

In addition, to improve the matching success rate and further improve the transcoding efficiency, the priority order of the languages may also be indicated in the language priority information, for example: guangdong language > English, i.e., English has the lowest priority, and the language of each priority is configured with corresponding keywords. When a transcoding track is set, the track description information matched with the language information with the highest priority level can be searched first, if the track description information matched with the language information with the highest priority level does not exist, the track description information matched with the language information with the second priority level is searched, and the process is repeated until the matched track description information is found.

Step 203, determining the subtitle matched with the pre-configured language priority information according to the subtitle description information of each subtitle included in the video metadata, and transcoding the subtitle to set the matched subtitle.

The specific setting method of the transcoded subtitles can be as follows:

acquiring a keyword corresponding to the language priority information;

detecting caption description information matched with the keywords;

and setting the subtitle corresponding to the subtitle description information matched with the keyword as a transcoding subtitle.

Specifically, when detecting the subtitle description information that matches the keyword, it may be detected whether each piece of subtitle description information includes the keyword, and if so (i.e., includes at least one keyword), the piece of subtitle description information is considered to match the keyword, otherwise (i.e., none of the keywords is included), the piece of subtitle description information is considered to not match the keyword.

The above description is a setting method of transcoding subtitles when a source video file has subtitles, and the subtitles generated according to the setting method are soft subtitles. In addition, if the video metadata does not have subtitle description information, that is, the source video file does not have subtitles, the same-name subtitle file of the source video file can be searched, the searched same-name subtitle file and the source video file are loaded and suppressed in the same group of data, and the hard subtitles are generated through transcoding. Namely, when the source video file has no subtitles, the subtitles can be automatically obtained during transcoding without manual participation.

The mainstream subtitle file generally includes two formats: srt (Sub Rip text) format and ass (advanced Sub Station alpha) format, while hard subtitle generation supports only the ass format. Therefore, if the searched subtitle file with the same name is in the srt format, the subtitle file in the srt format needs to be converted into the subtitle file in the ass format, and the subtitle file in the ass format needs to be loaded together with the source video file.

In addition, in order to further meet the user requirements, the name of the source video file can be set as a watermark and added to the transcoded video, and the specific setting method of the watermark can be as follows:

generating watermark characters according to file name description information included in the video metadata, wherein the watermark characters can be the name of a source video file;

confirm the preset position, the preset position can be set up according to actual demand, for example: the preset position may be the top left, top right, or top right of the video.

And setting the generated watermark characters at a preset position of the video.

And step 204, transcoding the source video file according to the setting.

The video transcoding means that a video code stream which is compressed and encoded is converted into another video code stream so as to adapt to different network bandwidths, different terminal processing capacities and different user requirements. Specifically, in this embodiment, a source video file with multiple tracks and multiple subtitles is transcoded into a track of a fixed language and a subtitle of a fixed language and output.

In addition, after the source video file is analyzed to obtain the video metadata, the audio track list and the subtitle list can be generated and displayed according to the video metadata. That is, a track list may be generated and displayed according to the track description information of each track included in the video metadata, and the track list may include language information of each track; and generating and displaying a subtitle list according to the subtitle description information of each subtitle included in the video metadata, wherein the subtitle list includes language information of each subtitle. The track list and the subtitle list may be displayed in a drop-down list manner.

In the previous step 202, if no track description information matching the keyword is detected, the first track in the generated track list may be set as the transcoding track by default, and likewise, in the step 203, if no subtitle description information matching the keyword is detected, the first subtitle in the generated subtitle list may be set as the transcoding subtitle by default.

Alternatively, when track description information matching the keyword is not detected, prompt information may be generated to prompt a user to manually select a track from the generated track list and then set a transcoded track according to the user's selection; or when the subtitle description information matched with the keywords is not detected, prompt information can be generated to prompt a user to manually select subtitles from the generated subtitle list, and then the subtitles are transcoded according to the selection setting of the user.

When the track list and the subtitle list are generated, the audio track selected by the user from the track list can be directly obtained without configuring language priority information, and the audio track selected by the user is set as a transcoding audio track; and acquiring the subtitle selected by the user from the subtitle list, and setting the subtitle selected by the user as the transcoding subtitle.

According to the embodiment of the application, the audio track list and the subtitle list are generated according to the video metadata, the generated audio track list comprises language information of each audio track, and the generated subtitle list comprises language information of each subtitle, so that a user can visually see each audio track and each subtitle included in a source video file and accurately select the audio track and the subtitle of a required language, the problem that in the existing manual transcoding implementation scheme, due to the fact that audio track and subtitle identification is not visual, the language is not displayed, the user needs to use a video player to play a source video file in advance to confirm the languages of the audio track and the subtitle is solved, and user operation is simplified.

In the embodiment, a matching method is adopted, the transcoding audio track and the transcoding subtitle are automatically set by utilizing the video metadata and the pre-configured language priority information, and the watermark is automatically set according to the video metadata, so that automatic transcoding is realized.

As shown in fig. 3a, a specific flow of the video transcoding method of the present embodiment may be as follows:

and 301, configuring language priority information according to the operation of the user.

In this embodiment, the configured language priority information is taken as global language priority information as an example, specifically, the language priority information may be configured according to a transcoding requirement, and a preferred language may be specified in the configured language priority information, for example, the language priority information may be configured as: guangdong language first or English first, etc. Each language priority information is configured with a corresponding keyword, and the configured keywords may include one or more keywords, for example, the configured language priority information and the corresponding keywords may be as shown in table 1 below:

language priority information	Key word
		Yue language priority	Guangdong language, Bai language, Yue and Cantonese
Priority of English	English, English

TABLE 1

Step 302, parsing the source video file to obtain video metadata.

The source video files refer to video files to be transcoded, and each source video file can comprise audio tracks of multiple languages and subtitles of multiple languages besides video data. Each source video file itself has video metadata, which is mainly data describing the attributes of the source video file, such as: author, name, copyright, version, format, authoring time, language, etc. of a video file, video metadata is generally stored in a key-value format.

The audio track and the subtitle are stored in an abstract format of a stream (stream) in a source video file, the audio track of one language is one stream, the subtitle of one language is one stream, information of each stream can be described in video metadata, detailed description information of each stream is generally stored in a title (title) field in the video metadata, and the audio track and the subtitle are generally described in detail in the industry and the audio track and the subtitle.

and repeating the steps until all the streams are analyzed.

The results of the analysis are shown in Table 2 below:

stream indexing	Type of stream	title field content
			1	Audio track	Guangdong tongue 64K
2	Audio track	English 64K
			3	Caption	Chinese simplified form
4	Caption	English
			5	Caption	Chinese-English double display

TABLE 2

After all streams are analyzed, the video metadata can be obtained, wherein the video metadata comprises audio track description information of each audio track and subtitle description information of each subtitle, each audio track description information comprises language information of the audio track, and each subtitle description information comprises language information of the subtitle.

After the source video file is analyzed to obtain the video metadata, an audio track list and a subtitle list can be generated and displayed according to the video metadata. That is, a track list may be generated and displayed according to the track description information of each track included in the video metadata, and the track list may include language information of each track; and generating and displaying a subtitle list according to the subtitle description information of each subtitle included in the video metadata, wherein the subtitle list includes language information of each subtitle. The track list and the subtitle list may be displayed in a pull-down list manner.

Step 303, setting a transcoding audio track according to the preconfigured language priority information and the audio track description information of each audio track included in the video metadata.

The specific setting method of the transcoded audio track can be seen in fig. 3b, which includes the following steps:

3031, obtaining keywords corresponding to language priority information;

step 3032, detecting whether audio track description information matched with the keywords exists in the video metadata, if so, executing step 3033, otherwise, executing step 3034;

specifically, when detecting the track description information matching the keywords, it may be detected whether each track description information includes the keywords, and if so (i.e., includes at least one keyword), the track description information is considered to match the keywords, otherwise (i.e., none of the keywords is included), the track description information is considered not to match the keywords.

Step 3033, setting the audio track corresponding to the audio track description information matched with the keyword as a transcoding audio track;

step 3034, setting the first track in the list of tracks as the transcoded track.

In addition, in step 3032, if no track description information matching the keyword is detected, prompt information may be generated to prompt the user to manually select a track from the generated track list, and then the user may set the transcoding track according to the selection.

Step 304, determining whether the video metadata includes subtitle description information, if so, performing step 305, otherwise, performing step 306.

And 305, setting the transcoded subtitles according to the preconfigured language priority information and the subtitle description information of each subtitle included in the video metadata.

As shown in fig. 3c, a specific setting method of the transcoded subtitles includes the following steps:

3051, acquiring keywords corresponding to language priority information;

step 3052, detecting whether caption description information matched with the keywords exists in the video metadata, if so, executing step 3053, otherwise, executing step 3054;

specifically, when detecting the subtitle description information matching the keyword, it may be detected whether each subtitle description information includes the keyword, and if so (including at least one keyword), the subtitle description information is considered to be matched with the keyword, otherwise (none of them), the subtitle description information is considered to be not matched with the keyword.

Step 3053, setting the subtitle corresponding to the subtitle description information matched with the keyword as a transcoded subtitle;

step 3054, setting the first subtitle in the subtitle list as the transcoded subtitle.

In addition, in step 3052, if the subtitle description information matching the keyword is not detected, prompt information may be generated to prompt the user to manually select a subtitle from the generated subtitle list, and then the subtitle is transcoded according to the selection setting of the user.

And step 306, loading the subtitle file with the same name of the source video file.

If the video metadata does not have the caption description information, namely the source video file does not have the caption, the same-name caption file of the source video file can be searched, the searched same-name caption file and the source video file are loaded and pressed in the same group of data, and the hard caption is generated through transcoding. Namely, when the source video file has no subtitles, the subtitles can be automatically obtained during transcoding without manual participation.

And 307, generating watermark characters according to the file name description information included in the video metadata, and setting the generated watermark characters at a preset position of the video.

The specific setting method of the watermark text can be seen from fig. 3d, and includes the following steps:

step 3071, generating watermark characters according to file name description information included in the video metadata;

the generated watermark text may be the name of the source video file.

Step 3072, determining a preset position;

the preset position can be set according to actual requirements, for example: the preset position may be the top left, top right, or top right of the video.

And 3073, setting the generated watermark characters at a preset position of the video.

And 308, transcoding the source video file according to the setting.

The video transcoding is to convert a video code stream which has been compressed and encoded into another video code stream so as to adapt to different network bandwidths, different terminal processing capabilities and different user requirements. Specifically, in this embodiment, a source video file with multiple tracks and multiple subtitles is transcoded into a track of a fixed language and a subtitle of a fixed language and output.

In the embodiment, a matching method is adopted, the transcoding track and the transcoding subtitle are automatically set by utilizing the video metadata and the pre-configured language priority information, and the watermark is automatically set according to the video metadata, so that automatic transcoding is realized.

For a large-batch film source transcoding scene, the method of the embodiment is adopted, and only one global language priority information needs to be set, so that the operation is greatly simplified, and the transcoding efficiency is improved.

In another embodiment, the transcoding can also be performed directly according to the audio track and the subtitle selected by the user, as shown in fig. 4, the video transcoding method of this embodiment includes:

step 401, parsing the source video file to obtain video metadata.

Step 402, generating and displaying a track list according to the track description information of each track included in the video metadata.

The track list may include language information of each track, and the generated track list may be displayed in a form of a drop-down list.

Step 403, generating a subtitle list according to the subtitle description information of each subtitle included in the video metadata and displaying the subtitle list.

The subtitle list comprises language information of each subtitle, and the generated subtitle list can be displayed in a pull-down list mode.

Step 404, acquiring the track selected by the user from the track list, and setting the track selected by the user as a transcoding track.

And 405, acquiring the subtitles selected by the user from the subtitle list, and setting the subtitles selected by the user as the transcoded subtitles.

And 406, generating watermark characters according to the file name description information included in the video metadata, and setting the generated watermark characters at a preset position of the video.

And 407, transcoding the source video file according to the setting.

According to the embodiment of the application, the audio track list and the subtitle list are generated according to the video metadata, the generated audio track list comprises language information of each audio track, the generated subtitle list comprises language information of each subtitle, a user can visually see each audio track and each subtitle included in a source video file, and accurately select the audio track and the subtitle of a required language, the problem that in the existing manual transcoding implementation scheme, due to the fact that audio track and subtitle identification is not visual, the language is not displayed, the user needs to use a video player to play the source video file in advance to confirm the languages of the audio track and the subtitle is solved, and user operation is simplified.

In order to better implement the above method, an embodiment of the present invention further provides a video transcoding apparatus, as shown in fig. 5a, the video transcoding apparatus includes a parsing unit 501, a track setting unit 502, a subtitle setting unit 503, and a transcoding unit 504, as follows:

an analyzing unit 501, configured to analyze a source video file to obtain video metadata;

an audio track setting unit 502 configured to determine an audio track matching the pre-configured language priority information according to audio track description information of each audio track included in the video metadata, and set the matching audio track as a transcoding audio track;

a caption setting unit 503, configured to determine, according to caption description information of each caption included in the video metadata, a caption matched with pre-configured language priority information, and set the matched caption as a transcoded caption;

a transcoding unit 504, configured to transcode the source video file according to the setting.

In some embodiments, as shown in fig. 5b, the apparatus further includes a watermark setting unit 505, where the watermark setting unit 505 is configured to generate watermark text according to file name description information included in the video metadata, and set the generated watermark text in a preset position of the video.

In some embodiments, as shown in fig. 5b, the track setting unit 502 comprises:

a first obtaining unit 5021, configured to obtain a keyword corresponding to the language priority information;

a first detection unit 5022 for detecting the track description information matched with the keyword;

a first setting unit 5023 for setting an audio track corresponding to the audio track description information matching the keyword as a transcoded audio track.

In some embodiments, as shown in fig. 5b, the subtitle setting unit 503 includes:

a second obtaining unit 5031, configured to obtain a keyword corresponding to the language priority information;

a second detecting unit 5032, configured to detect subtitle description information matching the keyword;

a second setting unit 5033, configured to set a subtitle corresponding to the subtitle description information that matches the keyword as a transcoded subtitle.

In some embodiments, as shown in fig. 5b, the apparatus further comprises a track list generating unit 506 and a subtitle list generating unit 507, wherein the track list generating unit 506 is configured to generate and display a track list according to track description information of each track included in the video metadata, the track list including language information of each track; the subtitle list generating unit 507 is configured to generate and display a subtitle list according to the subtitle description information of each subtitle included in the video metadata, where the subtitle list includes language information of each subtitle.

In some embodiments, the first setting unit 5023 is further configured to set a first soundtrack in the list of soundtracks as a transcoded soundtrack when no soundtrack description information matching the key is detected by the first detection unit 5022;

the second setting unit 5033 is further configured to set the first subtitle in the subtitle list as the transcoded subtitle when the second detecting unit 5032 does not detect the subtitle description information matching the keyword.

In some embodiments, the track setting unit 502 is further configured to obtain a track selected by the user from the track list, and set the track selected by the user as a transcoded track; and

the subtitle setting unit 503 is further configured to obtain a subtitle selected by the user from the subtitle list, and set the subtitle selected by the user as a transcoded subtitle.

In some embodiments, as shown in fig. 5b, the apparatus further includes a loading unit 508, and the loading unit 508 is configured to load a same-name subtitle file of the source video file when there is no subtitle description information in the video metadata.

In some embodiments, as shown in fig. 5b, the apparatus further includes a configuration unit 509, where the configuration unit 509 is configured to configure language priority information according to an operation of a user.

In the embodiment, a source video file is analyzed by an analysis unit to obtain video metadata, a track setting unit determines a track matched with pre-configured language priority information according to track description information of each track included in the video metadata, and the matched track is set to be a transcoding track; the caption setting unit determines a caption matched with pre-configured language priority information according to caption description information of each caption included in the video metadata, and transcodes the matched caption; and the transcoding unit transcodes the source video file according to the setting. The device of the embodiment can automatically set the transcoding audio track and the transcoding subtitle by using the video metadata and the pre-configured language priority information by adopting a matching method, so that automatic transcoding is realized.

An embodiment of the present invention further provides a video transcoding device, as shown in fig. 6, which shows a schematic structural diagram of a device according to an embodiment of the present invention, specifically:

the apparatus may include components such as a processor 601 of one or more processing cores, memory 602 of one or more computer-readable storage media, Radio Frequency (RF) circuitry 603, a power supply 604, an input unit 605, and a display unit 606. Those skilled in the art will appreciate that the device configuration shown in fig. 6 does not constitute a limitation of the device and may include more or fewer components than shown, or some components may be combined, or a different arrangement of components. Wherein:

the processor 601 is a control center of the apparatus, connects various parts of the entire apparatus using various interfaces and lines, performs various functions of the apparatus and processes data by running or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602, thereby monitoring the entire apparatus. Alternatively, processor 601 may include one or more processing cores; preferably, the processor 601 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.

The memory 602 may be used to store software programs and modules, and the processor 601 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602. The memory 602 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to the use of the device, and the like. Further, the memory 602 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 602 may also include a memory controller to provide the processor 601 with access to the memory 602.

RF circuit 603 may be used for receiving and transmitting signals during the process of transmitting and receiving information, and in particular, for receiving downlink information of a base station and then processing the received downlink information by one or more processors 601; in addition, data relating to uplink is transmitted to the base station. In general, the RF circuitry 603 includes, but is not limited to, an antenna, at least one Amplifier, a tuner, one or more oscillators, a Subscriber Identity Module (SIM) card, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like. In addition, the RF circuitry 603 may also communicate with networks and other devices via wireless communications. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communications (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Message Service (SMS), etc.

The apparatus further comprises a power supply 604 (e.g. a battery) for supplying power to the various components, and preferably, the power supply 604 is logically connected to the processor 601 via a power management system, so that functions of managing charging, discharging, and power consumption are implemented via the power management system. The power supply 604 may also include any component including one or more dc or ac power sources, recharging systems, power failure detection circuitry, power converters or inverters, power status indicators, and the like.

The apparatus may further include an input unit 605, and the input unit 605 may be used to receive input numeric or character information and generate a keyboard, mouse, joystick, optical or trackball signal input in relation to user settings and function control. In particular, in a particular embodiment, input unit 605 may include a touch-sensitive surface as well as other input devices. The touch-sensitive surface, also referred to as a touch display screen or a touch pad, may collect touch operations by a user (e.g., operations by a user on or near the touch-sensitive surface using a finger, a stylus, or any other suitable object or attachment) thereon or nearby, and drive the corresponding connection device according to a predetermined program. Alternatively, the touch sensitive surface may comprise two parts, a touch detection means and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 601, and can receive and execute commands sent by the processor 601. In addition, the touch sensitive surface can be implemented in various types, such as resistive, capacitive, infrared, and surface acoustic wave. The input unit 605 may include other input devices in addition to a touch-sensitive surface. In particular, other input devices may include, but are not limited to, one or more of a physical keyboard, function keys (such as volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.

The device may also include a display unit 606, which display unit 606 may be used to display information input by or provided to the user, as well as various graphical user interfaces of the device, which may be made up of graphics, text, icons, video, and any combination thereof. The Display unit 606 may include a Display panel, and optionally, the Display panel may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like. Further, the touch-sensitive surface may overlay the display panel, and when a touch operation is detected on or near the touch-sensitive surface, the touch operation is transmitted to the processor 601 to determine the type of the touch event, and then the processor 601 provides a corresponding visual output on the display panel according to the type of the touch event. Although in FIG. 6 the touch-sensitive surface and the display panel are two separate components to implement input and output functions, in some embodiments the touch-sensitive surface may be integrated with the display panel to implement input and output functions.

Although not shown, the device may further include a camera, a bluetooth module, etc., which will not be described herein. Specifically, in this embodiment, the processor 601 in the apparatus loads the executable file corresponding to the process of one or more application programs into the memory 602 according to the following instructions, and the processor 601 runs the application programs stored in the memory 602, thereby implementing various functions as follows:

analyzing a source video file to obtain video metadata;

determining a track matched with pre-configured language priority information according to track description information of each track included in the video metadata, and setting the matched track to be a transcoding track;

and transcoding the source video file according to the setting.

Further, before transcoding the source video file according to the settings, the processor 601 is further configured to perform the following steps:

and generating watermark characters according to the file name description information included in the video metadata, and setting the generated watermark characters at a preset position of the video.

Specifically, when determining a track matching with pre-configured language priority information from track description information of each track included in the video metadata and setting the matching track as a transcoding track, the processor 601 is configured to perform the following steps:

acquiring a keyword corresponding to the language priority information;

detecting audio track description information matched with the keywords;

and setting the audio track corresponding to the audio track description information matched with the key words as a transcoding audio track.

Specifically, when determining a subtitle matching with pre-configured language priority information according to subtitle description information of each subtitle included in the video metadata, and transcoding the subtitle setting matching, the processor 601 is configured to perform the following steps:

acquiring a keyword corresponding to the language priority information;

detecting caption description information matched with the keywords;

and setting the subtitle corresponding to the subtitle description information matched with the keyword as a transcoded subtitle.

Further, after parsing the source video file to obtain the video metadata, the processor 601 is further configured to perform the following steps:

generating and displaying a track list according to track description information of each track included in the video metadata, wherein the track list includes language information of each track;

and generating and displaying a subtitle list according to the subtitle description information of each subtitle included in the video metadata, wherein the subtitle list includes language information of each subtitle.

Further, the processor 601 is further configured to perform the following steps:

if the track description information matched with the keywords is not detected, setting the first track in the track list as a transcoding track; and

and if the subtitle description information matched with the keywords is not detected, setting the first subtitle in the subtitle list as the transcoded subtitle.

acquiring an audio track selected by a user from the audio track list, and setting the audio track selected by the user as a transcoding audio track; and

and acquiring the caption selected by the user from the caption list, and setting the caption selected by the user as a transcoding caption.

and if the video metadata does not have the subtitle description information, loading the subtitle file with the same name of the source video file.

Further, before parsing the source video file to obtain the video metadata, the processor 601 is further configured to perform the following steps:

and configuring language priority information according to the operation of a user.

The video transcoding device of the embodiment can automatically set the transcoding track and the transcoding subtitle by using the video metadata and the pre-configured language priority information by adopting a matching method, so that automatic transcoding is realized, only the language priority information needs to be configured in the whole transcoding process, the transcoding track and the transcoding subtitle do not need to be manually set, the labor cost is saved, and the transcoding efficiency is improved.

An embodiment of the present application further provides a storage device, where the storage device stores a computer program, and when the computer program runs on a computer, the computer is caused to execute the video transcoding method in any one of the foregoing embodiments, for example: analyzing a source video file to obtain video metadata; determining a track matched with pre-configured language priority information according to track description information of each track included in the video metadata, and setting the matched track to be a transcoding track; determining a subtitle matched with pre-configured language priority information according to subtitle description information of each subtitle included in the video metadata, and transcoding the matched subtitle; and transcoding the source video file according to the setting.

In the embodiment of the present application, the storage device may be a magnetic disk, an optical disk, a Read Only Memory (ROM), a Random Access Memory (RAM), or the like.

In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.

It should be noted that, for the video transcoding method of the embodiment of the present application, it can be understood by a person having ordinary skill in the art that all or part of the process of implementing the video transcoding method of the embodiment of the present application can be implemented by controlling the relevant hardware through a computer program, where the computer program can be stored in a computer readable storage medium, such as a memory of an electronic device, and executed by at least one processor in the electronic device, and during the execution, the process of the embodiment of the video transcoding method can be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory, a random access memory, etc.

For the video transcoding device according to the embodiment of the present application, each functional module may be integrated into one processing chip, or each module may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium, such as a read-only memory, a magnetic or optical disk, or the like.

The video transcoding method, the video transcoding device, and the storage device provided in the embodiments of the present application are described in detail above, and specific examples are applied in the description to explain the principles and embodiments of the present application, and the description of the embodiments above is only used to help understand the method and the core idea of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, the specific implementation manner and the application scope may be changed, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A method of video transcoding, comprising:

opening an analysis handle, executing stream analysis, and acquiring a plurality of stream data contained in a source video file;

reading a flow handle corresponding to each flow data to obtain the flow type of each flow data;

determining first stream data corresponding to subtitles and second stream data corresponding to an audio track contained in the source video file according to the stream types;

extracting title fields in video metadata corresponding to the first stream data and the second stream data to obtain audio track description information corresponding to each audio track and subtitle description information corresponding to each subtitle;

obtaining language priority information, and determining a keyword corresponding to the language priority information;

when a target audio track and a target subtitle which are matched with the keywords exist in the audio track description information and the subtitle description information, setting the target audio track as a transcoding audio track and setting the target subtitle as a transcoding subtitle;

when the target audio track and the target subtitle which are matched with the keywords do not exist in the audio track description information and the subtitle description information, generating and displaying an audio track list according to the audio track description information of each audio track included in the video metadata, wherein the audio track list includes language information of each audio track;

generating a subtitle list according to subtitle description information of each subtitle included in the video metadata and displaying the subtitle list, wherein the subtitle list includes language information of each subtitle;

acquiring an audio track selected by a user from the audio track list, and setting the audio track selected by the user as the transcoding audio track;

acquiring the subtitles selected by the user from the subtitle list, and setting the subtitles selected by the user as the transcoded subtitles;

transcoding the source video file according to the setting;

the method further comprises the following steps:

if the subtitle description information does not exist in the video metadata, searching a same-name subtitle file of the source video file; loading and pressing the same-name subtitle file and the source video file in the same group of data, and transcoding to generate hard subtitles;

when the format of the same-name subtitle file is not the format supported by the hard subtitles, converting the format of the same-name subtitle file into the format supported by the hard subtitles, and loading and pressing the same-name subtitle file after format conversion and the source video file into the same group of data;

before the transcoding the source video file according to the setting, further comprising:

automatically generating watermark characters according to file name description information included in the video metadata, wherein the generated watermark characters comprise the name of the source video file; determining a preset position; and setting the watermark characters at the preset position of the video.

2. A video transcoding apparatus, comprising:

the analysis unit is used for opening an analysis handle, executing stream analysis and acquiring a plurality of stream data contained in the source video file; reading a flow handle corresponding to each flow data to obtain the flow type of each flow data; determining first stream data corresponding to subtitles and second stream data corresponding to an audio track contained in the source video file according to the stream types; extracting title fields in video metadata corresponding to the first stream data and the second stream data to obtain audio track description information corresponding to each audio track and subtitle description information corresponding to each subtitle;

the setting unit is used for acquiring language priority information and determining keywords corresponding to the language priority information; when a target audio track and a target subtitle which are matched with the keywords exist in the audio track description information and the subtitle description information, setting the target audio track as a transcoding audio track and setting the target subtitle as a transcoding subtitle;

a sound track list display unit, configured to generate and display a sound track list according to sound track description information of each sound track included in the video metadata when the target sound track and the target subtitle matching the keyword do not exist in the sound track description information and the subtitle description information, where the sound track list includes language information of each sound track;

the caption list display unit is used for generating and displaying a caption list according to the caption description information of each caption included in the video metadata, wherein the caption list includes language information of each caption;

an audio track setting unit configured to acquire an audio track selected by a user from the audio track list, and set the audio track selected by the user as the transcoded audio track;

the subtitle setting unit is used for acquiring the subtitle selected by the user from the subtitle list and setting the subtitle selected by the user as the transcoded subtitle;

the transcoding unit is used for transcoding the source video file according to the setting;

a loading unit, configured to search a same-name subtitle file of the source video file if the subtitle description information is not included in the video metadata; loading and pressing the same-name subtitle file and the source video file in the same group of data, and transcoding to generate hard subtitles;

the device further comprises:

the watermark setting unit is used for automatically generating watermark characters according to file name description information included in the video metadata, and the generated watermark characters comprise the name of the source video file; determining a preset position; and setting the watermark characters at the preset position of the video.

3. A memory device storing a plurality of instructions adapted to be loaded by a processor and to perform the method of claim 1.