CN116647714A - Video generation method, device, electronic equipment and storage medium - Google Patents

Video generation method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN116647714A
CN116647714A CN202310639629.XA CN202310639629A CN116647714A CN 116647714 A CN116647714 A CN 116647714A CN 202310639629 A CN202310639629 A CN 202310639629A CN 116647714 A CN116647714 A CN 116647714A
Authority
CN
China
Prior art keywords
video
script
target
content
target video
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310639629.XA
Other languages
Chinese (zh)
Inventor
顾廷飞
敖迎辉
陈权
胡涛
马也
王波
曹锡鹏
孙华衿
李长刚
刘一波
张云浩
杜川
岳景来
王浩
冯少云
苏璟文
李德智
张玕
陶亮
张美�
吴超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202310639629.XA priority Critical patent/CN116647714A/en
Publication of CN116647714A publication Critical patent/CN116647714A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23424Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for inserting or substituting an advertisement
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44016Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving splicing one content stream with another content stream, e.g. for substituting a video clip
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4884Data services, e.g. news ticker for displaying subtitles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/265Mixing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • H04N5/278Subtitling

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Television Signal Processing For Recording (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The disclosure relates to a video generation method, a device, an electronic device and a storage medium, which belong to the technical field of video processing, and the method comprises the following steps: and responding to the video script editing operation, acquiring a video script, and further responding to the video generating operation aiming at the video script, and processing at least one material corresponding to the video script according to the analysis result of the video script to generate at least one target video, wherein the material corresponding to the video script is matched with the material type and the material content of the video script, and the arrangement sequence of the materials in the generated target video accords with the material arrangement sequence of the video script. The process realizes that at least one target video matched with the video script is automatically generated aiming at the video script, so that man-machine interaction operation related in the video generation process is simplified, and the video generation efficiency is effectively improved.

Description

Video generation method, device, electronic equipment and storage medium
Technical Field
The disclosure relates to the technical field of video processing, and in particular relates to a video generation method, a device, electronic equipment and a storage medium.
Background
Video is a better information dissemination carrier in the internet. Many online information platforms, e-commerce platforms, short video APP (Application), etc. use video as a main presentation mode.
In the related art, a user can generate a video to be distributed through a video generation platform. For example, a user inputs a video script on a video generation platform, uploads various materials required for generating a video, then clips, splices and other editing operations are performed on the various materials according to the video script, and finally the video is generated.
However, the man-machine interaction operation related to the method is complex, and the user needs to spend more time operating on the video generation platform, so that the whole video generation process is inefficient.
Disclosure of Invention
The disclosure provides a video generation method, a video generation device, electronic equipment and a storage medium, which can improve video generation efficiency. The technical scheme of the present disclosure is as follows.
According to a first aspect of an embodiment of the present disclosure, there is provided a video generating method, applied to a server, the method including:
responding to video script editing operation, and acquiring a video script, wherein the video script comprises the material type, the material content and the material arrangement sequence of the video to be generated;
Responding to video generation operation aiming at the video script, analyzing the video script to obtain the material type, the material content and the material arrangement sequence;
and determining at least one material corresponding to the video script based on the material type and the material content, processing at least one material based on the material arrangement sequence to generate at least one target video, wherein the arrangement sequence of the material in the target video accords with the material arrangement sequence, and the material is matched with the material type and the material content.
In the method, a video script is obtained in response to a video script editing operation, further, in response to a video generating operation for the video script, at least one material corresponding to the video script is processed according to an analysis result of the video script, and at least one target video is generated, wherein the material corresponding to the video script is matched with the material type and the material content of the video script, and the arrangement sequence of the materials in the generated target video accords with the material arrangement sequence of the video script. The process realizes that at least one target video matched with the video script is automatically generated aiming at the video script, so that man-machine interaction operation related in the video generation process is simplified, and the video generation efficiency is effectively improved.
In some embodiments, determining at least one material corresponding to the video script based on the material type and the material content, processing at least one material based on the material arrangement order, and generating at least one target video includes:
determining a plurality of materials corresponding to the video script based on the material type and the material content;
splicing a plurality of materials according to the material arrangement sequence to generate an intermediate video;
and adjusting video elements of the intermediate video based on the video content of the intermediate video to generate the target video.
In some embodiments, splicing a plurality of materials according to the material arrangement sequence to generate an intermediate video, including:
and filtering the content meeting the filtering conditions in each material, and splicing the plurality of filtered materials according to the material arrangement sequence to generate the intermediate video.
Through the method, the contents meeting the filtering conditions in each material are filtered, the accuracy of the intermediate video can be improved, and the finally generated target video meets the requirements of the video script better.
In some embodiments, adjusting video elements of the intermediate video based on video content of the intermediate video, generating the target video includes at least one of:
adding subtitles to the intermediate video based on the video content of the intermediate video to obtain the target video;
and adding media resources matched with the video content to the intermediate video based on the video content of the intermediate video to obtain the target video, wherein the media resources comprise one or more of pictures, dynamic special effects, audios and prompt texts.
Through the mode, corresponding subtitles can be added according to video content, personalized requirements are met, and user experience is improved. According to the method, the matched media resources can be automatically added according to the video content, so that personalized requirements are met, the display effect of the video is enriched, man-machine interaction operation is simplified, and user experience is improved.
In some embodiments, the method further comprises:
and adjusting the picture display parameters of the target video based on the picture display effect of the target video to obtain the adjusted target video, wherein the picture display effect of the adjusted target video accords with picture display conditions.
Through the mode, the picture display effect of the target video can be effectively improved.
In some embodiments, the video script further comprises at least one of:
the picture-in-picture material type, material content and material splicing mode;
video lines;
converting the text into a voice original text;
background music style.
Through the mode, the content of the video script is enriched, so that the display effect of the target video is enriched.
In some embodiments, determining at least one material corresponding to the video script based on the material type and the material content includes at least one of:
determining at least one first material corresponding to the video script from a material library based on the material type and the material content;
and responding to a material uploading operation aiming at the video script, and determining at least one second material corresponding to the video script based on the material type and the material content.
By determining the materials from the material library, the materials matched with the materials are automatically retrieved from the material library according to the video script, so that the video generation efficiency is improved. By responding to the material uploading operation mode, personalized requirements of materials required by the video to be generated can be met, and therefore user experience is improved.
In some embodiments, in a case where the number of the target videos is plural, each of the target videos includes a different target video element.
In some embodiments, the target video element comprises at least one of:
a subtitle style;
a style of media asset that matches video content of the target video;
a material transition pattern;
and determining the materials from the material library.
By the method, a plurality of target videos can be synchronously generated, so that a user can select according to requirements, and the user experience is improved.
In some embodiments, the plurality of target videos includes a first target video and at least one second target video, the method further comprising:
and generating prompt information corresponding to the second target video based on different video elements contained in the first target video and the second target video, wherein the prompt information is used for prompting the video elements different from the first target video in the second target video.
Through generating the prompt message corresponding to the second target video, the terminal can display conveniently, and a user can intuitively know the difference between different videos, so that man-machine interaction operation is simplified, and user experience is improved.
According to a second aspect provided by an embodiment of the present disclosure, there is provided a video generating method, applied to a terminal, the method including:
responding to video script editing operation, and acquiring a video script, wherein the video script comprises the material type, the material content and the material arrangement sequence of the video to be generated;
responding to video generation operation aiming at the video script, and acquiring at least one target video corresponding to the video script, wherein the target video is obtained by processing at least one material corresponding to the video script based on the material arrangement sequence, at least one material is determined based on the material type and the material content, the arrangement sequence of the materials in the target video accords with the material arrangement sequence, and the materials are matched with the material type and the material content;
at least one of the target videos is shown.
In some embodiments, in a case where the number of the target videos is plural, each of the target videos includes a different target video element, the displaying at least one of the target videos includes:
displaying a first target video, at least one second target video and prompt information corresponding to the second target video, wherein the prompt information is used for prompting video elements different from the first target video in the second target video.
Through the prompt information corresponding to the second target video, the user can intuitively know the difference between different videos, so that man-machine interaction operation is simplified, and user experience is improved.
According to a third aspect of the embodiments of the present disclosure, there is provided a video generating apparatus applied to a server, the apparatus including:
an acquisition unit configured to perform an editing operation in response to a video script, the video script including a material type, a material content, and a material arrangement order of a video to be generated;
the analysis unit is configured to execute a video generation operation responding to the video script, analyze the video script and obtain the material type, the material content and the material arrangement sequence;
the generating unit is configured to execute at least one material corresponding to the video script based on the material type and the material content, process at least one material based on the material arrangement sequence, and generate at least one target video, wherein the arrangement sequence of the material in the target video accords with the material arrangement sequence, and the material is matched with the material type and the material content.
In some embodiments, the generating unit comprises:
a determining unit configured to perform determining a plurality of the materials corresponding to the video script based on the material type and the material content;
the first sub-generation unit is configured to splice a plurality of materials according to the material arrangement sequence to generate an intermediate video;
and a second sub-generation unit configured to perform adjustment of video elements of the intermediate video based on video content of the intermediate video, and generate the target video.
In some embodiments, the first sub-generation unit is configured to perform:
and filtering the content meeting the filtering conditions in each material, and splicing the plurality of filtered materials according to the material arrangement sequence to generate the intermediate video.
In some embodiments, the second sub-generation unit is configured to perform at least one of:
adding subtitles to the intermediate video based on the video content of the intermediate video to obtain the target video;
and adding media resources matched with the video content to the intermediate video based on the video content of the intermediate video to obtain the target video, wherein the media resources comprise one or more of pictures, dynamic special effects, audios and prompt texts.
In some embodiments, the apparatus further comprises:
and the adjusting unit is configured to execute adjustment of the picture display parameters of the target video based on the picture display effect of the target video to obtain the adjusted target video, wherein the picture display effect of the adjusted target video accords with the picture display condition.
In some embodiments, the video script further comprises at least one of:
the picture-in-picture material type, material content and material splicing mode;
video lines;
converting the text into a voice original text;
background music style.
In some embodiments, the determining unit is configured to perform at least one of:
determining at least one first material corresponding to the video script from a material library based on the material type and the material content;
and responding to a material uploading operation aiming at the video script, and determining at least one second material corresponding to the video script based on the material type and the material content.
In some embodiments, in a case where the number of the target videos is plural, each of the target videos includes a different target video element.
In some embodiments, the target video element comprises at least one of:
a subtitle style;
a style of media asset that matches video content of the target video;
a material transition pattern;
and determining the materials from the material library.
In some embodiments, the plurality of target videos includes a first target video and at least one second target video, the generating unit is further configured to perform:
and generating prompt information corresponding to the second target video based on different video elements contained in the first target video and the second target video, wherein the prompt information is used for prompting the video elements different from the first target video in the second target video.
According to a fourth aspect of embodiments of the present disclosure, there is provided a video generating apparatus, applied to a terminal, the apparatus including:
a first acquisition unit configured to perform an editing operation in response to a video script, the video script including a material type, a material content, and a material arrangement order of a video to be generated;
a second obtaining unit, configured to perform a video generating operation for the video script, and obtain at least one target video corresponding to the video script, where the target video is obtained by processing at least one material corresponding to the video script based on the material arrangement sequence, at least one material is determined based on the material type and the material content, the arrangement sequence of the materials in the target video accords with the material arrangement sequence, and the materials are matched with the material type and the material content;
And a display unit configured to perform display of at least one of the target videos.
In some embodiments, in the case that the number of the target videos is plural, each of the target videos includes a different target video element, the presentation unit is configured to perform:
displaying a first target video, at least one second target video and prompt information corresponding to the second target video, wherein the prompt information is used for prompting video elements different from the first target video in the second target video.
According to a fifth aspect of embodiments of the present disclosure, there is provided an electronic device including:
one or more processors;
a memory for storing the processor-executable program code;
wherein the processor is configured to execute the program code to implement the video generation method described above.
According to a sixth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium comprising: the program code in the computer readable storage medium, when executed by a processor of an electronic device, enables the electronic device to perform the video generation method described above.
According to a seventh aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the video generation method described above.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure and do not constitute an undue limitation on the disclosure.
Fig. 1 is a schematic view of an implementation environment of a video generating method according to an embodiment of the present disclosure;
FIG. 2 is a flow chart of a video generation method provided by an embodiment of the present disclosure;
FIG. 3 is a flow chart of another video generation method provided by an embodiment of the present disclosure;
FIG. 4 is a flow chart of another video generation method provided by an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of a video script editing interface provided by embodiments of the present disclosure;
FIG. 6 is a schematic diagram of a display target video provided by an embodiment of the present disclosure;
fig. 7 is a schematic diagram of a video generating method according to an embodiment of the present disclosure;
Fig. 8 is a block diagram of a video generating apparatus provided by an embodiment of the present disclosure;
fig. 9 is a block diagram of a video generating apparatus provided by an embodiment of the present disclosure;
fig. 10 is a block diagram of a terminal according to an embodiment of the present disclosure;
fig. 11 is a block diagram of a server provided by an embodiment of the present disclosure.
Detailed Description
In order to enable those skilled in the art to better understand the technical solutions of the present disclosure, the technical solutions of the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. The implementations described in the following exemplary examples are not representative of all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with some aspects of the present disclosure as detailed in the accompanying claims.
It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present disclosure are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions. For example, video scripts and the like referred to in the embodiments of the present disclosure are all acquired with sufficient authorization.
Fig. 1 is an implementation environment schematic diagram of a video generating method according to an embodiment of the present disclosure. Referring to fig. 1, the implementation environment of the video generation method includes: a terminal 101 and a server 102. The terminal 101 and the server 102 are directly or indirectly connected through wired or wireless communication, which is not limited in the embodiments of the present disclosure.
The terminal 101 is at least one of a smart phone, a smart watch, a desktop computer, a laptop computer, a virtual reality terminal, an augmented reality terminal, a wireless terminal, and a laptop portable computer. Terminal 101 may be referred to generally as one of a plurality of terminals, with embodiments of the present disclosure being illustrated only by terminal 101. Those skilled in the art will recognize that the number of terminals may be greater or lesser. Illustratively, the terminal 101 is capable of installing and running an application for providing video generation functionality, also referred to as a video generation platform. For example, the target object inputs a video script through an application interface of an application program displayed on the terminal 101, and uploads corresponding material to generate a video conforming to the video script. Illustratively, the application takes the form of a web application, a applet, a client, or the like, to which the present disclosure is not limited. Where a applet refers to a program that runs on other applications, such as an applet.
The server 102 is an independent physical server, may be a server cluster or a distributed file system formed by a plurality of physical servers, and may be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The number of servers 102 may be greater or lesser, and embodiments of the present disclosure are not limited in this regard. Illustratively, the server 102 is configured to provide a background service for an application program executed by the terminal 101, for example, parse a video script input on the terminal 101 based on a target object, and generate a corresponding video based on the parsing result. Of course, the server 102 may also include other functional servers to provide more comprehensive and diverse services. Illustratively, taking a web application program that provides video generation functions as an example, the terminal 101 is running on the terminal, the server 102 is configured to provide background services for the web application program. Illustratively, the terminal 101 presents an application interface of the web application, and the target object triggers the terminal 101 to interact with the server 102 to process various operations by implementing various operations on the application interface. For example, server 102 obtains video scripts in response to video script editing operations performed by the target object on the application interface, and the like, to which the present disclosure is not limited.
In some embodiments, during the video generation process, the server 102 takes over primary generation work and the terminal 101 takes over secondary generation work; alternatively, the server 102 performs a secondary generation operation, and the terminal 101 performs a primary generation operation; alternatively, the server 102 or the terminal 101 can each independently undertake the generation work, which is not limited by the embodiments of the present disclosure.
The video generating method provided by the embodiment of the present disclosure is described below based on the above implementation environment.
Fig. 2 is a flowchart of a video generating method provided in an embodiment of the present disclosure. As shown in fig. 2, the method is applied to a server, and includes the following steps 201 to 203.
In step 201, the server acquires a video script including a material type, a material content, and a material arrangement order of a video to be generated in response to a video script editing operation.
In the embodiment of the disclosure, a server is in communication connection with a terminal, an application program for providing a video generating function is operated on the terminal, and the server is used for providing background service for the application program. The terminal displays an application interface of the application program, and the server responds to video script editing operation implemented by the target object on the application interface to acquire the video script. The video script is information capable of indicating video content and video structure of a video (of course, the video script may also indicate other information, the disclosure is not limited thereto), and may be understood as a design idea for a video to be generated, by creating the video script, information such as the video content and the video structure contained in the video to be generated may be intuitively and clearly known, so as to determine materials (such as pictures and videos) required by the video to be generated according to the video script, and process the materials according to the video script to generate a target video meeting the requirement of the video script, thereby improving video generation efficiency.
The material type refers to the type of the material required by the video to be generated, such as an implementation, a screen recording, a public material, a head, a tail and the like. The material content refers to the content of the material required by the video to be generated, such as a section of picture description text, keywords and the like. The material arrangement sequence refers to the arrangement sequence of the materials required by the video to be generated in the video to be generated, for example, the materials required by the video to be generated comprise a material A (a video segment for displaying the appearance of a certain object), a material B (a video segment for introducing the function of a certain object) and a material C (a video segment for introducing the use scene of a certain object), and the material arrangement sequence is the material B, the material C and the material A. In some embodiments, the material arrangement sequence is embodied by a mirror number, which is not limited. In addition, the present disclosure is not limited to the number of materials required for the video to be generated, and may be one or more.
In step 202, the server parses the video script in response to the video generation operation for the video script, to obtain the material type, the material content, and the material arrangement order.
In the embodiment of the disclosure, a server responds to video generation operation implemented by a target object on an application interface displayed by a terminal, and analyzes an obtained video script to obtain the material type, the material content and the material arrangement sequence.
In step 203, the server determines at least one material corresponding to the video script based on the material type and the material content, processes the at least one material based on the material arrangement sequence, and generates at least one target video, wherein the arrangement sequence of the material in the target video accords with the material arrangement sequence, and the material is matched with the material type and the material content.
In the embodiment of the disclosure, for any material corresponding to a video script, the material is matched with the material type and the material content in the video script, that is, the material type and the material content of the material meet the requirement of the video script. After at least one material corresponding to the video script is determined, the at least one material is processed according to the material arrangement sequence, and at least one target video is generated, wherein the target video is the video meeting the requirement of the video script.
In the video generating method provided by the embodiment of the disclosure, a video script is obtained in response to a video script editing operation, further, in response to a video generating operation for the video script, at least one material corresponding to the video script is processed according to an analysis result of the video script, and at least one target video is generated, wherein the material corresponding to the video script is matched with the material type and the material content of the video script, and the arrangement sequence of the materials in the generated target video accords with the material arrangement sequence of the video script. The process realizes that at least one target video matched with the video script is automatically generated aiming at the video script, so that man-machine interaction operation related in the video generation process is simplified, and the video generation efficiency is effectively improved.
Fig. 3 is a flowchart of another video generation method provided by an embodiment of the present disclosure. As shown in fig. 3, the method is applied to a terminal, and includes the following steps 301 to 303.
In step 301, the terminal obtains a video script including a material type, a material content, and a material arrangement order of a video to be generated in response to a video script editing operation.
In the embodiment of the disclosure, an application program providing a video generation function is run on a terminal, the terminal displays an application interface of the application program, and the terminal responds to video script editing operation implemented by a target object on the application interface to acquire a video script. The specific meaning of the video script is referred to in step 201 above, and will not be described again.
In step 302, the terminal responds to the video generating operation for the video script to obtain at least one target video corresponding to the video script, wherein the target video is obtained by processing at least one material corresponding to the video script based on the material arrangement sequence, at least one material is determined based on the material type and the material content, the arrangement sequence of the materials in the target video accords with the material arrangement sequence, and the materials are matched with the material type and the material content.
In the embodiment of the present disclosure, the target video may be generated by the terminal (the process is the same as the foregoing step 202, and therefore will not be repeated), or may be generated by the server and then sent to the terminal, which is not limited in the embodiment of the present disclosure. Taking the generation of the target video by the server as an example, in this step, the server parses the video script in response to the video generation operation performed by the target object on the application interface, generates at least one target video corresponding to the video script based on the parsing result, and sends the at least one target video to the terminal.
In step 303, the terminal presents at least one target video.
In the embodiment of the disclosure, the terminal displays at least one target video on an application interface.
In the video generating method provided by the embodiment of the disclosure, a video script is obtained in response to a video script editing operation, further, in response to a video generating operation for the video script, at least one material corresponding to the video script is processed according to an analysis result of the video script, and at least one target video is generated, wherein the material corresponding to the video script is matched with the material type and the material content of the video script, and the arrangement sequence of the materials in the generated target video accords with the material arrangement sequence of the video script. The process realizes that at least one target video matched with the video script is automatically generated aiming at the video script, so that man-machine interaction operation related in the video generation process is simplified, and the video generation efficiency is effectively improved.
Fig. 2 and 3 described above introduce a brief flow of the video generation method provided by the embodiment of the present disclosure. The above method will be described in detail based on the embodiment shown in fig. 4.
Fig. 4 is a flowchart of another video generation method provided by an embodiment of the present disclosure. As shown in fig. 4, taking an example that the video generation method is implemented through interaction between a terminal and a server, the video generation method includes the following steps 401 to 408.
In step 401, the terminal obtains a video script including a material type, a material content, and a material arrangement order of a video to be generated in response to a video script editing operation.
In the embodiment of the disclosure, an application program providing a video generation function is run on a terminal, the terminal displays an application interface of the application program, and a video script is acquired in response to video script editing operation implemented on the application interface by a target object. The application interface comprises a first control, the first control is used for editing the video script, the terminal responds to triggering operation of the first control, the video script editing interface is displayed, the video script editing interface comprises a plurality of functional controls and is used for providing editing functions for the video script, namely, is used for realizing various video script editing operations, and a target object can conveniently create the video script through the video script editing interface.
In some embodiments, the video script further comprises at least one of: the picture-in-picture material type, material content and material splicing mode; video lines; text-to-speech (TSS) originals; background music style; script names; video scale, etc., to which the present disclosure is not limited. The material type and the material content of the pd material are the same as those described above, so that the description is omitted, and the material splicing method of the pd material refers to a splicing method between the pd material and the main material, for example, upper and lower split screens, left and right split screens, upper right corner, lower right corner, full coverage, and the disclosure is not limited thereto.
Referring to fig. 5, a process in which a terminal acquires a video script in response to a video script editing operation will be described.
Fig. 5 is a schematic diagram of a video script editing interface provided by an embodiment of the present disclosure. As shown in fig. 5 (a), the video script editing interface includes a plurality of functional controls, such as a material type editing control 501, a material content editing control 502, a mirror number editing control 503 (for indicating a material arrangement sequence, one mirror number corresponds to one material, of course, one mirror number may also correspond to a plurality of materials, which is not limited to this disclosure), a video line editing control 504, a background music style editing control 505, a picture-in-picture material editing control 506, and the like.
Illustratively, the terminal creates mirror 1, i.e., the first material 1 required for the video to be generated, in response to a trigger operation of the mirror edit control 503; continuing taking the mirror number 1 as an example, the terminal responds to the triggering operation (such as the selection operation of a certain material type) of the material type editing control 501 to determine that the material type of the material 1 is 'real shooting'; in response to a triggering operation (such as a text input operation) on the material content editing control 502, determining the material content of the material 1 as "show the appearance of the article"; in response to a triggering operation (such as a text input operation) of the video speech editing control 504, determining that the video speech corresponding to the material 1 is "1". Introduce a XXX to a person; 2. next, the function … … ″ of this XXX will be described; in response to a triggering operation (e.g., a selection operation for a certain background music style) on the background music style editing control 505, determining the background music style of the material 1 as "lyrics"; in response to a trigger operation to the pip material editing control 506, a pip material editing area is shown, as shown in fig. 5 (b), which includes a plurality of functional controls, such as a material type editing control 5061, a material content editing control 5062, and a material stitching manner editing control 5063, to which the present disclosure is not limited.
Through the above step 401, a creation manner for a video script is provided, so that a terminal can acquire the video script, and technical support is provided for automatically generating a target video according to the video script.
In step 402, the terminal transmits the video script to the server.
In some embodiments, the video script editing interface includes a second control for saving the video script, and the terminal sends the video script to the server in response to a trigger operation on the second control. For example, the second control is in the form of a button, denoted as "confirm save," to which the present disclosure is not limited. That is, when the target object determines that the current video script is correct, click "confirm save" to trigger the terminal to send the video script to the server, thereby saving communication resources between the terminal and the server. Of course, in other embodiments, the terminal transmits information of the video script corresponding to any video script editing operation to the server in response to the video script editing operation, i.e., the terminal transmits the edited content of the video script of the target object to the server in real time, which is not limited thereto.
In step 403, the terminal transmits a video generation request for the video script to the server in response to the video generation operation for the video script.
In the embodiment of the disclosure, the application interface includes a third control, the third control is used for generating a video for the video script, and the terminal responds to the triggering operation of the third control and sends a video generation request for the video script to the server. For example, the third control is in the form of a button, denoted "script sheeting," and the present disclosure is not limited thereto. That is, in the case where the target object determines to generate video based on the current video script, click "script slicing", triggering the terminal to send a video generation request for the video script to the server.
In step 404, the server parses the video script based on the received video generation request to obtain the material type, the material content, and the material arrangement order.
In the embodiment of the disclosure, the server analyzes the acquired video script based on the received video generation request to obtain the material type, the material content and the material arrangement sequence. Of course, the server may parse the video script when the video script is obtained in the step 402, which is not limited in the embodiment of the present disclosure.
In step 405, the server determines at least one material corresponding to the video script based on the material type and the material content, the material matching the material type and the material content.
In the embodiment of the present disclosure, the material corresponding to the video script may be determined from a material library, or may be uploaded by a target object through a terminal, that is, a local material, which is not limited. The material library stores a plurality of materials for generating videos, and can be also understood as a common material library. In some embodiments, the step includes at least one of:
in the first mode, at least one first material corresponding to the video script is determined from a material library based on the material type and the material content. Illustratively, taking any material required by the video to be generated as an example, the server searches in a material library based on the material type and the material content of the material to obtain a first material matched with the material type and the material content of the material. For example, the server determines a plurality of target materials matching the material type of the material from the material library based on the material type of the material, and then invokes the first network model to determine the matching degree between the material content of the material and the material content of each target material, and determines the first material from the plurality of target materials based on the matching degree. It should be appreciated that embodiments of the present disclosure are not limited to the specific implementation of the first network model described above, for example, the first network model may be a neural network model based on an image description algorithm, and so on. By the method, the server can automatically search the matched materials from the material library according to the video script, so that the video generation efficiency is improved.
And in a second mode, responding to the material uploading operation aiming at the video script, and determining at least one second material corresponding to the video script based on the material type and the material content. The application interface provides a material uploading function for the video script, the server responds to material uploading operation of the target object on the application interface, obtains the material uploaded by the target object, and determines at least one second material corresponding to the video script based on the material type and the material content. That is, the server determines which materials indicated by the video script correspond to the materials uploaded by the target object based on the material type and the material content under the condition that the materials uploaded by the target object are obtained, so that technical support is provided for subsequent automatic generation of the target video. In other words, in the process of uploading the material by the target object, the server can automatically correspond the material uploaded by the target object to the material indicated by the video script based on the material type and the material content without considering the uploading sequence of the material and the like, so that the video generation efficiency is greatly improved. In addition, by the method, personalized requirements of materials required by the video to be generated can be met, and therefore user experience is improved.
In step 406, the server processes at least one material based on the material arrangement order, and generates at least one target video, where the material arrangement order in the target video matches the material arrangement order.
In the embodiment of the present disclosure, taking the number of materials corresponding to a video script as a plurality of examples, a server processes the plurality of materials, and a process of generating a target video includes the following steps a and B:
and step A, splicing a plurality of materials according to the material arrangement sequence to generate an intermediate video.
The server sorts the materials according to the material arrangement sequence so that the sorted materials conform to the material arrangement sequence, and the sorted materials are spliced to obtain the intermediate video. In some embodiments, where the video script includes video lines, the server may be capable of stitching multiple materials based on the video lines and the material arrangement order to generate an intermediate video, as the disclosure is not limited in this regard. In addition, as can be seen from the foregoing description of the video script, in some embodiments, the video script further includes a material type, a material content, and a material splicing manner of the pip material, that is, at least one material corresponding to the video script also includes the pip material, and accordingly, in this step, the server splices a plurality of materials according to the material splicing manner and the material arrangement order of the pip material to generate the intermediate video, which is not limited to this disclosure.
In some embodiments, the server filters the content meeting the filtering condition in each material, and splices the filtered materials according to the material arrangement sequence to generate the intermediate video. Illustratively, the filtering condition refers to that the material picture does not meet the video generation requirement, and the server can determine content meeting the filtering condition in each material based on the video script. For example, if the video script includes a video line, a certain material corresponding to the video script is an actual shot video uploaded by the target object through the terminal, and if the line corresponding to the material in the video line of the video script is "… …, a XX article … …" is introduced now, and the line appearing in the actual shot video uploaded by the target object is "… …, a YY article is introduced now, and if the line is not, but is XX article … …", the server can filter, that is, clip, the corresponding content of the line is "YY article, and if the line is not, so as to obtain a filtered actual shot video, so that the filtered actual shot video meets the video generation requirement, and the process can also be understood as removing the mouth error segment in the material. For another example, if the certain material corresponding to the video script is a video a retrieved by the server from the material library, and if the material content of the material indicated by the video script is a "video clip showing the appearance of the article a", and the video a includes both a "video clip showing the appearance of the article a" and a "video clip showing the appearance of the article B", the server can filter, that is, clip, the corresponding content of the "video clip showing the appearance of the article B" in the video a, so as to obtain a filtered video a, so that the filtered video a meets the requirement of the video script, and the process can also be understood as removing the redundant pictures in the material.
The above examples are merely illustrative, and are not intended to limit the present disclosure, and the specific content of the filtering condition can be set according to actual requirements, and the present disclosure is not limited thereto. Through the method, the contents meeting the filtering conditions in each material are filtered, the accuracy of the intermediate video can be improved, and the finally generated video meets the requirements of the video script better.
And B, adjusting video elements of the intermediate video based on the video content of the intermediate video to generate a target video.
Where video elements refer to elements related to video content, such as material, subtitles, stickers, dynamic special effects, prompt text, and so forth, the disclosure is not limited thereto. Illustratively, this step includes at least one of the following:
in the first mode, subtitles are added to the intermediate video based on the video content of the intermediate video, so that a target video is obtained.
The server identifies the voice content of the intermediate video based on the video content of the intermediate video, and adds subtitles to the intermediate video according to the voice content to obtain the target video. In some embodiments, the video script includes video lines, and the server adds subtitles to the intermediate video based on video content of the video lines and the intermediate video to obtain the target video. In other embodiments, the server determines attribute information of the intermediate video based on video content of the intermediate video, and adds a subtitle to the intermediate video based on a subtitle style corresponding to the attribute information to obtain the target video. Illustratively, the attribute information indicates scene attributes, character attributes, item attributes, and the like of the intermediate video. For example, the attribute information of the intermediate video indicates that the scene attribute of the intermediate video is natural scene, and the subtitle style added by the server for the intermediate video is related to the natural scene (such as surrounding green leaves around the subtitle).
For another example, the attribute information of the intermediate video indicates that the object attribute of the intermediate video is a vehicle, and the subtitle style added by the server for the intermediate video is related to the vehicle (such as a small icon surrounded by the vehicle around the subtitle, etc.). It is to be understood that the above examples are illustrative only and that the present disclosure is not limited thereto. Through the mode, corresponding subtitles can be added according to video content, personalized requirements are met, and user experience is improved.
And secondly, adding media resources matched with the video content for the intermediate video based on the video content of the intermediate video to obtain a target video.
The media resource includes one or more of pictures, dynamic special effects, audio and prompt text, and is not limited thereto. For example, filters, transitions, etc. The server determines content information of the intermediate video based on the video content of the intermediate video, and adds media resources matched with the video content to the intermediate video based on the content information. For example, if the content information indicates a keyword of the intermediate video, the server can add, based on the content information, a media resource matching the keyword of the intermediate video to the intermediate video (if the keyword of a certain video segment is a red packet, then a red packet sticker and a red packet sound effect are added to the certain video segment). For another example, if the content information indicates a character expression of the intermediate video, the server can add a media resource matching the character expression to the intermediate video based on the content information (if the character expression is happy in a certain video, a smiling face sticker is added to the certain video). For another example, the content information indicates that the intermediate video includes content that describes the item a, and the server can add media assets matching the item a to the intermediate video based on the content information (e.g., add background music associated with the vehicle to the intermediate video if the item a is a vehicle, or add prompt text associated with the vehicle to the intermediate video).
It is to be understood that the above examples are illustrative only and that the present disclosure is not limited thereto. Through the mode, the matched media resources can be automatically added according to the video content, so that personalized requirements are met, the display effect of the video is enriched, man-machine interaction operation is simplified, and user experience is improved.
And B, processing at least one material corresponding to the video script through the step A and the step B, and generating a target video. Step a may be understood as a video rough clipping process, and step B may be understood as a video fine clipping process. The method comprises the steps of generating a complete video which accords with the overall design thought of a video script through the step A, and then optimizing the intermediate video through the step B to generate a target video. Through the mode, the display effect of the video can be enriched on the basis of meeting the video script, and the user experience is improved.
In other embodiments, the server adjusts the frame display parameters of the target video based on the frame display effect of the target video to obtain an adjusted target video, where the frame display effect of the adjusted target video meets the frame display condition. The server invokes the second network model to determine the picture display parameters of the target video, and adjusts the picture display parameters of the target video so that the picture display effect of the adjusted target video meets the picture display conditions. The screen display condition is, for example, that the screen is not overexposed or that the screen is not jittered, and the present disclosure is not limited thereto. It should be appreciated that embodiments of the present disclosure are not limited to the specific implementation of the second network model described above, and for example, the second network model may be a neural network model based on a screen color correction algorithm, or a neural network model based on a screen debounce algorithm, or the like. By the method, the picture display effect of the target video can be effectively improved.
In other embodiments, the server determines, based on the video content of the target video, whether the target video meets the video generation condition, if so, performs the subsequent steps, and if not, sends a notification message to the terminal, where the notification message indicates that the target video does not meet the video generation condition, where the video generation condition refers to that the video content meets the quality requirement. For example, if the video content of the target video is yellow and terrible, the target video does not meet the video generation condition. It should be noted that, the server may also determine whether the intermediate video meets the video generation condition in the process of generating the intermediate video, which is not limited to this.
Based on the above, the process of generating a target video by the server is described. In some embodiments, where the number of target videos is multiple, each target video contains a different target video element. Wherein the target video element comprises at least one of: a subtitle style; a style of media asset that matches the video content of the target video; a material transition pattern; and determining the materials from the material library. This process can also be understood as the server generating a plurality of target videos conforming to the video script based on at least one material corresponding to the video script, except that the target video elements of the target videos are different. For example, the difference between the target video a and the target video B is that the background music is different, the background music of the target video a is the lyric song a, and the background music of the target video B is the lyric song B. By the method, a plurality of target videos can be synchronously generated, so that a user can select according to requirements, and the user experience is improved.
In some embodiments, the plurality of target videos includes a first target video and at least one second target video, and the server is further capable of generating, based on different video elements included in the first target video and the second target video, hint information corresponding to the second target video, where the hint information is used to hint a video element in the second target video that is different from the first target video. The first target video can be understood to be a main video, the second target video can be understood to be an associated video of the main video, and the server generates prompt information corresponding to the second target video based on different video elements contained in the first target video and the second target video, so that a subsequent terminal can display the prompt information, a user can intuitively know the difference between different videos, man-machine interaction operation is simplified, and user experience is improved. For example, the difference between the first target video a and the second target video B is that the background music is different, the background music of the first target video a is a lyric song a, the background music of the second target video B is a lyric song B, and the prompt information corresponding to the second target video B is "the background music of the present video is different from the background music of the main video, and the background music of the present video is a lyric song B".
In step 407, the server transmits at least one target video to the terminal.
In some embodiments, in the case that the plurality of target videos includes the first target video and at least one second target video, the server also sends the prompt information corresponding to the second target video to the terminal, so that the terminal can display the prompt information.
In step 408, the terminal presents at least one target video.
In the embodiment of the disclosure, the terminal displays at least one target video on an application interface.
In some embodiments, when the plurality of target videos includes a first target video and at least one second target video, the terminal displays, on the application interface, a first target video, at least one second target video, and prompt information corresponding to the second target video, where the prompt information is used to prompt a video element in the second target video that is different from the first target video.
In some embodiments, the application running on the terminal is further capable of providing a video recommendation function, that is, the terminal displays at least one target video on an application interface, the application interface further includes a fourth control for recommending the at least one target video, and the terminal recommends the at least one target video to the at least one first object in response to a trigger operation of the fourth control. From the above description, it can be seen that the server is capable of generating a plurality of target videos comprising different target video elements, thereby providing technical support for personalized recommendation when recommending these target videos to the first object. For example, the background music of the first target video a is the lyric song a, the background music of the second target video B is the lyric song B, and in the case of recommending the first target video a and the second target video B to at least one first object, the first target video a may be recommended to the first object that prefers the lyric song a, and the second target video B may be recommended to the first object that prefers the lyric song B, which is not limited thereto. It should be noted that, the recommendation policy of the target video in the disclosure is not limited, and a specific recommendation policy can be set according to actual requirements.
Referring to fig. 6 schematically, fig. 6 is a schematic diagram showing a target video provided by an embodiment of the present disclosure. As shown in fig. 6, the terminal displays a first target video (i.e., a main video), a plurality of second target videos (i.e., associated videos of the main video), and prompt information corresponding to the second target videos on an application interface. The application interface comprises a fourth control 601, indicated as 'batch recommendation', and the terminal responds to the triggering operation of the fourth control 601 to recommend the video selected by the target object to at least one first object.
It should be noted that, in the steps 401 to 408, the video generating method is described by taking the interaction between the terminal and the server as an example, and in some embodiments, the process of generating the target video shown in the steps 404 to 406 may also be performed by the terminal, which is not limited to this disclosure.
In summary, in the video generating method provided by the embodiment of the present disclosure, in response to a video script editing operation, a video script is obtained, further, in response to a video generating operation for the video script, at least one material corresponding to the video script is processed according to an analysis result of the video script, and at least one target video is generated, where the material corresponding to the video script is matched with a material type and a material content of the video script, and an arrangement sequence of the materials in the generated target video accords with a material arrangement sequence of the video script. The process realizes that at least one target video matched with the video script is automatically generated aiming at the video script, so that man-machine interaction operation related in the video generation process is simplified, and the video generation efficiency is effectively improved.
The video generating method provided by the present disclosure is described through the embodiments shown in fig. 2 to 6. The video generation method can be applied to various scenes for generating video based on video scripts, the capability of inputting video scripts and materials to obtain a piece by one key is provided, and video generation efficiency is greatly improved. For example, in a movie and television play scene, by the video generation method, after a video script is input on an application interface and a material is uploaded, automatic generation of a movie and television video can be realized, and the video generation efficiency is greatly improved. For example, in the scenario of producing the article advertisement video, by the video generating method, after the video script is input on the application interface and the material is uploaded (or not uploaded and retrieved from the material library), the automatic generation of the article advertisement video can be realized, the video generating efficiency is greatly improved, and the disclosure is not limited thereto.
The above-described process is exemplified below with reference to fig. 7. Fig. 7 is a schematic diagram of a video generating method according to an embodiment of the present disclosure. As shown in fig. 7, the video generation method includes the following steps:
And step 1, the target object carries out video script editing operation on an application interface to edit the video script, namely filling the video script. For example, by filling in the mirror number of a video script, video lines (or called bystandings), material type, material content, TTS, music style, picture-in-picture material, etc. And after filling, storing and executing the next step.
And step 2, uploading the material corresponding to the video script by the target object on the application interface. For example, uploading the local material of the terminal. Of course, the material in the material library may be used instead of uploading.
And 3, the target object performs video generation operation aiming at the video script on the application interface to generate a plurality of target videos. This process includes a rough shearing process and a fine shearing process. Illustratively, the rough shearing process includes: and according to the video script, splicing and editing (such as material sorting, redundant picture editing, mouth error picture editing and the like) the materials corresponding to the video script to obtain an intermediate video, wherein the process also comprises processing the picture-in-picture materials. The fine shearing process comprises the following steps: based on the video content of the intermediate video, the video elements of the intermediate video are adjusted to generate a target video (such as adding subtitles, stickers, music, transitions, filters, special effects, etc.). In addition, the picture display parameters of the target video can be adjusted so that the picture display effect of the adjusted target video accords with the picture display conditions.
And 4, displaying a plurality of target videos containing different target video elements on an application interface. Wherein the plurality of target videos includes a first target video and at least one second target video. The target object can select the target video according to the requirement to recommend the target video to at least one first object, so that recommendation optimization is realized.
The basic materials involved in the video generation process, such as a material library, a local material, a music library, a sticker library, and a special effects library, are not described in detail herein with reference to the embodiment shown in fig. 4.
Therefore, the method realizes that at least one target video matched with the video script is automatically generated aiming at the video script, so that man-machine interaction operation related to the video generation process is simplified, and the video generation efficiency is effectively improved.
Fig. 8 is a block diagram of a video generating apparatus provided by an embodiment of the present disclosure. Referring to fig. 8, the apparatus is applied to a server, and includes an acquisition unit 801, an analysis unit 802, and a generation unit 803.
An obtaining unit 801 configured to perform an editing operation in response to a video script, to obtain a video script including a material type, material content, and material arrangement order of a video to be generated;
A parsing unit 802 configured to perform parsing of the video script in response to a video generation operation for the video script, to obtain the material type, the material content, and the material arrangement order;
and a generating unit 803 configured to execute determining at least one material corresponding to the video script based on the material type and the material content, and processing at least one material based on the material arrangement order to generate at least one target video, wherein the arrangement order of the materials in the target video accords with the material arrangement order, and the materials are matched with the material type and the material content.
In some embodiments, the generating unit 803 includes:
a determining unit configured to execute determining a plurality of the materials corresponding to the video script based on the material type and the material content;
the first sub-generation unit is configured to splice a plurality of materials according to the material arrangement sequence to generate an intermediate video;
and a second sub-generation unit configured to perform adjustment of video elements of the intermediate video based on video content of the intermediate video, and generate the target video.
In some embodiments, the first sub-generation unit is configured to perform:
And filtering the content meeting the filtering conditions in each material, and splicing the plurality of filtered materials according to the arrangement sequence of the materials to generate the intermediate video.
In some embodiments, the second sub-generation unit is configured to perform at least one of:
adding subtitles to the intermediate video based on the video content of the intermediate video to obtain the target video;
and adding media resources matched with the video content to the intermediate video based on the video content of the intermediate video to obtain the target video, wherein the media resources comprise one or more of pictures, dynamic special effects, audios and prompt texts.
In some embodiments, the apparatus further comprises:
and the adjusting unit is configured to execute adjustment of the picture display parameters of the target video based on the picture display effect of the target video to obtain the adjusted target video, wherein the picture display effect of the adjusted target video accords with the picture display condition.
In some embodiments, the video script further comprises at least one of:
the picture-in-picture material type, material content and material splicing mode;
video lines;
converting the text into a voice original text;
Background music style.
In some embodiments, the determining unit is configured to perform at least one of:
determining at least one first material corresponding to the video script from a material library based on the material type and the material content;
and responding to the material uploading operation aiming at the video script, and determining at least one second material corresponding to the video script based on the material type and the material content.
In some embodiments, in the case where the number of the target videos is plural, each target video contains a different target video element.
In some embodiments, the target video element comprises at least one of:
a subtitle style;
a style of media asset that matches the video content of the target video;
a material transition pattern;
and determining the materials from the material library.
In some embodiments, the plurality of target videos includes a first target video and at least one second target video, the generating unit 803 is further configured to perform:
and generating prompt information corresponding to the second target video based on different video elements contained in the first target video and the second target video, wherein the prompt information is used for prompting the video elements different from the first target video in the second target video.
It should be noted that: in the video generating apparatus provided in the above embodiment, when generating a video, only the division of the above functional modules is used as an example, in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to perform all or part of the functions described above. In addition, the video generating apparatus provided in the above embodiment and the video generating method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.
Fig. 9 is a block diagram of a video generating apparatus provided by an embodiment of the present disclosure. Referring to fig. 9, the apparatus is applied to a terminal, and includes a first acquiring unit 901, a second acquiring unit 902, and a display unit 903.
A first obtaining unit 901 configured to perform an editing operation in response to a video script, to obtain a video script including a material type, material content, and material arrangement order of a video to be generated;
a second obtaining unit 902, configured to perform a video generating operation in response to the video script, to obtain at least one target video corresponding to the video script, where the target video is obtained by processing at least one material corresponding to the video script based on the material arrangement sequence, at least one material is determined based on the material type and the material content, the arrangement sequence of the materials in the target video accords with the material arrangement sequence, and the material matches with the material type and the material content;
A presentation unit 903 configured to perform presentation of at least one of the target videos.
In some embodiments, in the case that the number of the target videos is plural, each target video contains a different target video element, the presenting unit 903 is configured to perform:
displaying a first target video, at least one second target video and prompt information corresponding to the second target video, wherein the prompt information is used for prompting video elements different from the first target video in the second target video.
It should be noted that: in the video generating apparatus provided in the above embodiment, when generating a video, only the division of the above functional modules is used as an example, in practical application, the above functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to perform all or part of the functions described above. In addition, the video generating apparatus provided in the above embodiment and the video generating method embodiment belong to the same concept, and specific implementation processes thereof are detailed in the method embodiment and are not described herein again.
In an exemplary embodiment, there is also provided an electronic device comprising a processor and a memory for storing at least one computer program that is loaded and executed by the processor to implement the video generation method in the embodiments of the present disclosure.
Taking an electronic device as an example of a terminal, fig. 10 is a block diagram of a structure of the terminal according to an embodiment of the disclosure. As shown in fig. 10, the terminal 1000 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 1000 can also be referred to by other names of user equipment, portable terminal, laptop terminal, desktop terminal, etc.
In general, terminal 1000 can include: a processor 1001 and a memory 1002.
The processor 1001 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and so on. The processor 1001 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Prog rammable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 1001 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central ProcessingUnit ), and a coprocessor; a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 1001 may be integrated with a GPU (Graphics Processing Unit, image processor) for taking care of rendering and drawing of content that the display screen needs to display. In some embodiments, the processor 1001 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 1002 may include one or more computer-readable storage media, which may be non-transitory. Memory 1002 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1002 is used to store at least one program code for execution by processor 1001 to implement the processes performed by a terminal in the video generation method provided by the method embodiments of the present disclosure.
In some embodiments, terminal 1000 can optionally further include: a peripheral interface 1003, and at least one peripheral. The processor 1001, the memory 1002, and the peripheral interface 1003 may be connected by a bus or signal line. The various peripheral devices may be connected to the peripheral device interface 1003 via a bus, signal wire, or circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1004, a display 1005, a camera assembly 1006, audio circuitry 1007, a positioning assembly 1008, and a power supply 1009.
Peripheral interface 1003 may be used to connect I/O (Input/Output) related at least one peripheral to processor 1001 and memory 1002. In some embodiments, processor 1001, memory 1002, and peripheral interface 1003 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 1001, memory 1002, and peripheral interface 1003 may be implemented on a separate chip or circuit board, which is not limited in this embodiment.
Radio Frequency circuit 1004 is used to receive and transmit RF (Radio Frequency) signals, also known as electromagnetic signals. Radio frequency circuitry 1004 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 1004 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. Optionally, the radio frequency circuit 1004 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. Radio frequency circuitry 1004 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: metropolitan area networks, various generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 1004 may also include NFC (Near Field Communication ) related circuitry, which is not limiting of the application.
The display screen 1005 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 1005 is a touch screen, the display 1005 also has the ability to capture touch signals at or above the surface of the display 1005. The touch signal may be input to the processor 1001 as a control signal for processing. At this time, the display 1005 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, display 1005 may be one, disposed on the front panel of terminal 1000; in other embodiments, display 1005 may be provided in at least two, separately provided on different surfaces of terminal 1000 or in a folded configuration; in other embodiments, display 1005 may be a flexible display disposed on a curved surface or a folded surface of terminal 1000. Even more, the display 1005 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The display 1005 may be made of LCD (Liquid Crystal Display ), OLED (organic Light Emitting Diode) or other materials.
The camera assembly 1006 is used to capture images or video. Optionally, camera assembly 1006 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 1006 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 1007 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 1001 for processing, or inputting the electric signals to the radio frequency circuit 1004 for voice communication. For purposes of stereo acquisition or noise reduction, the microphone may be multiple, each located at a different portion of terminal 1000. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 1001 or the radio frequency circuit 1004 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, audio circuit 1007 may also include a headphone jack.
The location component 1008 is used to locate the current geographic location of terminal 1000 to enable navigation or LBS (Location Bas ed Service, location-based services).
Power supply 1009 is used to power the various components in terminal 1000. The power source 1009 may be alternating current, direct current, disposable battery or rechargeable battery. When the power source 1009 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, terminal 1000 can further include one or more sensors 1010. The one or more sensors 1010 include, but are not limited to: acceleration sensor 1011, gyroscope sensor 1012, pressure sensor 1013, fingerprint sensor 1014, optical sensor 1015, and proximity sensor 1016.
The acceleration sensor 1011 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 1000. For example, the acceleration sensor 1011 may be used to detect components of gravitational acceleration in three coordinate axes. The processor 1001 may control the display screen 1005 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal acquired by the acceleration sensor 1011. The acceleration sensor 1011 may also be used for the acquisition of motion data of a game or a user.
The gyro sensor 1012 may detect the body direction and the rotation angle of the terminal 1000, and the gyro sensor 1012 may collect the 3D motion of the user to the terminal 1000 in cooperation with the acceleration sensor 1011. The processor 1001 may implement the following functions according to the data collected by the gyro sensor 1012: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
Pressure sensor 1013 may be disposed on a side frame of terminal 1000 and/or on an underlying layer of display 1005. When the pressure sensor 1013 is provided at a side frame of the terminal 1000, a grip signal of the terminal 1000 by a user can be detected, and the processor 1001 performs right-and-left hand recognition or quick operation according to the grip signal collected by the pressure sensor 1013. When the pressure sensor 1013 is provided at the lower layer of the display screen 1005, the processor 1001 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 1005. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The fingerprint sensor 1014 is used to collect a fingerprint of the user, and the processor 1001 identifies the identity of the user based on the fingerprint collected by the fingerprint sensor 1014, or the fingerprint sensor 1014 identifies the identity of the user based on the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the processor 1001 authorizes the user to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. Fingerprint sensor 1014 may be disposed on the front, back, or side of terminal 1000. When a physical key or vendor Logo is provided on terminal 1000, fingerprint sensor 1014 may be integrated with the physical key or vendor Logo.
The optical sensor 1015 is used to collect ambient light intensity. In one embodiment, the processor 1001 may control the display brightness of the display screen 1005 based on the ambient light intensity collected by the optical sensor 1015. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 1005 is turned up; when the ambient light intensity is low, the display brightness of the display screen 1005 is turned down. In another embodiment, the processor 1001 may dynamically adjust the shooting parameters of the camera module 1006 according to the ambient light intensity collected by the optical sensor 1015.
Proximity sensor 1016, also referred to as a distance sensor, is typically located on the front panel of terminal 1000. Proximity sensor 1016 is used to collect the distance between the user and the front of terminal 1000. In one embodiment, when proximity sensor 1016 detects a gradual decrease in the distance between the user and the front face of terminal 1000, processor 1001 controls display 1005 to switch from the bright screen state to the off screen state; when proximity sensor 1016 detects a gradual increase in the distance between the user and the front of terminal 1000, processor 1001 controls display 1005 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the structure shown in fig. 10 is not limiting and that terminal 1000 can include more or fewer components than shown, or certain components can be combined, or a different arrangement of components can be employed.
Taking an electronic device as an example of a server, fig. 11 is a block diagram of a server according to an embodiment of the disclosure. Illustratively, the server 1100 may include one or more processors (Central Processing Units, CPU) 1101 and one or more memories 1102, where the one or more memories 1102 store at least one program code that is loaded and executed by the one or more processors 1101 to implement the video generation method provided by the above-described method embodiments. Of course, the server 1100 may also have a wired or wireless network interface, a keyboard, an input/output interface, etc. for performing input/output, and the server 1100 may also include other components for implementing device functions, which are not described herein.
In an exemplary embodiment, a computer readable storage medium is also provided, such as a memory 1102, comprising program code executable by the processor 1101 of the server 1100 to perform the video generation method described above. Alternatively, the computer readable storage medium may be a Read-only Memory (ROM), a random access Memory (Random Access Memory, RAM), a Compact-disk Read-only Memory (CD-ROM), a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, comprising a computer program which, when executed by a processor, implements the above-described video generation method.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any adaptations, uses, or adaptations of the disclosure following the general principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims (16)

1. A video generation method, applied to a server, the method comprising:
responding to video script editing operation, and acquiring a video script, wherein the video script comprises the material type, the material content and the material arrangement sequence of the video to be generated;
Responding to video generation operation aiming at the video script, analyzing the video script to obtain the material type, the material content and the material arrangement sequence;
and determining at least one material corresponding to the video script based on the material type and the material content, processing at least one material based on the material arrangement sequence to generate at least one target video, wherein the arrangement sequence of the material in the target video accords with the material arrangement sequence, and the material is matched with the material type and the material content.
2. The method for generating video according to claim 1, wherein said determining at least one material corresponding to the video script based on the material type and the material content, processing at least one of the materials based on the material arrangement order, and generating at least one target video includes:
determining a plurality of materials corresponding to the video script based on the material type and the material content;
splicing a plurality of materials according to the material arrangement sequence to generate an intermediate video;
and adjusting video elements of the intermediate video based on the video content of the intermediate video to generate the target video.
3. The method of generating video according to claim 2, wherein the splicing the plurality of materials according to the material arrangement order to generate the intermediate video includes:
and filtering the content meeting the filtering conditions in each material, and splicing the plurality of filtered materials according to the material arrangement sequence to generate the intermediate video.
4. The video generation method according to claim 2, wherein the adjusting video elements of the intermediate video based on the video content of the intermediate video generates the target video, comprising at least one of:
adding subtitles to the intermediate video based on the video content of the intermediate video to obtain the target video;
and adding media resources matched with the video content to the intermediate video based on the video content of the intermediate video to obtain the target video, wherein the media resources comprise one or more of pictures, dynamic special effects, audios and prompt texts.
5. The video generation method according to claim 1, characterized in that the method further comprises:
and adjusting the picture display parameters of the target video based on the picture display effect of the target video to obtain the adjusted target video, wherein the picture display effect of the adjusted target video accords with picture display conditions.
6. The video generation method of claim 1, wherein the video script further comprises at least one of:
the picture-in-picture material type, material content and material splicing mode;
video lines;
converting the text into a voice original text;
background music style.
7. The method of generating video according to claim 1, wherein the determining at least one material corresponding to the video script based on the material type and the material content includes at least one of:
determining at least one first material corresponding to the video script from a material library based on the material type and the material content;
and responding to a material uploading operation aiming at the video script, and determining at least one second material corresponding to the video script based on the material type and the material content.
8. The method according to any one of claims 1 to 7, wherein, in the case where the number of the target videos is plural, each of the target videos contains a different target video element.
9. The video generation method of claim 8, wherein the target video element comprises at least one of:
A subtitle style;
a style of media asset that matches video content of the target video;
a material transition pattern;
and determining the materials from the material library.
10. The video generation method of claim 8, wherein the plurality of target videos includes a first target video and at least one second target video, the method further comprising:
and generating prompt information corresponding to the second target video based on different video elements contained in the first target video and the second target video, wherein the prompt information is used for prompting the video elements different from the first target video in the second target video.
11. A video generation method, applied to a terminal, the method comprising:
responding to video script editing operation, and acquiring a video script, wherein the video script comprises the material type, the material content and the material arrangement sequence of the video to be generated;
responding to video generation operation aiming at the video script, and acquiring at least one target video corresponding to the video script, wherein the target video is obtained by processing at least one material corresponding to the video script based on the material arrangement sequence, at least one material is determined based on the material type and the material content, the arrangement sequence of the materials in the target video accords with the material arrangement sequence, and the materials are matched with the material type and the material content;
At least one of the target videos is shown.
12. The method according to claim 11, wherein in a case where the number of the target videos is plural, target video elements included in each of the target videos are different, the displaying at least one of the target videos includes:
displaying a first target video, at least one second target video and prompt information corresponding to the second target video, wherein the prompt information is used for prompting video elements different from the first target video in the second target video.
13. A video generating apparatus for use with a server, the apparatus comprising:
an acquisition unit configured to perform an editing operation in response to a video script, the video script including a material type, a material content, and a material arrangement order of a video to be generated;
the analysis unit is configured to execute a video generation operation responding to the video script, analyze the video script and obtain the material type, the material content and the material arrangement sequence;
the generating unit is configured to execute at least one material corresponding to the video script based on the material type and the material content, process at least one material based on the material arrangement sequence, and generate at least one target video, wherein the arrangement sequence of the material in the target video accords with the material arrangement sequence, and the material is matched with the material type and the material content.
14. A video generating apparatus, characterized by being applied to a terminal, comprising:
a first acquisition unit configured to perform an editing operation in response to a video script, the video script including a material type, a material content, and a material arrangement order of a video to be generated;
a second obtaining unit, configured to perform a video generating operation for the video script, and obtain at least one target video corresponding to the video script, where the target video is obtained by processing at least one material corresponding to the video script based on the material arrangement sequence, at least one material is determined based on the material type and the material content, the arrangement sequence of the materials in the target video accords with the material arrangement sequence, and the materials are matched with the material type and the material content;
and a display unit configured to perform display of at least one of the target videos.
15. An electronic device, the electronic device comprising:
one or more processors;
a memory for storing the processor-executable program code;
wherein the processor is configured to execute the program code to implement the video generation method of any one of claims 1 to 10 or the video generation method of any one of claims 11 to 12.
16. A computer readable storage medium, characterized in that program code in the computer readable storage medium, when executed by a processor of an electronic device, enables the electronic device to perform the video generation method of any one of claims 1 to 10, or the video generation method of any one of claims 11 to 12.
CN202310639629.XA 2023-05-31 2023-05-31 Video generation method, device, electronic equipment and storage medium Pending CN116647714A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310639629.XA CN116647714A (en) 2023-05-31 2023-05-31 Video generation method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310639629.XA CN116647714A (en) 2023-05-31 2023-05-31 Video generation method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116647714A true CN116647714A (en) 2023-08-25

Family

ID=87618509

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310639629.XA Pending CN116647714A (en) 2023-05-31 2023-05-31 Video generation method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116647714A (en)

Similar Documents

Publication Publication Date Title
CN111243632B (en) Multimedia resource generation method, device, equipment and storage medium
CN111246300B (en) Method, device and equipment for generating clip template and storage medium
CN110708596A (en) Method and device for generating video, electronic equipment and readable storage medium
CN110933330A (en) Video dubbing method and device, computer equipment and computer-readable storage medium
CN110545476B (en) Video synthesis method and device, computer equipment and storage medium
CN111031386B (en) Video dubbing method and device based on voice synthesis, computer equipment and medium
CN108959361B (en) Form management method and device
CN110248236B (en) Video playing method, device, terminal and storage medium
CN111339326A (en) Multimedia resource display method, multimedia resource providing method and multimedia resource providing device
CN113411680B (en) Multimedia resource playing method, device, terminal and storage medium
CN112363660B (en) Method and device for determining cover image, electronic equipment and storage medium
CN111880888B (en) Preview cover generation method and device, electronic equipment and storage medium
CN110750734A (en) Weather display method and device, computer equipment and computer-readable storage medium
CN112667835A (en) Work processing method and device, electronic equipment and storage medium
CN110996167A (en) Method and device for adding subtitles in video
CN113747199A (en) Video editing method, video editing apparatus, electronic device, storage medium, and program product
CN111276122A (en) Audio generation method and device and storage medium
CN111031391A (en) Video dubbing method, device, server, terminal and storage medium
CN111628925A (en) Song interaction method and device, terminal and storage medium
CN111539795A (en) Image processing method, image processing device, electronic equipment and computer readable storage medium
CN111459466B (en) Code generation method, device, equipment and storage medium
CN110493635B (en) Video playing method and device and terminal
CN113609358B (en) Content sharing method, device, electronic equipment and storage medium
CN112616082A (en) Video preview method, device, terminal and storage medium
CN114168369A (en) Log display method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination