WO2021098151A1 - 特效视频合成方法、装置、计算机设备和存储介质 - Google Patents

特效视频合成方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2021098151A1
WO2021098151A1 PCT/CN2020/087712 CN2020087712W WO2021098151A1 WO 2021098151 A1 WO2021098151 A1 WO 2021098151A1 CN 2020087712 W CN2020087712 W CN 2020087712W WO 2021098151 A1 WO2021098151 A1 WO 2021098151A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
special effect
template
initiator
effect video
Prior art date
Application number
PCT/CN2020/087712
Other languages
English (en)
French (fr)
Inventor
朱敏
Original Assignee
深圳壹账通智能科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳壹账通智能科技有限公司 filed Critical 深圳壹账通智能科技有限公司
Publication of WO2021098151A1 publication Critical patent/WO2021098151A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/222Studio circuitry; Studio devices; Studio equipment
    • H04N5/262Studio circuits, e.g. for mixing, switching-over, change of character of image, other special effects ; Cameras specially adapted for the electronic generation of special effects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs

Definitions

  • This application relates to the field of computer technology, and in particular to a special effect video synthesis method, device, computer equipment and storage medium.
  • template videos are provided by the special effects platform, users select the special effects templates they want to achieve through the terminal, upload the videos to the special effects platform based on the special effects templates, and integrate the user's video into the special effects templates to get User’s special effects video.
  • a special effect video synthesis method comprising: receiving a special effect video synthesis instruction sent by an initiator terminal, the special effect video synthesis instruction including a special effect video template identifier, the image information of the initiator in the special effect video template, and the initiator Video information; fuse the image information of the initiator in the special effect video template and the video information of the initiator to obtain the personal special effect video of the initiator; generate a video shooting invitation according to the special effect video of the initiator and send it to the initiator terminal, so The video shooting invitation is sent by the initiator terminal to the designated user; the image information selection instruction sent by the recipient terminal that receives the video shooting invitation is obtained; the special effect video is sent to the recipient terminal according to the image information selection instruction Unoccupied image information in the template; acquiring the image information of the special effect video template sent by the receiver terminal based on the unoccupied image information in the special effect video template, and the receiver’s video information; according to the special effect video The image information of the receiver in the template fuses the personal special effect video of the initiator and
  • a special effect video synthesis device comprising: a synthesis instruction receiving module, configured to receive a special effect video synthesis instruction sent by an initiator terminal, the special effect video synthesis instruction includes a special effect video template identifier, and the special effect video template initiates The image information of the party and the video information of the initiator; the first fusion module is used to fuse the image information of the initiator and the video information of the initiator in the special effect video template to obtain the personal special effect video of the initiator; the invitation sending module is used to Generate a video shooting invitation according to the special effect video of the initiator and send it to the initiator terminal.
  • the video shooting invitation is sent by the initiator terminal to the designated user; the instruction receiving module is used to obtain the recipient of the video shooting invitation
  • the receiver terminal sends the image information of the special effect video template based on the unoccupied image information in the special effect video template, and the receiver’s video information;
  • the second fusion module is used for the receiver according to the image information in the special effect video template.
  • the image information of is fused with the personal special effect video of the initiator and the video information of the receiver to obtain a multi-person special effect video.
  • a computer device includes: one or more processors; a memory; one or more computer programs, wherein the one or more computer programs are stored in the memory and configured to be operated by the one or more The one or more computer programs are configured to execute a special effect video synthesis method, wherein the special effect video synthesis method includes the following steps: receiving a special effect video synthesis instruction sent by the initiator terminal, the special effect The video synthesis instruction includes the special effect video template identifier, the image information of the initiator in the special effect video template, and the video information of the initiator; the image information of the initiator in the special effect video template and the video information of the initiator are merged to obtain the initiator’s video information.
  • Personal special effect video generate a video shooting invitation according to the special effect video of the initiator and send it to the initiator terminal, the video shooting invitation is sent by the initiator terminal to the designated user; get the recipient terminal that receives the video shooting invitation and send it
  • the image information selection instruction according to the image information selection instruction, send the unoccupied image information in the special effect video template to the receiver terminal; obtain the unoccupied image information of the special effect video template based on the receiver terminal
  • the above-mentioned special effect video synthesis method, device, computer equipment and storage medium solve the problem that the production scene of special effect video is restricted by time and space, and the operation convenience is low.
  • FIG. 1 is an application scene diagram of a special effect video synthesis method in an embodiment
  • FIG. 2 is a schematic flowchart of a special effect video synthesis method in another embodiment
  • Fig. 3 is a schematic diagram of a special effect video template in a special effect video synthesis method in an embodiment
  • FIG. 4 is a schematic diagram of video frame images of a multi-person special effect video in a special effect video synthesis method in an embodiment
  • Figure 5 is a structural block diagram of a special effects video synthesis device in another embodiment
  • Fig. 6 is an internal structure diagram of a computer device in an embodiment.
  • the special effect video synthesis method provided in this application is suitable for the field of artificial intelligence technology and can be applied to the application environment as shown in FIG. 1.
  • the terminal 102 communicates with the server 104 through the network through the network.
  • the server 104 receives the special-effect video synthesis instruction sent by the initiator terminal 102.
  • the special-effect video synthesis instruction includes the special-effect video template identifier, the image information of the initiator in the special-effect video template, and the initiator's video information; the server 104 merges the special-effect video template with the initiator's video information.
  • the image information and the video information of the initiator obtain the personal special effect video of the initiator; the video shooting invitation is generated according to the special effect video of the initiator and sent to the initiator terminal 102, and the video shooting invitation is sent by the initiator terminal 102 to the designated user; the server 104 Obtain the image information selection instruction sent by the receiving terminal 102 that received the video shooting invitation; send the unoccupied image information in the special effect video template to the receiving terminal 102 according to the image information selection instruction; obtain the receiving terminal 102 based on the special effect video template
  • the image information of the special effect video template sent by the unoccupied image information, as well as the receiver’s video information; according to the receiver’s image information in the special effect video template, the personal special effect video of the initiator and the receiver’s video information are merged to obtain the multiplayer special effect video .
  • the terminal 102 includes an initiator terminal and a receiver terminal, and may be one initiator terminal or multiple initiator terminals, and may be one recipient terminal or multiple recipient terminals.
  • the terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server 104 may be implemented by an independent server or a server cluster composed of multiple servers.
  • a special effect video synthesis method is provided. Taking the method applied to the server in FIG. 1 as an example for description, the method includes the following steps:
  • Step S220 Receive a special effect video synthesis instruction sent by the initiator terminal.
  • the special effect video synthesis instruction includes the special effect video template identifier, the image information of the initiator in the special effect video template, and the initiator video information.
  • the initiator terminal refers to the terminal held by the initiator of the special effects video.
  • the special effect video template can be preset, such as: fireworks scene, game scene, animation scene, basketball, etc. As shown in Figure 3, it is a special effect video template for fireworks scene.
  • the selected image information may also have only one image information.
  • the special effect video initiator only wants to synthesize a personal special effect video, he can select one or more special effect video templates with image information.
  • the special effect video initiator wants to synthesize a multi-person special effect video, he can select multiple special effect video templates with image information.
  • the special effect video template identifier is used to identify each special effect video template, and each special effect video template corresponds to an identifier.
  • the image information refers to the special effect image.
  • the image of the pig in the special effect video template picture in Figure 3 is the image information.
  • the special effect video synthesis instruction is that the special effect video initiator selects the special effect video template to be synthesized on the special effect video template selection page through the initiator terminal, and the initiator terminal determines the corresponding special effect video template identifier based on the special effect video template selected by the special effect video initiator, and generates Special effects video synthesis instructions are sent to the server. It can also be that the special effect video initiator selects a custom video special effect template in the special effect video template selection page through the initiator terminal, and generates a special effect video synthesis instruction based on the user-defined video special effect template.
  • the instruction includes the custom video special effect template. Send to the server.
  • the corresponding special effect video template can be obtained from the template database through the special effect video template identifier.
  • the image information of the initiator in the special effect video template can be used to determine the image that the initiator wants to synthesize.
  • the template database is used to store each special effect video template. According to the special effect video template identifier, a unique corresponding special effect video template can be found in the template database.
  • the received special effect video synthesis instruction is a custom video special effect template
  • process the custom video special effect template determine the image information that can be selected in the custom video special effect template, and determine the synthesis area corresponding to each image information, etc. , So that the format of the custom video special effect template is the same as that of the video special effect template in the template database to form a processed custom video special effect video template.
  • the processed custom video special effect template can also be stored in the template database, which can be used as the personal video special effect template of the special effect video initiator.
  • the initiator user Before receiving the special effect video synthesis instruction sent by the initiator terminal, by sending each special effect video template to the initiator terminal, the initiator user can select their favorite special effect video template and image information through the initiator terminal, based on the selected special effect video template , Determine the special effect video template identifier of the special effect video template, and shoot the video information of the initiator (that is, the personal video of the initiator of the special effect video), the initiator’s video information contains at least a certain number of face video frames, and the duration of the video Within the preset time period, the initiator terminal is made to generate a special effect video synthesis instruction based on the special effect video template identifier, the image information of the initiator in the special effect video template, and the initiator video information, and send the special effect video synthesis instruction to the server.
  • the initiator terminal can also determine whether the captured video information of the initiator meets the requirements by detecting the face video frame and the duration of the video. When the captured video information of the initiator does not reach the preset number of face video frames or the duration of the video is not within the preset duration range, remind the user to reshoot.
  • the preset number can be 50 ⁇ 1000 frames
  • the preset duration can be 10s ⁇ 500s.
  • step S240 the image information of the initiator and the video information of the initiator in the special effect video template are merged to obtain the personal special effect video of the initiator.
  • the expression frame of the video information is recognized by the expression recognition model, and the facial expression frame is obtained.
  • the number of the obtained facial expression frames needs to reach a preset number, and the preset number is determined according to the total number of frames of the special effect video template, such as:
  • the total number of frames of the special effects video template is 10 frames, and the obtained facial expression frames also need 10 frames. Extract the face area of the facial expression frame to obtain the face area, determine the composite area in each video frame of the special effect video template according to the image information, and merge the corresponding face area of the expression video frame into each video frame of the special effect video template. In the synthesis area, get personal special effects videos.
  • step S260 a video shooting invitation is generated according to the special effect video of the initiator and sent to the initiator terminal, and the video shooting invitation is sent from the initiator terminal to the designated user.
  • the generated video shooting invitation may be a link, or a QR code, etc.
  • the currently synthesized special effect video can be viewed through the video shooting invitation and participate in the synthesis of the special effect video.
  • the generated video shooting invitation is sent to the initiator terminal.
  • the special effect video initiator can send the video shooting invitation to the designated user through the terminal, and the designated user is the user designated by the special effects video initiator to receive Account
  • the initiator can limit who is invited to join, for example: the special effects video initiator sends a video shooting invitation to the account of user A and user B, then the designated user is the account of user A and user B. It is also possible to not limit who is the invitee. For example, if the initiator sends a video shooting invitation to Moments, it can be seen that the account of the user invited to the video shooting is the designated user.
  • Step S280 Obtain the image information selection instruction sent by the receiver terminal that received the video shooting invitation.
  • the recipient is the designated user who receives the video shooting invitation.
  • the receiver receives the video shooting invitation generated by the initiator, it can view the special effect video of the initiator of the special effect video based on the video shooting invitation.
  • the receiver accepts the video shooting invitation of the initiator, the receiver uses the operation on the receiver terminal (click) Video shooting invitation, enter the page where you can view the special effect video of the initiator of the special effect video, and send an image information selection instruction to the server based on the page.
  • Step S300 Send the unoccupied image information in the special effect video template to the receiver terminal according to the image information selection instruction.
  • the server when it receives the image information selection instruction of the initiator terminal, it obtains the use of the image information of the special effect video template used in the special effect video synthesis, and determines the special effect video template according to the use of the image information of the special effect video template.
  • Unoccupied image information such as the special effect video template has four image information q, w, e, r
  • the special effect video initiator can choose any image information in the special effect video template, and the special effect video initiator selects the image information w
  • the unoccupied image information in the special effect video template only leaves the three image information q, e, r.
  • the unoccupied image information in the special effect video template should be q
  • the three image information of, e, and r are sent to the receiving terminal terminal of the three image information of q, e, and r in the special effect video template.
  • the image information that is not occupied in the same special effect video template is sent to multiple recipient terminals at the same time.
  • multiple recipient terminals are based on the non-occupied image information in the special effect video template.
  • the image information of the special effect video template sent by the occupied image information is the same, the image information is occupied by the recipient who sent first, and the image information is occupied by the recipient after the reminder. Please select again.
  • Step S320 Obtain the image information of the special effect video template sent by the receiver terminal based on the unoccupied image information in the special effect video template, and the receiver's video information.
  • the receiving party can view the unoccupied image information in the special effect video template through the receiving terminal, and select it for special effect video synthesis on the receiving terminal.
  • the image information and the personal video are taken, and the image information and video information selected by the receiver are sent to the server through the receiver terminal.
  • the server After the server receives the selected image information and the receiver’s video information, it can be based on the selected image information. Mark the image information as occupied image information.
  • step S340 the personal special effect video of the initiator and the video information of the receiver are merged according to the image information of the receiver in the special effect video template to obtain a multi-person special effect video.
  • the recipient’s personal special effect video can be obtained based on the recipient’s image information and the recipient’s video information in the special effect video template, and the initiator’s personal special effect video and the recipient’s personal special effect video can be synthesized according to the special effect video template to obtain multiple people Special effects video.
  • the video frame of the multiplayer special effect video includes the personal special effect image of the initiator and the personal special effect image of multiple recipients, such as the video frame image of the multiplayer special effect video as shown in FIG. 4. According to the image information corresponding to the personal special effect video of the initiator and the image information corresponding to the personal special effect video of the recipient, the personal special effect video of the initiator and the personal special effect video of the recipient are merged to obtain a multi-person special effect video.
  • the initiator sends a special effect video synthesis instruction to the server based on the terminal, selects the special effect video template and image information, and sends the image information and video information to the server.
  • the server initiates synthesis based on the special effect video template, image information and video information Party’s personal special effects video, and generate the initiator’s special effect video invitation, for the initiator to send the special effect video invitation to the designated user, so that the designated user participates in the synthesis of the special effect video based on the special effect video invitation, and the designated user only needs to upload the captured video through the terminal
  • the information and the selected image information are given to the server, and the server can synthesize the video information of the initiator and each receiver into the same special effect video, realizing the synthesis of multi-person special effect videos, without the need for one camera to collect multiple participants at the same time Participants only need to upload their personal video information to the server in the scene where the personnel are located.
  • the server performs multi-person special effect video synthesis on the uploaded video information, which
  • the step of fusing the personal special effect video of the initiator and the video information of the recipient according to the image information of the special effect video template of the recipient to obtain the multi-person special effect video includes: fusing the image information of the recipient in the special effect video template And the receiver’s video information to obtain the receiver’s personal special effect video; synthesize the initiator’s personal special effect video and the receiver’s personal special effect video according to the special effect video template to obtain the multi-person special effect video.
  • the expression frame of the video information is recognized by the expression recognition model, and the facial expression frame is obtained.
  • the number of the obtained facial expression frames needs to reach a preset number, and the preset number is determined according to the total number of frames of the special effect video template, such as:
  • the total number of frames of the special effects video template is 10 frames, and the obtained facial expression frames also need 10 frames. Extract the face area of the facial expression frame to obtain the face area, determine the composite area in each video frame of the special effect video template according to the image information, and merge the corresponding face area of the expression video frame into each video frame of the special effect video template. In the synthesis area, get personal special effects videos.
  • the video frame of the multiplayer special effect video includes the personal special effect image of the initiator and the personal special effect image of the recipient, such as the video frame image of the multiplayer special effect video as shown in FIG. 4.
  • the personal special effect video of the initiator and the personal special effect video of the recipient are merged to obtain a multi-person special effect video.
  • the step of synthesizing the personal special effect video of the initiator and the personal special effect video of the recipient according to the special effect video template to obtain the multi-person special effect video includes: according to the image information corresponding to the personal special effect video of the initiator, and the recipient The image information corresponding to the personal special effects video of the initiating party merges the personal special effects video of the recipient and the personal special effects video of the recipient to obtain the multi-person special effects video.
  • the image information corresponding to the personal special effect video of the initiator is used to determine which image information is synthesized by the initiator
  • the image information corresponding to the personal special effect video of the recipient is used to determine which image information is synthesized by the receiver. It can be based on the personal special effects video of the initiator, by obtaining the face area integrated in each video frame in the personal special effects video of the recipient, and the face area integrated in each video frame is correspondingly integrated into the personal special effect of the initiator In the video frame of the video, the image in the personal special effect video of the initiator and the image in the personal special effect video of the recipient are formed (as shown in Fig. 4).
  • the image in the personal special effects video of the initiator and the image in the personal special effects video of the recipient are formed.
  • the image in the personal special effects video refers to the integration of the face area in the video information into the image The image after the synthesis area.
  • the personal special effect video of the initiator and the personal special effect video of the recipient are merged.
  • the multi-person special effect video is obtained, the currently synthesized special effect video invited by the special effect video is updated.
  • you can See the synthesis progress of the current special effect video such as: after the initiator user sends a special effect video invitation to the receiver, each user can see the current synthesized special effect video through the special effect video invitation, when the recipient’s image information and video information are received ,
  • the multi-person special effects video includes the image and the initiator in the personal special effects video of the recipient A.
  • the reception Multiplayer special effects videos of Fang A and the initiator After receiving the image information and video information of the recipient B, step S340 to step S360 are executed to obtain the multi-person special effect video of the recipient A, the initiator, and the recipient A and B.
  • the multiplayer special effect video synthesis is ended, the final multiplayer special effect video is generated, and a completion reminder is sent to the users participating in the multiplayer special effect video.
  • the initiator sends an end multiplayer special effect video synthesis instruction through the terminal, and the server combines the currently synthesized special effect based on the received end multiplayer special effect video synthesis instruction
  • the video serves as the final multiplayer special effects video, and a completion reminder is sent to users who participate in the multiplayer special effects video.
  • a completion reminder is sent to the user, so that the user does not need to be notified of the synthesis progress through special effects video invitations.
  • the user can end the multiplayer special effect video synthesis at any time.
  • the fusion method of personal special effects videos includes: recognizing video information through an expression recognition model to obtain each expression video frame; extracting the face region in each expression video frame to obtain the person corresponding to each expression video frame Face area: Determine the composite area in each video frame of the special effect video template according to the image information; merge the facial area corresponding to each expression video frame to the composite area in each video frame of the special effect video template to obtain a personal special effect video.
  • the personal special effect video may be the personal special effect video of the initiator or the personal special effect video of the recipient.
  • the facial expression recognition model is a model for recognizing facial expression video frames. It collects facial expression training pictures (network pictures, standard resource pictures, etc.); performs light and dark tone processing on the training pictures to enhance the generalization ability of the model; performs facial expression classification on the training pictures (Smiling, blinking, funny, opening mouth, etc.) Obtain all kinds of expression pictures; input all kinds of expression pictures into a model based on the tensorflow framework using CNN convolutional neural network for training, and obtain an expression recognition model, so that the expression recognition model can be used Identify which type of picture each picture belongs to, such as: it is a picture of smiling expressions, a picture of open mouth expressions, a picture of funny expressions or other pictures, etc.
  • the expression video frame refers to a video frame in which the facial expressions in each video frame of the video information are smiling, blinking, funny, opening mouth, and so on.
  • the face area refers to the partial image of the face in the expression video frame, and the face area can be identified based on the face recognition technology.
  • the composite area refers to the facial area of the image information in the video frame of the special effect video template. It can be based on face recognition technology or the facial area, or it can be pre-marked on the facial area of each image information of the special effect video template, and directly based on the The image information determines the corresponding composite area.
  • Fusion of the facial area corresponding to each emoticon video frame to the composite area in each video frame of the special effect video template refers to replacing the composite area in each video frame of the special effect video template with the face area to obtain the personal special effect video.
  • the expression video frame in the message can make the synthesized personal special effect video more interesting.
  • the step of fusing the face area corresponding to each emoticon video frame to the composite area in each video frame of the special effects video template to obtain a personal special effect video includes: according to the sequence of each emoticon video frame in the video information , Determine the sequence of each emoticon video frame; According to the sequence of each video frame of the special effect video template, and the sequence of each emoticon video frame, determine the corresponding relationship between each emoticon video frame and each video frame of the special effect video template; According to each emoticon video Correspondence between the frames and the video frames of the special effects video template, correspondingly merge the facial area corresponding to each emoticon video frame to the composite area in each video frame of the special effect video template to obtain a personal special effect video.
  • the order of each emoticon video frame in the video information refers to the order in which the video frames are played.
  • the video frame l, the video frame k, the video frame j, and the video frame h are displayed in sequence, and the sequence of each emoticon video frame in the video information is: video frame l, video frame k, video frame j, and video frame h.
  • the sequence of the video frames of the special effects video template is similar to the sequence of the emoticon video frames, and will not be repeated here.
  • the corresponding relationship between the emoticon video frame and the video frames of the special effect video template such as: suppose there are video frame p, video frame y, video frame i, and video frame u in the special effect video template, and the sequence is video frame p, video frame y, video Frame i, video frame u, video information includes video frame l, video frame k, video frame j, and video frame h.
  • the sequence of the expression video frames is: video frame l, video frame k, video frame j, video frame h, the video frame p of the special effect video template corresponds to the video frame l, the video frame k corresponds to the video frame y, the video frame i corresponds to the video frame j, and the video frame u corresponds to the video frame h.
  • the facial area corresponding to each expression video frame is correspondingly merged into the composite area in each video frame of the special effects video template, such as: the face area of video frame l is merged into the composite area of video frame p, and the face area of video frame k is merged To the synthesis area of video frame y, the face area of video frame j is merged into the synthesis area of video frame i, and the face area of video frame h is merged into the synthesis area of video frame u.
  • the facial area corresponding to each emoticon video frame is correspondingly merged into the composite area in each video frame of the special effect video template to obtain the image of the personal special effect video.
  • the steps include: according to the corresponding relationship between each emoticon video frame and each video frame of the special effect video template, the corresponding facial area of each emoticon video frame is correspondingly merged into the composite area in each video frame of the special effect video template to obtain the special effect video frame; Feather processing is performed on the edge of the composite area after the special effect video frame is fused to the face area to obtain each processed special effect video frame; according to each processed special effect video frame, a personal special effect video is obtained.
  • the feathering processing refers to the effect of making the edges of the selected range of images obscured.
  • the feathering processing of the edges of the composite area after fusion into the face area in each special effect video frame is performed to obtain each processed special effect video frame (that is, the special effect video frame after the feathering). By feathering the special effect video frame, the synthesized special effect video frame can be made more natural.
  • the step of recognizing the video information through the facial expression recognition model to obtain each facial expression video frame includes: formatting the video information according to a preset video format to obtain the converted video information; using the facial expression recognition model Recognize the converted video information to obtain each emoticon video frame.
  • the preset video format can be set according to production requirements.
  • the special effect video synthesis method of this application uniformly uses a format that is conducive to network dissemination to synthesize multiplayer special effect videos, such as the F4V format.
  • the video formats uploaded by different users may be different, such as: AVI, WMV, RM, RMVB, MPEG1, MPEG2, F4V and other formats
  • the uploaded video needs to be converted to a unified format.
  • the format conversion of the video information according to the preset video format can be to obtain the video metadata of the video according to the encoding method of the video, and convert the video metadata of the video according to the preset video format to obtain the converted video information.
  • multiplayer special effects videos can be realized for different formats of videos, and special effects video synthesis that supports multiple formats can be realized.
  • the step of sending unoccupied image information in the special effect video template to the recipient terminal according to the image information selection instruction includes: obtaining user information in the image information selection instruction, and the user information includes: user account information , User location information; when verifying that the recipient is a designated user according to the user information, the unoccupied image information in the special effect video template is sent to the recipient terminal.
  • the user account information can be used to identify the identity of each user.
  • the user location information is the area where the current user is located.
  • the special effects video synthesis applet is associated with WeChat
  • both the initiator and recipient users can enter the special effects video synthesis applet through WeChat to perform special effects video synthesis.
  • Program when you enter the special effects video synthesis applet, you need to obtain the user's WeChat account, which is the user account information, and obtain the user's area.
  • the client of the terminal interacts with the server (ie, the server) of the special effect video synthesis applet through the client on the terminal.
  • the server ie, the server
  • the image information selection instruction is sent through the recipient terminal
  • the user information of the recipient is obtained, and the image information selection instruction is generated based on the user information.
  • the initiator sends the special effect video synthesis instruction through the initiator terminal
  • the user information of the initiator may also be obtained, so that the special effect video synthesis instruction also carries user information.
  • the user information of each user can be collected, and big data analysis can be further carried out based on the user information, further realizing product demand analysis, and further recommending products to users. Obtain user information through entertainment, and recommend products, which improves work efficiency and improves the accuracy of product recommendations.
  • steps in the flowchart of FIG. 2 are displayed in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless there is a clear description in this article, there is no strict order for the execution of these steps, and these steps can be executed in other orders. Moreover, at least part of the steps in FIG. 2 may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times. The execution of these sub-steps or stages The sequence is not necessarily performed sequentially, but may be performed alternately or alternately with at least a part of other steps or sub-steps or stages of other steps.
  • a special effect video synthesis device including: a synthesis instruction receiving module 310, a first fusion module 320, an invitation sending module 330, an instruction receiving module 340, and an image information sending module 350 , The information acquisition module 360 and the second fusion module 370, where:
  • the synthesis instruction receiving module 310 is configured to receive the special effect video synthesis instruction sent by the initiator terminal.
  • the special effect video synthesis instruction includes the special effect video template identifier, the image information of the initiator in the special effect video template, and the initiator video information;
  • the first fusion module 320 is used for fusing the image information of the initiator and the video information of the initiator in the special effect video template to obtain the personal special effect video of the initiator;
  • the invitation sending module 330 is configured to generate a video shooting invitation according to the special effect video of the initiating party and send it to the initiating party terminal, and the video shooting invitation is sent to the designated user by the initiating party terminal;
  • the instruction receiving module 340 is configured to obtain the image information selection instruction sent by the receiving terminal receiving the video shooting invitation;
  • the image information sending module 350 is configured to send the unoccupied image information in the special effect video template to the receiving terminal according to the image information selection instruction;
  • the information acquisition module 360 is configured to acquire the image information of the special effect video template sent by the receiver terminal based on the unoccupied image information in the special effect video template, and the receiver's video information;
  • the second fusion module 370 is configured to merge the personal special effect video of the initiator and the video information of the recipient according to the image information of the recipient in the special effect video template to obtain a multi-person special effect video.
  • the second fusion module 370 is further used for: fusing the image information of the recipient and the video information of the recipient in the special effect video template to obtain the recipient's personal special effect video; and synthesizing the initiator's personal special effect according to the special effect video template Video and the recipient’s personal special effects video to get multi-person special effects videos.
  • the first fusion module 320 and the second fusion module 370 are further used to: recognize the video information through the expression recognition model to obtain each expression video frame; extract the face region in each expression video frame, Obtain the face area corresponding to each emoticon video frame; determine the composite area in each video frame of the special effect video template according to the image information; merge the face area corresponding to each emoticon video frame to the composite area in each video frame of the special effect video template to obtain the individual Special effects video.
  • the first fusion module 320 and the second fusion module 370 are further used to: determine the sequence of each emoticon video frame according to the sequence of each emoticon video frame in the video information; The sequence of frames, and the sequence of each emoticon video frame, determine the corresponding relationship between each emoticon video frame and each video frame of the special effect video template; according to the corresponding relationship between each emoticon video frame and each video frame of the special effect video template, each emoticon video The frame corresponding to the face area is correspondingly fused to the composite area in each video frame of the special effect video template to obtain a personal special effect video.
  • the first fusion module 320 and the second fusion module 370 are further configured to: according to the corresponding relationship between each emoticon video frame and each video frame of the special effect video template, merge the corresponding face area of each emoticon video frame into the special effect.
  • the composite area in each video frame of the video template obtains the special effect video frame; the edge of the composite area merged into the face area in each special effect video frame is feathered to obtain each processed special effect video frame; according to each processed Special effects video frames, get personal special effects videos.
  • the first fusion module 320 and the second fusion module 370 are further used to: format the video information according to the preset video format to obtain the converted video information; Recognize the information to obtain each expression video frame.
  • the image information sending module 350 is further configured to: obtain user information in the image information selection instruction, the user information includes: user account information, user location information; The party terminal sends the unoccupied image information in the special effect video template.
  • each module in the above-mentioned special effect video synthesis device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above-mentioned modules may be embedded in the form of hardware or independent of the processor in the computer equipment, or may be stored in the memory of the computer equipment in the form of software, so that the processor can call and execute the operations corresponding to the above-mentioned modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 6.
  • the computer equipment includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, a computer program, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the database of the computer equipment is used to store special effects video data.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a special effect video synthesis method, wherein the special effect video synthesis method includes the following steps: receiving a special effect video synthesis instruction sent by the initiator terminal, and the special effect video synthesis instruction includes a special effect video Template identification, the image information of the initiator in the special effect video template and the video information of the initiator; fusion of the image information of the initiator in the special effect video template and the video information of the initiator to obtain the personal special effect video of the initiator; The special effect video of the initiator generates a video shooting invitation and sends it to the initiator terminal.
  • the special effect video synthesis method includes the following steps: receiving a special effect video synthesis instruction sent by the initiator terminal, and the special effect video synthesis instruction includes a special effect video Template identification, the image information of the initiator in the special effect video template and the video information of the initiator; fusion of the image information of the initiator in the special effect video template and the video information of the initiator to obtain the personal special effect video of the initiator;
  • the video shooting invitation is sent from the initiator terminal to the designated user; obtains the image information selection instruction sent by the receiver terminal that receives the video shooting invitation;
  • the image information selection instruction sends the unoccupied image information in the special effect video template to the receiver terminal; and obtains the special effect sent by the receiver terminal based on the unoccupied image information in the special effect video template
  • the image information of the video template and the video information of the receiver; according to the image information of the receiver in the special effect video template, the personal special effect video of the initiator and the video information of the receiver are merged to obtain a multi-person special effect video.
  • FIG. 6 is only a block diagram of part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a storage medium storing computer-readable instructions.
  • the storage medium is a volatile storage medium or a non-volatile storage medium.
  • the computer-readable instructions are executed by one or more processors. When executed, one or more processors are made to perform the following steps: receive a special effect video synthesis instruction sent by the initiator terminal, the special effect video synthesis instruction includes the special effect video template identifier, the image information of the initiator in the special effect video template, and the initiator video information Integrate the image information of the initiator and the video information of the initiator in the special effects video template to obtain the initiator’s personal special effect video; generate a video shooting invitation based on the initiator’s special effect video and send it to the initiator terminal, and the video shooting invitation is from the initiator terminal Send to the designated user; obtain the image information selection instruction sent by the receiver terminal that received the video shooting invitation; send the unoccupied image information in the special effect video template to the receiver terminal according to the image information selection instruction; obtain the receiver terminal based on the special effect video The image information of

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)
  • Studio Circuits (AREA)
  • Processing Or Creating Images (AREA)

Abstract

本申请涉及一种特效视频合成方法、装置、计算机设备和存储介质。基于人脸识别技术,通过发起方基于发起方终端向服务器发送特效视频合成指令,选定特效视频模板及形象信息,并向服务器发送形象信息和视频信息,服务器根据特效视频模板、形象信息和视频信息合成发起方的个人特效视频,并生成发起方的特效视频邀请,供发起方将特效视频邀请发送给指定用户,使指定用户基于特效视频邀请参与一起合成特效视频,指定用户只需通过终端上传拍摄的视频信息和所选择的形象信息给服务器,服务器就能将发起方与各接收方的视频信息合成在同一个特效视频中,实现多人特效视频的合成,解决了特效视频的制作场景受到了时间和空间的限制,操作便利性低的问题。

Description

特效视频合成方法、装置、计算机设备和存储介质
本申请要求于2019年11月21日提交中国专利局、申请号为201911147121.8,发明名称为“特效视频合成方法、装置、计算机设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及计算机技术领域,特别是涉及一种特效视频合成方法、装置、计算机设备和存储介质。
背景技术
随着互联网技术的发展,各个资源共享平台,在娱乐方面,基于互联网技术,开发了多种多样的娱乐方式,如:基于互联网的特效合成技术,实现在线制作特效视频,供用户娱乐。
目前在线制作特效视频的方式是:由特效平台提供模板视频,用户通过终端选择自己想实现特效模板,在特效模板的基础上,将视频上传给特效平台,将用户的视频融入到特效模板,获得用户的特效视频。
当需要制作有多人参与的特效视频时,在选择特效模板后,参与视频制作的多人需要同时对着同一个摄像头,即由一个摄像头在同一时间采集多个参与人员所在的场景,发明人发现这就使得特效视频的制作场景受到了时间和空间的限制,操作便利性低。
发明内容
基于此,有必要针对上述技术问题,提供一种操作便利的特效视频合成方法、装置、计算机设备和存储介质。
一种特效视频合成方法,所述方法包括:接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
一种特效视频合成装置,所述装置包括:合成指令接收模块,用于接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;第一融合模块,用于融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;邀请发送模块,用于根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;指令接收模块,用于获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;形象信息发送模块,用于根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;信息获取模块,用于获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;第二融合模块,用于根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
一种计算机设备,其包括:一个或多个处理器;存储器;一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行一种特效视频合成方法,其中,所述特效视频合成方法包括以下步骤:接收发起方终端发送的特效视频合成指令,所述特效视频合成指令 中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现一种特效视频合成方法,其中,所述特效视频合成方法包括以下步骤:接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
上述特效视频合成方法、装置、计算机设备和存储介质,解决了特效视频的制作场景受到了时间和空间的限制,操作便利性低的问题。
附图说明
图1为一个实施例中特效视频合成方法的应用场景图;
图2为另一个实施例中特效视频合成方法的流程示意图;
图3为一个实施例中特效视频合成方法中的特效视频模板示意图;
图4为一个实施例中特效视频合成方法中的多人特效视频的视频帧图像示意图;
图5为另一个实施例中特效视频合成装置的结构框图;
图6为一个实施例中计算机设备的内部结构图。
具体实施方式
本申请提供的特效视频合成方法,适用于人工智能技术领域,可以应用于如图1所示的应用环境中。其中,终端102通过网络与服务器104通过网络进行通信。服务器104接收发起方终端102发送的特效视频合成指令,特效视频合成指令中包括特效视频模板标识、特效视频模板中发起方的形象信息及发起方视频信息;服务器104融合特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;根据发起方的特效视频生成视频拍摄邀请并发送至发起方终端102,视频拍摄邀请由发起方终端102发送至指定用户;服务器104获取接到视频拍摄邀请的接收方终端102发送的形象信息选择指令;根据形象信息选择指令向接收方终端102发送特效视频模板中未被占用的形象信息;获取接收方终端102基于特效视频模板中未被占用的形象信息发送的特效视频模板的形象信息,以及接收方视频信息;根据特效视频模板中接收方的形象信息融合发起方的个人特效视频和接收方的视频信息,得到多人特效视频。其中,终端102包括发起方终端和接收方终端,可以是一个发起方终端也可以是多个发起方终端,可以是一个接收方终端也可以是多个接收方终端。终端102可以但不限于是各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备,服务器104可以用独立的服务器或者是多个服务器组成的服务器集群来实现。
在一个实施例中,如图2所示,提供了一种特效视频合成方法,以该方法应用于图1中 的服务器为例进行说明,包括以下步骤:
步骤S220,接收发起方终端发送的特效视频合成指令,特效视频合成指令中包括特效视频模板标识、特效视频模板中发起方的形象信息及发起方视频信息。
其中,发起方终端指的是特效视频发起人所持的终端。特效视频模板可以是预先设置,如:放烟花场景、游戏场景,动漫场景、打篮球等等,如图3所示,是放烟花场景的特效视频模板,特效视频模板中可以是有多个可选择的形象信息,也可以是只有一个形象信息。特效视频发起人只想合成个人特效视频时,可以选择一个或多个形象信息的特效视频模板,特效视频发起人想合成多人特效视频时,可以选择多个形象信息的特效视频模板。特效视频模板标识是用于识别各个特效视频模板的,每个特效视频模板对应一个标识。形象信息指的是特效形象,如图3特效视频模板图片中的小猪形象,即为形象信息。特效视频合成指令是特效视频发起人通过发起方终端,在特效视频模板选择页面,选择想合成特效视频模板,发起方终端基于特效视频发起人选择的特效视频模板确定对应的特效视频模板标识,生成特效视频合成指令,向服务器发送。也可以是特效视频发起人通过发起方终端,在特效视频模板选择页面中选择自定义视频特效模板,基于用户自定义视频特效模板,生成特效视频合成指令,该指令中包括自定义视频特效模板,向服务器发送。
通过特效视频模板标识可以向模板数据库获取对应的特效视频模板,通过特效视频模板中发起方的形象信息可以确定发起方想合成的形象,模板数据库是用于保存各个特效视频模板。根据特效视频模板标识在模板数据库可以找到唯一对应的特效视频模板。当接收到的特效视频合成指令中是自定义视频特效模板时,对自定义视频特效模板进行处理,确定自定义视频特效模板中可以选择的形象信息等,并确定各形象信息对应的合成区域等,使得自定义视频特效模板与模板数据库中的视频特效模板的格式相同,形成处理后的自定义视频特效视频模板。还可以将处理后的自定义视频特效模板存储至模板数据库中,可以作为该特效视频发起人的个人视频特效模板。
在接收发起方终端发送的特效视频合成指令之前,通过向发起方终端发送各特效视频模板,可供发起方用户通过发起方终端选择自己喜欢的特效视频模板及形象信息,基于选择的特效视频模板,确定该特效视频模板的特效视频模板标识,并拍摄拍摄发起方视频信息(即特效视频发起人的个人视频),发起方视频信息中至少有一定数量的人脸视频帧,并且该视频的时长在预设时长范围内,使发起方终端基于特效视频模板标识、特效视频模板中发起方的形象信息及发起方视频信息,生成特效视频合成指令,向服务器发送该特效视频合成指令。发起方终端还可以通过检测人脸视频帧及视频的时长,来确定拍摄的发起方视频信息是否符合要求。当拍摄的发起方视频信息没有达到预设数量的人脸视频帧或视频的时长不在预设时长范围时,提醒用户重新拍摄,预设数量可以是50帧~1000帧,预设时长可以是10s~500s。
步骤S240,融合特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频。
其中,通过表情识别模型对该视频信息的表情帧进行识别,获取人脸表情帧,获取的人脸表情帧需要数量达到预设数量,预设数量根据特效视频模板的总帧数确定,如:特效视频模板的总帧数为10帧,获取的人脸表情帧也需要10帧。对人脸表情帧的人脸区域进行提取,获得人脸区域,根据形象信息确定特效视频模板各视频帧中的合成区域,将表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频。
步骤S260,根据发起方的特效视频生成视频拍摄邀请并发送至发起方终端,视频拍摄邀请由发起方终端发送至指定用户。
其中,生成的视频拍摄邀请可以是链接,也可以是二维码等等,可以通过该视频拍摄邀请查看到当前合成的特效视频,并参与合成特效视频。生成的视频拍摄邀请后向发起方终端发送,发起方终端接收到视频拍摄邀请后,特效视频发起人可以通过终端将视频拍摄邀请发送给指定用户,指定用户是由特效视频发起人指定接收的用户账号,发起人可以限定邀请谁加入,如:特效视频发起人将视频拍摄邀请发送给用户A的账号、用户B的账号,则指定用 户即为用户A的账号、用户B的账号。也可以不限定谁为被邀请者,如:发起人将视频拍摄邀请发在朋友圈,可以看到该视频拍摄邀请的用户的账号即为指定用户。
步骤S280,获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令。
其中,接收方即为接收到视频拍摄邀请的指定用户。接收方接收到发起方生成视频拍摄邀请时,可以基于视频拍摄邀请查看到特效视频发起人的特效视频,当接收方接受发起方的视频拍摄邀请,接收方通过在接收方终端上操作(点击)视频拍摄邀请,进入查看到特效视频发起人的特效视频的页面,基于该页面向服务器发送形象信息选择指令。
步骤S300,根据形象信息选择指令向接收方终端发送特效视频模板中未被占用的形象信息。
其中,服务器接收到发起方终端的形象信息选择指令时,获取该次特效视频合成所使用的特效视频模板的形象信息的使用情况,根据特效视频模板的形象信息的使用情况,确定特效视频模板中未被占用的形象信息,如特效视频模板有q、w、e、r四个形象信息,特效视频发起人可以选择特效视频模板中任意一个形象信息,而特效视频发起人选择了w形象信息时,特效视频模板中未被占用的形象信息就只剩下q、e、r这三个形象信息,当第一次接收形象信息选择指令时,特效视频模板中未被占用的形象信息应当为q、e、r这三个形象信息,则向接收方终端发送特效视频模板中q、e、r这三个形象信息。当同时出现多个接收方终端同时发送形象信息选择指令时,同时给多个接收方终端发送相同的特效视频模板中未被占用的形象信息,当多个接收方终端中基于特效视频模板中未被占用的形象信息发送的特效视频模板的形象信息相同时,以先发送的接收方占用该形象信息,提醒后发送的接收方该形象信息被占用,请重新选择。
步骤S320,获取接收方终端基于特效视频模板中未被占用的形象信息发送的特效视频模板的形象信息,以及接收方视频信息。
其中,接收方终端接收到服务器的特效视频模板中未被占用的形象信息后,接收方通过接收方终端可查看特效视频模板中未被占用的形象信息,在接收方终端选择用于特效视频合成的形象信息,并拍摄个人视频,通过接收方终端将接收方选择的形象信息以及视频信息发送给服务器,服务器接收到选择的形象信息以及接收方视频信息后,根据接收到选择的形象信息,可以将该形象信息标记为已占用的形象信息。当特效视频模板中的形象信息被其他指定用户选择,则不可以再次被选,通知接收方重新选择形象信息。
步骤S340,根据特效视频模板中接收方的形象信息融合发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
其中,可以基于特效视频模板中接收方的形象信息和接收方的视频信息,得到接收方的个人特效视频,根据特效视频模板合成发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频。多人特效视频的视频帧中包括了发起方的个人特效图像和多个接收方的个人特效图像,如图4所示的多人特效视频的视频帧图像。根据发起方的个人特效视频对应的形象信息,与接收方的个人特效视频对应的形象信息,将发起方的个人特效视频和接收方的个人特效视频进行融合,得到多人特效视频。
上述特效视频合成方法中,发起方基于终端向服务器发送特效视频合成指令,选定特效视频模板及形象信息,向服务器发送形象信息和视频信息,服务器根据特效视频模板、形象信息和视频信息合成发起方的个人特效视频,并生成发起方的特效视频邀请,供发起方将特效视频邀请发送给指定用户,使指定用户基于特效视频邀请参与一起合成特效视频,指定用户只需通过终端上传拍摄的视频信息和所选择的形象信息给服务器,服务器就能将发起方与各接收方的视频信息合成在同一个特效视频中,实现多人特效视频的合成,无需由一个摄像头在同一时间采集多个参与人员所在的场景,参与人员只需将各自的个人视频信息上传给服务器即可,由服务器对上传的各视频信息进行多人特效视频合成,解决了特效视频的制作场景受到了时间和空间的限制,操作便利性低的问题。
在一个实施例中,根据接收方的特效视频模板的形象信息融合发起方的个人特效视频和 接收方的视频信息,得到多人特效视频的步骤,包括:融合特效视频模板中接收方的形象信息和接收方的视频信息,得到接收方的个人特效视频;根据特效视频模板合成发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频。
其中,通过表情识别模型对该视频信息的表情帧进行识别,获取人脸表情帧,获取的人脸表情帧需要数量达到预设数量,预设数量根据特效视频模板的总帧数确定,如:特效视频模板的总帧数为10帧,获取的人脸表情帧也需要10帧。对人脸表情帧的人脸区域进行提取,获得人脸区域,根据形象信息确定特效视频模板各视频帧中的合成区域,将表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频。多人特效视频的视频帧中包括了发起方的个人特效图像和接收方的个人特效图像,如图4所示的多人特效视频的视频帧图像。根据发起方的个人特效视频对应的形象信息,与接收方的个人特效视频对应的形象信息,将发起方的个人特效视频和接收方的个人特效视频进行融合,得到多人特效视频。
在一个实施例中,根据特效视频模板合成发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频的步骤,包括:根据发起方的个人特效视频对应的形象信息,与接收方的个人特效视频对应的形象信息,将发起方的个人特效视频和接收方的个人特效视频进行融合,得到多人特效视频。
其中,发起方的个人特效视频对应的形象信息用于确定发起方合成的是哪个形象信息,接收方的个人特效视频对应的形象信息用于确定接收方合成的是哪个形象信息。可以是在发起方的个人特效视频的基础上,通过获取接收方的个人特效视频中各视频帧中融入的人脸区域,将各视频帧中融入的人脸区域对应融入到发起方的个人特效视频的视频帧中,形成包含发起方的个人特效视频中的形象和接收方的个人特效视频中的形象(如图4)。也可以是在接收方的个人特效视频的基础上,通过获取发起方的个人特效视频中各视频帧中融入的人脸区域,将各视频帧中融入的人脸区域对应融入到接收方的个人特效视频的视频帧中,形成包含发起方的个人特效视频中的形象和接收方的个人特效视频中的形象,个人特效视频中的形象指的是将视频信息中的人脸区域融合到形象中的合成区域后的形象。
在一个实施例中,发起方的个人特效视频和接收方的个人特效视频进行融合,得到多人特效视频后,更新特效视频邀请的当前合成的特效视频,当用户通过特效视频邀请查看时,可以看到当前特效视频的合成进度,如:发起方用户向接收方发送特效视频邀请后,各用户可以通过特效视频邀请看到当前合成的特效视频,当接收到接收方的形象信息以及视频信息后,执行步骤S340至步骤S360,获得多人特效视频,根据获取到接收方的形象信息以及视频信息的先后顺序进行多人特效视频,如:发起方邀请了接收方甲和接收方乙,先接收到接收方甲先的形象信息以及视频信息,则执行步骤S340至步骤S360,获得接收方甲和发起方的多人特效视频,该多人特效视频包括接收方甲的个人特效视频中的形象和发起方的个人特效视频中的形象。获得接收方甲和发起方的多人特效视频后,将特效视频邀请的当前合成的特效视频更新为接收方甲和发起方的多人特效视频,通过特效视频邀请查看时,看到的是接收方甲和发起方的多人特效视频。后续接收到接收方乙形象信息以及视频信息,则执行步骤S340至步骤S360,获得接收方甲、发起方和接收方甲乙的多人特效视频。通过特效视频邀请查看时,看到的是接收方甲、发起方和接收方甲乙的多人特效视频。通过实时更新特效视频邀请中的当前合成的特效视频,用户可以实时获知当前合成进度。
在一个实施例中,当特效视频模板中的所有形象信息被选中并完成合成后,结束多人特效视频合成,生成最终的多人特效视频,并给参与多人特效视频的用户发送完成提醒。当特效视频模板中的还有没有被选中的形象信息时,也可以是发起人通过终端发送结束多人特效视频合成指令,服务器基于接收到的结束多人特效视频合成指令,将当前合成的特效视频作为最终的多人特效视频,并给参与多人特效视频的用户发送完成提醒。通过在生成最终的多人特效视频后,给用户发送完成提醒,使用户无需通过特效视频邀请获知合成进度。通过结束多人特效视频合成指令,用户可以随时结束多人特效视频合成。
在一个实施例中,个人特效视频的融合方法包括:通过表情识别模型对视频信息进行识别,获得各表***帧;对各表***帧中的人脸区域进行提取,获得各表***帧对应人脸区域;根据形象信息确定特效视频模板各视频帧中的合成区域;将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频。
其中,个人特效视频可以是发起方的个人特效视频,也可以是接收方的个人特效视频。表情识别模型是用于识别表***帧的模型,是通过收集表情训练图片(网络图片、标准资源图片等);对训练图片做明暗色调处理,增强模型泛化能力后;对训练图片进行表情分类(如:微笑,眨眼,搞怪,张嘴等等)获得各类表情图片;将各类表情图片输入基于tensorflow框架使用CNN卷积神经网络的模型中进行训练,得到表情识别模型,使表情识别模型可以识别出各图片属于哪一类图片,如:是属于微笑表情类图片、属于张嘴表情类图片、属于搞怪表情类图片或其他类图片等等。
表***帧指的是视频信息的各个视频帧中人脸的表情是微笑、眨眼、搞怪、张嘴等等表情的视频帧。人脸区域指的是表***帧中人脸部分图像,可以基于人脸识别技术识别出人脸区域。合成区域指的是特效视频模板的视频帧中该形象信息的面部区域,可以基于人脸识别技术或出面部区域,也可以是预先对特效视频模板各形象信息的面部区域进行标记,直接根据该形象信息确定对应的合成区域。将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域指的是,将特效视频模板各视频帧中的合成区域替换为人脸区域,获得个人特效视频,通过识别视频信息中的表***帧,可以使合成的个人特效视频趣味性更高。
在一个实施例中,将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:根据各表***帧在视频信息中的先后顺序,确定各表***帧的先后顺序;根据特效视频模板各视频帧的先后顺序,与各表***帧的先后顺序,确定各表***帧与特效视频模板各视频帧的对应关系;根据各表***帧与特效视频模板各视频帧的对应关系,将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频。
其中,各表***帧在视频信息中的先后顺序指的是视频帧的播放先后顺序,如:视频信息中有视频帧l、视频帧k、视频帧j、视频帧h,在播放视频信息时,依次显示视频帧l、视频帧k、视频帧j、视频帧h,则各表***帧在视频信息中的先后顺序为:视频帧l、视频帧k、视频帧j、视频帧h。特效视频模板各视频帧的先后顺序与表***帧的先后顺序类似,不再赘述。表***帧与特效视频模板各视频帧的对应关系,如:假设特效视频模板中有视频帧p、视频帧y、视频帧i、视频帧u,先后顺序为视频帧p、视频帧y、视频帧i、视频帧u,视频信息中有视频帧l、视频帧k、视频帧j、视频帧h,各表***帧的先后顺序为:视频帧l、视频帧k、视频帧j、视频帧h,则特效视频模板的视频帧p与视频帧l对应,视频帧k与视频帧y对应,视频帧i与视频帧j对应,视频帧u与视频帧h对应。将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,如:将视频帧l的人脸区域融合至视频帧p的合成区域,视频帧k的人脸区域融合至视频帧y的合成区域,视频帧j的人脸区域融合至视频帧i的合成区域,视频帧h的人脸区域融合至视频帧u的合成区域。
在一个实施例中,根据各表***帧与特效视频模板各视频帧的对应关系,将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:根据各表***帧与特效视频模板各视频帧的对应关系,将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得特效视频帧;对各特效视频帧中融合至人脸区域后的合成区域的边缘进行羽化处理,获得各处理后的特效视频帧;根据各处理后的特效视频帧,获得个人特效视频。
其中,羽化处理指的是使选定范围的图边缘达到朦胧的效果,可以采用均值滤波和cvSnakeImage()两种方式进行平滑轮廓线,采用对mask采用全图均值滤波方法扩宽过度区域,完成对各特效视频帧中融合至人脸区域后的合成区域的边缘的羽化处理,获得各处理后 的特效视频帧(即羽化后的特效视频帧)。通过对特效视频帧进行羽化处理,可以使合成的特效视频帧更自然。
在一个实施例中,通过表情识别模型对视频信息进行识别,获得各表***帧的步骤,包括:根据预设的视频格式对视频信息进行格式转换,获得转换后的视频信息;通过表情识别模型对转换后的视频信息进行识别,获得各表***帧。
其中,预设的视频格式可以根据制作需求设定,比如本申请的特效视频合成方法统一采用利于在网络传播的格式合成多人特效视频,如F4V格式等等。因为不同用户上传的视频可能格式不一样,如:AVI、WMV、RM、RMVB、MPEG1、MPEG2、F4V等格式,需要将上传的视频转换为统一格式。根据预设的视频格式对视频信息进行格式转换可以是,根据视频的编码方式,获取该视频的视频元数据,根据预设的视频格式对该视频的视频元数据进行转换,获得转换后的视频信息。通过视频格式转换可以实现对不同格式视频进行多人特效视频,实现支持多格式的特效视频合成。
在一个实施例中,根据所述形象信息选择指令向接收方终端发送特效视频模板中未被占用的形象信息的步骤,包括:获取形象信息选择指令中的用户信息,用户信息包括:用户账号信息、用户位置信息;根据用户信息验证接收方是指定用户时,向接收方终端发送特效视频模板中未被占用的形象信息。
其中,用户账号信息可以用于识别各用户的标识。用户位置信息为当前用户所在的地区。以本申请是以小程序的方式实现特效视频合成方法为例,且该特效视频合成小程序与微信关联,发起方用户与接收方用户都可以通过微信进入特效视频合成小程序进行特效视频合成小程序,在进入特效视频合成小程序时,需要获取用户的微信账号,该账户则为用户账号信息,并获取该用户所在的地区,获得用户账号信息、用户位置信息后,进入特效视频合成小程序的客户端,通过终端上的客户端与特效视频合成小程序的服务端(即服务器)进行交互。通过接收方终端发送形象信息选择指令时,获取接收方的用户信息,基于用户信息生成形象信息选择指令。在发起方通过发起方终端发送的特效视频合成指令时,也可以获取发起方的用户信息,使特效视频合成指令中还携带有用户信息。可以收集各用户的用户信息,进一步根据用户信息进行大数据分析,进一步实现产品需求分析,进一步对用户进行产品推荐。通过娱乐的方式获取到用户的信息,进行产品推荐,提高了工作效率,提高产品推荐的精准度。
应该理解的是,虽然图2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图2中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图5所示,提供了一种特效视频合成装置,包括:合成指令接收模块310、第一融合模块320、邀请发送模块330、指令接收模块340、形象信息发送模块350、信息获取模块360和第二融合模块370,其中:
合成指令接收模块310,用于接收发起方终端发送的特效视频合成指令,特效视频合成指令中包括特效视频模板标识、特效视频模板中发起方的形象信息及发起方视频信息;
第一融合模块320,用于融合特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;
邀请发送模块330,用于根据发起方的特效视频生成视频拍摄邀请并发送至发起方终端,视频拍摄邀请由发起方终端发送至指定用户;
指令接收模块340,用于获取接到视频拍摄邀请的接收方终端发送的形象信息选择指令;
形象信息发送模块350,用于根据形象信息选择指令向接收方终端发送特效视频模板中未被占用的形象信息;
信息获取模块360,用于获取接收方终端基于特效视频模板中未被占用的形象信息发送 的特效视频模板的形象信息,以及接收方视频信息;
第二融合模块370,用于根据特效视频模板中接收方的形象信息融合发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
在一个实施例中,第二融合模块370还用于:融合特效视频模板中接收方的形象信息和接收方的视频信息,得到接收方的个人特效视频;根据特效视频模板合成发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频。
在一个实施例中,第一融合模块320和第二融合模块370还用于:通过表情识别模型对视频信息进行识别,获得各表***帧;对各表***帧中的人脸区域进行提取,获得各表***帧对应人脸区域;根据形象信息确定特效视频模板各视频帧中的合成区域;将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频。
在一个实施例中,第一融合模块320和第二融合模块370还用于::根据各表***帧在视频信息中的先后顺序,确定各表***帧的先后顺序;根据特效视频模板各视频帧的先后顺序,与各表***帧的先后顺序,确定各表***帧与特效视频模板各视频帧的对应关系;根据各表***帧与特效视频模板各视频帧的对应关系,将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得个人特效视频。
在一个实施例中,第一融合模块320和第二融合模块370还用于:根据各表***帧与特效视频模板各视频帧的对应关系,将各表***帧对应人脸区域对应融合至特效视频模板各视频帧中的合成区域,获得特效视频帧;对各特效视频帧中融合至人脸区域后的合成区域的边缘进行羽化处理,获得各处理后的特效视频帧;根据各处理后的特效视频帧,获得个人特效视频。
在一个实施例中,第一融合模块320和第二融合模块370还用于:根据预设的视频格式对视频信息进行格式转换,获得转换后的视频信息;通过表情识别模型对转换后的视频信息进行识别,获得各表***帧。
在一个实施例中,形象信息发送模块350还用于:获取形象信息选择指令中的用户信息,用户信息包括:用户账号信息、用户位置信息;根据用户信息验证接收方是指定用户时,向接收方终端发送特效视频模板中未被占用的形象信息。
关于特效视频合成装置的具体限定可以参见上文中对于特效视频合成方法的限定,在此不再赘述。上述特效视频合成装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图6所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机程序和数据库。该内存储器为非易失性存储介质中的操作***和计算机程序的运行提供环境。该计算机设备的数据库用于存储特效视频数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种特效视频合成方法,其中,所述特效视频合成方法包括以下步骤:接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
本领域技术人员可以理解,图6中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种存储有计算机可读指令的存储介质,所述存储介质为易失性存储介质或非易失性存储介质,该计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行以下步骤:接收发起方终端发送的特效视频合成指令,特效视频合成指令中包括特效视频模板标识、特效视频模板中发起方的形象信息及发起方视频信息;融合特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;根据发起方的特效视频生成视频拍摄邀请并发送至发起方终端,视频拍摄邀请由发起方终端发送至指定用户;获取接到视频拍摄邀请的接收方终端发送的形象信息选择指令;根据形象信息选择指令向接收方终端发送特效视频模板中未被占用的形象信息;获取接收方终端基于特效视频模板中未被占用的形象信息发送的特效视频模板的形象信息,以及接收方视频信息;根据特效视频模板中接收方的形象信息融合发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。

Claims (20)

  1. 一种特效视频合成方法,其中,所述方法包括:
    接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;
    融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;
    根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;
    获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;
    根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;
    获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;
    根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
  2. 根据权利要求1所述的方法,其中,所述根据接收方的所述特效视频模板的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频的步骤,包括:
    融合所述特效视频模板中接收方的形象信息和接收方的视频信息,得到接收方的个人特效视频;
    根据所述特效视频模板合成所述发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频。
  3. 根据权利要求1或2所述的方法,其中,个人特效视频的融合方法包括:
    通过表情识别模型对所述视频信息进行识别,获得各表***帧;
    对各所述表***帧中的人脸区域进行提取,获得各所述表***帧对应人脸区域;
    根据所述形象信息确定所述特效视频模板各视频帧中的合成区域;
    将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频。
  4. 根据权利要求3所述的方法,其中,所述将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:
    根据各所述表***帧在所述视频信息中的先后顺序,确定各所述表***帧的先后顺序;
    根据所述特效视频模板各视频帧的先后顺序,与各所述表***帧的先后顺序,确定各所述表***帧与所述特效视频模板各所述视频帧的对应关系;
    根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频。
  5. 根据权利要求4所述的方法,其中,所述根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:
    根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得特效视频帧;
    对各所述特效视频帧中融合至人脸区域后的合成区域的边缘进行羽化处理,获得各处理后的特效视频帧;
    根据各处理后的所述特效视频帧,获得个人特效视频。
  6. 根据权利要求3所述的方法,其中,所述通过表情识别模型对所述视频信息进行识别,获得各表***帧的步骤,包括:
    根据预设的视频格式对所述视频信息进行格式转换,获得转换后的视频信息;
    通过表情识别模型对所述转换后的视频信息进行识别,获得各表***帧。
  7. 根据权利要求1所述的方法,其中,所述根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息的步骤,包括:
    获取所述形象信息选择指令中的用户信息,所述用户信息包括:用户账号信息、用户位置信息;
    根据所述用户信息验证所述接收方是所述指定用户时,向所述接收方终端发送所述特效视频模板中未被占用的形象信息。
  8. 一种特效视频合成装置,其中,所述装置包括:
    合成指令接收模块,用于接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;
    第一融合模块,用于融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;
    邀请发送模块,用于根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;
    指令接收模块,用于获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;
    形象信息发送模块,用于根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;
    信息获取模块,用于获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;
    第二融合模块,用于根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
  9. 一种计算机设备,其中,包括:
    一个或多个处理器;
    存储器;
    一个或多个计算机程序,其中所述一个或多个计算机程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个计算机程序配置用于执行一种特效视频合成方法;其中,所述特效视频合成方法包括以下步骤:
    接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;
    融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;
    根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;
    获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;
    根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;
    获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;
    根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
  10. 根据权利要求9所述的计算机设备,其中,所述根据接收方的所述特效视频模板的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频的步骤,包括:
    融合所述特效视频模板中接收方的形象信息和接收方的视频信息,得到接收方的个人特效视频;
    根据所述特效视频模板合成所述发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频。
  11. 根据权利要求9或10所述的计算机设备,其中,个人特效视频的融合方法包括:
    通过表情识别模型对所述视频信息进行识别,获得各表***帧;
    对各所述表***帧中的人脸区域进行提取,获得各所述表***帧对应人脸区域;
    根据所述形象信息确定所述特效视频模板各视频帧中的合成区域;
    将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频。
  12. 根据权利要求11所述的计算机设备,其中,所述将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:
    根据各所述表***帧在所述视频信息中的先后顺序,确定各所述表***帧的先后顺序;
    根据所述特效视频模板各视频帧的先后顺序,与各所述表***帧的先后顺序,确定各所述表***帧与所述特效视频模板各所述视频帧的对应关系;
    根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频。
  13. 根据权利要求12所述的计算机设备,其中,所述根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:
    根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得特效视频帧;
    对各所述特效视频帧中融合至人脸区域后的合成区域的边缘进行羽化处理,获得各处理后的特效视频帧;
    根据各处理后的所述特效视频帧,获得个人特效视频。
  14. 根据权利要求11所述的计算机设备,其中,所述通过表情识别模型对所述视频信息进行识别,获得各表***帧的步骤,包括:
    根据预设的视频格式对所述视频信息进行格式转换,获得转换后的视频信息;
    通过表情识别模型对所述转换后的视频信息进行识别,获得各表***帧。
  15. 根据权利要求9所述的计算机设备,其中,所述根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息的步骤,包括:
    获取所述形象信息选择指令中的用户信息,所述用户信息包括:用户账号信息、用户位置信息;
    根据所述用户信息验证所述接收方是所述指定用户时,向所述接收方终端发送所述特效视频模板中未被占用的形象信息。
  16. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有计算机程序,该计算机程序被处理器执行时实现一种特效视频合成方法;其中,所述特效视频合成方法包括以下步骤:
    接收发起方终端发送的特效视频合成指令,所述特效视频合成指令中包括特效视频模板标识、所述特效视频模板中发起方的形象信息及发起方视频信息;
    融合所述特效视频模板中发起方的形象信息和发起方的视频信息,得到发起方的个人特效视频;
    根据所述发起方的特效视频生成视频拍摄邀请并发送至发起方终端,所述视频拍摄邀请由发起方终端发送至指定用户;
    获取接到所述视频拍摄邀请的接收方终端发送的形象信息选择指令;
    根据所述形象信息选择指令向所述接收方终端发送所述特效视频模板中未被占用的形象信息;
    获取所述接收方终端基于所述特效视频模板中未被占用的形象信息发送的所述特效视频模板的形象信息,以及接收方视频信息;
    根据所述特效视频模板中接收方的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频。
  17. 根据权利要求16所述的计算机可读存储介质,其中,所述根据接收方的所述特效视频模板的形象信息融合所述发起方的个人特效视频和接收方的视频信息,得到多人特效视频的步骤,包括:
    融合所述特效视频模板中接收方的形象信息和接收方的视频信息,得到接收方的个人特效视频;
    根据所述特效视频模板合成所述发起方的个人特效视频和接收方的个人特效视频,得到多人特效视频。
  18. 根据权利要求16或17所述的计算机可读存储介质,其中,个人特效视频的融合方法包括:
    通过表情识别模型对所述视频信息进行识别,获得各表***帧;
    对各所述表***帧中的人脸区域进行提取,获得各所述表***帧对应人脸区域;
    根据所述形象信息确定所述特效视频模板各视频帧中的合成区域;
    将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频。
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:
    根据各所述表***帧在所述视频信息中的先后顺序,确定各所述表***帧的先后顺序;
    根据所述特效视频模板各视频帧的先后顺序,与各所述表***帧的先后顺序,确定各所述表***帧与所述特效视频模板各所述视频帧的对应关系;
    根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频。
  20. 根据权利要求19所述的计算机可读存储介质,其中,所述根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得个人特效视频的步骤,包括:
    根据各所述表***帧与所述特效视频模板各所述视频帧的对应关系,将各所述表***帧对应人脸区域对应融合至所述特效视频模板各视频帧中的合成区域,获得特效视频帧;
    对各所述特效视频帧中融合至人脸区域后的合成区域的边缘进行羽化处理,获得各处理后的特效视频帧;
    根据各处理后的所述特效视频帧,获得个人特效视频。
PCT/CN2020/087712 2019-11-21 2020-04-29 特效视频合成方法、装置、计算机设备和存储介质 WO2021098151A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911147121.8A CN111147766A (zh) 2019-11-21 2019-11-21 特效视频合成方法、装置、计算机设备和存储介质
CN201911147121.8 2019-11-21

Publications (1)

Publication Number Publication Date
WO2021098151A1 true WO2021098151A1 (zh) 2021-05-27

Family

ID=70517212

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/087712 WO2021098151A1 (zh) 2019-11-21 2020-04-29 特效视频合成方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN111147766A (zh)
WO (1) WO2021098151A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112153422B (zh) * 2020-09-25 2023-03-31 连尚(北京)网络科技有限公司 视频融合方法和设备
CN112312163B (zh) * 2020-10-30 2024-05-28 北京字跳网络技术有限公司 视频生成方法、装置、电子设备及存储介质
CN113806306B (zh) * 2021-08-04 2024-01-16 北京字跳网络技术有限公司 媒体文件处理方法、装置、设备、可读存储介质及产品
CN114429611B (zh) * 2022-04-06 2022-07-08 北京达佳互联信息技术有限公司 视频合成方法、装置、电子设备及存储介质

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000175171A (ja) * 1998-12-03 2000-06-23 Nec Corp テレビ会議の映像生成装置及びその生成方法
CN102665026A (zh) * 2012-05-03 2012-09-12 华为技术有限公司 一种利用视频会议实现远程合影的方法、设备及***
CN106331529A (zh) * 2016-10-27 2017-01-11 广东小天才科技有限公司 一种图像拍摄方法及装置
CN106375193A (zh) * 2016-09-09 2017-02-01 四川长虹电器股份有限公司 远程合照方法
CN107734257A (zh) * 2017-10-25 2018-02-23 北京玩拍世界科技有限公司 一种群拍视频拍摄方法及装置
CN109040647A (zh) * 2018-08-31 2018-12-18 北京小鱼在家科技有限公司 媒体信息合成方法、装置、设备及存储介质
CN109785229A (zh) * 2019-01-11 2019-05-21 百度在线网络技术(北京)有限公司 基于区块链实现的智能合影方法、装置、设备和介质
CN110012352A (zh) * 2019-04-17 2019-07-12 广州华多网络科技有限公司 图像特效处理方法、装置及视频直播终端
CN110166799A (zh) * 2018-07-02 2019-08-23 腾讯科技(深圳)有限公司 直播互动方法、装置及存储介质

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005033532A (ja) * 2003-07-14 2005-02-03 Noritsu Koki Co Ltd 写真処理装置
CN104680480B (zh) * 2013-11-28 2019-04-02 腾讯科技(上海)有限公司 一种图像处理的方法及装置
US10057205B2 (en) * 2014-11-20 2018-08-21 GroupLuv, Inc. Systems and methods for creating and accessing collaborative electronic multimedia compositions
CN106355551A (zh) * 2016-08-26 2017-01-25 北京金山安全软件有限公司 拼图处理方法、装置、电子设备及服务器
KR101894956B1 (ko) * 2017-06-21 2018-10-24 주식회사 미디어프론트 실시간 증강 합성 기술을 이용한 영상 생성 서버 및 방법
CN108259788A (zh) * 2018-01-29 2018-07-06 努比亚技术有限公司 视频编辑方法、终端和计算机可读存储介质
CN110121094A (zh) * 2019-06-20 2019-08-13 广州酷狗计算机科技有限公司 视频合拍模板的显示方法、装置、设备及存储介质

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000175171A (ja) * 1998-12-03 2000-06-23 Nec Corp テレビ会議の映像生成装置及びその生成方法
CN102665026A (zh) * 2012-05-03 2012-09-12 华为技术有限公司 一种利用视频会议实现远程合影的方法、设备及***
CN106375193A (zh) * 2016-09-09 2017-02-01 四川长虹电器股份有限公司 远程合照方法
CN106331529A (zh) * 2016-10-27 2017-01-11 广东小天才科技有限公司 一种图像拍摄方法及装置
CN107734257A (zh) * 2017-10-25 2018-02-23 北京玩拍世界科技有限公司 一种群拍视频拍摄方法及装置
CN110166799A (zh) * 2018-07-02 2019-08-23 腾讯科技(深圳)有限公司 直播互动方法、装置及存储介质
CN109040647A (zh) * 2018-08-31 2018-12-18 北京小鱼在家科技有限公司 媒体信息合成方法、装置、设备及存储介质
CN109785229A (zh) * 2019-01-11 2019-05-21 百度在线网络技术(北京)有限公司 基于区块链实现的智能合影方法、装置、设备和介质
CN110012352A (zh) * 2019-04-17 2019-07-12 广州华多网络科技有限公司 图像特效处理方法、装置及视频直播终端

Also Published As

Publication number Publication date
CN111147766A (zh) 2020-05-12

Similar Documents

Publication Publication Date Title
WO2021098151A1 (zh) 特效视频合成方法、装置、计算机设备和存储介质
US11670015B2 (en) Method and apparatus for generating video
CN108322832B (zh) 评论方法、装置、及电子设备
CN111080759B (zh) 一种分镜效果的实现方法、装置及相关产品
CN112199016B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
CN105608715A (zh) 一种在线合影方法及***
WO2023011221A1 (zh) 混合变形值的输出方法及存储介质、电子装置
CN106375193A (zh) 远程合照方法
CN112492231B (zh) 远程交互方法、装置、电子设备和计算机可读存储介质
CN112004034A (zh) 合拍方法、装置、电子设备及计算机可读存储介质
CN114430494B (zh) 界面显示方法、装置、设备及存储介质
KR20170102570A (ko) 소셜 네트워킹 툴들과의 텔레비전 기반 상호작용의 용이화
CN107911601A (zh) 一种拍照时智能推荐拍照表情和拍照姿势的方法及其***
CN108961368A (zh) 三维动画环境中实时直播综艺节目的方法和***
US20150341541A1 (en) Methods and systems of remote acquisition of digital images or models
CN109529350A (zh) 一种应用于游戏中的动作数据处理方法及其装置
CN108320331B (zh) 一种生成用户场景的增强现实视频信息的方法与设备
CN117011497A (zh) 一种ar场景下基于ai通用助手的远程多方视频交互方法
WO2023082737A1 (zh) 一种数据处理方法、装置、设备以及可读存储介质
CN115442658B (zh) 直播方法、装置、存储介质、电子设备及产品
CN116016837A (zh) 一种沉浸式虚拟网络会议方法和装置
Sun et al. Video Conference System in Mixed Reality Using a Hololens
US20230138434A1 (en) Extraction of user representation from video stream to a virtual environment
CN114125552A (zh) 视频数据的生成方法及装置、存储介质、电子装置
CN112734657A (zh) 基于人工智能和三维模型的云合影方法、装置及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20889519

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20889519

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20889519

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 29-09-2022)